Solved: Re: CPU I/O wait

Babar_Qayyum · ‎08 Jun 2022

Dear All,

What exactly is the CPU I/O wait? Particularly with the reference to Kubernetes Master/Worker nodes?

Regards,

Babar

AntonioSousa · ‎08 Jun 2022

I/O wait means that your processor is stalled because it is waiting for disk, and can't do anything else in the processes it is scheduling. It might not mean that it cannot do other things, but in the case of your graph, it seems that the processes that are running (one or more) are waiting for I/O.

It might be difficult to find out what exactly is going on, probably correlate with the logs? Or better, check disk activity by process in Dynatrace, you might get a fast clue there...

Finally, in normal Linux, you might try to put the disk in debug, but not sure if it can be done the way I usually do it in Linux...

Antonio Sousa

Babar_Qayyum · ‎08 Jun 2022

Hello @AntonioSousa

I looked into the disk latency, especially for the Disk read, and there is a latency, but the CPU I/O wait started a couple of minutes before this latency. How do you see this?

Regards,

Babar

AntonioSousa · ‎08 Jun 2022

@Babar_Qayyum,

In the server that had issues, in the old host view, click "Consuming processes", then the separator "I/O" and you should be able to figure which process did the most I/O.

Antonio Sousa

Babar_Qayyum · ‎09 Jun 2022

Hello @AntonioSousa

There is no I/O during the problem except the maximum CPU used by Other processes.

Regards,

Babar

AntonioSousa · ‎09 Jun 2022

OK, so it seems that the "Other processes" are grabbing the I/O. Does this happen often?

Antonio Sousa

Babar_Qayyum · ‎09 Jun 2022

Hello @AntonioSousa

No. It recently happened with one of the OpenShift clusters.

Regards,

Babar

AntonioSousa · ‎09 Jun 2022

@Babar_Qayyum,

You could try to put the filesystem in debug mode, to find out who is accessing the filesystem, but that of course generates massive amount of data. Given that your are not able to replicate the issue, that would make it more difficult. But seems the disk usage is low, so it might be possible?

Antonio Sousa