in DCRUM Failure(total) metrics , is that mean application requests failures ? is that including the operation count ?? I know its combination of Tcp,transport, application.. so its means complete client to server request failures. Please help me to understand.
Solved! Go to Solution.
When you add a dimension or measure to a DMI table, the header will tell you what the definition is. For Failures (total), it reads
The total number of failures, that is all Failures (transport) + all Failures (TCP) + all Failures (application)
So, this metric does not include operation count. You would want to plot this with operation count to compare how many operations experienced a failure, whether at the TCP, Transport, or Application level of the IP stack. Also, this means client <-> server, bidirectionally.
Let me know if you need further clarification 🙂
thanks ! I understand the failure description , But in real scenario, App team validated the failures details, Latter they said they couldn't see any failures message,errors and exception from the app level and sever end, and then I enabled the TCP, Transport and Application failures, in results its shows TCP and Transport, So will it be Network packet drops ?? and those request wont be reach in to app server level ??
The team didn't see issues because issues in the TCP and Transport layers won't reach the application layer. Those issues are taken care of by the NICs, so the app will never be aware. This is the purpose of DC RUM, because otherwise there's not a way to monitor what's happening on the network unless you're between two NICs. TCP Failures could be an issue with packet drops. Transport Failures are one level higher and would deal with the connection established between the two servers. You can google a list of Transport Failures in the TCP stack and see what they could be.