The customer's AMDs are in Sampling mode probably due to network overload and they're dropping a huge amount of packets and can't recovery.
So, my question is about the best way to provide proofs of network overload on AMD interfaces to "certify" the need of AMD Sampling.
Thanks a lot in advance,
Solved! Go to Solution.
When an AMD is over-loaded (dropping packets) there are a couple of places I start.
Use a SSH (putty or similar) session on the AMD and issue the top command. HIt the "1" key to show all of the CPU's on the system. Look at CPU 0 - the SI column. Is it >90%? That tells us too many packets are hitting the interface and the AMD cannot process them. CPU0 has to process the interrupts on the system, including NIC interrupts..
Second, with Top displaying the cores - is one of the cores at >90%? This indicates that one of the analyzers, or the auto-detect process, is overloaded. Review the capacity recommendations for the different types of analyzers. You can create a simple DMI report showing packets/analyzer to help determine which analyzer may be overloaded.
Some of the stats are also available in the AMD Statistic diagnostic reports in the CAS. I like to use the AMD Capacity Status report to correlate packet drops with CPU and Memory use.
This is usually a good place to start. I'm sure others have additional stats/logs to look at as well - but this is where I start my investigation.
What version of DC RUM does the customer use? If they're on 12.3 or below, I recommend they upgrade to 12.4. In 12.4 we improved the traffic diagnostics, so it automatically identifies traffic quality issues and is simpler to use. It is available on CAS from the Tools > Diagnostics > Traffic diagnostics menu. The diagnostics is complemented by a set of redesigned AMD statistics reports (Tools > Diagnostics > AMD statistics), where you can further check whether traffic quality issues are caused by AMD performance (which is hard to determine based only on the traffic stream). There's an educational module on Dynatrace University that covers the new diagnostics. In addition to automatic identification of traffic quality issues, you can also capture a traffic trace and analyze it with Trace Trimmer or any other packet level analysis tool of your choice, for example, Dynatrace Network Analyzer.
yep the customer is on 12.3..planning to upgrade to 12.4.x but not shortly. I've already had the chance to "play around" with the Traffic diagnostics in 12.4 and they provide good insights on traffic quality. So my question was more oriented on 12.3 releases while waiting for the upgrade..