Problem description

100% CPU usage visible in Console Devices Management and in /var/log/adlex/rtm_perf.log file in CPU Utilization section you may find entries like:

Remember that RUM Console shows CPU Utilization taking value of one most used core for reporting.

Solutions checklist

Any of the following actions could resolve the problem. The process begins with the most likely solution and we recommend that you perform the following steps in provided sequence. Before executing the procedure we recommend that you make a backup of your current configuration. Check Backup and Recovery Procedures for details. If none of the steps resolve the problem, we urge you to contact our support team. 

1. Assign additional threads to the SSL analyzer.




One of reasons is that SSL traffic is being decrypted by too little cores (one by default). To discover this take a look to your /var/log/adlex/rtm_perf.log and find a section named Packets used by analyzers:

[...]
RT              Packets used by analyzers:
RT                      0:0 1:0 2:68239 3:93781 4:0 5:1784 6:0 7:0 8:0 9:0 
RT                      10:0 11:0 12:0 13:0 14:0 15:0 16:0 17:0 18:0 19:0 
RT                      20:0 21:0 22:0 23:0 24:0 25:0 26:0 27:0 28:0 29:0 
RT                      30:0 31:0 32:0 33:0 34:0 35:0 36:0 37:0 38:0 39:0 
RT                      40:0 41:0 42:0 43:0 44:0 45:0 46:0 47:0 48:0 49:0 
RT                      50:0 51:0 52:0 53:0 54:0 55:0 56:0 57:0 58:0 59:0 
RT                      60:0 61:0 62:0 63:0 64:0 65:0 66:0 67:0 68:0 69:0 
RT                      70:0 71:0 72:0 73:0
[...]

Analyzer #3 is related to SSL ecrypted traffic. Like mentioned above by default it uses just one core - we can force it to use specified number of cores adding:

ssl.engine.param=threads:4

into your /usr/adlex/config/rtm.config file. 4 cores should be enough even for a large amount of traffic.

Remember to restart rtm service after making this (and any) change in your rtm.config:

service rtm restart

Now you should see the SSL decryption workload distributed into first four cores in the system.

Visit DC RUM documentation for more information on multi-core and multithreaded processing on the AMD.

2. Enable custom driver on the AMD.




Second reason for 100% high CPU usage is caused by ksoftirqd daemon is likely a consequence of using native driver on AMD and filtering traffic coming from sniffing NICs (AMD configuration filters out packets not belonging to monitored software services, but still present on the wire).

In this case high cpu usage situation should improve by simply enabling use of custom driver on the AMD.The requirement although is to have a NIC that is compatible with our custom driver. 

Custom driver can easily be enabled on AMD:

Make these changes in /usr/adlex/config/rtm.config file:

driver.shm=true
force.native.drv=false

You'd then need to reboot the whole AMD box (the Linux OS, not just the AMD processes) in order to load shared memory driver into non-fragmented space in RAM.

 

FOR AMD HS:

Three recommendations which might help with high CPU utilisation

Install AMD 17.0.3.60, which contains a set of rtmgate memory and performance tweaks rtmgate (APMODCRUMM-9230)
Please enable Software load balancing mode and change First affinity CPU to 1. (APMODCRUMM-9160).
Verify sequence number gap rate. It might cause high cpu utilization. (APMODCRUMM-9011)

 

What to do next

If none of the provided solutions resolved the issue please collect AMD diagnostic information (DC RUM Console -> Manage devices -> (select proper AMD server) -> Export diagnostic information) and contact our Support team.