cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Memory related monitoring for Linux systems

Hi,

May I know effective ways to monitor memory related KPI's without any causing alert fatigue or too much noise at the same time?

Possible options I could think of are,

1. OOTB Memory saturation (combination of memory used % and page faults per sec)

2. builtin:host.mem.usage:filter using metric event (metric key)

3. Swap space used % using metric event (metric selector)

Anything else apart from this?

Regards,

Srikanth

6 REPLIES 6

Esam_Eid
Advisor

Hi @SrikanthSamraj ,

 

Process memory also could be an option, so you can monitor if a particular process is consuming more memory 

Regards,
Esam 

Thanks @Esam_Eid . Would this require specific process name to be called out explicitly in the metric event ?

yes, if you are suspecting one of the processes is eating the memory/CPU resources and you want to keep an eye on it, then you need to specify the process name

@Esam_Eid  - In our case, we don't know the exact process name and we're not intending to focus on any specific process instead want it to be generic.

ChadTurner
DynaMight Legend
DynaMight Legend

My question would be - what is the issue driving the need. I always recommend letting Dynatrace use the AI to baseline and determine what needs to be alerted. If that doesn't meet your need then look to adjusting it or putting in the custom metric segment. But without understanding the paint point, I cant give you a rock solid solution. 

-Chad

At present, we monitor swap space used % through metric event (not consuming predefined event type "Memory saturation").

Client requirement was to monitor physical memory in correlation with swap space usage.

I'm recently aware that using Davis AI anomaly detector such correlation is possible, but I don't have access to this module yet.

Whilst we're working on the access part, keen to know if there are any other ways of monitoring memory stuff in Linux server apart from "Memory saturation" and metric events (swap space, memory used %) to avoid any server outages due to spike in memory.

Featured Posts