I see details information in Infrastructure dashlet for java and webserver agents as "Memory is unhealthy" or the "CPU" issue but I do not see the same information for Host agents when Host is going unhealthy.
Can someone please explain me why we are not seeing details info for host agents when it is going unhealthy?
Solved! Go to Solution.
In order to show Host Memory is unhealthy in details, it has to meet the both the conditions(Min avail Value and Min Avail percent) mentioned below or else if it has Max Page Faults also will trigger an alert.
Note: For configure the threshold right click on Host group --> Configure Host group--> Thresholds
Hope, in your case Applications agents has detected the issue with the memory by evaluating above condition and in Host the condition might not have met.
I have tweaked the threshold in order to produce the issue and could see it showing me the issues with all the OS metrics like CPU, Memory and Disk.
Thank you for your answer. But the host I am talking about the /tmp filesystem disk utilization went 100% for atleast 90mins. I didn't see memory unhealthy message in details and also I didn't see /tmp filesystem coming on the top in "Disks" section for this host.
Correction it's Disk status not memory... Are you able to see the /tmp Mount point under Infrastructure Health of the host --> Disks, if you are not able to see the /tmp, please paste me the snapshots of the Server Settings --> Infrastructure--> Exclusions and
as well, the file system configuration on server level.
Thanks for sharing the snaps, I could see there is no issue from the mount point /tmp in monitoring as it's been included and also the /tmp is mounted for a file system which was chosen as any in AppMon settings. Hence, there is no issue with that.
Can you share me one more snap of the Threshold config which helps me to determine the alert condition and as well the health status
Under Infrastructure Health --> Host Group--> (RighClick) Configure HostGroup-->Thresholds
Just took some time to reply you as I was working on this with vendor and Linux team. Are you able to see the /tmp directory in the Disks health of Infrastructure as like the below.
If you are not able to see the /tmp directory in here, please paste the output of the below comamnd
#cat /proc/mounts --> this command helps us to know whether the /tmp directory is mounted or not.
Mounted one : will have a log like this
/dev/mapper/centos-tmp /tmp ext4 rw,seclabel,nosuid,nodev,relatime,data=ordered 0 0
and you don't find the above log if it's not mounted. Make sure you work with the Linux Admin to get that directory mounted
Thanks for the info, just now I have tested by filling the /tmp directory on the server to 100% and could see it's started reflecting the health on Infrastructure.
And Health overview
I suspect the issue might have resolved the moment when you are verifying the health.
What I see is dynatrace appmon is detecting the disk issues which are mounted on the server.
If you are fine in testing the disk health alter the Memory threshold by doubling the value of Memory Free 2048 MB and Memory Free % as 10.
Thanks for your inputs. I guess the same too the issue might have resolved when I started looking into it. What I wonder is whenever this issue happened if I set that timeframe I don't see it /tmp 100% but when I chart for the same timeframe on that host I see disk 100%.
So does appmon automatically removes the unhealthy info in disk section even i go back to that timeframe and see?
If you are looking under the Infrastructure health overview, yes Appmon doesn't show the Disk as unhealthy though you have selected the time frame as the incident period.
However in the host information dashlet you can able to see the Heat Field for a particular incident time frame 12:20 to 12:43 (Red line of History) - The below snip shows the same
Yeah that's what I have seen. I can see the red line for that time frame but not the actual disk showing as full. So dynatrace automatically removes the info once it is turned to healthy state again.
I really appreciate your help Ravi. Thanks a lot for your time and findings.