Consider this scenario, your z/OS monitoring is not working as expected, and you would like to do an initial investigation before opening a support ticket. But you have no idea how to read the logs.
Don't fret; I got you. 😉
I've compiled some common issues that I've experienced, and I'll guide you on how to troubleshoot them. If your problem needs to be listed here, you can leave a comment, and I'll help you as much as I can.
This is a 2-part blog, and in this section, I'll be covering the basics and discussing logs, how to extract them, and what are the important information that it contains.
Downloading the Support Archive
To start with your troubleshooting, you need logs. Whether your problem is on the zRemote or the zDC, it is always a good idea to have a full visibility of what's happening. To download your z/OS logs, please check this page: https://community.dynatrace.com/t5/Troubleshooting/How-to-download-the-z-OS-OneAgent-diagnostics-log...
How can I check the logs that are related to my problem?
The first obstacle that you might experience is the number of logs that is available for you to read. Depending on your set-up, the logs the could be huge and overwhelming. But there is a way to filter this.
Extract the support_archive folder and search for the keyword:
after that, look for this line. This specifies the location of the zlocal agent logs and its accompanying zremote logs.
YYYY-MM-DD HH:MM:SS UTC �00000000] info �native] Log info data�zLocalAgent logfile=/location/zlocal.log, zRremoteAgent logfile=location/zremote.log, zRemoteAgentName=oneagentname, zRemoteAgent TimeStamp=YYYY-MM-DD HH:MM:SS]
Why do you need to do this? You might have multiple zRemote, zDCs and zLocals. In this way, you know which logs are connected and related to each other and you won't get confused.
Checking the versions
Now that you've identified the correct logs, you need to verify the versions of your DT mainframe components.
In your zLocal logs, you can search for the keyword:
The most recently installed zLocal version: 220.127.116.118
Why is this important? Most problems that I encounter are fixed by upgrading your mainframe components. The zDC, zLocal and zRemote are connected to each other so compatibility of versions between these 3 components are very important.
As a best practice, I recommend that zDC, zLocal and zRemote should be on the same version OR at least 3 versions lower. If zDC is on a higher version than the zRemote, then a problem might occur.
Exploring the logs
There are many important information that you can see in the logs.
For example in the zRemote logs, you will be able to see:
1. Your zRemote information
ZRemote number of cpus=n, total memory=nGB
For more information regarding sizing, please check this documentation: https://www.dynatrace.com/support/help/setup-and-configuration/dynatrace-oneagent/installation-and-o...
2. The Platform it was built in:
Build platform .............. Linux x86 64-bit
3. Tenant ID information
4. Process group IDs
5. IP address for the Server/Collector
6. The available Physical Memory
Physical memory limit: 99.99% available (nnnn of nnnn MB taken)
7. The Active Sensors
[native] ASID[nn], smfID[xxxx], sysid[xxxx], jobName[xxxx ] - ZDTP006I - ZDTP006I ZDTPLT (ICE) compiled on Jun 4 2022 16:31:17 VER 1.243.0 . [native] ASID[nn], smfID[xxxx], sysid[xxxx], jobName[xxxx ] - ZDTP020I - ZDTP020I Active Sensors: MQ DB2 SOAP CTG DB2Fetch DLI HTTP ZCON .
and so much more...
In the zLocal and zDC logs, you will see specific mainframe information like the z/OS version, the values inside ZDCSYSIN, LPAR information, SMOs and Number of transactions processed.
I will leave it for you to explore. But as a support, these are the things that I verify before starting an investigation. I make sure that I got all the basic information that I need to investigate.
In the next blog, we will talk about the common problems that you might encounter and how to troubleshoot them: https://community.dynatrace.com/t5/Troubleshooting/Troubleshooting-your-z-OS-monitoring-Common-Issue...