on 29 Jan 2025 11:31 AM
Dynatrace allows you to automatically push problem notifications to your preferred third-party incident management or ChatOps service. Open problems are continuously updated based on evolving impact and correlating events. To avoid spam notifications, problem notifications are only pushed to third-party systems when problems are initially detected and when they are ultimately resolved.
There are times when notifications don't reach your incident management tool.
In such cases, please follow these troubleshooting steps:
1. Check if the problem was created
Go to the Problems page and use the available filter to look for the problem.
For example, my problem is "Test123".
2. Verify the alert profile configuration
Ensure that the alert profile configured in Dynatrace to trigger notifications is linked to the problem.
For example, my alert profile is "JT".
3. Use Data Explorer
Go to Data Explorer and select the metric "Server - Notifications - Problem Notifications (dsfm:server.notifications.problem_notifications)". Filter the concerned notification configuration to find out if there was any error while sending the notification.
Other valuable dimensions (notification.type, notification.display_name, notification.reason, http_status_code, alerting_profile.display_name, executed_retries, notification.delivery_status, http_status_class, notification.id, alerting_profile.id, will_retry, problem.status)
More details can be viewed from the preset Dashboard: Davis® health self-monitoring
4. For Managed clusters, you can also
Download the Cluster support archive and search for the file named "audit.notifications.0.log" (\SupportArchive*.zip\SERVER<nodeId>\logs). Search using the Problem ID in the logs to find the corresponding log entry.
2025-01-28 10:46:57 UTC {"eventType":"SEND","tenantId":"kqu***","userId":"Notification service","userIdType":"SERVICE_NAME","userOrigination":"Notification service (Internal)","sessionId":null,"identity":"P-25019141","identityCategory":"NOTIFICATION","success":false,"timestamp":1738061217224,"message":"distributorType: SERVICE_NOW_EVENT tenantUuid: kqu*** integrationConfigId: 8********98 alertingProfileConfigId: c**********2 reasons: [NEW, NEW_EVENT] status: OPEN problemId: 5970829412515165095_1738060980000V2 nRetries: 0 willRetry: false callDuration: PT0.318733069Sresult: NotificationResult [retryRecommended=false, message=Invalid status code., exception=, deliveryStatus=INVALID_HTTP_STATUS_CODE, response=NotificationResponse [httpStatusCode=404, body={\"success\":false,\"error\":{\"message\":\"Token \\\"c********6\\\" not found\",\"id\":\"\"}}], uri=https://webhook.site/c******6/api/global/em/jsonv2, suppressed=[]]","jsonPatch":null}
Note: The logs are present on any one of the nodes, so please check all server files.
Retries: we do 3 retries if the notification is not sent.(The retry strategy (which error codes, interval, retry count) is hardcoded per notification type and can't be configured.)
Once you get the error code based on it, you can continue troubleshooting it with your internal incident management/ middleware team.
If you're still unable to find the issue, please create a ticket with the Dynatrace Support team, including the link to the problem ID and notification profile.