β31 Oct 2024
07:39 AM
- last edited on
β09 May 2025
03:00 PM
by
AgataWlodarczyk
Minimize the chance of human mistakes by automating the correlation of problem details.
Is your website malfunctioning, leaving users frustrated and unable to complete their tasks due to persistent errors? Such an ongoing issue will clearly have a significant impact on your business.
Without proper observability, you might not even be aware of these problems. You need proactive, real-time insights about issues with your services and an automated way to identify their root causes.
Details about the impact, severity, and root cause of problems can be scattered across metrics, logs, or traces. Thatβs why, to effectively troubleshoot, you need all these signals in a single observability platform.
Using Dynatrace can significantly accelerate your Mean-Time-to-Identify (MTTI) critical issues, allowing you to fix them quickly before they affect customer experience and thus minimize business impact from outages. By consolidating all signals into a single observability platform, you reduce the risk of human errors from manual correlation of problem details.
How is the problem tackled?
Letβs use the OpenTelemetry demo app, which includes a flag feature that, when activated, generates critical errors. The OpenTelemetry collector enriches the deployment information as logs are collected, enriching the logs with metadata. This process tags important logs and enriches them with support information.
Then, all this data is sent to Dynatrace, where the OpenPipeline looks for specific logs enriched with metadata to generate events. This leads to a problem being identified with full context, including ownership details, links to additional support information, and the ability to drill down for further analysis.
Go to the playground and try it yourself.
In this proposed solution, instead of altering the app, we enriched logs externally. This allowed us to resolve problems swiftly and prevent any negative impact on customer experience, thus minimizing business disruptions.
Weβve gathered some useful materials below to help you try hands-on problem detection in your environment.
Find thoroughly described log-based problem detection in the documentation and hands-on demo on GitHub Codespace.
To gain more context and see this solution in action, watch the end-to-end self-service use case based on the OpenTelemetry Demo app.