Explore our self-service use case of problems detection with logs

GosiaMurawska · ‎31 Oct 2024

Minimize the chance of human mistakes by automating the correlation of problem details.

Is your website malfunctioning, leaving users frustrated and unable to complete their tasks due to persistent errors? Such an ongoing issue will clearly have a significant impact on your business.

Without proper observability, you might not even be aware of these problems. You need proactive, real-time insights about issues with your services and an automated way to identify their root causes.

Details about the impact, severity, and root cause of problems can be scattered across metrics, logs, or traces. That’s why, to effectively troubleshoot, you need all these signals in a single observability platform.

Using Dynatrace can significantly accelerate your Mean-Time-to-Identify (MTTI) critical issues, allowing you to fix them quickly before they affect customer experience, thus minimizing business impact from outages. By consolidating all signals into a single observability platform, you reduce the risk of human errors from manual correlation of problem details.

How is the problem tackled?

Let’s use the OpenTelemetry demo app, which includes a flag feature that, when activated, will generate critical errors. By enriching the deployment information, as logs are collected, the OpenTelemetry collector will enhance the logs with metadata. This process tags important logs and enriches them with support information.

Then, all this data is sent to Dynatrace, where the Openpipeline looks for specific logs enriched with metadata to generate events. This leads to a problem being identified with full context, including ownership details, links to additional support information, and the ability to drill down for further analysis.

Go to the playground and try it yourself.

In this proposed solution, instead of altering the app, we enriched logs externally. This allowed us to resolve problems swiftly and prevent any negative impact on customer experience, thus minimizing the business disruptions.

We’ve gathered some useful materials below to help you try hands-on problems detection in your environment.

Find thoroughly described log-based problem detection in the documentation and hands-on demo on GitHub Codespace.

To gain more context and see this solution in action, watch the end-to-end self-service use case based on the OpenTelemetry Demo app.