cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Take the "Alert of the day" challenge!

laima_vainina
Community Team
Community Team

alerting-53.jpg

Hello Community!  

We're pleased to share a second challenge where you can also win new and unique badge!  

At Dynatrace, we continuously develop a product that can save your time and let your teams focus on innovation. We believe that Dynatrace alerting capabilities provide a lot of options to avoid any delays in your customer digital journey, but we're curious to hear your personal experience. 

  

Tell us about the alerts that give you the most precise solutions and improve your work. Maybe you have an “Alert saved the day” story? Or just your favorite, most time-saving alert configuration 

Share your expertise with your fellow Community members and let's learn from each other.  

 

Everyone who shares their alert story or screenshot will get new limited-edition badge and +100 bonus points which can help you get a higher rank in the Dynatrace Community.  

Entries close on May 11th and on that day we will reward all participants with the new badge. 

 

Can’t wait to see your favorite alerts!

13 REPLIES 13

There are alerts and alerts... Most of the alerts that occur in Dynatrace happen when there is a problem. And that's not normally good news. What's better though is that Dynatrace alerts normally give us excellent MTTI (Mean Time To Identify), which also normally gives us even better MTTRs (Mean Time To Repair) given the mostly available root cause detection.

Given this, one of the types of alerts that I most like are those ones that tell me that bad things might be happening, but have not happened yet. Like the disk is getting almost full. It has probably saved the day, because simply the problem didn't happen! It's also on the top of my wish list for Dynatrace: giving me an indication of what might be happening in the future!

Karolina_Linda
Community Team
Community Team

In case of monitoring our legacy community platform, we found the alerts on the user action duration degradation and the JavaScript error rate increase extremely helpful. That way, we immediately knew that, for example, our workarounds for particular forum issues weren't following the best development practices 😉 The same applied to too large banners, images, etc. The things that work for us don't necessarily need to work for others, so Dynatrace alerts were making us aware of the issues that we were not seeing at the first glance.

Keep calm and build Community!

ChadTurner
Leader

Over the years I have shared multiple use cases and methods with Dynatrace. One that we didn't share centered on 3rd party requests and a rogue firewall change. 

 

Early on with our Dynatrace Journey we were alerted to an increase of failures to our 3rd party vendor. Dynatrace alerted us of this issue as it deviated from our baseline. But keep in mind we were also very new to Dynatrace. Immediately we started to look into the failure rate increase alert and found that at 8am failures went from 0% to 100%. Dynatrace provided us the URL and the port number that this request was coming from and its 3rd Party destination. 

 

Immediately we raised this issue to senior staff who called out to Ecommerce to confirm/deny our findings. Senior staff asked Ecommerce to initiate a "Synthetic" sale. Shortly there after Ecommerce stated that their Synthetic sale completed without issue. 

 

At this point we were all thinking that Dynatrace might have given us a false positive. It wasn't until AppDev came running stating that they were also seeing failures.... 30 mins after Dynatrace detected the first failure. We worked together along with the Network Team to discover that another employee had a made a rogue firewall change. Immediately corrective action was taken, and at 9am we saw failures drop to 0%. 

 

There was still a question though - "Why did the Ecommerce pass without issue?" As it turned out the system was designed to never provide users any errors, rather give them a "simulated response". This event sparked discussions and corrective action to ensure similar future scenarios would provide true test data.

 

Ultimately, Dynatrace did its job. We had a major issue that lasted 1 hour, but if we had trusted Dynatrace more and tests came back with true results, then we could have reduced this downtime by half! Because of this issue not only was corrective action taken in the Ecommerce Synthetics, but also showed staff that they can trust Dynatrace and helped foster a growing relationship with App Dev and the Dynatrace Team. 

-Chad

pahofmann
Champion

There where a few great alerts, but as a partner they are always harder to share if it's not "your" environment.

 

Some of my favorite alerts where during PoCs though where the customer already had a vague idea what was causing problems but wasn't able to verify/prove it e.g. to a third party vendor.

 

Dynatrace picks up the problem, identifies the root cause automatically --> great use case for the presentation. 

that's indeed a nice scenario 🙂 

Keep calm and build Community!

Radoslaw_Szulgo
Dynatrace Leader
Dynatrace Leader

Of course I can't imagine to work without Dynatrace! One of the most powerful and most common issue we were hit in one of our internal services is a common "N+1 JPA" issue. This is related to retrieving data from a database. The problems occurs when a SQL query is executed to fetch N records and each of N records need additional query to fetch some relational records... and so on. As a result one operation results in hundreds or thousands of SQL queries ... which of course is slow and impacts end-user. Dynatrace alerts on "Response time degradation" quickly and shows this at hand - it reduces MTTR (as @AntonioSousa  mentions) to the max!

 

The other - small, quick but yet powerful -> Who remembers or cares every day if the disk space on one of dozens or hundreds of hosts is sufficient? Dynatrace! A quick alert reminder and problem solved:

Radoslaw_Szulgo_0-1619439609466.png

 

Marcelo_santand
Contributor

Hello team.

 

The alert we use the most is custom alerts. These help us to identify with greater sensitivity possible failures in the applications or services that we are monitoring. We like to know much earlier for certain applications that an error that is just beginning is showing. In this way, the response time is much longer and the impact on the end customers is less. (For 100% critical applications.)

Additionally, our team works with IIS, which is very important in the face of an APPpool crash that is quickly identified.

 

>The Process group availability monitoring alert

  >>if any process becomes unavailable

 

This alert is very useful.

 

Best Regards

I truly enjoy the variety of use cases shared in this challenge 🙂 It's interesting to see how we utilize the same functionality depending on our needs, teams we work in, business goals, etc.

Thanks for sharing!

Keep calm and build Community!

alonso_decosio
Dynatrace Advisor
Dynatrace Advisor

 

A really useful one is the ones we receive from our HTTP monitors.

We have a situation where the application heavily relies on the information from 3rd party integrations to work. By creating some multi request HTTP Monitors, we are simulating those integrations and notifying the team ahead of time about the issue.


These alerts have changed the way that the operations team works and reduced the MTTD and MTTR.

 

r_weber
Pro

One of my favourite alerts recently was a case of calls to an external service, where the customer got charged by volume of successful requests. Dynatrace alerted on the calling service because in 60% of the cases it returned an HTTP 400 to the client due to wrong payload.
However the call to the external service was still being made (and "successful) and therefore charged. By fixing the wrong payload handling and immediately returning instead of performing the 3rd party service call the cost was reduced significantly.

Without the service dependency and traces this would have gone unnoticed or the cost impact would not have been discovered.

DanielS
Advisor

My favorite alerts are those when in a PoC a client discovers a problem that he does not know he has and he gets an alert. Lots on the list, I/O, Garbage Collector, Memory Saturation Of course I always have a little help from DAVIES ‌😜‌ I love seeing the expressions on their faces and finally they said, DAVIES is not just a pretty face.

The true delight is in the finding out rather than in the knowing.

now I would love to see their faces, too! 😄 

Keep calm and build Community!

Kind of like that:

via GIPHY