cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Sudden jump in unmonitored hosts causing purepath chain breakages - what happened?

john_kennedy1
Newcomer

We recently ran into a situation here the requests to unmonitored hosts jumped and has stayed there. This was noticed by some folks in AppDev when they were chasing some purepaths and found many broken chains. Some research indicated that "something" happened on 12/10/20 around 4:00AM. See chart below. Has anyone run into this type of situation? Of course "nothing" has changed. 🙂

13 REPLIES 13

ChadTurner
Leader

Was any host monitoring turned off from the Dynatrace perspective? What about the level of monitoring? Was the oneagent reduced to infrastructure only monitoring on some hosts?

-Chad

Good question. We have reviewed both situations and this has not been identified as the issue. We look good from a monitoring perspective as well as what we have in Full Stack vs. Infra Mode.

Hey Chad. Thanks for the response. Full stack monitoring is enabled on all servers and in fact we did patching this past weekend and full reboots were done.

kalle_lahtinen
Advisor

Could it be that there's a component (LB/proxy/integration service/etc.) that's stripping the x-dynatrace headers, in effect terminating the PurePaths and making those requests look like they're towards unmonitored hosts? So for example connections towards domain names which resolve to load balancer VIPs and as such do not correspond to any specific server with OneAgent running..?

Another solid question and that is where we have been concentrating. The issue seems to be surrounding calls in/out of our Web Service Managed infrastructure that is front-ended by a LB. We are full y running in AWS on a mix of EC2 and Native Svcs and the ELB in this case is a CLB. We have reviewed changes at the time of the spike (approx. 12/10 at 4:00AM ) and there were no scheduled changes by our organization at that time.

Anonymous
Not applicable

Might this be new traffic? if you go back at the back call before that time. The number of calls is the same? do the calls that before went to injected service drop and go here now?


Hi Dante. This is an interesting thought. I 'll work with the AppDev team on this. We only keep 10 days of traces in Prod so going back further for comparison sake is problematic.

BabarQayyum
Leader

Hello John K.

Can you check the backtrace for this traffic to know who made these calls?

Regards,

Babar

Hi Babar. Good thought The usual and expected suspects are the callers in all of our situations. We don't see any outliers.

sarahthomas1975
Newcomer

The call is of a Web Request type. It's not calling a database service. Or at least not directly. That's why you will need to do a full-stack monitoring on the host and not the infrastructure only. Try enabling the full stack monitoring (and restart processes on target host) if this makes any difference.


GarageBand on PC

HI Sanders. Thanks for the response. Full stack monitoring is enabled on all servers and in fact we did patching this past weekend and full reboots were done.

ChadTurner
Leader

I would recommend opening a support ticket for this. That way support can dig deeper into the issue and hopefully find a resolution or explanation for this.

-Chad

This has been done.

SUP-64416