Showing results for 
Show  only  | Search instead for 
Did you mean: 

Total useractions when sampling RUM in partial instrumented load balanced servers?

DynaMight Guru
DynaMight Guru

I have an application that is setup to analyze only 50% of user sessions. The application runs on two webservers, but only one is instrumented by OneAgent.

Now, I would expect that I'm capturing 1/4 of all user actions, but that doesn't seem to be the case.

The problem seems to be related to the load balancing I'm using. It's basically round-robin, so requests hit each one of the servers in a random way. So, when a page is instrumented by OneAgent, the request grabbing the ruxitagentjs*.js might fail because it might be directed to the server that is not instrumented. And even if I get the Javascript from the instrumented server, I might not then be able to submit the data in the XHR requests.

So, it's basically a combinatorics problem. Has anyone had the issue before, or does anyone have the formula for how to extrapolate total user actions in such scenarios?

Antonio Sousa

Not applicable

Do you have a cluster activegate? you could try to redirect the beacon to it, so at least when the JS is served, the XHR goes to something that is always listening. I did something similar for an application that had hard rules FW so we couldn't actually submit the POST to the OA injecting the app since it wouldn't allow it to pass...

Interesting idea, but won't work in this case, as there is no Cluster AG.

Antonio Sousa

Dynatrace Guru
Dynatrace Guru

I wonder if browser do not cache web pages so it loads javascript agent even if you switch the web server. Maybe try to check some user sessions how they start and finish.

Senior Product Manager,
Dynatrace Managed expert

I had done the check from the browser side using Developer Tools, after discovering a discrepancy in what were the values I would expect. So basically I had initially thought along these lines:

  • There is no Cache involved, as the page has "pragma: no-cache" headers
  • When I get a page from the server that is not instrumented, there is no Dynatrace RUM inserted into the page. So, on average, 50% of the pages do not get instrumented.
  • On the OneAgent instrumented server, on average 50% of the sessions are instrumented. So, 25% of the total web pages are expected to have the Dynatrace RUM Javascript
  • The browser has to get the Dynatrace RUM Javascript. So, once again it might hit one of the two servers. But here it gets particularly tricky, as there is a cache in front of the servers. So this request might be usually cached, as I see it coming with a "cache-control: max-age=1800" header. Sometimes I do see 404s, but interestingly sometimes the browser seems to retry it once again, normally getting the Javascript.
  • The POST of RUM data done to the endpoint will on average land on the instrumented server only 50% of the times. Interestingly, also here, it seems it tries to POST the data a second time when it fails the first time.

I believe this will be even more complex if more that two servers are involved. But even with two, it seems more complicated than I had anticipated.

Antonio Sousa