In relation to metrics, the documentation states
"For each IPv4 and IPv6 TCP connection, defined by:
Source address and port
Destination address and port
PID of the communicating process
Network namespace
the following metrics are collected:
I understand each connection has to be defined manually, Is that so ?
If so, Where does this definition must be indicated ?
Moreover, where are the metrics stored ?
Later in Dynatrace documentation it is indicated
"We added the following binaries to the OneAgent installer package with all their required permissions and capabilities:
How is this integration done ?
I hope it will be available shortly as well. I think the main benefit is to provide more precise network monitoring with less overhead than the current libpcap approach. As it's based on eBPF it will work only on more recent Linux distributions.
@maxi_moscardi, @ct_27, @Julius_Loman,
Some additional documentation has appeared here: https://www.dynatrace.com/support/help/how-to-use-dynatrace/networks/network-monitoring-with-nettrac...
I'm also very interested in knowing how this works in more detail...
@AntonioSousa eBPF itself is very cool, provides excellent observability options on the system level. A recommended source to get familiar with it is this book for example.
I've been following Gregg's work for some years now, especially at the systems performance level. It's very deep, and probably not needed for almost all scenarios. I particularly like it when it blends into other domains I follow, specifically security. I especially dream about the days where we could have "method hotspots" at the system level 😎
Just to add that the detail I was looking for, is the detail on how Dynatrace will give us access to all this detail 😁
In this first step - I guess it's mostly about overcoming limitations of the current network monitoring approach in containerized environments. Anyway - I'm looking forward to see more of eBPF within Dynatrace.
Looking for some guidance on NetTracer.....we updated the OneAgent on a few Linux machines and ensured that their CentOS (RedHat) and Ubuntu versions were update date. We restarted the machines. But of the 7+ machines only 2 of them are reporting NetTracer metrics.
I found this in the logs. I'm not a System Administrator so not so great with these things. Guessing the dtuser account for some reason can't run
/opt/dynatrace/oneagent/agent/lib64/oneagentnettracer.
error [nettracer] perf_event_open_map for pfd -1 failed: Permission denied (13)
@ct_27 are the Linux hosts not providing metrics supported for Nettracer? It works on Kernel 4.15+, so older distros such as CentOS7 or RedHat7 won't report them.
@Julius_Loman The host is on 4.18.0-348.20.1.el8_5.x86_64.
I created another host to troubleshoot, Ubuntu 20.04.4 LTS. I'm an admin on this box, and after some permission adjustments i was able to manually run oneagentnettracer but the OneAgent itself (user dtuser i assume) is still showing the Permission denied error in the logs.
What baffles me is we have 100's of Linux machines and when we turned this on globally only 2 hosts reported in metrics. It's on OneAgent 1.233.177 but even so I have many other agents at this level not working. I have hosts running agents on many versions between 1.233.77 and 1.239.226 so it doesn't appear to be a OneAgent version issue. The hosts are all running compatible OS versions, hence i can run oneagentnettracer manually.
Wondering if anyone else has had issues or it's all been, flip the toggle and away they go. I'll open a DT Support case Monday if i can't make progress.
I'll talk to my Sysadmin on Monday. See if he can find something wrong with the DTUser.
@ct_27 are the hosts not reporting metrics running any containers? In my environment I see only data for process groups running in containers. There is no data in the nettracer processes for the other hosts.
@Julius_Loman Thanks so much for pointing that out. I can't believe I missed it, and it's in the first line of the documentation "in containerized Linux hosts.".
So disappointing. Looks good on paper but I'd only be able to use this for an extremely small percentage of our monitored infrastructure.
@ct_27 to be honest, I also missed the line. Only a few of the metrics have the container name as the dimension. I looked at the data in my lab environment and I immediately noticed I have data only for processes which run in a container.
However, for other (non containerized) processes we already have the same metrics just with different metric id, don't we?