I have objected lots of concerns about the OneAgent before, but now I'm facing an extremely "stubborn" Linux admin at the hoster of one of my clients.
I tried a lot of convincing already, here is his stance concerning the auto-injection via LD_PRELOAD:
The problem with the LD_PRELOAD is that the dynatrace library get loaded into each process started on the system. What if there is an error in the library (such as a memory leak)? It would endanger the entire system.
What can go as far as it can no longer be started. In addition, there is also a whole range of other reasons, e.g. access to key material. For us, a global LD_PRELOAD means that we no longer can give any SLA, because we cannot guarantee recovery in a reasonable time window. We will also then remove all keys from the host, which, for example, makes monitoring no longer possible for us.
(this is a ecommerce hoster, not banking, insurance or governmental btw.)
I'm running out of fact-based arguments, thus asking the community for creative answers.
Anything is appreciated! 🙂
Here's what ChatGPT had to say when I asked it to make the case to use LD_PRELOAD for a monitoring tool like Dynatrace. Additionally, you could also include the OneAgent is rigorously tested by Dynatrace and you would additionally test in a non-production environment before deploying it to your client's production hosts.
LD_PRELOAD is a powerful mechanism in the Linux environment that allows developers to intercept and modify system calls and library functions. While its primary use is to provide compatibility and debugging features, it can also be leveraged for monitoring purposes, such as integrating a tool like Dynatrace. Here are some compelling reasons to consider using LD_PRELOAD for a monitoring tool like Dynatrace:
1. Minimal code changes: LD_PRELOAD offers a non-intrusive way to monitor applications without modifying their source code. By leveraging LD_PRELOAD, you can inject Dynatrace's monitoring functionality into an application's runtime without the need for extensive code modifications. This is particularly valuable when dealing with legacy or third-party applications that you don't have direct control over.
2. Comprehensive visibility: Dynatrace is known for its powerful monitoring capabilities, providing deep insights into application performance, resource consumption, and dependencies. By utilizing LD_PRELOAD, you can extend Dynatrace's reach to monitor lower-level interactions and system calls within the application, enabling a more holistic view of its behavior and performance. This level of visibility can be crucial for diagnosing and resolving complex performance issues.
3. Real-time monitoring: LD_PRELOAD allows for dynamic instrumentation, enabling the injection of monitoring code at runtime. This capability ensures that monitoring is applied continuously and in real-time without requiring application restarts. As a result, you can monitor applications without interrupting their normal operation, which is especially important for mission-critical systems or environments where downtime is not acceptable.
4. Interception of system calls and library functions: LD_PRELOAD enables the interception and modification of system calls and library functions at the user level. This interception capability allows Dynatrace to track and analyze various aspects of an application's behavior, such as file operations, network communication, memory allocation, and more. By capturing these interactions, Dynatrace can provide detailed insights into resource utilization, performance bottlenecks, and potential security vulnerabilities.
5. Flexibility and customization: LD_PRELOAD offers flexibility in terms of which functions and system calls you choose to intercept and monitor. This level of customization allows you to tailor the monitoring approach to your specific requirements, focusing on the aspects that are most relevant to your application and performance goals. You can selectively intercept only the necessary functions, reducing the potential overhead associated with monitoring.
6. Compatibility and portability: LD_PRELOAD is widely supported across different Linux distributions and architectures. This compatibility ensures that Dynatrace can be easily integrated into various environments without the need for platform-specific modifications or configurations. Whether you're running your applications on-premises, in the cloud, or within containers, LD_PRELOAD provides a consistent and portable approach to instrumenting and monitoring applications.
While LD_PRELOAD can be a powerful tool, it's important to use it judiciously and with proper caution. Careful consideration should be given to the potential impact on application performance, security, and stability. Additionally, it's recommended to thoroughly test the integration of Dynatrace using LD_PRELOAD in a controlled environment to ensure that it meets your monitoring requirements effectively.
Fantastic idea to ask ChatGPT for reasoning and advantages of LD_PRELOAD. Happy to see that it came up with the same advantages as I did.
The conclusion however (and the fear of the frightened Linux admin) is exactly the las statement:
"While LD_PRELOAD can be a powerful tool, it's important to use it judiciously and with proper caution. Careful consideration should be given to the potential impact on application performance, security, and stability."
It outweighs all the benefits and reasoning of the paranoid Linux admin/hoster. It's fear of the unknown or maybe a idea - set deep into his minds - that LD_PRELOAD is bad because someone could exploit it (e.g. https://www.hackingarticles.in/linux-privilege-escalation-using-ld_preload/).
However I think this is not true anymore since LD_PRELOAD would test for euid and ruid.
That Dynatrace is a well tested and proven is clear....but you know some OSS fanatics do not trust anything the is commercial.
This can be rather tricky... Maybe he has something to hide? Or even worse, he can make your life miserable afterwards.
I normally approach these type of people offering them something in return. And always start with a demo environment first, so he gets confidence, and really nothing breaks...
Agreed, it seems a common situation when you face a hardcore sysadmin who is proud of his environment. Generally, also an approach is to talk to his managers, but I guess this was tried already. Maybe there is another solution in place that this sysadmin takes care of? (Zabbix/Nagios/Icinga ...)
I'd also ask if they have any antimalware / antivirus or other security tools. Most of the tools work exactly the same way as Dynatrace using LD_PRELOAD on Linux. Examples include Aqua, ESET ...
@AntonioSousa @Julius_Loman good points. Indeed I think its a combo of “my system” and “noone should see more than my monitoring”, “my stuff is good enough”.
Security, Good idea: I’ll bring the topic of AppSec to the table, without automatic deep injection no security relevant application monitoring!
FYI: the hoster is not the owner of Dynatrace, he’s merely the provider of the VM as a service….
There are already some good points in the previous comments, and the two main things I'd also concentrate on would be:
In addition, regarding the more security related concerns: a dedicated and capable Linux admin can use tools such as filesystem namespaces to prevent the `/etc/ld.so.preload` file from being visible to critical processes. (Note that even though we are talking about LD_PRELOAD here, most Linux systems actually use that file instead of the environment variable for preloading the Dynatrace process module). An example command for preventing preloading for a process would be:
unshare -m -- sh -c 'mount --bind /dev/null /etc/ld.so.preload && <program to execute>'
(Note that in real world you would have to integrate the `unshare` command into places like systemd service files and perhaps combine it with tools like `su` to run the process as a specific user, so that command as shown here is only provided as reference / testing tool.)
Edit: more details are now available in a separate troubleshooting article.