cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Run oneagent on AWS docker swarm

richard_wilson
Newcomer

Hi, apologies if this has already been asked, I've searched and can't see anything linked to the issue I'm facing.

I've got 2 docker swarms that are linked to an AWS account, I've successfully linked the AWS account to our Dynatrace account and getting high level reporting but it says I need to install the OneAgent to get more info about each EC2 instance. The EC2 instances are created as managers and nodes by docker swarm so I don't have access to the AMI's that they use. I can modify the cloudformation and include the install commands via the UserData command, but this fails.

Trying to install it manually via SSH I get the following message:

Error: Dynatrace OneAgent cannot be installed inside a docker container, setup won't continue.

But it isn't inside a docker container, it's the host, I'm guessing that because the OS is Alpine, the OneAgent just assumes it's a container and fails there.

Any suggestions for how I can get OneAgent installed on EC2 instances deployed via docker swarm?

Thanks in advance,

Let me know if you need any more information.

Rich

7 REPLIES 7

moritz_becker
Participant

When you connect to a Docker Swarm Node/Manager instance created by Docker Swarm for AWS (i.e. via the provided CloudFormation template), then you indeed connect to a container running on this instance, namely the docker4x/shell-aws container. This happens transparently because the host SSH port is forwarded to the docker4x/shell-aws container.

moritz_becker
Participant


After hours of trying, here is what I figured out. I have tried to install OneAgent using two approaches.


The following relates to a Docker Swarm on AWS cluster that was created using the CloudFormation template
supplied by Docker. EC2 instances in such a deployment are based on the AMI Moby Linux 17.10.0-ce-aws1 edge (ami-b20bb2dd) for the latest version.

1. Install OneAgent on the "real" Docker Host


I first attempted to manage to build a SSH connection to a swarm EC2 instance in order to try the installation. I managed to do that by inserting the following instructions into the User Data section before the rc-service docker restart instruction.


echo "http://dl-2.alpinelinux.org/alpine/edge/community/" >> /etc/apk/repositories
apk add --no-cache openssh sudo shadow sed
echo "docker ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers# start sshd
sed -i 's/#Port 22/Port 2222/g' /etc/ssh/sshd_config
mkdir -p ~/.ssh && docker cp shell-aws:/home/docker/.ssh/authorized_keys ~/.ssh/authorized_keys
/etc/init.d/sshd start
This allowed me to SSH into the Swarm EC2 instance at docker@{ec2-public-dns}:2222. The next issue was that the provided AMI is based on Alpine Linux but OneAgent requires glibc. I worked around this by installing glibc compatibility using the following commands:
# install alpine-glibc
wget -q -O /etc/apk/keys/sgerrand.rsa.pub "https://raw.githubusercontent.com/sgerrand/alpine-pkg-glibc/master/sgerrand.rsa.pub"
wget "https://github.com/sgerrand/alpine-pkg-glibc/releases/download/2.27-r0/glibc-2.27-r0.apk"
apk add --force glibc-2.27-r0.apk

I then tried to reuse the OneAgent installation instructions provided during the Dynatrace SaaS setup. Before that, I had to install OpenSSL support as well as the trusted root CAs:

apk update && apk upgrade wget && apk add openssl coreutils && apk --no-cache add ca-certificates
The installation commands provided by Dynatrace are as follows:
wget -O Dynatrace-OneAgent-Linux.sh "https://{ACCOUNT-ID}.live.dynatrace.com/api/v1/deployment/installer/agent/unix/default/latest?Api-Token={API-TOKEN}&arch=x86&flavor=default"
# wget "https://ca.dynatrace.com/dt-root.cert.pem" ; ( echo 'Content-Type: multipart/signed; protocol="application/x-pkcs7-signature"; micalg="sha-256"; boundary="--SIGNED-INSTALLER"'; echo ; echo ; echo '----SIGNED-INSTALLER' ; cat Dynatrace-OneAgent-Linux.sh ) | openssl cms -verify -CAfile dt-root.cert.pem > /dev/null
/bin/sh Dynatrace-OneAgent-Linux.sh APP_LOG_CONTENT_ACCESS=1

I needed to deactivate the signature verification because the root ca could not be downloaded:

Connecting to ca.dynatrace.com (52.85.184.247:443)
wget: error getting response: Connection reset by peer
Next, the GLIBC version check carried out by the OneAgent install script fails because it uses ldd --version and ldd in the AMI still targets the musl libc despite the prior installation of glibc compatibility. I manually deactivated the check in the script as a workaround.
After that, the invocation of base64 failed - it seems that Alpine comes with an uncommon version:
/bin/base64: unrecognized option: i
BusyBox v1.25.1 (2017-11-23 08:48:46 GMT) multi-call binary.
Usage: base64 [-d] [FILE]
Base64 encode or decode FILE to standard output
-d Decode data
tar: short read


I solved this by simply removing the -i option from the base64 invocation (readonly UNPACK_BINARY_ARGS="-d").
The next problem was that the AMI's root device is not mounted at / and the filesystem mounted as / is a tmpfs with only ~500MB in size. Thus, the installer script terminates because not enough space is available in the default insall path /opt/dynatrace.
I thought of 2 possible workarounds for this problem:



  1. Specify a different install path (e.g. to /var which is mounted on the root device with sufficient memory)

  2. Mount the root device in /opt via sudo mount /dev/xvdb1 /opt

I first tried option nr 2 and observed that the agent installation was working, although autostart could not be registered:
15:11:18 Error: Couldn't add oneagent to autostart. Please adjust and add it manually.
Dynatrace-OneAgent-Linux.sh: line 3305: ex: not found


After option nr 2 worked, I did not try option nr 1.


Here are the final instructions - leaving out the ssh stuff - that need to be inserted into the user data section of the Swarm CloudFormation template right before the rc-service docker restart (you may need to adapt the OneAgent download URL):


"\n",
"# install openssl & ca certs\n",
"apk update && apk upgrade wget && apk add openssl coressl && apk --no-cache add ca-certificates\n",
"# install alpine-glibc\n",
"wget -q -O /etc/apk/keys/sgerrand.rsa.pub https://raw.githubusercontent.com/sgerrand/alpine-pkg-glibc/master/sgerrand.rsa.pub\n",
"wget https://github.com/sgerrand/alpine-pkg-glibc/releases/download/2.27-r0/glibc-2.27-r0.apk\n",
"apk add --force glibc-2.27-r0.apk\n",
"# install Dynatrace OneAgent\n",
"mkdir -p /opt/dynatrace\n",
"mount /dev/xvdb1 /opt/dynatrace\n",
"if [ -d /opt/dynatrace/oneagent ]; then\n",
" /opt/dynatrace/oneagent/agent/initscripts/oneagent start\n",
"else\n",
" wget -O Dynatrace-OneAgent-Linux.sh \"https://{ACCOUNT-ID}.live.dynatrace.com/api/v1/deployment/installer/agent/unix/default/latest?Api-Token={API-TOKEN}&arch=x86&flavor=default\"\n",
" sed -i 's/arch_checkGlibc$/: #arch_checkGlibc/;s/readonly UNPACK_BINARY_ARGS=\\\"-di\\\"/readonly UNPACK_BINARY_ARGS=\\\"-d\\\"/' Dynatrace-OneAgent-Linux.sh\n",
" /bin/sh Dynatrace-OneAgent-Linux.sh APP_LOG_CONTENT_ACCESS=1\n",
"fi\n",
"\n",

2. Use OneAgent docker image


I noticed that there is also a docker image for running the OneAgent and tried that as well.


The docker image relies on a volume mount -v /:/mnt/root - this path is hard coded in the entrypoint.sh script. Again, the problem is that / is not mounted in the root device but in a tmpfs device with limited space. Hence, the installation fails due to insufficient space. The only way I was able to get it to work was by removing the volume mount and by adapting the entrypoint.sh script by:
readonly DOCKER_HOST_ROOT_PREFIX=/

Conclusion


Installing OneAgent on a Swarm EC2 instance works but it would be very nice if Dynatrace could modify their install procedures to make this work out-of-the-box.

So, what did your docker run statement look like in its entirety?


Sorry, but your question does not make sense to me.


Sorry, my bad. Your option #2 of using the OneAgent docker image. I was wondering if you made any changes to the command as it is prescribed in the following link? https://www.dynatrace.com/support/help/deploy-dynatrace/oneagent/docker/how-do-i-deploy-dynatrace-oneagent-as-docker-container/ For some odd reason after download and install of the image, in a very similar setup to what you describe, the container fails with the following: First, this is run: The command that we run is:

docker run -d --privileged=true --restart=unless-stopped --pid=host --net=host --ipc=host --name=oneagent -v /:/mnt/root -e ONEAGENT_INSTALLER_SCRIPT_URL="https://***.live.dynatrace.com/api/v1/deployment/installer/agent/unix/def ault/latest?Api-Token=***&arch=x86&flavor=default" dynatrace/oneagent APP_LOG_CONTENT_ACCESS=1

Then the output in the docker logs look like this. There's a check and failure for access rights on an initial mount point but it appears to move along and then fails trying to read the "-e" parameter? Very odd.

20:17:50 Error: Insufficient access rights () on: /var/lib

20:17:50 Error: /var/lib path must be globally readable (r-x permissions for others). 20:17:50 Error: Please adjust the permissions and then retry the installation.

20:17:51 Started agent deployment as a Docker container, PID 30911.

20:17:51 Downloading agent to /tmp/Dynatrace-OneAgent-Linux.sh via https://***.live.dynatrace.com/api/v1/deployment/installer/agent/unix/default/latest?Api-Token=***&arch=x86&flavor=default

20:17:51 Download was skipped as the agent version did not change

20:17:51 Validating downloaded agent installer 20:17:53 Verification successful 20:17:53 Deploying to: /mnt/host_root 20:17:53 Starting installer...

readlink: unrecognized option: e

BusyBox v1.25.1 (2016-10-26 16:15:20 GMT) multi-call binary.

Usage: readlink [-fnv] FILE

Display the value of a symlink

-f Canonicalize by following all symlinks ...........


As far as I can recall I used the prescribed docker run command. Other than that I have no idea what the problem could be in your case. Maybe I will try if the OneAgent docker image still works for me when I have time next week in which case I will update my answer.


divya_sharma
Newcomer

I have a Question posted here. Could you please suggest any ideas if you have so?https://answers.dynatrace.com/spaces/482/dynatrace-open-qa/questions/201930/unable-to-get-the-services-running-in-docker-conta.html