cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Dynatrace operator cannot pull ag image - connection denied

jordan_rose
Helper

Hi All, I'm posting here since we are pretty stuck on this topic.

 

We are attempting to migrate to the Dynatrace Operator and when ag pod starts and attempts to pull the activegate image from our managed cluster we get "image pull back off" errors. Pod output eludes to connection being denied by our cluster. We manage our own domain name and ssl certs and have tried adding the cert in a config map. 

 

I cant seem to find solid doc on how to resolve this or about adding the certs properly, any help here is much appreciated. 

13 REPLIES 13

dannemca
DynaMight Guru
DynaMight Guru

Did you confirm the API Token if has the required scopes?

 

  • Read configuration
  • Write configuration
  • Read settings
  • Write settings
  • Read entities
  • Installer download
  • Access problem and event feed, metrics, and topology
  • Create ActiveGate tokens

 

Site Reliability Engineer @ Kyndryl

I used the "Deploy Dynatrace > Openshift" screen to create the tokens, this method creates a data ingest token and an Operator Token but neither include the "Create ActiveGate tokens" permission. I will try again with that permission added to the operator token. 

We got the same image pull error when using the new token with the added permission for - Create ActiveGate tokens

can you send here the error line you are getting in the pods log?

Site Reliability Engineer @ Kyndryl

Thanks for reply - we have been through many iterations of trying this and this is the latest error we are getting. We are using the cluster node IP address in the apiurl since the vip seems to be unreachable from the pods. (replaced some IPs and envid with "X's")

 

Generated from kubelet on workernode2 times in the last 0 minutesFailed to pull image "xx.xx.xx.xx/e/envIDxxx/linux/activegate:latest": rpc error: code = Unknown desc = error pinging docker registry xx.xx.xx.xx: Get "https://xx.xx.xx.xx/v2/":  x509: cannot validate certificate for xx.xx.xx.xx because it doesn't contain any IP SANs

Certificate issue, you can try to figure out how to properly add the certificate for this managed host in your cluster, or use the 'skipCertCheck: true' in Dynakube.yaml, right below apiUrl

 

# Optional: Disable certificate validation checks for installer download and API communication
#
skipCertCheck: true
Site Reliability Engineer @ Kyndryl

We do have the skipCertCheck set to true. As far as adding the cert we cant seem to find some solid doc on it. 

 

When you say add the add the "certificate for this managed host in your cluster" - we should be focused on adding the cert for our dynatrace managed cluster url to the Dynatrace operator, correct. We tried to do this via config map and still couldnt get this working.

 

Worth noting we have had a support ticket open for a while now and havent found a resolution. 

techean
Dynatrace Champion
Dynatrace Champion

Hi Jordan, 

This error will generally be thrown from the machine where the commands are getting executed as the server from where you are executing the command dont trust the docker registry self signed certificates. 

You can make the docker trust the self-signed certificate by placing the self-signed certificate to the “/etc/docker/certs.d/<docker_registry_hostname>:<docker_registry_host_port>/ca.crt” on the machine where you are trying to run the docker command.

You can follow the steps how to trust a self signed certificate for docker registry searching it from any official docker document. 

KG

Turns out it was a cert/trusted connection issue. We ended up pulling the image from an AG and adding the certs there. Our dev env doesnt have access to connect directly to our vip on prod netscaler and we couldnt bypass the vip without having proper certs in place.

oakdag
Participant

Hi everyone,

Do we need to set up AG to do this? Just one cluster node is not enough, is it? What is the exact information about this? Can't I install for OPC if I don't have an AG?

Mizső
DynaMight Guru
DynaMight Guru

Hi @oakdag,

 

First question: I think the best practice is to use AG if it is possible, but with proper firewall rules your oneagents from OPC can connect directly to DT cluster nodes. I recommend to use AG.

Get started with Kubernetes/OpenShift monitoring | Dynatrace Docs

"Pods must allow egress to your Dynatrace environment or to your Environment ActiveGate in order for metric routing to work properly."

 

You can use your exiting AG (outside of OPC) or you can install a containerized AG within OPC with dynakube operator (you can define the numbers of the containerized AG with dynakube custom resource yaml). I recommend the containerized AG.

 

 Second question: Which cluster node do you think? OPC or DT. In case of DT one node is enough with the recommended resources (can be found in the documentation) In case of OPC one worker node is enough for the containerized AG.

 

Third question: Get started with Kubernetes/OpenShift monitoring | Dynatrace Docs

 

Fourth question: See above, answer is No.

 

I hope it helps.

 

Best regards,

 

Mizső 

 

 

Dynatrace Community RockStar 2024, Certified Dynatrace Professional

Hi Mizsö,

Thanks a lot. I' ve been solved this issue. I installed valid certificate after solved. Actually, I had done it a few times before installing the certificate and this was not a problem, but this time I could only solve it after installing the certificate. Moreover, it did not happen even though I used the noskipcert parameter. Thanks again for your comments.

Regards.

Hi @oakdag , can you explain how you "installed" the Cerficate we are facing the same issue when trying to pull the images from a managed cluster x509: "certificate signed by unknown authority", how did obtain the certificate did you install it using a configMap ?

Featured Posts