‎20 Dec 2023 09:03 PM - edited ‎17 Jun 2024 07:50 AM
TL; DR
Envoy deprecated and will removed OpenTracing. This affects the Dynatrace Envoy code module starting with Envoy version 1.30 (expected released in April 2024) and Istio version 1.22 (released in May 2024). Dynatrace offers a new solution based on OpenTelemetry for Istio/Envoy observability: https://docs.dynatrace.com/docs/shortlink/otel-integrations
Please make sure to run at minimum OneAgent version 1.283.123.20240201-075622 in case you're updating to Envoy >= 1.29.
Note: Initially, the Envoy community planned to deprecate OpenTracing already with 1.29. As this has been shifted to Envoy 1.30, this post got updated accordingly.
Read more:
---
1. OpenTracing deprecation in Envoy
Envoy announced the deprecation of OpenTracing & Opencensus alongside the latest Envoy release (Envoy 1.28) in favor of OpenTelemetry:
The deprecation and removal of OpenTracing in Envoy directly affects Dynatrace’s Envoy code module, as this code module is based on the OpenTracing API. The deprecation and breaking change policy in Envoy follow a 3 step approach:
Update: Following the initial Envoy deprecation policy, Dynatrace OneAgent 1.281 won't inject into Envoy containers with Envoy version >= 1.29 (this can overruled by support - see FAQ). Starting with OneAgent 1.283.123.20240201-075622 (or later), this has been adapted as follows:
* Injection into Envoy 1.29 should work as expected
* Injection into Envoy 1.30+ is prohibited in order to avoid any configuration failures raised by Envoy.
We recommend to update to OneAgent 1.283.123.20240201-075622 before updating to Envoy 1.29+.
---
2. What about Istio?
Istio and other service meshes (e.g. Kong Mesh, Hashicorp Consul, AWS App Mesh, OpenServiceMesh, etc..) leverage Envoy proxies as data plane. Consequently, any change in Envoy directly affects any service mesh using upcoming Envoy versions.
Istio provides a mapping between Istio and Envoy versions in the Istio documentation:
The next Istio version, 1.22.x, is expected to leverage uses Envoy 1.30.
---
3. Future Istio/Envoy observability with Dynatrace
Dynatrace already planned the transition from OpenTracing to OpenTelemetry ahead of time and worked on an improved Istio/envoy observability. Based on the feedback / product ideas we got from you, we’ve identified & analyzed the most important requirements:
For this purpose, we heavily contributed OpenTelemetry functionalities to Envoy in the last releases (i.e. http exporter, support for resource detectors, sampling).
On top of Envoy, we’re currently contributing additional configurations to Istio in order to unlock the new Envoy Open Telemetry capabilities. We will update our documentation with detailed instructions on how-to leverage the new OTEL based Envoy/Istio observability, soon. You can find the instructions on how to configure Istio/Envoy with OpenTelemetry for Dynatrace in our documentation: https://docs.dynatrace.com/docs/shortlink/otel-integrations
New unified service detection
The new Istio observability based on OpenTelemetry can already be based on Unified Services (Dynatrace version 1.274+).
Outlook
We will continue our community contributions to Envoy/Istio and plan to add additional possibilities for intelligent sampling (Update 2024-04: Dynatrace sampling added to Envoy 1.30 / Istio 1.22). More details around the new OpenTelemetry based Envoy/Istio observability will be shared in an upcoming blog post and product documentation.
---
4. FAQ
Can I change the Envoy configuration to still allow OpenTracing in Envoy 1.29?
Yes, according to the Envoy breaking change policy this is possible. In this case, envoy.features.enable_all_deprecated_features needs to be enabled within Envoy. Moreover, please reach out to Dynatrace support to re-enable the Dynatrace code-module injection for Envoy 1.29.
What do I need to consider when updating to Envoy 1.29?
Please make sure to run at minimum OneAgent 1.283.123.20240201-075622 in your environments before updating to Envoy 1.29. For OneAgent 1.281, deep monitoring for Envoy needs to be explicitly enabled in your environment by Dynatrace support.
Is there any (immediate) action needed for older Envoy versions (up until Envoy 1.28)?
No action is needed. However, you can configure the OpenTelemetry tracer with http-export already in Envoy 1.28.
Note: We’ll update this post once we have additional information/insights regarding the upcoming Istio version (Istio 1.21) and provide links to the new documentation.
‎12 Feb 2024 08:51 PM
@stefan_penner Thank you for sharing this
‎25 Jan 2025 01:08 AM
Hi @stefan_penner I see that the main direction and recommendation from Dynatrace for Envoy and Istio service mesh integrations and support is to use OTEL? We've noticed this approach creates a security gap in how we manage the configuration process (meaning the DT API Token). In a POC we did, the DT API token is stored/used in clear text in the Istio meshConfig.extensionProviders.opentelemetry.http.headers. We ultimately would like to know if OTEL is the direction for Envoy and Istio service mesh metrics. There are also the Envoy tenant settings and Istio OneAgent features that we can enable, would those remain available or would they be going away in the future?
Here's an example of the DT API token usage in clear text:
extensionProviders:
- name: dynatrace-otel
opentelemetry:
dynatrace_sampler:
cluster_id: -xxxxxxxxx
tenant: xxx12345
http:
headers:
- name: Authorization
value: Api-Token dt0c01.xxxxx <-------
path: /api/v2/otlp/v1/traces
timeout: 10s
port: 80
resource_detectors:
dynatrace: {}
service: istio-system/sampletenant[.]live[.]dynatrace[.]com
‎28 Jan 2025 12:54 PM
Hi @steven_v, this approach is a very simple and only used if you don't have an otel collector.
I would strongly suggest to look at putting in place an otel collector to handle the connectivity to Dynatrace.
This will reduce the load on your Istio components and allow for benefits like transformations and more secure connectivity (which can also address your issue), not to mention creating an Otel Collector for general ingestion of data (Prometheus and traces).
Where to start:
1. GitHub - Dynatrace/dynatrace-otel-collector: Dynatrace distribution of the OpenTelemetry Collector.
Links for config & documents are in there.
- Do you need to use the Dynatrace one ? no, you could also use the out of box Otel Collector.
- why the Dynatrace one? it has some handy plugins built in, and is based on Otel Cpllector, Dynatrace has to support it.
2. Setup the Otel Collector to have HTTP & gRPC endpoints.
3. In your Otel Collector, you can configure your endpoints to use secrets. You can call from environment variables or mount points.
alternateConfig:
exporters:
otlphttp:
endpoint: "${env:DT_ENDPOINT}"
headers:
Authorization: "Api-Token ${env:DT_API_TOKEN}"
4. Configure your Otel collector to do whatever else you need (transforms (metrics / trace enhancements), prometheus ...).
5. get your Otel collector connected to Dynatrace (you can do this via proxy secrets, direct or however you need).
From Here you can now quickly and easily configure your Istio Connector using the default Istio configs for tracing: Istio / OpenTelemetry
All you need to do is replace the endpoint in the document with the endpoint for the otel collector.
- Nice and clean.
As an added bonus, you can use the Otel collector to scrape your Istio Prometheus data and send to Dynatrace.
This will save a significant amount of pain going through the active gate.
Anyway, here is a helm based sample if you want it.
mode: deployment
#Replica cound of pods to support sharding of prometheus
replicaCount: 3
image:
repository: "<my repo>/dynatrace/dynatrace-otel-collector/dynatrace-otel-collector"
tag: latest
#Name of pods
command:
name: dynatrace-otel-collector
# Pod Labels for support & enabling Istio if required
podLabels:
Service: "Dynatrace Otel"
Support: "my support team"
#sidecar.istio.io/inject: "true" ### enable if you want Otel to run behind istio - if you do you'll need to do the SE & Netpol
extraEnvs:
- name: HTTP_PROXY
valueFrom:
secretKeyRef:
key: http_proxy
name: otelproxy
- name: HTTPS_PROXY
valueFrom:
secretKeyRef:
key: https_proxy
name: otelproxy
- name: NO_PROXY
valueFrom:
secretKeyRef:
key: no_proxy
name: otelproxy
- name: DT_API_TOKEN
valueFrom:
secretKeyRef:
name: dynatrace-otelcol-dt-api-credentials
key: dt-api-token
- name: DT_ENDPOINT
valueFrom:
secretKeyRef:
name: dynatrace-otelcol-dt-api-credentials
key: dt-endpoint
- name: SHARDS
value: "3"
- name: POD_NAME_PREFIX
value: otel-prometheus-collector
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
resources:
requests:
cpu: 750m
memory: 4Gi
ephemeral-storage: "1Gi"
limits:
cpu: 5
memory: 8Gi
ephemeral-storage: "2Gi"
podDisruptionBudget:
enabled: true
minAvailable: 2
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
behavior:
scaleUp:
stabilizationWindowSeconds: 180
selectPolicy: Max
policies:
- type: Pods
value: 5
periodSeconds: 30
- type: Percent
value: 100
periodSeconds: 30
scaleDown:
stabilizationWindowSeconds: 180
selectPolicy: Min
policies:
- type: Pods
value: 3
periodSeconds: 30
- type: Percent
value: 50
periodSeconds: 30
targetCPUUtilizationPercentage: 90
targetMemoryUtilizationPercentage: 95
rollout:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
strategy: RollingUpdate
dnsPolicy: "ClusterFirst"
# Additional settings for sharding
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- otel-collector
topologyKey: "kubernetes.io/hostname"
presets:
kubernetesAttributes:
enabled: true
useGOMEMLIMIT: true
ports:
jaeger-compact:
enabled: false
jaeger-thrift:
enabled: false
jaeger-grpc:
enabled: false
zipkin:
enabled: false
metrics:
enabled: true
serviceAccount:
create: true
annotations: {}
name: "k8s-otel-collector-sa"
clusterRole:
create: true
annotations: {}
name: "k8s-otel-collector-role"
rules:
- apiGroups:
- ""
resources:
- pods
- services
- endpoints
- namespaces
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- deployments
- replicasets
verbs:
- get
- list
- watch
- apiGroups:
- apps
resources:
- daemonsets
- statefulsets
- replicasets
verbs:
- get
- list
- watch
- apiGroups:
- networking.k8s.io
resources:
- ingresses
verbs:
- get
- list
- watch
clusterRoleBinding:
annotations: {}
name: "k8s-otel-collector-role-binding"
alternateConfig:
exporters:
otlphttp:
endpoint: "${env:DT_ENDPOINT}"
headers:
Authorization: "Api-Token ${env:DT_API_TOKEN}"
extensions:
health_check:
endpoint: ${env:MY_POD_IP}:13133
processors:
attributes:
actions:
- key: k8s.cluster.name
value: '<my cluster name>'
action: insert
cumulativetodelta: {}
filter:
metrics:
exclude:
match_type: expr
expressions:
- MetricType == "Summary"
memory_limiter:
check_interval: 5s
limit_percentage: 80
spike_limit_percentage: 25
batch/traces:
send_batch_size: 5000
send_batch_max_size: 5000
timeout: 60s
batch/metrics:
send_batch_size: 3000
send_batch_max_size: 3000
timeout: 60s
batch/logs:
send_batch_size: 1800
send_batch_max_size: 2000
timeout: 60s
k8sattributes:
auth_type: serviceAccount
passthrough: false
extract:
metadata:
- k8s.pod.name
- k8s.pod.uid
- k8s.deployment.name
- k8s.statefulset.name
- k8s.daemonset.name
- k8s.cronjob.name
- k8s.namespace.name
- k8s.node.name
- k8s.cluster.uid
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.name
- from: resource_attribute
name: k8s.namespace.name
- sources:
- from: resource_attribute
name: k8s.pod.ip
- sources:
- from: resource_attribute
name: k8s.pod.uid
- sources:
- from: connection
transform:
error_mode: ignore
trace_statements:
- context: resource
statements:
- set(attributes["dt.kubernetes.workload.kind"], "statefulset") where IsString(attributes["k8s.statefulset.name"])
- set(attributes["dt.kubernetes.workload.name"], attributes["k8s.statefulset.name"]) where IsString(attributes["k8s.statefulset.name"])
- set(attributes["dt.kubernetes.workload.kind"], "deployment") where IsString(attributes["k8s.deployment.name"])
- set(attributes["dt.kubernetes.workload.name"], attributes["k8s.deployment.name"]) where IsString(attributes["k8s.deployment.name"])
- set(attributes["dt.kubernetes.workload.kind"], "daemonset") where IsString(attributes["k8s.daemonset.name"])
- set(attributes["dt.kubernetes.workload.name"], attributes["k8s.daemonset.name"]) where IsString(attributes["k8s.daemonset.name"])
- set(attributes["dt.kubernetes.cluster.id"], attributes["k8s.cluster.uid"]) where IsString(attributes["k8s.cluster.uid"])
log_statements:
- context: resource
statements:
- set(attributes["dt.kubernetes.workload.kind"], "statefulset") where IsString(attributes["k8s.statefulset.name"])
- set(attributes["dt.kubernetes.workload.name"], attributes["k8s.statefulset.name"]) where IsString(attributes["k8s.statefulset.name"])
- set(attributes["dt.kubernetes.workload.kind"], "deployment") where IsString(attributes["k8s.deployment.name"])
- set(attributes["dt.kubernetes.workload.name"], attributes["k8s.deployment.name"]) where IsString(attributes["k8s.deployment.name"])
- set(attributes["dt.kubernetes.workload.kind"], "daemonset") where IsString(attributes["k8s.daemonset.name"])
- set(attributes["dt.kubernetes.workload.name"], attributes["k8s.daemonset.name"]) where IsString(attributes["k8s.daemonset.name"])
- set(attributes["dt.kubernetes.cluster.id"], attributes["k8s.cluster.uid"]) where IsString(attributes["k8s.cluster.uid"])
receivers:
otlp:
protocols:
grpc:
endpoint: ${env:MY_POD_IP}:4317
http:
endpoint: ${env:MY_POD_IP}:4318
##################################################################
## PROMETHEUS SCRAPE SETTINGS GO HERE ##
##################################################################
prometheus:
config:
scrape_configs:
- job_name: opentelemetry-collector
scrape_interval: 30s
static_configs:
- targets:
- ${env:MY_POD_IP}:8888
- job_name: 'kube-dns'
scrape_interval: 30s
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_namespace]
action: keep
regex: kube-system
- source_labels: [__meta_kubernetes_pod_label_k8s_app]
action: keep
regex: kube-dns
- source_labels: [__meta_kubernetes_pod_container_name]
action: keep
regex: sidecar
- source_labels: [__meta_kubernetes_pod_container_port_number]
action: keep
regex: 10054
- source_labels: [__address__]
action: replace
regex: (.*):\d+
replacement: $$1:10054
target_label: __address__
metric_relabel_configs:
- source_labels: [__name__]
action: drop
regex: ^go_.*
metrics_path: /metrics
scheme: http
- job_name: 'istio-ingressgateway'
scrape_interval: 15s
metrics_path: /metrics
scheme: http
static_configs:
- targets: ['otel-ingressgateway.istio-internal.svc.cluster.local:15020']
relabel_configs:
- source_labels: [__address__]
action: replace
regex: (.*):\d+
target_label: __address__
replacement: $1:15020
metric_relabel_configs:
- source_labels: [__name__]
action: drop
regex: ^go_.*
- job_name: 'istiod'
scrape_interval: 15s
metrics_path: /metrics
scheme: http
static_configs:
- targets: ['istiod.istio-internal.svc.cluster.local:15014']
metric_relabel_configs:
- source_labels: [__name__]
action: drop
regex: ^go_.*
##################################################################
##################################################################
service:
telemetry:
metrics:
address: ${env:MY_POD_IP}:8888
logs:
level: debug
extensions:
- health_check
pipelines:
logs:
exporters:
- otlphttp
processors:
- attributes
- k8sattributes
- memory_limiter
- batch/logs
receivers:
- otlp
metrics:
exporters:
- otlphttp
processors:
- attributes
- cumulativetodelta
- memory_limiter
- batch/metrics
- k8sattributes
- filter
receivers:
- prometheus
traces:
exporters:
- otlphttp
processors:
- transform
- memory_limiter
- batch/traces
receivers:
- otlp
Secrets In this are for Proxy and the DT API (mentioned in Dynatrace format). examples below
*if you are doing Prometheus, you need to add the node, pod & service CIDR to the no-proxy.
apiVersion: v1
data:
dt-api-token: ZHQwYzAxLm15YXBpdG9rZW4
dt-endpoint: aHR0cHM6Ly90ZW5hbnRpZC5saXZlLmR5bmF0cmFjZS5jb20vYXBpL3YyL290bHAv
kind: Secret
metadata:
annotations:
name: dynatrace-otelcol-dt-api-credentials
type: Opaque
---
apiVersion: v1
data:
http_proxy: aHR0cDovL3VzZXJuYW1lOnBhc3N3b3JkQHByb3h5OnBvcnQ
https_proxy: aHR0cDovL3VzZXJuYW1lOnBhc3N3b3JkQHByb3h5OnBvcnQ
no_proxy: MTI3LjAuMC4xLFBPRF9DSURSLE5PREVfQ0lEUixTRVJWSUNFX0NJRFI
kind: Secret
metadata:
annotations:
name: otelproxy
type: Opaque
have fun