22 Apr 2026
09:35 PM
- last edited on
23 Apr 2026
08:13 AM
by
Michal_Gebacki
Hi everyone! I built an automated observability maturity auditor that uses Claude AI + Dynatrace MCP to run a 15-agent audit across infrastructure, configuration, DEM, operations, and security — producing a scored HTML report with root cause analysis and actionable next steps. The entire audit runs from a single command: "audit tenant uhv42169".
See it in action:
As an consultant, I audit Dynatrace tenants regularly for clients across Latin America. Each audit follows a repeatable pattern:
This process used to take 1-3 days manually. With the Dynatrace MCP server, I automated it down to ~15 minutes.
The key insight is using CLAUDE.md as an executable playbook. Instead of writing Python code to call APIs, I wrote a markdown file that instructs Claude AI to execute the audit step by step, using the Dynatrace MCP server as its data source.
MCP Tool Used By Agents Purpose
| get_environment_info | Setup | Verify connectivity, get tenant ID |
| execute_dql | 01-09, 12, 15 | Query entities, tags, management zones, services |
| list_problems | 10, 11, 13 | Problem history, MTTR, noise analysis, custom alerts |
| list_davis_analyzers | 10 | Verify Davis AI capabilities |
| list_vulnerabilities | 14 | Security posture assessment |
| get_kubernetes_events | 15 | K8s cluster health and event analysis |
| chat_with_davis_copilot | Exploration | Settings discovery (dt.setting workaround) |
Each agent is a markdown file defining: DQL queries or MCP tool calls, checks with PASS/WARN/FAIL/INFO criteria, blast radius (CRITICAL/HIGH/MEDIUM/LOW), remediation text, and analysis guidelines with root cause/recommendations/next steps.
Each finding is weighted by blast radius:
| CRITICAL | Weight 4.0 |
| HIGH | Weight 3.0 |
| MEDIUM | Weight 2.0 |
| LOW | Weight 1.0 |
Status scoring: PASS = 100% of weight, WARN = 50%, FAIL = 0%, INFO = excluded.
Agent score = (earned_weight / total_weight) × 100. Global score = average of all agents with data.
# Agent Score Data Source
| 01 | OneAgent & ActiveGate | 73.6 | execute_dql |
| 02 | Host Groups | 0.0 | execute_dql |
| 03 | Management Zones | 0.0 | execute_dql |
| 04 | Auto Tags | 0.0 | execute_dql |
| 05 | Manual Tags | 50.0 | execute_dql |
| 06 | Ownership | 0.0 | execute_dql |
| 07 | Security Context | N/A | dt.setting unavailable |
| 08 | RUM | N/A | No apps (intentional) |
| 09 | Synthetic Monitors | 27.3 | execute_dql |
| 10 | Anomaly Detection | 83.3 | list_davis_analyzers + list_problems |
| 11 | Problem Notifications | 66.7 | list_problems (CUSTOM_ALERT inference) |
| 12 | SLOs | 0.0 | execute_dql |
| 13 | Problem History | 45.5 | list_problems + execute_dql |
| 14 | Vulnerabilities | 100.0 | list_vulnerabilities |
| 15 | Kubernetes | 30.0 | get_kubernetes_events + execute_dql |
The core innovation is that CLAUDE.md IS the automation. No Python, no scripts, no SDK wrappers. The AI reads the playbook and follows it:
Each agent is a self-contained markdown file. Here's a simplified example of Agent 10 (Anomaly Detection), which was redesigned to use MCP tools instead of the unavailable dt.setting:
# Agent 10: Anomaly Detection
Category: configuration | Blast Radius: HIGH
## MCP Tools
list_davis_analyzers — verify AI capabilities
list_problems(timeframe="30d") — check if detection fires
## Checks
davis_analyzers_available: PASS if ≥3 analyzers
anomaly_detection_firing: PASS if SLOWDOWN/RESOURCE problems exist
anomaly_problem_ratio: WARN if ≥60% anomaly-based (noisy)
The generated HTML report features:
A significant challenge: fetch dt.setting is not available as a DQL data object, which initially blocked 4 agents (Security Context, Anomaly Detection, Problem Notifications, and partially Management Zones/Auto Tags/Ownership/SLOs).
The solution was to leverage other MCP tools creatively:
This turned a limitation into a feature — the audit now uses 7 different MCP tools instead of relying solely on DQL, making it more resilient and comprehensive.
| Metric | Before (Manual) | After (MCP) |
| Audit duration | 1-3 days | ~20 minutes |
| Dimensions checked | 8-10 | 15 (5 categories) |
| MCP tools used | N/A | 7 tools |
| Report format | Google Slides / PDF | Interactive HTML (dark mode, search, filter) |
| Consistency | Varies by analyst | 100% repeatable |
Stack: Claude AI + Dynatrace MCP Server + Markdown playbooks
Source: Available on request — the entire framework is ~15 markdown files + 1 HTML template
23 Apr 2026 07:03 PM
Hey @tracegazer ,
would mind sharing it? Reminds me of the tenant review https://github.com/dynatrace-oss/CustomerSuccess/tree/main
I internally tried to build a similar solution for Dynatrace Managed using Claude, but actually not using the agent approach, but generating the utility to provide such reports. (Managed installations are typically airgapped + customers have AI regulations).
23 Apr 2026 08:14 PM
Hi @Julius_Loman .. The current solution runs with the help of Claude and Dynatrace’s MCP. However, I have a previous solution that used Dynatrace APIs, which, as I understand, should also work in both SaaS and Managed environments. Let me check if I have it committed and published on Git.
24 Apr 2026 06:26 PM
Hi @Julius_Loman , I’ve finally uploaded the repository. Here’s the URL:
git clone https://github.com/alanfuentes92/observability-auditor.git
cd observability-auditor/audit-mcp
cp mcp-config.example.json .mcp.json
Edit .mcp.json with your tenant URL and token
claude .
Then say: "audit this tenant" → An HTML report will be generated in the output/ folder
Feel free to reach out with any questions, feedback, suggestions, or ideas—everything is welcome.
Featured Posts