AI
Everything around AI: AI observability, agentic AI, LLMs, MCP servers, and more
cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

MCP Server Challenge entry #9: R.E.A.D.Y. - Reliability Evidence Assessment for Dynatrace Readiness

MaximilianoML
Champion

R.E.A.D.Y. Project Use Case Write-Up

R.E.A.D.Y. is a Dynatrace-native app that uses Dynatrace Remote MCP, DQL, and Dynatrace platform APIs to transform observability data into two practical operator workflows:

  1. Problems Intelligence for real operational triage
  2. Ready Report Generation for fleet-level operational readiness

The goal is not to create another chatbot-first experience. Instead, R.E.A.D.Y. reduces manual observability work by collecting, normalizing, and scoring evidence before any AI explanation is generated.

The workflow follows a simple pattern:

Dynatrace MCP / DQL / APIs
-> normalized evidence
-> deterministic checks and scoring
-> optional OpenAI summarization
-> operator-ready insight

Problem We Wanted to Solve

In many environments, the signals already exist in Dynatrace, but answering operational questions still requires too many manual steps.

Operators often need to:

  • review recent Davis Problems
  • understand which services, applications, or entities are most affected
  • compare categories such as Error, Slowdown, Resource, Availability, or Custom
  • inspect duration and recurrence patterns
  • identify the next entity to investigate
  • assess whether a service or application fleet is operationally ready

The data is available, but the workflow is fragmented. R.E.A.D.Y. brings that evidence together into a structured, repeatable, and operator-friendly experience.

Tools Used

1. Dynatrace Remote MCP

Dynatrace Remote MCP is used as the main evidence and context bridge.

MCP tools used in the project include:

  • execute-dql
  • get-entity-id
  • get-entity-name
  • query-problems
  • get-problem-by-id
  • find-documents
  • find-troubleshooting-guides

2. Dynatrace Platform APIs and App Functions

Dynatrace App Functions are used on the backend side of the app to orchestrate evidence collection and report generation securely.

This keeps sensitive configuration and tokens out of the browser and makes the flow more production-friendly.

3. OpenAI API

OpenAI is used only after the evidence has already been collected, normalized, and scored.

The AI layer is optional and is used to generate concise operator-facing explanations from the real evidence set, not to invent conclusions.

4. Playbook based

The Playbooks layer provides structured guidance for the LLM, defining how it should use Dynatrace MCP tools, DQL, and platform evidence during the analysis flow. Instead of allowing the model to guess, each playbook orients the LLM to first collect evidence, resolve entities, query Problems or telemetry when needed, validate the available context, and only then generate an operator-facing explanation. This keeps the output grounded in real Dynatrace data, reduces hallucination risk, and makes the workflow repeatable across different operational scenarios.

Primary Use Case 1: Problems Intelligence for Real Operational Triage

Scenario

An operator wants to understand the health of a selected scope, such as services, applications, frontends, or infrastructure, over a time window like the last 2 hours, 24 hours, 7 days, or 30 days.

Instead of opening multiple Dynatrace views manually, the operator opens the Problems view in R.E.A.D.Y. and filters by:

  • time window
  • impact
  • status
  • category or type

What the App Does

  1. Queries recent Davis Problems through Dynatrace MCP.
  2. Normalizes the result set into a stable Problems overview payload.
  3. Aggregates total Problems, active vs. closed Problems, status, category, time trends, duration statistics, recurrent Problems, top affected entities, and slowdown-related endpoints or services.
  4. Resolves entity IDs into readable names using MCP.
  5. Links affected entities directly to Dynatrace topology views.
  6. Optionally generates one AI Operator Insight from the real evidence set.

Why MCP Matters

MCP makes the integration practical because the app can use stable, named tools instead of hardcoding every tenant-specific retrieval path into the UI.

It allows the app to:

  • discover available tools
  • query Problems consistently
  • resolve entity names
  • execute DQL
  • search related documentation and troubleshooting content

Results Achieved

  • a live Problems dashboard backed by real Dynatrace data
  • filtering by scope, status, impact, category, and time window
  • readable entity names where resolution is possible
  • direct links to affected entities in Dynatrace
  • histograms showing how long Problems stay open
  • category-level duration distribution
  • recurrence and concentration signals
  • optional AI-generated operator insight based only on collected evidence

Practical Value

This helps operators answer questions such as:

  • Are current Problems active pressure or mostly historical noise?
  • Which category dominates this time window?
  • Are Slowdown Problems short-lived or staying open too long?
  • Is one service, application, or entity repeatedly involved?
  • Which entity should be inspected next?
  • Where is operational risk concentrated?

Primary Use Case 2: Fleet-Level Ready Reports for Operational Readiness

Scenario

A platform team or SRE team wants to assess whether a fleet is operationally ready.

This is different from simply checking whether telemetry exists. A service may have traces and metrics but still be missing important operational metadata, ownership, documentation, or governance signals.

R.E.A.D.Y. currently supports readiness generation for:

  • All Services
  • All Applications

What the App Does

The Ready Report workflow collects evidence for the selected scope and evaluates readiness using deterministic checks.

Example signals include:

  • ownership metadata
  • team tags
  • environment tags
  • runbook or contact metadata
  • dashboard evidence
  • documentation evidence
  • governance metadata
  • application or service operational context

The app then generates a structured report containing:

  • overall readiness score
  • domain-level results
  • detected gaps
  • recommendations
  • evidence status

The result clearly separates:

  • evidence present
  • evidence missing
  • evidence unknown or unavailable

That distinction is important. R.E.A.D.Y. does not pretend to know more than the data supports.

Why This Is Useful

Readiness reviews are often manual and inconsistent. One team may consider a service ready because it has traffic. Another team may expect ownership, SLOs, dashboards, alerts, documentation, and runbooks.

R.E.A.D.Y. makes that conversation more explicit and repeatable by turning readiness into an evidence-based workflow.

Results Achieved

  • real fleet-level readiness generation for Services
  • real fleet-level readiness generation for Applications
  • deterministic scoring before AI explanation
  • clear visibility into missing operational metadata
  • structured recommendations based on the collected evidence
  • a repeatable report format that can be reused across environments

Repeatable Workflow Pattern

This project is not tied to one specific tenant. The same pattern can be reused in other Dynatrace environments.

  1. Configure the Dynatrace environment URL and platform token.
  2. Configure the Dynatrace MCP server endpoint.
  3. Discover available MCP tools.
  4. Collect evidence through MCP, DQL, and platform APIs.
  5. Normalize the evidence into a stable internal structure.
  6. Apply deterministic checks and scoring.
  7. Use AI only after the evidence is already structured.

This same pattern can be extended to:

  • Kubernetes workload readiness
  • synthetic monitor readiness
  • host readiness
  • deployment-change correlation workflows
  • fleet-wide governance audits
  • service ownership validation
  • operational metadata quality checks

Why This Is a Good Dynatrace MCP Use Case

This project demonstrates Dynatrace MCP beyond a simple chat interface.

R.E.A.D.Y. uses MCP as part of a real operator workflow for:

  • evidence discovery
  • DQL execution
  • entity resolution
  • Problems analytics
  • documentation search
  • troubleshooting context
  • readiness assessment

The key value is that MCP becomes an operational building block. It helps answer questions teams already care about:

  • Where should we investigate first?
  • What is recurring?
  • Which entities are driving risk?
  • What evidence is missing?
  • Is this fleet operationally ready?
  • What should be improved before production readiness is accepted?

Architecture Pattern

UI
-> Dynatrace App Function orchestration
-> Dynatrace MCP / DQL / Platform APIs
-> normalized evidence layer
-> deterministic rules and scoring
-> optional AI summarization
-> operator-facing result

This design keeps the system explainable, testable, auditable, extensible, and grounded in Dynatrace data.

What Makes It Creative

The creativity in R.E.A.D.Y. is not about replacing operators with AI.

The creative part is combining Dynatrace-native evidence collection, MCP-powered context retrieval, deterministic operational scoring, and optional AI explanation into one repeatable workflow.

Business and Operational Outcome

R.E.A.D.Y. helps teams:

  • shorten time to triage
  • standardize readiness reviews
  • identify recurring Problems
  • detect concentration of operational risk
  • highlight missing ownership or governance metadata
  • produce structured reports instead of relying on tribal knowledge
  • make operational readiness more evidence-based

In short, Dynatrace MCP is used here not as a novelty, but as a repeatable operational building block for observability-driven decision support.

Below you have some images of the App:

Problems Intelligence

MaximilianoML_3-1777970963661.png

MaximilianoML_1-1777970881063.png

MaximilianoML_4-1777971057463.png

Fleet-Level Ready Reports for Operational Readiness

MaximilianoML_5-1777971217464.png

MaximilianoML_6-1777971334014.png

MaximilianoML_7-1777971404041.png

 

Max Lopes
0 REPLIES 0

Featured Posts