cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Creating a Problem Overview Dashboard by Tag

yuval1983
Organizer

 

Hello everyone,

I’m excited to start using the new dashboard, and I’m looking for assistance in creating a solution that will help me display and analyze problems in my system more effectively.

In my setup, I have multiple TAGs that I receive alerts for, and the system works great. Now, I want to create a new dashboard that focuses on problems based on the variables TAG and work area (e.g., sandbox, PROD, etc.).

I’d like to filter the results using TAGs as the main variable. Below is the query I’ve been working on. I’d appreciate any advice or guidance to refine it:

 

fetch dt.entity.service
| expand tags
| filter contains(tags, "SRE Observability:")
| fields tags, managementZones

 

 

fetch events
| filter event.kind == "DAVIS_PROBLEM"
| sort timestamp desc

// First Summarzation Start - Group Problem Events by Problem ID
| summarize {startTime = takeFirst(event.start),
            endTime = takeFirst(event.end),
            problemClosedDuration = takeFirst(resolved_problem_duration),
            status = takeFirst(event.status),
            event.name = takeFirst(event.name),
            severityLevel = takeFirst(event.category),
            affected = takeFirst(affected_entity_ids),
            rootCause = takeFirst(root_cause_entity_id),
            dt.davis.is_duplicate = takeFirst(dt.davis.is_duplicate),
            event.id = takeFirst(event.id)}, 
            by:{display_id}
| filter `dt.davis.is_duplicate` == false
| fieldsAdd status = if((status == "ACTIVE"),"OPEN", 
                else:if((status == "CLOSED"), "CLOSED"))
| fieldsAdd currentTime = toTimestamp(now())-1m                
// First Summarzation Emd - Group Probem Events by Problem ID

// Dashboard Fitlers Start
| filter ("ALL" == $Status OR $Status == status) 
| filter ("ALL" == $Severity OR $Severity == severityLevel)
// Dashboard Fitlers End
              
// Second Summarzation Start - Summarize Closed Problem Count
| summarize Total = count(), by:{timestamp = if(status == "CLOSED",endTime, else:currentTime),status}
//| makeTimeseries Total = Count(), by:{status}
| makeTimeseries ProblemCount = SUM(Total), by:{status}
// Second Summarzation End - Summarize Closed Problem Count

 

 

 

 

 

fetch events
| filter event.kind == "DAVIS_PROBLEM" // Filter only Davis problems
| sort timestamp desc

// Expand affected and root cause entity IDs
| expand affected_entity_ids
| expand root_cause_entity_id

// Lookup services and their tags
| lookup [fetch dt.entity.service 
          | expand tags 
          | filter contains(tags, "SRE Observability:")], 
         sourceField:affected_entity_ids, lookupField:id, prefix:"serviceData"

// Summarize Problems
| summarize {
    startTime = takeFirst(event.start),
    endTime = takeFirst(event.end),
    status = takeFirst(event.status),
    severityLevel = takeFirst(event.category),
    managementZone = takeFirst(serviceData.managementZones), // Zone info
    affectedServices = collectDistinct(serviceData.servicesentity.name), // Affected services
    tagList = collectDistinct(serviceData.tags), // Grouped tags
    rootCauseServices = collectDistinct(lookup.rootcause.entity.servicesentity.name) // Root cause services
}, by: {display_id}

// Add Status and Duration
| fieldsAdd currentTime = toTimestamp(now())
| fieldsAdd statusLabel = if((status == "ACTIVE"), "🔴 OPEN", else: "🟢 CLOSED")
| fieldsAdd durationMinutes = if((status == "CLOSED"), (endTime - startTime) / 60000000000, 
                                 else: (currentTime - startTime) / 60000000000)

// Final Output Formatting
| fields 
    ProblemID = display_id,
    Problem = concat(display_id, " - ", takeFirst(event.name)),
    Status = statusLabel,
    Severity = severityLevel,
    `Management Zone` = managementZone,
    `Affected Services` = affectedServices,
    Tags = tagList, // Renamed for clarity
    `Root Cause Services` = rootCauseServices,
    StartTime = startTime,
    EndTime = if((status == "OPEN"), "In Progress", else: endTime),
    `Duration (min)` = durationMinutes
| sort StartTime, direction:"descending"

 

 

fetch events
| filter event.kind == "DAVIS_PROBLEM"
| sort timestamp desc

// First Summarzation Start - Group Probem Events by Problem Type
| summarize {startTime = takeFirst(event.start),
            endTime = takeFirst(event.end),
            problemClosedDuration = takeFirst(resolved_problem_duration),
            status = takeFirst(event.status),
            event.name = takeFirst(event.name),
            severityLevel = takeFirst(event.category),
            affected = takeFirst(affected_entity_ids),
            rootCause = takeFirst(root_cause_entity_id),
            dt.davis.is_duplicate = takeFirst(dt.davis.is_duplicate),
            event.id = takeFirst(event.id)}, 
            by:{display_id}
| filter `dt.davis.is_duplicate` == false
| fieldsAdd status = if((status == "ACTIVE"),"OPEN", 
                else:if((status == "CLOSED"), "CLOSED"))
// Summarzation End - Group Probem Events by Problem Type

// Dashboard Fitlers Start

| filter ("ALL" == $Status OR $Status == status)
| filter ("ALL" == $Severity OR $Severity == severityLevel)
// Dashboard Fitlers End


// Second Summarzation Start - Summarize by Problem Type
| summarize Total = count(), by:{Type = event.name}
| sort Total, direction:"descending"
// Second Summarzation Start - Summarize by Problem Type

 

2 REPLIES 2

yuval1983
Organizer

I would also appreciate help here, I don't understand why it doesn't work

fetch events
| filter event.kind == "DAVIS_PROBLEM"
| sort timestamp desc

// First Summarzation Start - Group Problem Events by Problem ID
| summarize {startTime = takeFirst(event.start),
            endTime = takeFirst(event.end),
            problemClosedDuration = takeFirst(resolved_problem_duration),
            status = takeFirst(event.status),
            event.name = takeFirst(event.name),
            severityLevel = takeFirst(event.category),
            affected = takeFirst(affected_entity_ids),
            rootCause = takeFirst(root_cause_entity_id),
            dt.davis.is_duplicate = takeFirst(dt.davis.is_duplicate),
            event.id = takeFirst(event.id)}, 
            by:{display_id}
| filter `dt.davis.is_duplicate` == false
| fieldsAdd status = if((status == "ACTIVE"),"OPEN", 
                else:if((status == "CLOSED"), "CLOSED"))
| fieldsAdd currentTime = toTimestamp(now())-1m                
// First Summarzation Emd - Group Probem Events by Problem ID

// Dashboard Fitlers Start
| filter ("ALL" == $Status OR $Status == status) 
| filter ("ALL" == $Severity OR $Severity == severityLevel)
// Dashboard Fitlers End
              
// Second Summarzation Start - Summarize Closed Problem Count
| summarize Total = count(), by:{timestamp = if(status == "CLOSED",endTime, else:currentTime),status}
//| makeTimeseries Total = Count(), by:{status}
| makeTimeseries ProblemCount = SUM(Total), by:{status}
// Second Summarzation End - Summarize Closed Problem Count

`eventType` isn't allowed here. Please check the autocomplete suggestions before the error for alternative options.

 

fetch events
| filter event.kind == "DAVIS_PROBLEM" // Filter only Davis problems
| expand entity_tags
| filter contains(entity_tags, $Tag) // Filter dynamically based on $Tag
| sort timestamp desc

// First Summarization Start - Group Problem Events by Problem ID
| summarize startTime=takeFirst(event.start),
            endTime=takeFirst(event.end),
            problemClosedDuration=takeFirst(resolved_problem_duration),
            eventType=takeFirst(event.type), // Use `event.type` as a fallback for categorization
            impactedEntities=takeFirst(affected_entity_ids),
            rootCause=takeFirst(root_cause_entity_id),
            isDuplicate=takeFirst(dt.davis.is_duplicate),
            eventID=takeFirst(event.id),
            tags=takeFirst(entity_tags)
| filter isDuplicate == false
| fieldsAdd statusLabel=if(condition=(takeFirst(event.status) == "ACTIVE"), then="OPEN", else="CLOSED") // Derive status dynamically
| fieldsAdd currentTime=toTimestamp(now())-1m
// First Summarization End - Group Problem Events by Problem ID

// Dashboard Filters Start
| expand tags
| filter ("ALL" == $Status OR $Status == statusLabel)
| filter ("ALL" == $EventType OR $EventType == eventType) // Updated to use `event.type` for filtering
| filter contains(tags, $Tag) // Ensure filtering by $Tag works dynamically
// Dashboard Filters End

// Second Summarization Start - Summarize Closed Problem Count
| summarize ClosedProblems=countIf(statusLabel == "CLOSED"),
            OpenProblems=countIf(statusLabel == "OPEN"),
            TotalProblems=count(),
            by eventType
| makeTimeseries ProblemCount=SUM(ClosedProblems),
                 OpenCount=SUM(OpenProblems),
                 TotalIssues=SUM(TotalProblems)
// Second Summarization End - Summarize Closed Problem Count

 

 

Has anyone done something similar?
I’d really appreciate your help!

Featured Posts