cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Log monitoring: How-to "Inspect 1-minute intervals of log events ingest"

gilles_tabary
Advisor

Hello.

I get `Ingested log data is trimmed` message in Dynatrace Managed CMC events. So I go to the related FAQ : which says : 

 

6. Inspect 1-minute intervals of log events ingest.
- If you see that log events are trimmed to the Maximum ingest of Log Events limit set for this environment, you need to increase it.
- If log ingest was below the limit in subsequent intervals, your log entries will be re-ingested and should be available later, but you could consider increasing the limit to avoid a delay in data processing.

 

But how do I do that ? How do I know in which case I am ? In my Environment Log Viewer, am I supposed to find or not find something related to "dt.ingest.warnings" ? Or is it supposed to be investigated through "Format Table" field named "trimmed" ? In which case what am I supposed to find or not find ? 

Regards.

4 REPLIES 4

Mizső
DynaMight Leader
DynaMight Leader

Hi @gilles_tabary 

You can check and allocate the maximum ingest of log events per minute in CMC at environment setting.

By click on the refresh cluster limit you will have the CLUSTER level overall limit. It can be splitted by the environments. in this example the cluster limit is ~ 168k (based on the memory and cpu capacity). Env1 has 140k / minute, Env2 has 20k Env3 has 8k...

Mizs_10-1695848067315.png

With these two metrics you can monitor and alert the incoming or rejected logs:

dsfm:server.log_and_events_monitoring.events_incoming_count:splitBy():sum:sort(value(sum,descending))

dsfm:server.log_and_events_monitoring.events_rejected_count:splitBy():sum:sort(value(sum,descending))

I hope it helps.

Best regards,

Mizső

Dynatrace Community RockStar 2024, Certified Dynatrace Professional

gilles_tabary
Advisor

Thanks. I new about that already. Any thing about the actual question ?

victor_balbuena
Dynatrace Mentor
Dynatrace Mentor

You should probably consider increasing the maximum number of log events per minute regardless, as per Mizső's message, but to answer your original question:

The message you see about logs being trimmed doesn't always mean that your logs are actually being trimmed. It is a message generated by the DAVIS AI which could be over-reacting by seeing a spike in log ingest and thinking that logs will need to be trimmed, even if there is no data loss. So basically, we need to understand if this is the case, or data is being trimmed for real.

The way to do this is to check, as per the first case, if your log events are being trimmed to the maximum set for the environment. If this is the case, you need to increase the maximum to avoid data loss. If logs are not really reaching the maximum set for the environment (or maybe only once or twice in a row), then the case is that DAVIS AI is calculating, because of a spike, that there will be too many logs, even if this is not true. In this last case, there will be a delay in log ingestion (hence why increasing the maximum is still advised), but no data will be lost.

Hopefully that helps you understand which one is your case.

gilles_tabary
Advisor

Hello.

Interesting point about Davis. Thanks.

After much chating, here is my understanding. Say trimmed warning in CMC has a timestamp equal to 10:23:12. In the log viewer, set time frame exactly to one hour long from 09:23 to 10:23. Then, watch the graph (i.e. the plot). Don't search or try to filter for a special kind of log line, i.e. log events, especially attribute "dt.ingest.warnings" or table format field named "trimmed"  are not to be used, this is not where to look at or what to search for. On the graph, check how high the bars reach. Each bar shows on a one minute interval the number of log events ingested. This is what could be compared to the "Maximum number of log events per minute" set in the CMC. If bars consistently (many minutes in a row) reach approximately this Maximum (not higher, not lower) it may show there could be a problem indeed. If bars are higher than this Maximum it shows (as stated by @victor_balbuena) that all events got eventually ingested, but with a delay. If bars are lower : no problem.

Let me know if this is an acceptable statement (otherwise I'll correct it as to not induce confusion in readers. 🙂 )

I feel the FAQ doc coud be amended with maybe decorated screen shots, stating explicitly where to look at exactly, what to expect, what not to expect, what could be considered as a confirmation of the problem, or a confirmation there is no problem. 

Regards.

Featured Posts