We often experience CAS processing delay. When I look at the Processing Status report, there is some confusion here.
From Delay Chart, I see zdata processing has 1 hour 42 min max deplay on 01/21.
From zdata processing chart, I see max processing time just 5 minutes.
We all know zdata file will be generated every 5 minute. So what's the actual delay on that date? How can I use those chart to narrow down the reason of delays.
I believe that your first chart is aggregated to 1 day resolution that causes this.
When you troubleshoot sample delay remember that:
Also remember that you can edit default diagnostic reports/charts, modify them and save as your own.
To get more insight into what the reason is you can explore the DMI and create additional graphs that can answer question such as number of files, files size, or processed records.
And as Adam says - the benchmark tool (don't run this during production hours and warn the SQL people that it will trigger alerts) will give you a better understanding whether you need to upagrade your SQL.
In release 12.3.1 it's very easy to add nodes to a farm/cluster to share the load as well so recovering from your current state doesn't have to be complicated.
Hi Adam and Ulf,
Thanks for response. Those are really helpful.
Are you sure http://<CAS>/DiagConsole#/diag/DUMP+STACK is the correct link? I tried it and got 404 error.
Regarding Processing time report, I have three questions here,
Ad.0. You should replace <CAS> with your CAS' IP or host name to make it work.
Ad.1. It's the same as I believe the resolution on your report is 1 period, averages are useful only if you aggregate bigger time range using resolutions other than 1 period,
Ad.2. It may come from the other file types processing. Can't tell more w/o server.log.
Ad.3. AMD serves one "native" zdata and the other from NFC process. You can count it as one.