In our server someone earlier has configured the alert for Apache Working threads by creating a rate for Busy Threads /Max Threads. But we have never seen any alerts in couple of months whereas Appliaction team stated that they have seen some issues earlier with the number of threads issue in webserver. So, i would like to know how it's configured and how it' works if we want to trigger the alert for 1000/1200 threads.
I would chart the two metrics and make sure you're getting values that you expect.
And do those values end up with a ratio something that makes sense for the Upper Severe threshold you've set of 90%?
BTW: 1000/1200 is 83%, not 90%. Is that possibly why you've not seen any incidents triggered?
Chetan, I would plot the individual measures and see if you're getting values for them, as well as determine if the values make sense. Then mentally determine if you perform the ratio calculation (max/busy), would you come up with a value between 0 and 1. If you don't come to this conclusion, then somethings wrong with the measures.
If you conclude you should get a value between 0-1, then I would focus on the incident. Perhaps change the aggregation, or Evaluation TimeFrame and see if that results in any changes.
One thing that puzzles me, you state you want to trigger when your ratio is 1000/1200, but you're UpperSevere threshold is set to 90%. 1000/1200 I assume to be Busy/Max values. Therefore shouldn't your threshold be 83%?
Thanks for your reply Joe!
Even if i reduce the thresholds to 1% it never triggered any alert. I think its something wrong with the Measures. When i charted the Busy threads and Max threads chart they both have same numbers for count with aggregation as average. may be am looking at wrong aggregation for these measures.
Also, when checking the threads count from the default Monitoring dashboard for one of the host and these values are different with the above measures count.
Attaching screen shots for the charting of Busy and Max threads measures and Monitoring dashboard thread counts.
Please let me know if am missing something in measures configuration.
I believe that rate measures are done on the client side and therefore only available for charting. This is why incidents never fire (managed server side). This could be a good RFE if there isn't one already.