I see this message in Dynatrace->Server->Settings asking me to partition my database. So far I have not done the partitioning. Will there be any adverse effects if I do not do this?
Database partition is mandatory in case of XLarge and Plus environments.
Check the following link for the 'Database Server / Performance Warehouse Best Practices' and the 'Performance Warehouse Administration':
Here's what will likely happen. As the PW gets bigger and bigger in your XLarge environment, the purge job will become slower and slower, to the point that it will run for hours and eventually not finish. When the PW is partitioned, the purge job can be accomplished by dropping a partition (as opposed to huge numbers of logged row deletes). This is really fast.
I ran across this "rule of thumb" in an internal support document:
As partitioning will add additional complexity in database administration, is should only used when the aging task runs very long ( > 10 hours).
I like this rule, since sometimes we purposely over provision the server to make use more memory and cores, but may not sure if the DB really needs to be partitioned or not.
Using this rule of thumb allows observation of a concrete metric to determine when partitioning is necessary.
If you want to go with this rule of thumb, just periodically check under Status Overview/Tasks and Monitors, find the "Performance Warehouse Clean Up Task". On the leaf node, right click and select "Details" and check the "Duration".
We recently switched to XLarge64 staying on an unpartitioned DB. It massively increased our AppMon performance but the cleanup task time didn't increase.
I use the "Performance Warehouse - Write" dashboard to monitor the Cleanup Task and the Measure Health to monitor the number of measures that we manage: it's incredible how easy is to lose control on them and on BT.
I see this message in Incidents for the last three days:-
Daily Performance Warehouse Summary (Events and Exceptions)
Exceeded time for writing measurements - Count: 6
No exceptions since last summary.
Is this related?
That incident means the time spent writing measures is too high, and that the PWH is not able to keep up with the measure writing load it is taking on. Have you looked at the "Performance Warehouse - Write" dashboard? It will show if the time spent is consistently high or just having a few blips of taking too long.
Here is the explanation.
This may be due to dynaTrace server is getting overloaded by the data coming into the server and when inserting the data into the database,
the clean-up task in the database might taking too long to age/delete old data. The data backs up on the server while the tables
are locked and this carries over into the normal operations when load is much higher.
To combat this you can change the repository clean-up task from 2am to 8pm to allow much more time of off-peak hours to complete the task.
The data backing up in the server and new incoming data is filling up the server memory and cause performance issues.
The partitioning job helps reduce the time of the clean-up task in the warehouse so that the deletion job is removed from the clean-up task
since the data is stored in partitions for each day. When it is time to remove the already aged-data, it is as simple as deleting the partition
instead of a long delete statement with timestamps.