cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Log monitoring lost at once on tens of log file

gilles_tabary
Advisor

Hello.

Some of my users reported having lost from one day to an other all log monitoring on tens and tens of log file previously configured. Happened twice last week. Twice this week. This is kind of major because every time, metrics and alerts based on them are lost in prod !

 

After much testing and digging in Configuration Change Audit logs it turns out I see very strange entries.

For instance I want to test the activation of the monitoring of one log :
             "value": "/my/wanted/log/path/I/want/to/monitore/logs/dmgr/SystemOut.log",
but the audit logs shows that it actualy replace an existing already monitored path, hich is not what I want !

              "oldValue": "/why/the/heck/this/log/path/is/replaced/var/log/tomcat/xyz/xyz-daily.log"

What ! Why ? Where does that come from ?

 

Further more the Audit Logs shows many removal :

              { "op": "remove", "path": "/selections/1"},

Why ? What is removed ? How come this happen ?

 

 

2021-07-26 12:48:31 UTC 
{
"eventType": "UPDATE",
"tenantId": "41...",
"userId": "u1MY_PERSONAL_USER",
"userIdType": "USER_NAME",
"userOrigination": "webui (10.x.x.x)",
"sessionId": "no...",
"identity": "LOG_STORAGE_CONFIG",
"identityCategory": "CONFIG",
"success": true,
"timestamp": 1627303711110,
"message": null,
"jsonPatch": "
[
{
	"op": "replace",
	"path": "/selections/0/selection/paths/0/path",
	"value": "/my/wanted/log/path/I/want/to/monitore/logs/dmgr/SystemOut.log",
	"oldValue": "/why/the/heck/this/log/path/is/replaced//var/log/tomcat/xyz/xyz-daily.log"
},
{
	"op": "replace",
	"path": "/selections/0/selection/pgsOrOses/0/procGroup/longId",
	"value": 301xxx,
	"oldValue": -601xxx
},
{	"op": "remove",	"path": "/selections/0/selection/hosts/0"},
{	"op": "remove",	"path": "/selections/1"},
{	"op": "remove",	"path": "/selections/1"},
{	"op": "remove",	"path": "/selections/1"},
... many many times ...

 

No one here use (yet) API calls to set this logs.

Audit logs shows same kind of event for different users.

This is not a DDU licence exhaustion : we still have some.

Anyobe experienced this already ? Any feedback welcomed.

Regards.

 

 

Key phrases for further searches : log concentration lost, log monitoring disabled, my log is not monitored anymore and I don't know why.

6 REPLIES 6

ct_27
DynaMight Pro
DynaMight Pro

Not sure at all if this is related but we had similar occur early in our DT implementation when we didn't know the system as well.    For us, under Settings -> Log sources and storage.   We used the Host perspective tab to select a bunch of logs to monitor.   Then a few weeks later someone came in and selected a single log under the 'Process groups perspective'.  DT shows a warning but it doesn't really warn you enough that if you save under a different perspective it completely clears out all the selections you had made in the "other" perspective.

 

Maybe this is what is happening to you as well.  It took us a few days to figure out logs stopped coming in then another few days of trial and error (and pulling audit logs) to figure out the root cause.

HigherEd

This sounds very much it could be the reason of our problem. It is so strange. 😉 Thanks.

But then how did you fix this very risky mess ? By not using "Process groups perspective" ? Manage the logs monitored only through API ? 

Unfortunately there is no way to fix this. It's a ticking time bomb.  We've learned our lesson enough that it's well known by all DT Administrators to never change the Log Monitoring Perspective.   

 

I have asked DT to fix. We're also waiting on DT to release the new Logging pages but haven't heard anything on this topic for along time now.

HigherEd

Any link to an RFE I could vote on ?

gilles_tabary
Advisor

For the record : support is informed through ticket 6445 .

Featured Posts