13 Jul 2020 01:08 PM - last edited on 17 May 2021 10:41 AM by MaciejNeumann
Hello:
We recently upgraded from a single node cluster to a 3 node HA Clusters
(Node ID 1 Original) (Node ID 6 New) (Node ID 7 New)
First, when a problem is detected, how do the other nodes know not to raise an alert on the same issue?
Is there a way to determine which node detects a problem from from the GUI information?
Is the information copied into all 3 nodes? If so what log is used?
Secondly, we have a lot of notifications that use external API web hooks.
I see information about the notification attempts int he audit.notifications.0.0.log file
How is it determined which node will send out the alert to the source?
Is the notification attempt kept only on that single node sending the alert, or is it copied somewhere onto all 3 servers?
I want to thank the community for taking time to look at my post.
Kindly,
Chris
Solved! Go to Solution.
13 Jul 2020 01:32 PM
Cluster nodes continuously exchange data with each other to get awareness of what was seen and executed. Also Cluster nodes utilize distributed data storage engines - Cassandra and Elasticsearch, so data is replicated and available to all nodes.
There's no need to point what node detect the problem, as it is transparent to the user. If 1 node goes down, the other will take the responsibility of running the correlation of events to raise a problem and trigger notifications. It's the internal business logic embed to the cluster node.
Notification attempts are stored in Cassandra and replicated.
Do you have any additional questions?
13 Jul 2020 01:42 PM
I do understand the transparency to the users. However, if an alert is suppose to go out and does not, I need to know the best way to trace and validate the attempt.
I was working with support and they suggested looking at the audit.notifications.0.0.log on each Cluster memeber.
When i did this.. I saw messages related to attempts for specific problems. I did not however see the same messages replicated across the 3 logs.
That being said, How would i determine what server to seach without having to look through all 3 logs?
Would it be necessary to query the distributed databases from outside the application itself?
Is this supported or a common practice?
Thank you again for your input. 🙂
13 Jul 2020 01:50 PM
If you really need, you have to browse logs on all nodes. Or use some log processing - e.g. Dynatrace, Splunk, Sumologic, or other..
However the question still remains... why would you need to bother with that?
13 Jul 2020 02:04 PM
I would need to bother with this because we have had situations where alerts show up in the problem window but don't reach their destination.
I can for instance in a log see a web-hook attempt and be given the information.whether is succeeded or failed.. here is an example from 1 log.
So the ability to trace it from a centralized query or location would be more efficient then going through 3 or more logs in an HA Cluster environment (1 per node)
13 Jul 2020 02:32 PM
I understand. Currently, there's no better way. Feel free to post a product idea! 🙂