Has anyone written any workflow in ansible (Tower or AWX) that is triggered from Dynatrace, where the problem is reported and a context search is performed, in order to execute the remediation task?
And im not speaking of the classic examples of comments or rollback.
Im trying to build it for Process restart for java apps that go rogue and just crash the box if nobody is watching. Already write it for a single Host. But in a multitude of host with multitude of different process with same name and different host and uses im just scraching my head of how to actually do it.
Im starting to read about Ansible Workflow, but the fact is that after the problem is reporded i need to gather the information (asuming calling the Topology API) and get some context to know what to actually do.
Did anyone had this scene and managed to actually create a self-healing enviroment?
We had started an integration with Ansible where we built a playbook that defined the processes needed to reboot a process when it had crashed. Unfortunately we never finished that play book as other issues had come up and it was side lined.