There are multiple variants how to validate SSL certificates and alert on expiry. I've taken a look at all of them and missed a lack of automation. I therefor created another one that hopefully overcomes some of the limitations and is easier to use in large environments.
As we are not having this feature out of the box for a long time this might be useful.
Summarizing the various attempts and threads from:
SSL Certification expiration checks out of the box - Details? (@Larry R.)
Does Dynatrace monitor SSL certificate validation (@Akshay S.)
Monitor SSL certificate expiry and generate alert (@Dario C.)
(also the contributers @Július L., @Leon Van Z.)
What is different in this plugin?
Where to find it?
You can find the plugin at my personal github repository.
First of all, let me congratulate you for this extension!
I have just installed it and getting the following error when configuring for the first time:
Error(No module named '_cffi_backend')
I have checked the code, and cffi_backend doesn't seem to be included... Any ideas?
I should have read this before:
I'm using it with an 1.229.168 Activegate. Will be trying on a newer one later today...
Hi @AntonioSousa ,
maybe I should consider building the newest version still for older AGs while these are still around. Alternatively you could use one of the older releases, which are built for Python 3.6 (missing only minor config-via-tag feature).
Thanks @r_weber ,
I considered switching to an older version, but didn't even try, as I believe, given the way Dynatrace implements versioning in Activegate extensions, it wouldn't let it work after having installed it in a newer version.
I'm going to compile it locally in my 3.6 dev environment.
When the extension checks for already open problems, how is it doing that? I ask because my initial testing a few weeks ago I had this check a synthetic, problem created with no issues. I manually closed out that problem because I was working through vetting out this extension before rolling out to our customers, and simply trying to understand it more (synthetic i used was not mine so I didn't yet want the app team to see the problem is why i closed the problem). Anyways, that same synthetic will now not open a new problem even though the cert is going to expire in 12 days, check is doing 15 days. I pulled the below from the AG logs and do see where there is an error while trying to get the synthetic hosts so maybe that is the issue? The 14082 minutes lines up with when the original problem was created, back on 1/31. Replaced the URL with xxxx.
Ideally yes we should not be closing out problems manually especially when the problem is not actually resolved 🙂 I was simply testing here and now don't know how to get things back to where a problem will be created, aside from removing and installing the extension back.
2022-02-10 13:55:01.758 UTC INFO [Python][SSL_Certificate_Check][ThreadPoolExecutor-0_1] - [getMonitorsWithOpenEvents] A problem for SS9 D01 is already open for 14082 minutes
2022-02-10 13:55:01.993 UTC ERROR [Python][SSL_Certificate_Check][ThreadPoolExecutor-0_1] - [getSSLCheckHosts] Error while trying to get synthetic hosts
2022-02-10 13:55:01.994 UTC INFO [Python][SSL_Certificate_Check][ThreadPoolExecutor-0_1] - [query] Refreshing open problems for: ['xxxxxx']
2022-02-10 13:55:02.062 UTC INFO [Python][SSL_Certificate_Check][ThreadPoolExecutor-0_1] - [getCertExpiry] Certificate for xxxxxxx (MonitorID: SYNTHETIC_TEST-356B7E801199A8CC) expires in 12 days: ERROR
that is interesting. it actually checks the problem opening error events. Those should also clear upon manual close of a problem.
Can you try to set the treshold to 10, so that the check turns into OK on the next run and then set it back to 15?
That did work, setting the cert expiry threshold to a lower number, one where the extension will see that the cert expiry date is within the threshold and subsequently mark the event as closed. Any clue why this would be needed? We plan to use 1 endpoint and anyone that wants to use this check will add the tag to their synthetic. There will unfortunately be a time where somewhere will close the problem manually even though the issue is not actually addressed (won't happen on the regular i hope but it will happen). This would mean that I would then have to change the threshold to something that is lower to 'force' the extension to mark the event as closed, OR i just change it to something like 1 day then after the clearing, change it back to our standard. Ideally I would love to not have to do this 🙂
Also, through my testing when I do close the ticket manually I see the time value for the event being updated to the time when the problem was closed. As an example, I had a problem get created at 18:15 today and I manually closed it at 18:29. While the problem was open the time value for the event was accurate (showed 18:15) but when I closed out the problem it now shows 'today, 18:29-now'. It shows now because the extension thinks the event is still open. This update of the start time for the event does not occur with problems that close automatically, I've only seen this issue if you manually close the problem.
Hi @travis_anderson ,
interesting observation. It seems that manually closing the problem doesn’t “clear” the problem-opening event so from the plugin perspective the opening event is just refreshed but Dynatrace doesn’t interpret this as another problem opening event.
I might need to do additional checks for this special case of manual close….or state in the problem description of the problem that manual close equals a ignore of the certificate expiry forever 😉
Hi @travis_anderson ,
The new version (1.18) now will reopen problems if the error condition is still met after a manual close of a problem. (Please note the API token permission change requirement)
Hope that helps! Enjoy!
Since I migrated to AG 1.239, the extension stopped working. Getting the errors below in the gateway logs. Tried some tricks recompiling extension, but with no luck...
2022-06-08 10:39:00.141 UTC [000045ac] severe [native] 139751677454080(ThreadPoolExecutor-0_0) - [set_full_status] No module named '_cffi_backend' Traceback (most recent call last): File "/var/lib/dynatrace/remotepluginmodule/agent/runtime/engine_unzipped/ruxit/plugin_state_machine.py", line 336, in _execute_next_task self._query_plugin() File "/var/lib/dynatrace/remotepluginmodule/agent/runtime/engine_unzipped/ruxit/plugin_state_machine.py", line 663, in _query_plugin self._plugin_run_data = self._create_plugin_run_data() File "/var/lib/dynatrace/remotepluginmodule/agent/runtime/engine_unzipped/ruxit/plugin_state_machine.py", line 636, in _create_plugin_run_data plugin_module = importlib.import_module(self.metadata["source"]["package"]) File "/opt/dynatrace/remotepluginmodule/agent/plugin/python3.8/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1014, in _gcd_import File "<frozen importlib._bootstrap>", line 991, in _find_and_load File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 671, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 843, in exec_module File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "/opt/dynatrace/remotepluginmodule/plugin_deployment/custom.remote.python.certcheck/certcheck.py", line 28, in <module> from OpenSSL import SSL File "/opt/dynatrace/remotepluginmodule/plugin_deployment/custom.remote.python.certcheck/OpenSSL/__init__.py", line 8, in <module> from OpenSSL import crypto, SSL File "/opt/dynatrace/remotepluginmodule/plugin_deployment/custom.remote.python.certcheck/OpenSSL/crypto.py", line 11, in <module> from OpenSSL._util import ( File "/opt/dynatrace/remotepluginmodule/plugin_deployment/custom.remote.python.certcheck/OpenSSL/_util.py", line 5, in <module> from cryptography.hazmat.bindings.openssl.binding import Binding File "/opt/dynatrace/remotepluginmodule/plugin_deployment/custom.remote.python.certcheck/cryptography/hazmat/bindings/openssl/binding.py", line 14, in <module> from cryptography.hazmat.bindings._openssl import ffi, lib ModuleNotFoundError: No module named '_cffi_backend' 2022-06-08 10:39:01.058 UTC [000044b7] info [native] 139751206131456(MainThread) - [one_plugin_loop_step] plugin <RemotePluginEngine, meta_name:custom.remote.python.certcheck id:0x7f1a39a37580> threw exception No module named '_cffi_backend'
Thanks! Currently have two separate environments, installed independently, giving the following error:
Error(cannot import name 'x509' from 'cryptography.hazmat.bindings._rust' (unknown location))
Trying to figure it out 😉
Thanks for confirming. Still haven't got the solution. Attaching more logs, might be easier for @r_weber or others...
2022-06-07 15:06:54.488 UTC  severe [native] 4912(ThreadPoolExecutor-0_0) - [set_full_status] cannot import name 'x509' from 'cryptography.hazmat.bindings._rust' (unknown location)^M Traceback (most recent call last):^M File "C:\ProgramData/dynatrace/remotepluginmodule/agent/runtime/engine_unzipped\ruxit\plugin_state_machine.py", line 336, in _execute_next_task^M self._query_plugin()^M File "C:\ProgramData/dynatrace/remotepluginmodule/agent/runtime/engine_unzipped\ruxit\plugin_state_machine.py", line 663, in _query_plugin^M self._plugin_run_data = self._create_plugin_run_data()^M File "C:\ProgramData/dynatrace/remotepluginmodule/agent/runtime/engine_unzipped\ruxit\plugin_state_machine.py", line 636, in _create_plugin_run_data^M plugin_module = importlib.import_module(self.metadata["source"]["package"])^M File "C:\Program Files/dynatrace/remotepluginmodule/agent/plugin/python3.8\importlib\__init__.py", line 127, in import_module^M return _bootstrap._gcd_import(name[level:], package, level)^M File "<frozen importlib._bootstrap>", line 1014, in _gcd_import^M File "<frozen importlib._bootstrap>", line 991, in _find_and_load^M File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked^M File "<frozen importlib._bootstrap>", line 671, in _load_unlocked^M File "<frozen importlib._bootstrap_external>", line 843, in exec_module^M File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed^M File "C:\Program Files/dynatrace/remotepluginmodule/plugin_deployment/custom.remote.python.certcheck\certcheck.py", line 28, in <module>^M from OpenSSL import SSL^M File "C:\Program Files/dynatrace/remotepluginmodule/plugin_deployment/custom.remote.python.certcheck\OpenSSL\__init__.py", line 8, in <module>^M from OpenSSL import crypto, SSL^M File "C:\Program Files/dynatrace/remotepluginmodule/plugin_deployment/custom.remote.python.certcheck\OpenSSL\crypto.py", line 8, in <module>^M from cryptography import utils, x509^M File "C:\Program Files/dynatrace/remotepluginmodule/plugin_deployment/custom.remote.python.certcheck\cryptography\x509\__init__.py", line 6, in <module>^M from cryptography.x509 import certificate_transparency^M File "C:\Program Files/dynatrace/remotepluginmodule/plugin_deployment/custom.remote.python.certcheck\cryptography\x509\certificate_transparency.py", line 10, in <module>^M from cryptography.hazmat.bindings._rust import x509 as rust_x509^M ImportError: cannot import name 'x509' from 'cryptography.hazmat.bindings._rust' (unknown location)^M
@AntonioSousa I see someone else already logged an issue for this:
I tried to get the necessary installed on my Windows host, but alas it is still not working.
I have also tried some more tricks, but have not managed to solve the issue.
One thing that does surprise me is the involvement of "certificate_transparency". It shouldn't be needed, and maybe going to an older version of openssl might help. Maybe @r_weber might help diagnosing this?