The Business Need for Certificate Monitoring and Management
PKI certificates make the world go around, but they come with a serious flaw: they expire, and—as they do that—cause business disruption and lead to loss of customer confidence. Countless hours have been spent installing, monitoring, and rotating certificates to keep the Internet running.
This problem of certificates expiring in production and causing downtime has been plaguing the IT world since day one. So, it’s a little surprising that we’re still talking about it, nearly three decades later. In recent years there’s also been a strong push to reduce certificate lifetimes for better security, requiring more frequent rotation. More work? Fortunately, in parallel, we’ve been improving our automation capabilities, but we still have a way to go until that’s ubiquitous.
Traditional Monitoring Approaches Fall Short
To understand why certificate expiration is still a problem, perhaps we need to look at the traditional monitoring tools. A newcomer coming to this field might find three types of solutions. From simple to complex, they are:
- Homegrown scripts; many organizations start like this. After their first or second certificate expires in production, they write a quick script to continuously fetch and check the certificates installed on their websites.
- Observability platforms; given that all companies need observability, it’s only natural to embed certificate monitoring as a feature of those platforms.
- Certificate Lifecycle Management (CLM) products; on the high end of the spectrum we have CLM platforms, which are designed to control and automate the entire certificate lifecycle, including renewal. In this case, the thinking is that comprehensive automation will avoid the problem of missed renewals.
There are different challenges associated with each of these approaches. Homegrown solutions are quick to whip up, but severely underestimate the complexity of PKI and the amount of time it takes to develop a robust solution. The situation is better with observability platforms, but their generic approach to monitoring still leaves a lot to be desired in the PKI space. CLM products don’t have that problem, but they’re notoriously difficult to deploy widely enough within an organization to completely solve the problem.
More broadly, a comprehensive solution to the problem requires looking at the problem from a fresh perspective. We’ve found that there are 4 necessary factors:
- Expertise; on the surface, detecting expiring certificates seems easy, but it really isn’t. Scratch the surface and you will discover a great deal of complexity that may require years of knowledge and experience to untangle. There are numerous edge cases to deal with. For example, certificates nominally expire at the designated time, but in practice, there have been many early revocations, for real problems or imposed by major user agents for CAs’ transgressions. Further, even though we usually talk about certificate expiration, we really want our certificates to work, which means that we need to monitor for other types of failure. Examples include certificate chain misconfiguration, Certificate Transparency compatibility, TLS misconfiguration, and so on.
- Coverage; once you’re equipped with sufficient knowledge, the next question is all about being able to observe all locations where certificates are deployed. And this is where even the best traditional solutions fail, because they all require manual configuration. For these tools to be effective, you have to tell them where your certificates are. In modern enterprises, no one knows. Tracking certificates in spreadsheets doesn’t work in today’s world where infrastructure changes every second of the day.
- Third-party monitoring; this aspect is really about coverage, but from a different perspective. These days, it’s very rare to have a company in full control over their infrastructure. On the contrary, even the smallest companies work with dozens of vendors and their certificates.
Traditional approaches fail because they haven’t been designed to fully solve the problem.
Certificate Transparency to the Rescue
Certificate Transparency (CT) is a relatively new addition to the PKI ecosystem. Deployed in 2018 after Chrome made it mandatory, CT was designed to support auditing of the behaviour of various Certification Authorities (CAs) issuing certificates. Before CT, we had no visibility into CAs’ issuing practices. This led to a variety of problems with CAs getting hacked and the attackers issuing certificates for some of the most popular websites in the world. The most infamous such incident was the full compromise and exploitation of a Dutch CA called DigiNotar in 2011.
With CT we get a reliable public record of every website certificate issued anywhere in the world. High-profile companies can monitor CT to ensure that all certificates issued for their properties are legitimate.
CT is also invaluable in the context of certificate monitoring because, if we monitor CT, we discover the domains and subdomains embedded in them, and, with that, we are able to discover the majority of deployed locations. As a result, we get much better coverage. And, even more important, we remove the need to manually configure anything.
A tool that consumes the public certificate stream from CT will be able to continuously look for misissued certificates as well as update the company asset inventory to contribute to comprehensive monitoring of correct deployment. Although technically possible, for most organizations it’s not feasible to monitor CT directly due to the large volume of data. To find their own certificates it’s necessary to consume the entire certificate stream in real time, which amounts to hundreds of events per second.
Certificate Monitoring with Red Sift Certificates
At Red Sift, our approach was to build a dedicated certificate monitoring solution from the ground up, aimed at solving this problem in the best possible way. Our key design decision was to combine PKI expertise with an independent auditor perspective. This enables fast and easy deployments in parallel with any existing tools, such as more complex certificate lifecycle management platforms. Our focus only on monitoring means we don’t have competing priorities and also can’t break anything.
The next challenge to solve is the discovery of all certificates and all installed locations. To help with this, we’ve been monitoring Certificate Transparency since 2017, processing many billions of certificates and building our discovery data stores. In essence, we already know how to find all of our customers’ certificates, before we make even a single monitoring network request.
That said, CT is only one of the discovery starting points. We maintain multiple discovery databases focusing on all key aspects of the global infrastructure. In addition, we connect directly to CAs as well, ensuring we collect all the information our customers want to have in one place.
On the monitoring side, we take advantage of the breadth of our monitoring capabilities, where we inspect a wide range of network and security standards. That helps us identify the connections among services and reliance on third-party services. We go after all identified points of failure. After all, if your vendor’s certificate breaks, your website will go down, so we monitor all third-party certificates just to be sure.
Learn more about Red Sift Certificates here.