SSL Certificate Expiry Outages: Microsoft, Ericsson, Spotify, and What They Had in Common

SSL certificate expiry is one of the most preventable causes of major service outages. The expiry date is embedded in the certificate months in advance. And yet it keeps causing production incidents — because most monitoring setups catch it too late, or miss the right certificate entirely.

Documented outages

Microsoft Teams — February 2020

An expired SSL certificate took Microsoft Teams offline for millions of users during the height of COVID-era remote work adoption. The certificate expired without triggering renewal. Microsoft's post-incident review cited monitoring gaps on internal certificate infrastructure.

Ericsson — December 2021

An expired software certificate caused a major mobile network outage across multiple operators globally. Ericsson confirmed the root cause was an expired certificate in SGSN-MME nodes. The outage affected 11 operators across multiple countries.

Spotify — 2020

Spotify experienced an outage traced to an expired TLS certificate on an internal service. The certificate was on an infrastructure component that wasn't covered by standard monitoring because it wasn't customer-facing.

Let's Encrypt root certificate — September 2021

The DST Root CA X3 certificate used by Let's Encrypt as a cross-sign expired, breaking HTTPS on older Android devices and systems that hadn't updated their trust stores. Affected services included Stripe, AWS API endpoints, and numerous banking apps.

The pattern behind every SSL outage

These outages share a structure:

Certificate monitored at 30 days — or not monitored at all
Renewal pipeline silently broken for weeks or months
By the time the alert fires, there's no runway to debug the renewal failure
Or: the monitored certificate is the CDN/load balancer cert, not the origin cert

The CDN trap

Sites behind Cloudflare, Fastly, or AWS CloudFront have two certificates: the CDN edge certificate (what users see) and the origin certificate (between the CDN and the server). Most monitoring checks the CDN cert. The origin cert expires silently behind the edge.

# Check what users see (CDN cert):
openssl s_client -connect yourdomain.com:443 2>/dev/null | openssl x509 -noout -dates

# Check origin cert directly (bypass CDN):
openssl s_client -connect YOUR_SERVER_IP:443   -servername yourdomain.com 2>/dev/null | openssl x509 -noout -dates

The 200-day rule

Let's Encrypt certificates expire in 90 days. If renewal breaks on day 1, a 30-day alert fires at day 60 — giving you 30 days to debug a failure that's been silent for 60. Monitor at 200 days. Anything expiring in under 200 days either intentionally has a short validity or has a broken renewal pipeline.

Weekly SSL monitoring cron job

0 9 * * 1 flock -n /tmp/ssl-check.lock /usr/local/bin/check-ssl.sh

Check multiple domains at once — expiry dates, CDN detection, certificate chain validation, and 200-day early warnings.

SSL Checker →

SSL Certificate Expiry Outages: Microsoft, Ericsson, Spotify, and What They Had in Common

Documented outages

The pattern behind every SSL outage

The CDN trap

The 200-day rule

Related