System down due to Unknown Expired Cert Issue

Greetings,

I am running version 1.6.15, and I restarted my controller this morning, and things went badly. None of my routers can connect to the controller, as they show this message:

Jun 30 18:04:55 storziti01.ops.gq1.comanyname.com ziti[1670998]: [65095.469] ERROR ziti/router/env.(*networkControllers).connectToControllerWithBackoff.func2: {error=[error connecting ctrl (tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2026-06-30T18:04:55Z is after 2026-06-28T20:01:55Z)] endpoint=[tls::5443]} unable to connect controller

When I look at the certs referenced in my router config, they look ok. I get the following:

[root@storziti01 ziti]# openssl x509 -noout -dates -in /opt/ziti/er-gq1-fsx1.cert

notBefore=Oct 31 19:30:35 2025 GMT

notAfter=Oct 31 19:31:35 2026 GMT

[root@storziti01 ziti]# openssl x509 -noout -dates -in /opt/ziti/er-gq1-fsx1.server.chain.cert

notBefore=Oct 31 19:30:35 2025 GMT

notAfter=Oct 31 19:31:35 2026 GMT

When I look at the certs referenced in my controller config file, I get:

[root@ip-10-3-2-176 ziti]# openssl x509 -noout -dates -in /var/lib/sia/certs/storage-ops.aws.openziti.cert.pem

notBefore=Jun 30 14:39:46 2026 GMT

notAfter=Jul 7 15:39:46 2026 GMT

[root@ip-10-3-2-176 ziti]# openssl x509 -noout -dates -in /var/lib/sia/certs/storage-ops.aws.openziti.cert.pem

notBefore=Jun 30 14:39:46 2026 GMT

notAfter=Jul 7 15:39:46 2026 GMT

[root@ip-10-3-2-176 ziti]# openssl x509 -noout -dates -in /opt/ziti/pki/comanyname-test-signing-intermediate/certs/companynametest-signing-intermediate.cert

notBefore=Aug 23 00:52:11 2022 GMT

notAfter=Aug 20 00:53:10 2032 GMT

I can't figure out where my expired cert is, and I already tried deleting and re-creating/enrolling a router, and it also fails for certificate issues. Any ideas?

Thanks!

Hi @greggw01, most likely your controller's server cert is expired.

Let's probe using hosts/ports... Open the controller's config file and find the ctrl-plane address: scroll to the section that starts with ctrl: near the top. Under it there's a line advertiseAddress: tls:HOST:PORT. The HOST and PORT after tls: are the ctrl-plane address that your routers connect to and validate. Set some variable to the host:port like: ctrl_plane=host:8440.

Do the same thing for the APIs. Find the edge address: scroll down to the section that starts with web:. Inside it, look under bindPoints: for a line named advertise: (it reads advertise: HOST:PORT). That HOST and PORT are the edge/client API address. Set some variable to the host:port like: ctrl_apis=host:8441.

Now probe each address by running:

openssl s_client -connect $ctrl_plane </dev/null 2>/dev/null | openssl x509 -noout -subject -issuer -dates
openssl s_client -connect $ctrl_apis </dev/null 2>/dev/null | openssl x509 -noout -subject -issuer -dates

Let's make sure both of the endpoints have certs that are valid. I'll be one of them isn't, the control plane one...

Assuming that's the case you'll want to roll that cert and restart the controller.

Mint a new, longer-lived server cert using something along these lines:

ziti pki create server \
  --pki-root /path/to/pki \
  --ca-name <intermediate-ca-name> \
  --key-file <controller>-server \
  --dns "<ctrl-advertise-host>,localhost" --ip 127.0.0.1,::1 \
  --server-file <new-cert-name> \
  --expire-limit 3650 #i mean -maybe not 10 years? but it's up to you

Make sure the DNS list includes the exact host your advertiseAddress uses, or routers will still reject it. Use a long --expire-limit (days) so this doesn't recur in a week. (I noticed in your output you have a cert that has a very short lifetime -- /var/lib/sia/certs/storage-ops.aws.openziti.cert.pem is only good for 7 days??)

Then update the identity.server_cert (and the matching cert/key if they changed) in the config to point at the new files, and restart the controller. Re-run the two openssl s_client probes to confirm the new notAfter is years out (or whatever), not days.