TLS Handshake Failed: Remote Error "tls: bad certificate" in Ziti Controller

I'm currently setting up an OpenZiti controller on an AlmaLinux server, and I keep encountering repeated TLS handshake failures with the error "remote error: tls: bad certificate". Here's the detailed output from journalctl -u ziti-controller -f:

...
{"_context":"tls:0.0.0.0:1280","error":"remote error: tls: bad certificate","file":"github.com/openziti/transport/v2@v2.0.138/tls/listener.go:257","func":"github.com/openziti/transport/v2/tls.(*sharedListener).processConn","level":"error","msg":"handshake failed","remote":"10.252.252.30:XXXX","time":"2024-11-06T22:XX:XX.XXXZ"}
...

The error appears after I modified the config.yml file with the following settings:

identity:
  ca:          "pki/root/certs/root.cert"
  key:         "/etc/openziti/certs/letsencrypt/privkey.pem"
  server_cert: "/etc/openziti/certs/letsencrypt/fullchain.pem"
  cert:        "pki/intermediate/certs/client.cert"

The controller service is running, as confirmed by systemctl status ziti-controller. However, these TLS errors suggest that there might be a problem with certificate configuration or compatibility between the controller and the connecting devices.

Here are some specifics of my setup:

  • Controller hostname: almalinuxztna-174
  • IP: 10.252.252.174
  • The certificate used is generated specifically for the domain ziti.telkomuniversity.ac.id.

I've verified that the certificates are in place, but I suspect there could be misconfiguration either in the certificate chain or within the controller configuration file.

That specific error message is not actually an error. Here's the GitHub issue where I've explained and am requesting refinement of that message.

Is there any malfunction besides the confusing "error" message?

Hi @garry0, you really shouldn't change that identity block for letsencrypt support. You should instead leverage "alt_server_certs". By changing that block, you'll likely be breaking your overlay's PKI en-masse.

I assume you just want the API to be served by a 3rd party verifiable cert, but the overlay should maintain it's own PKI.

Have you discovered alt_server_certs? If not, look into it. A good walkthrough imho (with video) is the one I did for BrowZer: Example Enabling BrowZer | OpenZiti

EDIT:

here's an example of what my block looks like:

identity:
  cert:        "/home/ubuntu/.ziti/quickstart/ip-172-31-11-231/pki/ip-172-31-11-231-intermediate/certs/ip-172-31-11-231-client.chain.pem"
  server_cert: "/home/ubuntu/.ziti/quickstart/ip-172-31-11-231/pki/ip-172-31-11-231-intermediate/certs/ip-172-31-11-231-server.chain.pem"
  key:         "/home/ubuntu/.ziti/quickstart/ip-172-31-11-231/pki/ip-172-31-11-231-intermediate/keys/ip-172-31-11-231-server.key"
  ca:          "/home/ubuntu/.ziti/quickstart/ip-172-31-11-231/pki/cas.pem"
  alt_server_certs:
    - server_cert:  "/data/docker/letsencrypt/live/clint.demo.openziti.org/fullchain.pem"
      server_key:   "/data/docker/letsencrypt/live/clint.demo.openziti.org/privkey.pem"
1 Like

Good catch! I overlooked the letsencrypt part. I'd expect some things to break with that configuration too, misleading client handshake errors notwithstanding.

Here's one more example that resembles the configuration you shared.

identity:
  cert:        "pki/intermediate/certs/client.cert"
  server_cert: "pki/intermediate/certs/server.cert"
  key:         "pki/intermediate/certs/server.key"
  ca:          "pki/root/certs/root.cert"
  alt_server_certs:
    - server_cert:  "/etc/openziti/certs/letsencrypt/fullchain.pem"
      server_key:   "/etc/openziti/certs/letsencrypt/privkey.pem"

Importantly, the subject alternative name (SAN) in /etc/openziti/certs/letsencrypt/fullchain.pem must be distinct from the SAN in pki/intermediate/certs/server.cert because the TLS client handshake contains the server name indication (SNI) that tells Ziti how to route the request, i.e., which TLS server certificate to present.

What's your use case for the alt server cert, web console, browzer, something else?

Most Ziti edge SDKs use cases, including all Ziti tunnelers, don't require a publicly trusted cert because they negotiate trust with the Ziti network through other mechanisms like enrollment.

1 Like

Thanks Ken,

@garry0 one more point that's vital, make sure the DNS SANS within the "OpenZiti-internal" PKI doesn't overlap with LetsEncrypt.

If the SANS overlap, you'll have non-deterministic behavior of your overlay. Just make sure you use two separate urls, for example the "internal" pki could map to DNS SANS:

https://openziti-internal-pki.your.domain

while your LE-based cert would look like this:

https://openziti.your.domain

they MUST be different to have deterministic behavior :slight_smile:

1 Like

"I have tried configuring Let’s Encrypt certificates in alt_server_certs, but the domain ziti.telkomuniversity.ac.id is inaccessible (connection refused). However, when the Let’s Encrypt cert is placed directly in the identity configuration, ziti.telkomuniversity.ac.id becomes accessible.

Notably, the subject alternative name (SAN) in both certificates is the same, with output as follows:

[root@almalinuxztna-174 certs]# openssl x509 -in /etc/openziti/certs/letsencrypt/fullchain.pem -text -noout | grep -A1 "Subject Alternative Name"
            X509v3 Subject Alternative Name:
                DNS:ziti.telkomuniversity.ac.id
[root@almalinuxztna-174 certs]# openssl x509 -in /var/lib/ziti-controller/pki/intermediate/certs/server.cert -text -noout | grep -A1 "Subject Alternative Name"
            X509v3 Subject Alternative Name:
                DNS:localhost, DNS:ziti.telkomuniversity.ac.id, IP Address:127.0.0.1, IP Address:0:0:0:0:0:0:0:1

As for my use case, I don’t have a specific one—I applied the Let’s Encrypt certificate because, after creating the domain ziti.telkomuniversity.ac.id, I encountered an error (NET::ERR_CERT_AUTHORITY_INVALID) when attempting to access it. I concluded that this might be due to a lack of a valid certificate, so I generated the Let’s Encrypt certificate for ziti.telkomuniversity.ac.id.

I'm also still unclear on how to effectively implement Zero Trust in a production environment. What should I prepare? Should I follow the use cases you mentioned earlier, or is it more about the network topology?

I'm also still unclear on how to effectively implement Zero Trust in a production environment. What should I prepare? Should I follow the use cases you mentioned earlier, or is it more about the network topology?

I have been spending the last few days reviewing the Ziti router, Ziti controller, and the Zero Trust concepts themselves, trying to gain a better understanding before fully implementing.

1 Like

"Is it possible for me to create a new SAN entry in one of the certificates using a command like this with a new DNS SAN?

openssl req -new -x509 -key /var/lib/ziti-controller/pki/intermediate/private/server.key \
-out /var/lib/ziti-controller/pki/intermediate/certs/server.cert \
-subj "/CN=openziti-internal-pki.your.domain" \
-addext "subjectAltName=DNS:localhost,DNS:openziti-internal-pki.your.domain,IP:127.0.0.1,IP:::1"

Would this ensure there’s no overlap in SANs between the OpenZiti-internal PKI and Let’s Encrypt certificates, and help achieve deterministic behavior?"

This will lead to non-deterministic behavior. You must resolve this or you'll have a hard time.

The first thing you'll need to do is ensure you have an A record for openziti-internal-pki.your.domain. In your case, I'd assume this would be something like ziti-internal-pki.telkomuniversity.ac.id.

Once you have an A record, I would recommend you just go back through the whole installation process. It's quicker and easier and will ensure your PKI is setup properly using the "internal pki" domain name.

Once installed, use ziti ops verify-traffic to verify the OpenZiti overlay is correctly intalled using the internal pki.

After confirming the internal pki works, add the alt server certs.

An important point of note, being a zero trust overlay, we expect you to own and operate your own PKI or in this case - allow OpenZiti to do that on your behalf. Also in a production environment, after everythign works you really should offline the root CA's private key and put it "anywhere else". I don't think we have doc explaining this to people, but there are plenty of resources on the internet that discuss offlining keys and why that's a good idea if you need it.

I think that's the cleanest/easiest path forward for now. Get a new A record for your "internal pki" domain, reinstall the overlay, make sure it works, then apply the alt server certs after that's all setup.

hth

Thank you for the detailed explanation,

I’ll proceed with creating a new A record for the internal PKI with a unique domain, like ziti-internal-pki.telkomuniversity.ac.id, as you suggested. I understand this is crucial to avoid SAN overlap and ensure deterministic behavior during TLS handshakes.

However, I did attempt to reinstall the Ziti controller, but it didn’t go smoothly. During the bootstrap process, I encountered the following error:

ERROR: failed to create default admin in database
ERROR: something went wrong during bootstrapping
WARN: set VERBOSE=1 or DEBUG=1 for more output
WARN: see output in '/tmp/tmp.u4hl470TeB'

And here’s the content from /tmp/tmp.u4hl470TeB:

{"error":"unable to load identity (open pki/intermediate/certs/server.chain.pem: no such file or directory)","file":"github.com/openziti/ziti/controller/subcmd/init.go:150","func":"github.com/openziti/ziti/controller/subcmd.configureController","level":"fatal","msg":"could not read configuration file [/var/lib/private/ziti-controller/config.yml]","time":"2024-11-08T21:43:43.645Z"}

From this log, it appears that server.chain.pem is missing, although I haven’t modified anything related to that certificate.

Once I resolve this issue, I’ll follow through with the rest of the steps, including verifying the overlay setup with ziti ops verify-traffic and then adding the Let’s Encrypt certificate in the alt_server_certs section as advised.

It looks like you're in a partially-configured state. Since you're already trying to reinstall, here's the steps to reset the state.

  1. Clean the service state.

    sudo systemctl disable --now ziti-controller.service;
    sudo systemctl reset-failed ziti-controller.service;
    sudo systemctl clean --what=state ziti-controller.service || sudo rm -rf /var/lib/private/ziti-controller
    
  2. Purge the package, including configuration files, unless you wish to re-use the answers you gave the first time.

    APT - Debian, Ubuntu, etc.

    sudo apt-get purge openziti-controller
    

    RPM - RedHat, Fedora, etc.

    sudo dnf remove openziti-controller
    
  3. Verify the state directory is empty if it still exists.

    sudo ls -lR /var/lib/private/ziti-controller
    
  4. Reinstall the package openziti-controller if you purged or uninstalled it.

  5. Run bootstrap.bash to generate a config or create config.yml manually in the state dir.

  6. Start ziti-controller.service

These are adapted from the Linux controller guide's "uninstall" section: Controller Deployment | OpenZiti

I added the rm command to the cleaning step because someone had a version of systemd that didn't support the --what=state option.