Using public CA certificates with Docker Compose?

Hi everyone,

I was wondering if there was an easy way to provide the “alternative” certificates issued by a public CA such as Let’s Encrypt via the environment file when using Docker Compose to deploy OpenZiti.

An alternative I thought about is mounting the folder /persistent, and editing the files as explained in Using Public CA Certificates | OpenZiti, but being able to provide the certificates via .env file would be better.

Thank you

Hi @Reverse5431. Welcome to the community and to OpenZiti!

I have actually been working towards that from a quickstart perspective. I have some WIP docker compose files but haven’t had the opportunity to finish it off yet.

You can most definitely do this and I do it all the time when I install browZer or when I self-host zrok. Like I said, we just don’t have decent (or any) guide for it yet. There are a couple of minor ‘gotchas’ when using LE, most notably is that if you mount the certs to the filesystem you’ll have to figure out how to grant the container permissions to the files.

If you’re ok with what I have and don’t need too much help, here’s the most basic of steps. I don’t know if this is an ok level of detail or not though (since it’s not much):

  • make a folder somewhere, we’ll call it $HOME/docker-compose-alt-certs: mkdir -p $HOME/docker-compose-alt-certs

  • make a file named (this is important, docker ‘things’) .env in that directory: vi $HOME/docker-compose-alt-certs/.env
    NOTE: “obviously” YOUR_PASSWORD_HERE, YOUR_EXT_DNS_HERE, YOUR_EXT_IP_HERE, etc

    Contents of .env
    # OpenZiti Variables
    ZITI_IMAGE=openziti/quickstart
    ZITI_VERSION=latest
    
    # the user and password to use
    # Leave password blank to have a unique value generated or set the password explicitly
    ZITI_USER=admin
    ZITI_PWD=YOUR_PASSWORD_HERE
    
    # controller name, address/port information
    ZITI_CTRL_NAME=ziti-controller
    ZITI_CTRL_EDGE_ADVERTISED_ADDRESS=YOUR_EXT_DNS_HERE
    ZITI_CTRL_ADVERTISED_ADDRESS=YOUR_EXT_DNS_HERE
    ZITI_CTRL_EDGE_IP_OVERRIDE=YOUR_EXT_IP_HERE
    ZITI_CTRL_EDGE_ADVERTISED_PORT=8441
    ZITI_CTRL_ADVERTISED_PORT=8440
    
    # The duration of the enrollment period (in minutes), default if not set. shown - 7days
    ZITI_EDGE_IDENTITY_ENROLLMENT_DURATION=10080
    ZITI_ROUTER_ENROLLMENT_DURATION=10080
    
    # router address/port information
    ZITI_ROUTER_NAME=ziti-edge-router
    ZITI_ROUTER_PORT=8442
    ZITI_ROUTER_IP_OVERRIDE=YOUR_EXT_IP_HERE
    ZITI_ROUTER_LISTENER_BIND_PORT=8444
    ZITI_ROUTER_ROLES=public,aws
    
    # pki setup
    ZITI_PKI_ALT_SERVER_CERT=/etc/letsencrypt/live/clint.demo.openziti.org/fullchain.pem
    ZITI_PKI_ALT_SERVER_KEY=/etc/letsencrypt/live/clint.demo.openziti.org/privkey.pem
    
  • make a file named docker-compose.yml in that directory: vi $HOME/docker-compose-alt-certs/docker-compose.yml

    Contents of docker-compose.yml
    version: '2.4'
    services:
      ziti-controller:
        image: "${ZITI_IMAGE}:${ZITI_VERSION}"
        env_file:
          - ./.env
        ports:
          - ${ZITI_CTRL_EDGE_ADVERTISED_PORT:-1280}:${ZITI_CTRL_EDGE_ADVERTISED_PORT:-1280}
          - ${ZITI_CTRL_ADVERTISED_PORT:-6262}:${ZITI_CTRL_ADVERTISED_PORT:-6262}
        environment:
          - ZITI_CTRL_NAME=${ZITI_CTRL_NAME:-ziti-edge-controller}
          - ZITI_CTRL_EDGE_ADVERTISED_ADDRESS=${ZITI_CTRL_EDGE_ADVERTISED_ADDRESS:-ziti-edge-controller}
            #- ZITI_CTRL_EDGE_ALT_ADVERTISED_ADDRESS=${ZITI_CTRL_EDGE_ALT_ADVERTISED_ADDRESS:-}
          - ZITI_CTRL_EDGE_ADVERTISED_PORT=${ZITI_CTRL_EDGE_ADVERTISED_PORT:-1280}
          - ZITI_CTRL_EDGE_IP_OVERRIDE=${ZITI_CTRL_EDGE_IP_OVERRIDE:-127.0.0.1}
          - ZITI_CTRL_ADVERTISED_PORT=${ZITI_CTRL_ADVERTISED_PORT:-6262}
          - ZITI_EDGE_IDENTITY_ENROLLMENT_DURATION=${ZITI_EDGE_IDENTITY_ENROLLMENT_DURATION}
          - ZITI_ROUTER_ENROLLMENT_DURATION=${ZITI_ROUTER_ENROLLMENT_DURATION}
          - ZITI_USER=${ZITI_USER:-admin}
          - ZITI_PWD=${ZITI_PWD}
          - ZITI_PKI_ALT_SERVER_CERT=${ZITI_PKI_ALT_SERVER_CERT:-}
          - ZITI_PKI_ALT_SERVER_KEY=${ZITI_PKI_ALT_SERVER_KEY:-}
        networks:
          ziti:
            aliases:
              - ziti-edge-controller
        volumes:
          - ziti-fs:/persistent
          - /etc/letsencrypt/:/etc/letsencrypt/
        entrypoint:
          - "/var/openziti/scripts/run-controller.sh"
    
      ziti-controller-init-container:
        image: "${ZITI_IMAGE}:${ZITI_VERSION}"
        depends_on:
          - ziti-controller
        environment:
          - ZITI_CTRL_EDGE_ADVERTISED_ADDRESS=${ZITI_CTRL_EDGE_ADVERTISED_ADDRESS:-ziti-edge-controller}
          - ZITI_CTRL_EDGE_ADVERTISED_PORT=${ZITI_CTRL_EDGE_ADVERTISED_PORT:-1280}
        env_file:
          - ./.env
        networks:
          ziti:
        volumes:
          - ziti-fs:/persistent
        entrypoint:
          - "/var/openziti/scripts/run-with-ziti-cli.sh"
        command:
          - "/var/openziti/scripts/access-control.sh"
          
      ziti-edge-router:
        image: "${ZITI_IMAGE}:${ZITI_VERSION}"
        env_file:
          - ./.env
        depends_on:
          - ziti-controller
        ports:
          - ${ZITI_ROUTER_PORT:-3022}:${ZITI_ROUTER_PORT:-3022}
          - ${ZITI_ROUTER_LISTENER_BIND_PORT:-10080}:${ZITI_ROUTER_LISTENER_BIND_PORT:-10080}
        environment:
          - ZITI_CTRL_ADVERTISED_ADDRESS=${ZITI_CTRL_ADVERTISED_ADDRESS:-ziti-controller}
          - ZITI_CTRL_ADVERTISED_PORT=${ZITI_CTRL_ADVERTISED_PORT:-6262}
          - ZITI_CTRL_EDGE_ADVERTISED_ADDRESS=${ZITI_CTRL_EDGE_ADVERTISED_ADDRESS:-ziti-edge-controller}
          - ZITI_CTRL_EDGE_ADVERTISED_PORT=${ZITI_CTRL_EDGE_ADVERTISED_PORT:-1280}
          - ZITI_ROUTER_NAME=${ZITI_ROUTER_NAME:-ziti-edge-router}
          - ZITI_ROUTER_ADVERTISED_HOST=ec2-13-58-222-94.us-east-2.compute.amazonaws.com
          - ZITI_ROUTER_PORT=${ZITI_ROUTER_PORT:-3022}
          - ZITI_ROUTER_LISTENER_BIND_PORT=${ZITI_ROUTER_LISTENER_BIND_PORT:-10080}
          - ZITI_ROUTER_ROLES=public
          - ZITI_PKI_ALT_SERVER_CERT=${ZITI_PKI_ALT_SERVER_CERT:-}
          - ZITI_PKI_ALT_SERVER_KEY=${ZITI_PKI_ALT_SERVER_KEY:-}
        networks:
          - ziti
        volumes:
          - ziti-fs:/persistent
          - /etc/letsencrypt/:/etc/letsencrypt/
        entrypoint: /bin/bash
        command: "/var/openziti/scripts/run-router.sh edge"
        
      ziti-console:
        image: openziti/zac
        working_dir: /usr/src/app
        environment:
          - ZAC_SERVER_CERT_CHAIN=${ZITI_PKI_ALT_SERVER_CERT:-}
          - ZAC_SERVER_KEY=${ZITI_PKI_ALT_SERVER_KEY:-}
          - ZITI_CTRL_EDGE_ADVERTISED_ADDRESS=ctrl.clint.demo.openziti.org
          - ZITI_CTRL_EDGE_ADVERTISED_PORT=${ZITI_CTRL_EDGE_ADVERTISED_PORT:-1280}
          - ZITI_CTRL_NAME=${ZITI_CTRL_NAME:-ziti-edge-controller}
          - PORTTLS=8443
        depends_on:
          - ziti-controller
        ports:
          - 8443:8443
        volumes:
          - ziti-fs:/persistent
          - /etc/letsencrypt/:/etc/letsencrypt/
        networks:
          - ziti
    
    networks:
      ziti:
    
    volumes:
      ziti-fs:
    
  • Make the letsencrypt files available to the docker images…

    Add a user, group, chown and chmod the LE files
    sudo groupadd -g 2171 zitiweb
    sudo useradd -u 2171 -M ziggy
    sudo usermod -aG zitiweb ziggy
    sudo usermod -aG zitiweb $USER
    sudo chown -R root:zitiweb /etc/letsencrypt/
    sudo chmod -R g+rX /etc/letsencrypt/
    

    [this doc I copied from a PR I just made for something else]:
    If you use Certbot to get the LE certs, it will make the files it creates available to root only (a good practice). If you run your network as root, this you’ll have no problems but generally, it’s a better practice to not run as root when you don’t need to. In order to run this example as “us” (not the root user) we’ll need to grant specific users the ability to read the files.

    A flexible way to allow other processes to use/access these files is to make a new group and a new user, that is what is shown below. In linux, groups and users are assigned ids. 2171 looks like “ziti” so we’ll use UID 2171 and GID 2171. The example below will make a new group named zitiweb. This group will then be granted ownership of the letsencrypt folder via chown. Changing the ownership of the files to the group will allow any user in that group the ability to read these files so be careful granting this group to users. Then we’ll add the user we are currently logged in with to that group so that “we” can see the files for debugging or other purposes. Finally, we’ll make a ziggy user that is also in this group so that if we want to, we can run processes as ziggy. Please plan accordingly here. This is just a reasonable example to follow to get you going, change it to suit your needs and do not take this example as authoritative. There are many ways to solve this problem, it’s up to you to pick ‘the best’ way.

After making the .env and docker-compose.yml files, you should be able to just docker compose up that compose file.

I THINK this will work…

1 Like

Hi @TheLumberjack, thank you very much for your help.

I have generated a wildcard certificate using Certbot (DNS Challenge), and followed all the instructions. I have changed the ZITI_ROUTER_ADVERTISED_HOST variable to ZITI_ROUTER_ADVERTISED_ADDRESS as I notice this commit: Updating ZITI_ROUTER_ADVERTISED_HOST to use ZITI_ROUTER_ADVERTISED_AD… · openziti/ziti@6e7f3d0 · GitHub.

Despite having followed your instructions step by step, I am failing to have OpenZiti up and running. This is the .env file (I’ve replaced sensitive information such as password, domain and IP):

.env

OpenZiti Variables

ZITI_IMAGE=openziti/quickstart
ZITI_VERSION=latest

Ziti default user credentials

ZITI_USER=admin
ZITI_PWD=password

controller name, address/port information

ZITI_CTRL_NAME=ziti-controller
ZITI_CTRL_EDGE_ADVERTISED_ADDRESS=edge.example.com
ZITI_CTRL_ADVERTISED_ADDRESS=ctrl.example.com
ZITI_CTRL_EDGE_IP_OVERRIDE=xxx.xxx.xxx.xxx
ZITI_CTRL_EDGE_ADVERTISED_PORT=8441
ZITI_CTRL_ADVERTISED_PORT=8440

The duration of the enrollment period (in minutes), default if not set. shown - 7days

ZITI_EDGE_IDENTITY_ENROLLMENT_DURATION=10080
ZITI_ROUTER_ENROLLMENT_DURATION=10080

router address/port information

ZITI_ROUTER_NAME=ziti-edge-router
ZITI_ROUTER_ADVERTISED_ADDRESS=router.example.com
ZITI_ROUTER_PORT=8442
ZITI_ROUTER_IP_OVERRIDE=xxx.xxx.xxx.xxx
ZITI_ROUTER_LISTENER_BIND_PORT=8444
#ZITI_ROUTER_ROLES=public,aws

pki setup

ZITI_PKI_ALT_SERVER_CERT=/etc/letsencrypt/live/example.com/fullchain.pem
ZITI_PKI_ALT_SERVER_KEY=/etc/letsencrypt/live/example.com/privkey.pem

Splitting this commend into multiple bits to overcome the “maximum 10 links” limit

The controller initialization container starts, but keeps waiting for the edge controller on port 8441 endlessly.

ziti-controller-init-container

NOT OVERRIDING: env var ZITI_BIN_DIR already set. using existing value
NOT OVERRIDING: env var ZITI_BIN_ROOT already set. using existing value
NOT OVERRIDING: env var ZITI_ENV_FILE already set. using existing value
NOT OVERRIDING: env var ZITI_HOME already set. using existing value
NOT OVERRIDING: env var ZITI_NETWORK already set. using existing value
NOT OVERRIDING: env var ZITI_SCRIPTS already set. using existing value
NOT OVERRIDING: env var ZITI_SHARED already set. using existing value

adding /var/openziti/ziti-bin to the path
waiting for https://edge.example.com:8441

From what I can see, the controller still creates their own CA, to issue self-signed certificates, despite the Let’s Encrypt certificates have been provided.

ziti-controller

system has not been initialized. initializing…
Populating environment variables
ZITI_NETWORK overridden: ziti
ZITI_HOME overridden: /persistent
ZITI_USER overridden: admin
ZITI_PWD overridden: password
ZITI_BIN_DIR overridden: /var/openziti/ziti-bin
ZITI_CTRL_NAME overridden: ziti-controller
ZITI_CTRL_EDGE_ADVERTISED_PORT overridden: 8441
ZITI_CTRL_EDGE_ADVERTISED_ADDRESS overridden: edge.example.com
ZITI_CTRL_ADVERTISED_ADDRESS overridden: ctrl.example.com
ZITI_CTRL_ADVERTISED_PORT overridden: 8440
ZITI_ROUTER_NAME overridden: ziti-edge-router
ZITI_ROUTER_PORT overridden: 8442
ZITI_ROUTER_LISTENER_BIND_PORT overridden: 8444
ZITI_HOME overridden: /persistent
ZITI_ENV_FILE overridden: /persistent/ziti.env
Your OpenZiti environment has been set up successfully.

A file with all pertinent environment values was created here: /persistent/ziti.env

NOT OVERRIDING: env var ZITI_ARCH already set. using existing value
NOT OVERRIDING: env var ZITI_BINARIES_FILE already set. using existing value
NOT OVERRIDING: env var ZITI_BINARIES_VERSION already set. using existing value
NOT OVERRIDING: env var ZITI_BIN_DIR already set. using existing value
NOT OVERRIDING: env var ZITI_BIN_ROOT already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_ADVERTISED_ADDRESS already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_ADVERTISED_PORT already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_EDGE_ADVERTISED_ADDRESS already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_EDGE_ADVERTISED_PORT already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_EDGE_IP_OVERRIDE already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_EDGE_NAME already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_NAME already set. using existing value
NOT OVERRIDING: env var ZITI_EDGE_IDENTITY_ENROLLMENT_DURATION already set. using existing value
NOT OVERRIDING: env var ZITI_ENV_FILE already set. using existing value
NOT OVERRIDING: env var ZITI_HOME already set. using existing value
NOT OVERRIDING: env var ZITI_IMAGE already set. using existing value
NOT OVERRIDING: env var ZITI_NETWORK already set. using existing value
NOT OVERRIDING: env var ZITI_OSTYPE already set. using existing value
NOT OVERRIDING: env var ZITI_PKI already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_ALT_SERVER_CERT already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_ALT_SERVER_KEY already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_CTRL_CA already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_CTRL_CERT already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_CTRL_EDGE_INTERMEDIATE_NAME already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_CTRL_EDGE_ROOTCA_NAME already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_CTRL_INTERMEDIATE_NAME already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_CTRL_KEY already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_CTRL_ROOTCA_NAME already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_CTRL_SERVER_CERT already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_EDGE_CA already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_EDGE_CERT already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_EDGE_KEY already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_EDGE_SERVER_CERT already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_SIGNER_CERT already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_SIGNER_CERT_NAME already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_SIGNER_INTERMEDIATE_NAME already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_SIGNER_KEY already set. using existing value
NOT OVERRIDING: env var ZITI_PKI_SIGNER_ROOTCA_NAME already set. using existing value
NOT OVERRIDING: env var ZITI_PWD already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_ADVERTISED_ADDRESS already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_ENROLLMENT_DURATION already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_IP_OVERRIDE already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_LISTENER_BIND_PORT already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_NAME already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_PORT already set. using existing value
NOT OVERRIDING: env var ZITI_SCRIPTS already set. using existing value
NOT OVERRIDING: env var ZITI_SHARED already set. using existing value
NOT OVERRIDING: env var ZITI_USER already set. using existing value
NOT OVERRIDING: env var ZITI_VERSION already set. using existing value

adding /var/openziti/ziti-bin to the path
Populating environment variables
ZITI_HOME overridden: /persistent
ZITI_USER overridden: admin
ZITI_PWD overridden: passwordnano
ZITI_PKI overridden: /persistent/pki
ZITI_PKI_SIGNER_CERT_NAME overridden: ziti-signing
ZITI_PKI_SIGNER_ROOTCA_NAME overridden: ziti-signing-root-ca
ZITI_PKI_SIGNER_INTERMEDIATE_NAME overridden: ziti-signing-intermediate
ZITI_PKI_SIGNER_CERT overridden: /persistent/pki/ziti-signing-intermediate/certs/ziti-signing-intermediate.cert
ZITI_PKI_SIGNER_KEY overridden: /persistent/pki/ziti-signing-intermediate/keys/ziti-signing-intermediate.key
ZITI_BIN_DIR overridden: /var/openziti/ziti-bin
ZITI_CTRL_NAME overridden: ziti-controller
ZITI_CTRL_EDGE_NAME overridden: ziti-edge-controller
ZITI_CTRL_EDGE_ADVERTISED_PORT overridden: 8441
ZITI_CTRL_EDGE_ADVERTISED_ADDRESS overridden: edge.example.com
ZITI_CTRL_ADVERTISED_ADDRESS overridden: ctrl.example.com
ZITI_CTRL_ADVERTISED_PORT overridden: 8440
ZITI_PKI_CTRL_ROOTCA_NAME overridden: ctrl.example.com-root-ca
ZITI_PKI_CTRL_INTERMEDIATE_NAME overridden: ctrl.example.com-intermediate
ZITI_PKI_CTRL_EDGE_ROOTCA_NAME overridden: ziti-edge-controller-root-ca
ZITI_PKI_CTRL_EDGE_INTERMEDIATE_NAME overridden: ziti-edge-controller-intermediate
ZITI_PKI_CTRL_SERVER_CERT overridden: /persistent/pki/ctrl.example.com-intermediate/certs/ctrl.example.com-server.chain.pem
ZITI_PKI_CTRL_KEY overridden: /persistent/pki/ctrl.example.com-intermediate/keys/ctrl.example.com-server.key
ZITI_PKI_CTRL_CA overridden: /persistent/pki/cas.pem
ZITI_PKI_CTRL_CERT overridden: /persistent/pki/ctrl.example.com-intermediate/certs/ctrl.example.com-client.cert
ZITI_PKI_EDGE_CERT overridden: /persistent/pki/ziti-edge-controller-intermediate/certs/edge.example.com-client.cert
ZITI_PKI_EDGE_SERVER_CERT overridden: /persistent/pki/ziti-edge-controller-intermediate/certs/edge.example.com-server.chain.pem
ZITI_PKI_EDGE_KEY overridden: /persistent/pki/ziti-edge-controller-intermediate/keys/edge.example.com-server.key
ZITI_PKI_EDGE_CA overridden: /persistent/pki/ziti-edge-controller-intermediate/certs/ziti-edge-controller-intermediate.cert
ZITI_ROUTER_NAME overridden: ziti-edge-router
ZITI_ROUTER_PORT overridden: 8442
ZITI_ROUTER_LISTENER_BIND_PORT overridden: 8444
ZITI_HOME overridden: /persistent
ZITI_ENV_FILE overridden: /persistent/ziti.env
Your OpenZiti environment has been set up successfully.

A file with all pertinent environment values was created here: /persistent/ziti.env

Generating PKI
Creating CA: ctrl.example.com-root-ca
Success

Creating CA: ziti-edge-controller-root-ca
Success

Creating CA: ziti-signing-root-ca
Success

Creating intermediate: ctrl.example.com-root-ca ctrl.example.com-intermediate 1
Using CA name: ctrl.example.com-root-ca
Success

Creating intermediate: ziti-edge-controller-root-ca ziti-edge-controller-intermediate 1
Using CA name: ziti-edge-controller-root-ca

The Admin Panel uses the wildcard certificate, and starts without any problem. A question regarding this, will the control panel be accessible by visiting ctrl.example.com:8443, or should a proxy be used to reach it?

ZAC

running ZAC
ZAC will use this key for TLS: /etc/letsencrypt/live/example.com/privkey.pem
ZAC will present this pem for TLS: /etc/letsencrypt/live/example.com/fullchain.pem
emitting settings.json
Loading Settings File From: /usr/src/app/…/ziti/settings.json
{
edgeControllers: [
{
name: ‘ziti-controller’,
url: ‘https://edge.example.com:8441’,
default: true
}
],
editable: true,
update: false,
location: ‘…/ziti’,
port: 1408,
portTLS: 8443,
logo: ‘’,
primary: ‘’,
secondary: ‘’,
allowPersonal: true,
rejectUnauthorized: false,
mail: { host: ‘’, port: 25, secure: false, auth: { user: ‘’, pass: ‘’ } },
from: ‘’,
to: ‘’
}
TLS initialized on port: 8443
Ziti Admin Console is now listening on port 1408

The edge router, like the init container, keeps waiting for the edge controller on port 8441 endlessly.

ziti-edge-router

_ZITI_ROUTER_NAME set to: ziti-edge-router
NOT OVERRIDING: env var ZITI_BIN_DIR already set. using existing value
NOT OVERRIDING: env var ZITI_BIN_ROOT already set. using existing value
NOT OVERRIDING: env var ZITI_ENV_FILE already set. using existing value
NOT OVERRIDING: env var ZITI_HOME already set. using existing value
NOT OVERRIDING: env var ZITI_NETWORK already set. using existing value
NOT OVERRIDING: env var ZITI_SCRIPTS already set. using existing value
NOT OVERRIDING: env var ZITI_SHARED already set. using existing value

adding /var/openziti/ziti-bin to the path
waiting for https://edge.example.com:8441

Is there something I have missed? Let me know if I can provide any other information to help understanding the problem.

If it gets stuck: waiting for https://edge.example.com:8441, that usually means only one of two things:

  • the controller failed to start
  • the firewall doesn't have that port open

Since docker is in this mix, it could be that there's an exposed port mismatch too, but I can clearly see this in your output so I'm pretty sure that's not the case:

ZITI_CTRL_EDGE_ADVERTISED_PORT overridden: 8441

When this happens to me, I think it was because the container couldn't read the configured cert/key and failed at startup. Let's look at the docker logs for the controller using:

# cd to $HOME/docker-compose-alt-certs
docker compose logs ziti-controller

I just went to my own server, changed the permissions on the certs/keys from LE like this: chown -R root:root /etc/letsencrypt and started my docker compose... My controller fails to start, I see the same "waiting for ...." message you see but in my logs I see this the controller doesn't have permissions to open the LE certs/key and the controller throws a fatal error:

panic: unable to load identity (open /etc/letsencrypt/live/clint.demo.openziti.org/fullchain.pem: permission denied)

docker-compose-alt-certs-ziti-controller-1  | [   0.018]   FATAL edge/controller/subcmd.configureController: {error=[unable to load identity (open /etc/letsencrypt/live/clint.demo.openziti.org/fullchain.pem: permission denied)]} could not read configuration file [/persistent/ziti-controller.yaml]
docker-compose-alt-certs-ziti-controller-1  | controller initialized. unsetting ZITI_USER/ZITI_PWD from env
docker-compose-alt-certs-ziti-controller-1  | [   0.022]   ERROR ziti/ziti/controller.run: {go-version=[go1.20.5] version=[v0.0.0] os=[linux] arch=[amd64] error=[unable to load identity (open /etc/letsencrypt/live/clint.demo.openziti.org/fullchain.pem: permission denied)] revision=[local] build-date=[2020-01-01 01:01:01]} error starting ziti-controller
docker-compose-alt-certs-ziti-controller-1  | panic: unable to load identity (open /etc/letsencrypt/live/clint.demo.openziti.org/fullchain.pem: permission denied)

So I'm guessing that's the issue?

As for the network running it's own PKI, don't worry about the that. YOU WANT THAT (and need it). OpenZiti will still use/maintain/operate it's own full PKI for it's own mTLS connections. Alternative certs are primarily useful for the rest API, for "websocket" support in routers (to support BrowZer) and for ZAC. The overlay network itself will maintain its own PKI.

If you want to use YOUR PKI for identities, that's actually a different feature called "Third Pary CAs"

Anyway, hopefully the issue is just the permission issue. That's why I included that last blurb cause it's kind of a pain... As a reminder, here's what I run to make docker happy.

sudo groupadd -g 2171 zitiweb
sudo useradd -u 2171 -M ziggy
sudo usermod -aG zitiweb ziggy
sudo usermod -aG zitiweb $USER
sudo chown -R root:zitiweb /etc/letsencrypt/
sudo chmod -R g+rX /etc/letsencrypt/

Hope that helps

Really sorry about this. Turns out the permissions were already set properly. As I am using a very low-power and low-end device for the network, it was just very slow. Once the controller finishes setting up the network, both the controller-init and the edge router successfully connect to edge.example.com:8441.

However, something still doesn’t work properly. The controller fails to complete the handshake due to bad certificate. It also fails to send and receive heartbeats.

controller

[ 7.111] INFO xweb/v2.(*Server).Start: starting ApiConfig to listen and serve tls on 0.0.0.0:8441 for server client-management with APIs: [edge-management edge-client fabric]
[ 10.806] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[remote error: tls: bad certificate]} handshake failed
[ 11.689] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[remote error: tls: bad certificate]} handshake failed

[ 80.869] ERROR channel/v2.(*heartbeater).sendHeartbeat: {error=[context deadline exceeded] channelId=[ch{ID_a}->u{classic}->i{ID-b}]} failed to send heartbeat
[ 200.438] ERROR fabric/controller/handler_ctrl.(*heartbeatCallback).CheckHeartBeat: {channelType=[router] channelId=[ID_a]} heartbeat not received in time, closing link
[ 204.119] ERROR edge/controller/handler_edge_ctrl.(*createApiSessionHandler).createApiSession: {error=[channel closed]} failed to send response
[1002.883] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[remote error: tls: unknown certificate authority]} handshake failed
[1004.724] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[tls: first record does not look like a TLS handshake]} handshake failed
[1004.738] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[tls: first record does not look like a TLS handshake]} handshake failed
[1004.752] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[tls: first record does not look like a TLS handshake]} handshake failed
[1004.767] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[tls: first record does not look like a TLS handshake]} handshake failed
[1004.783] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[tls: first record does not look like a TLS handshake]} handshake failed
[1004.797] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[tls: first record does not look like a TLS handshake]} handshake failed
[1004.813] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[tls: first record does not look like a TLS handshake]} handshake failed
[1004.826] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[tls: first record does not look like a TLS handshake]} handshake failed
[1004.843] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[tls: first record does not look like a TLS handshake]} handshake failed
[1004.863] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[tls: first record does not look like a TLS handshake]} handshake failed

The Edge Router fails to get the interface for port 8444, but succeeds for port 8442. (No other service running on port 8444) It then fails to send/receive heartbeats as well, to finally throw a fatal error.

edge router

[ 0.435] WARNING fabric/router/xlink_transport.loadListenerConfig: {error=[no network interface found for 0.0.0.0] addr=[tls:0.0.0.0:8444]} unable to get interface for address
[ 0.439] INFO edge/router/xgress_edge.(*listener).Listen: {address=[tls:0.0.0.0:8442]} starting channel listener
[ 0.481] WARNING edge/tunnel/dns.flushDnsCaches: {error=[exec: “resolvectl”: executable file not found in $PATH]} unable to find systemd-resolve or resolvectl in path, consider adding a dns flush to your restart process

[ 141.663] FATAL edge/router/xgress_edge_tunnel.(*servicePoller).pollServices: {error=[timeout waiting for message reply: context deadline exceeded]} xgress_edge_tunnel unable to authenticate to controller. ensure tunneler mode is enabled for this router or disable tunnel listener. exiting

The controller init finishes initializing the docker volume successfully. However, it says that the certificate authority retrieved from the server is untrusted. Is it referring to the ones generated by the controller, or to the let’s encrypt generated ones?

controller init

Untrusted certificate authority retrieved from server
Verified that server supplied certificates are trusted by server
Server supplied 5 certificates
Server certificate chain written to /home/ziti/.config/ziti/certs/edge.example.com
Token: TOKEN
Saving identity ‘default’ to /home/ziti/.config/ziti/ziti-cli.json



New edge router policy all-endpoints-public-routers created with id: ID
New service edge router policy all-routers-all-services created with id: ID
This docker volume has been initialized.

Any idea what might be causing this?

This actually "might be ok" tbh depending on what is going on... For example, if you have a tunneler running somewhere, and you're tearing down the network (docker compose down -v) as I suspect you might be doing, it's quite common for that older tunneler to keep trying to connect the NEW overlay network, using an OLD PKI (which is dead, because a whole new one was generated), and fail. If that's the situation, you'll see this {error=[local error: tls: bad record MAC]} handshake failed over and over and over again on a relatively periodic interval. Now, the errors in your log all look to be coming in the same second, so that doesn't seem to be the problem here.

Can we just do this, how about we just stop the ziti-router, and maybe stop the ziti-console if you have that running to and take it down to the bare minimum of just the ziti-controller.

Let's add each piece in one at a time and let's see if there's some kind of race condition I've never experienced because for me it's not been this slow? That sound ok?

So basically in that compose file I referenced before, comment out the whole section for the ziti-edge-rotuer and ziti-console... dump the whole network using docker compose down -v, up the network again using docker compose up -d (daemon mode)...

Tail the docker logs using docker compose logs -f and let them settle down, make sure you see NO log messages... If you do, that's an indicator that a ziti-edge-tunnel (or some process) is trying to attach to the controller.. Find that process and stop it...

Once things seem ok, let's just add back the ziti-edge-router service, and then run docker compose up -d again... The router should come online....

Assuming it DOES... exec to the controller:

docker compose exec -it ziti-controller bash

run zitiLogin and let's just see the status of:

ziti edge list edge routers

There should be one and it should be online:

ziti@eb1988feb641:/persistent$ ziti edge list edge-routers
╭────────────┬──────────────────┬────────┬───────────────┬──────┬────────────╮
│ ID         │ NAME             │ ONLINE │ ALLOW TRANSIT │ COST │ ATTRIBUTES │
├────────────┼──────────────────┼────────┼───────────────┼──────┼────────────┤
│ bHPyFPKDKB │ ziti-edge-router │ true   │ true          │    0 │ public     │
╰────────────┴──────────────────┴────────┴───────────────┴──────┴────────────╯
results: 1-1 of 1

Can we try that?

Alright, so I had to regenerate the certificate as something was wrong with it. Then I did run “docker compose down -v” which deleted removed the containers, network and volume. I commented out the edge-router and the console, and then run again docker compose up -d. The handshake failed still occurs, and is the only error that keeps occurring post initialization. No other logs indicating that any process is attempting to connect to the controller.

handshake

open-ziti-controller-1 | [ 9.647] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[remote error: tls: bad certificate]} handshake failed

open-ziti-controller-init-container-1 | This docker volume has been initialized.
open-ziti-controller-init-container-1 exited with code 0
open-ziti-controller-1 | [ 50.514] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[remote error: tls: unknown certificate authority]} handshake failed

I noticed that visiting edge.example.com:8441 provides a NetFoundry issued certificate, rather than the Let’s Encrypt one provided via .env file. Is this intended, as part of the mTLS connections you mentioned before? Also shot port 8441 be changed to 443, so to bypass some network restrictions that block almost any port other than port 443?
image

In then uncommented the ziti-edge-router, and run docker compose up -d again. There are a few error/warning here and there, but the router manages to connect to the controller. I have verified it with “ziti edge list edge-routers”.
image

with the edge router

open-ziti-edge-router-1 | [ 0.426] WARNING edge/router/internal/edgerouter.(*Config).LoadConfigFromMap: Invalid heartbeat interval [0] (min: 60, max: 10), setting to default [60]
open-ziti-edge-router-1 | [ 0.440] WARNING fabric/router/xlink_transport.loadListenerConfig: {addr=[tls:0.0.0.0:8444] error=[no network interface found for 0.0.0.0]} unable to get interface for address
open-ziti-edge-router-1 | [ 0.477] WARNING edge/tunnel/dns.flushDnsCaches: {error=[exec: “resolvectl”: executable file not found in $PATH]} unable to find systemd-resolve or resolvectl in path, consider adding a dns flush to your restart process
open-ziti-edge-router-1 | [ 13.947] ERROR channel/v2.(*heartbeater).sendHeartbeat: {error=[context deadline exceeded] channelId=[ch{ctrl}->u{reconnecting}->i{xxxx}]} failed to send heartbeat
open-ziti-controller-1 | [ 732.309] ERROR transport/v2/tls.(*sharedListener).processConn [tls:0.0.0.0:8441]: {error=[remote error: tls: bad certificate]} handshake failed

Also, I can now access ZAC without any problem, but when visiting it for the first time for a browser window, ZAC prints this error five times:

console error

open-ziti-console-1 | [session-file-store] will retry, error on last attempt: Error: ENOENT: no such file or directory, open ‘sessions/random_alphanumerical_lf.json’

Are these errors negligible, and not to worry about?

The Handshake Issue

So it seems to never settle down? that's what you're saying, right? If that's the case, there's something very strange happening. If you have a browser tab open to "https://edge.example.com:8441", or if you have a ziti-edge-tunnel/or Ziti Desktop Edge for Windows/Mac running, or maybe a ziti router process running somewhere? You can run that out entirely by redoing the entire setup again and changing the .env to use something like:

ZITI_CTRL_EDGE_ADVERTISED_PORT=18441

You could also use netstat or ss if you're familiar with those or tcpdump to track down exactly what process from what IP is attaching to your :8441 port. Something seems to be attempting to access it and is presenting a certificate, trying to create an mTLS connection that the controller is rejecting. That's most likely a router, I'd think.

Alternative Server Certs - Boring Details

This is the helpful hint I needed. With this, I think I finally understand the difference between what you're doing and what I always do and how I've tested/verified this whole thing in the past and why this is failing for you. It comes down to our notion of "Alternate Server Certs". Technically, I think what you're trying to do is replace the server certs, so they would be the primary certs and not use an alternate set. Let me explain...

When I use Alternate Server Certs, it's because I always use AWS and AWS provides a DNS name for it's VMs. So for example I'll have a DNS entry of "ec2-1.2.3.4" that AWS provided for my VM. That's my primary DNS entry. Then I'll also register something like "controller.clint.demo.openziti.org" and that's my alternate DNS entry. I have TWO, but you only want to use one. Our notion of "alternate" server certs is to indicate that you want more than one DNS entry to find/locate your server.... But here, you're only using one. That changes things enough where we might need to do more work to get it working "right" if you want to use the quickstart. It wasn't built for this particular use case.

Easiest Way to Address this Situation

Right now, imo, the easiest way to deal with this situation and use the quickstart would be to provision a second DNS name just for OpenZiti. For example, I just made nf.clint.demo.openziti.org. It points to 13.58.222.94. You can access it at https://clintnf.demo.openziti.org:8441/ and you'll see it uses the self-signed CA the quickstart creates. There's also https://ctrl.clint.demo.openziti.org:8441/ which points to the exact same place and uses docker as we've been discussing here. The only difference is I have changed these two lines in my .env file:

ZITI_CTRL_EDGE_ADVERTISED_ADDRESS=clintnf.demo.openziti.org
ZITI_CTRL_ADVERTISED_ADDRESS=clintnf.demo.openziti.org

Conclusion

Are you willing to make two DNS entries, one for the self-signed PKI and one for the LE PKI? If you do that, change the .env file as shown just above, it should just fire up.

Sorry that this is complex, but there's many different ways to accomplish all these things and without knowing exactly what you were doing and how, my own biases got in the way a little bit from giving you better guidance at the start.

Let me know how you get on! Hopefully you'll be good now.