Upgrade 1.1.15 -> 1.6.3

Hello everyone,

I am currently trying to upgrade from version 1.1.15 to 1.6.3. After adding trustDomain to my controller-config, all connections are working again.

However, I have moved the ZAC to a separate port, the corresponding config in the controller looks like this:

web:
  - name: public
    [...]
  - name: private
    bindPoints:
      - interface: 0.0.0.0:8080
        address: mgmt.openziti.example.com:443
    options:
      idleTimeout: 5000ms
      readTimeout: 5000ms
      writeTimeout: 100000ms
      minTLSVersion: TLS1.2
      maxTLSVersion: TLS1.3
    apis:
      - binding: edge-client
        options: { }
      - binding: edge-management
        options: { }
      - binding: fabric
        options: { }
      - binding: health-checks
        options: { }
      - binding: zac
        options:
          location: /ziti-console
          indexFile: index.html

In version 1.1.15 I was able to reach ZAC, after the upgrade I now see the following error message in the log:

Jun 26 09:40:07 server01 openziti[49778]: panic: could not validate server at web[1]: identity is not valid for provided host: [mgmt.openziti.example.com]. is valid for: [127.0.0.1, ::1, localhost, openziti.example.com]

Here is the question, how do I solve this? Thanks for your help

Hi @ZzenlD. This is telling you that your configuration is invalid for the certificates you have generated. OpenZiti never should have allowed you to operate in an invalid configuration but it used to allow for this. With version 1.4.3+ we started to check the certificates configured with the configuration. This is telling you that your certs are valid for the following addresses:

but your controller is trying to advertise mgmt.openziti.example.com, which is not in the list. To fix this, you must create a certificate that matches the desired advertisement.

Depending on how you installed your controller, that can be done in different ways. The easiest way is usually to find the key for the server certs and regenerate them using the ziti pki commmand.

I see, the router has the following option in the configuration for exactly this purpose:

edge:
  csr:
    country: US
    province: NC
    locality: Charlotte
    organization: NetFoundry
    organizationalUnit: Ziti
    sans:
      dns:
        - localhost
        - openziti-router
        - router.openziti.example.com
      ip:
        - "127.0.0.1"
        - "::1"

Is there something similar for the controller?

Hi everyone,
I'm currently trying to update the controller from 1.6.3 to 1.6.14 and I think I'm getting the same error as last time. The log says:

panic: could not validate server at web[1]: identity is not valid for provided host: [mgmt.openziti.root-dev.example.com]. is valid for: [127.0.0.1, ::1, localhost, openziti.root-dev.example.com]

The controller configuration looks like this:

v: 3
db:                     "/ziti-controller/bbolt.db"
trustDomain: root-dev.example.com
identity:
  cert:        "pki/intermediate/certs/client.chain.pem"
  server_cert: "pki/intermediate/certs/server.chain.pem"
  key:         "pki/intermediate/keys/server.key"
  ca:          "pki/root/certs/root.cert"
ctrl:
  options:
    advertiseAddress: tls:openziti.root-dev.example.com:443
  listener:             tls:0.0.0.0:1280
healthChecks:
  boltCheck:
    interval: 30s
    timeout: 20s
    initialDelay: 30s
edge:
  api:
    sessionTimeout: 30m
    address: openziti.root-dev.example.com:443
  enrollment:
    signingCert:
      cert: pki/intermediate/certs/intermediate.cert
      key:  pki/intermediate/keys/intermediate.key
    edgeIdentity:
      duration: 180m
    edgeRouter:
      duration: 180m
web:
  - name: public
    bindPoints:
      - interface: 0.0.0.0:1280
        address: openziti.root-dev.example.com:443
    identity:
      ca:          "pki/root/certs/root.cert"
      key:         "pki/intermediate/keys/server.key"
      server_cert: "pki/intermediate/certs/server.chain.pem"
      cert:        "pki/intermediate/certs/client.chain.pem"
    options:
      idleTimeout: 5000ms
      readTimeout: 5000ms
      writeTimeout: 100000ms
      minTLSVersion: TLS1.2
      maxTLSVersion: TLS1.3
    apis:
      - binding: edge-client
        options: { }
  - name: private
    bindPoints:
      - interface: 0.0.0.0:8080
        address: mgmt.openziti.root-dev.example:443
    options:
      idleTimeout: 5000ms
      readTimeout: 5000ms
      writeTimeout: 100000ms
      minTLSVersion: TLS1.2
      maxTLSVersion: TLS1.3
    apis:
      - binding: edge-client
        options: { }
      - binding: edge-management
        options: { }
      - binding: fabric
        options: { }
      - binding: health-checks
        options: { }
      - binding: zac
        options:
          location: /ziti-console
          indexFile: index.html

Now, the strange thing I don’t understand is that I’ve attached the output below: I connect to the controller and view the current certificate, where the mgmt.openziti domain is missing, so I create a new server certificate with that domain. After that, the domain is included in the certificate.

# Show acutal server certificate
[user@openziti-controller ziti-controller]$ openssl x509 -in pki/intermediate/certs/server.cert -noout -text | grep -A1 "Subject Alternative Name"
            X509v3 Subject Alternative Name: 
                DNS:localhost, DNS:openziti.root-dev.example.com, IP Address:127.0.0.1, IP Address:0:0:0:0:0:0:0:1


# Generate new server certificate with mgmt.openziti domain
[user@openziti-controller ziti-controller]$ ziti pki create server \
        --allow-overwrite \
        --pki-root ./pki \
        --pki-country DE \
        --pki-province NRW \
        --pki-locality Frankfurt \
        --pki-organization "Example" \
        --pki-organizational-unit "IT" \
        --ca-name intermediate \
        --server-name openziti.root-dev.example.com \
        --spiffe-id 'controller/openziti' \
        --dns "localhost,openziti.root-dev.example.com,mgmt.openziti.root-dev.example.com" \
        --ip "127.0.0.1,::1"
Using CA name:  intermediate
Success

# Show new generated server certificate
[user@openziti-controller ziti-controller]$ openssl x509 -in pki/intermediate/certs/server.cert -noout -text | grep -A1 "Subject Alternative Name"
            X509v3 Subject Alternative Name: 
                DNS:localhost, DNS:openziti.root-dev.example.com, DNS:mgmt.openziti.root-dev.example.com, IP Address:127.0.0.1, IP Address:0:0:0:0:0:0:0:1

But when I restart the controller container and view the certificate again, it’s probably the “old” certificate again:

# Show server certificate after controller-container restart
[user@openziti-controller ziti-controller]$ openssl x509 -in pki/intermediate/certs/server.cert -noout -text | grep -A1 "Subject Alternative Name"
            X509v3 Subject Alternative Name: 
                DNS:localhost, DNS:openziti.root-dev.example.com, IP Address:127.0.0.1, IP Address:0:0:0:0:0:0:0:1

Is the certificate being overwritten somewhere? Where am I going wrong?

Thanks for the help.

AAANd now i see you asked this question at the end (i admit i didn't read the issue fully - shame on me)

Is the certificate being overwritten somewhere?

It's possible yes. I think you need to somehow update the 'answer file' from deployemnts. @qrkourier is usually the one that would know this off the top of his head. I'll have to dig in some to understand what is happeneing.... I think it's possible....

The only reason I manually created all the certificates using ziti pki create during the initial setup was that I wanted to specify the Country, Province, Locality, Organization, and Organizational Unit fields in my CA myself, and unfortunately there is no option to set these via environment variables during a controller deployment.

I did some digging and I was able to reproduce the issue with docker compose and version 1.6.14 (you used the word 'container' so i assume you're using docker)...

Update your compose file ZITI_AUTO_RENEW_CERTS: "false".

Then recreate your certs and it'll stop this behavior.

Wow, thank you so much for even reproducing the error. Yes, by “container” I mean Docker—or, in my case, Podman.

But if I set the ZITI_AUTO_RENEW_CERTS option, the controller won't renew any certificates at all, and they'll expire unless I do something manually, right?

Is there any way to use an ENV variable to tell the controller which country, locality, IPs, and DNS domains it should use when automatically renewing the certificates?
I think only a small adjustment would be needed, since the feature itself is already implemented; it’s just that the values can’t be defined.

For example, you could set the variables

  • ZITI_CONTROLLER_COUNTRY
  • ZITI_CONTROLLER_LOCALITY
  • ZITI_CONTROLLER_ORGANIZATION
  • ZITI_CONTROLLER_SERVERCERT_DNS

to your default values by default, but they can be overridden with:

ZITI_CONTROLLER_SERVER_DNS=openziti.example.com,mgmt.openziti.example.com

Is this an option?

yes, you'll have to renew the certs on your own after changing that flag. Personally, I think it's something you should want to do and own on your own anyway in your situation. To be honest, I'm not sure this sort of change is something that we'll look to make. All changes comes with maintenance and support costs, my guess is that for now it's not something we would want to do at this time just because it's not been an issue before. It could change in the future.

But doing it manually is very prone to errors. Which script renews the certificates when the container starts? Unfortunately, I couldn't find any relevant source code for this on GitHub.

It would be enough if the script executed this instead of

ziti pki create server \
        --allow-overwrite \
        --pki-root ./pki \
        --pki-country US \
        [...]

it executed this one instead, i.e., set the default values:

ziti pki create server 
        --allow-overwrite 
        --pki-root ./pki 
        --pki-country ${ZITI_CONTROLLER_COUNTRY:-US} \
        [...]

The additional maintenance effort isn’t really that much, or am I missing something?

I took a closer look at bootstrap.bash inside the Controller-Container and found a solution that works, but it behaves the same way as before if the variables aren't defined.

Here's the change I made:

#/bootstrap.bash
makePki() {

    [...]

    ziti pki create ca \
      --pki-root "${ZITI_PKI_ROOT}" \
      --ca-file "${ZITI_CA_FILE}" \
      --pki-country "${ZITI_CA_COUNTRY}" \
      --pki-province "${ZITI_CA_PROVINCE}" \
      --pki-locality "${ZITI_CA_LOCALITY}" \
      --pki-organization "${ZITI_CA_ORGANIZATION}" \
      --pki-organizational-unit "${ZITI_CA_ORGANIZATION_UNIT}"

    [...]

    ziti pki create intermediate \
      --pki-root "${ZITI_PKI_ROOT}" \
      --ca-name "${ZITI_CA_FILE}" \
      --intermediate-file "${ZITI_INTERMEDIATE_FILE}"
      --pki-country "${ZITI_CA_COUNTRY}" \
      --pki-province "${ZITI_CA_PROVINCE}" \
      --pki-locality "${ZITI_CA_LOCALITY}" \
      --pki-organization "${ZITI_CA_ORGANIZATION}" \
      --pki-organizational-unit "${ZITI_CA_ORGANIZATION_UNIT}"
      
    [...]
}

issueLeafCerts() {

    [...]
    
    ziti pki create server \
      --pki-root "${ZITI_PKI_ROOT}" \
      --ca-name "${ZITI_INTERMEDIATE_FILE}" \
      --key-file "${ZITI_SERVER_FILE}" \
      --server-file "${ZITI_SERVER_FILE}" \
      --dns "localhost,${ZITI_CTRL_ADVERTISED_ADDRESS},${ZITI_CTRL_ADVERTISED_MGMT_ADDRESS}" \
      --ip "127.0.0.1,::1" \
      --allow-overwrite >&3  # write to debug fd because this runs every startup
      --pki-country "${ZITI_CA_COUNTRY}" \
      --pki-province "${ZITI_CA_PROVINCE}" \
      --pki-locality "${ZITI_CA_LOCALITY}" \
      --pki-organization "${ZITI_CA_ORGANIZATION}" \
      --pki-organizational-unit "${ZITI_CA_ORGANIZATION_UNIT}"

    [...]
}



# set defaults
: "${ZITI_CA_COUNTRY:=US}"  # country of the ca
: "${ZITI_CA_PROVINCE:=NC}"  # province of the ca
: "${ZITI_CA_LOCALITY:=Charlotte}"  # locality of the ca
: "${ZITI_CA_ORGANIZATION:=NetFoundry}" # organization that owns the ca
: "${ZITI_CA_ORGANIZATION_UNIT:=Ziti}"  # organizational unit responsible for the ca
: "${ZITI_CTRL_ADVERTISED_MGMT_ADDRESS:=$ZITI_CTRL_ADVERTISED_ADDRESS}"  # management address of the Ziti controller; accessible by default via ZITI_CTRL_ADVERTISED_ADDRESS

Would that be a solution?

I didn't look too deeply but it seems like it'd work. I'd be a bit cautious and back that bash up though. If/when you upgrade I am not sure if we would retain changes made to it. Cheers!

Yes, the upgrade worked, but the bootstrap.bash script is included in the container and will, of course, be overwritten again when the image is updated. That is why I’m asking if you could implement the change in the official image.
The maintenance effort is the same as for all other variables in the bootstrap script, and if the variables aren’t defined, the script will work as before.

I understand the ask. It's something we'll have to discuss amongst the team.

That would be great and would make things a lot easier. I'm curious to see what the answer is.
I'd rather not just pass the modified Bootstrap script into the container just for that reason. That's not really a clean solution :slight_smile:

Is there any news on this yet? The changes required are minimal, and the script behaves exactly the same as before, even if the additional variables aren't set.

Oh gee. I completely dropped the ball on this one to be totally honest. I'll have to get this context back but I'll put this on my todo list and I'll follow up on it soon, probalby mid next week.

Sorry about that

I've not forgotten about this thread -- just haven't been able to get to it.... I'll keep trying to get back here though. just an fyi.

Thanks for the info. I’d be happy if it could be implemented soon. Then the update to the latest 1.6 version will work, followed by the switch to 2.0 - which I’m really looking forward to :star_struck:

Ok. I've finally been able to get around to understanding your problem here. Sorry it's taken me so long... I was able to reproduce the problem locally and I see what you mean. I want to rework how this works a bit but in the meantime you can set ZITI_AUTO_RENEW_CERTS: "false"...

This will mean restarting the container won't affect your certs allowing you to regenerate them properly without having them get overwritten... In the meantime, I'll try to fix this going forward.

add:

  services:
    ziti-controller:          # whatever they named the controller service
      image: openziti/ziti-controller
      environment:
        ZITI_AUTO_RENEW_CERTS: "false"

then regenerate the server cert once with all of your advertised names in --dns:

ziti pki create server --pki-root ./pki \
  --ca-name intermediate \
  --key-file server --server-file server \
  --dns "localhost,openziti.root-dev.example.com,mgmt.openziti.root-dev.example.com" \
  --ip "127.0.0.1,::1" --allow-overwrite

then docker compose up -d to recreate the container...

I think that should allow you to upgrade properly. I'll be trying to get around to doing some upgrade tests like this to see if your proposal works out (thanks for digging in there). I'll try to follow up here if/when I feel like I have a workable solution