Bound to old controller / running ziti in production using docker

Hey, thought I would write about it here for discussion, as not sure it is a bug per se’, but not easy either.

So, I am looking at how to convert my docker-compose quickstart into a fully fledged public internet facing configuration. Anyway, I thought that I would start with the simplified-docker-compose.yml file (I think this needs to be updated for the zac-controller-init container?). Lets just say due to a typo it got a bit messy. Anyway, I deleted the old configurations (think deleted docker-volume) and started it again, which meant that I got a new PKI infrastructure. All was going well, until I turned on the client. I deleted all the identities, and tried to enter my identity. However, the zac_controller log was littered with:

tls: received record with version 301 when expecting version 303

and the client log was littered with

Certificate verification failed e.g. CRL CA or signature check failed
Processing of the the ServerHello handshake message failed

Basically, the client was still sticky to the old PKI, even through I have removed all the identities from it. In the end, I needed to delete the config file from c:\windows\system32\config\systemprofile\AppData\Roaming\NetFoundary\*random_characters*.json

So, two things:
a) Should we really be writing in the C:\windows\system32\config directory - maybe the C:\ProgramData\NetFoundary maybe / could be a better place
b) More importantly - should there be an option (maybe under advanced settings) to delete that config file, or, when all identities are deleted, the config file is deleted as well, as a new JWT would re-enroll it to the (new) controller?

thoughts?

The https://openziti.github.io/ziti/quickstarts/network/hosted.html is geared towards this as well. Yeah, seems like we should add the init container to the simplified-docker-compose so I added an issue for that.

You would almost certainly have required this anyway. The PKI the docker-related quickstarts generate will all be geared towards one host or 'hardcoded' to the names available in docker compose. This means all the certificates generated will use a SANS of ziti-edge-controller, ziti-private-blue etc. When you move to a full-fledged install in a VPS you'll have some new domain name entirely and your controller cert won't be valid, your routers will all have to reenroll etc. If you follow the "how ziti anywhere" quickstart, that one is geared to set you up with a starting controller/router with an externally valid DNS/IP.

Once you remove your PKI you'll need to dump the old one. I believe there's a bug in the ZDEW around this - that's why I expect you would want to "stop/start" the client. It's not a high-frequency occurrence so this particular bug has been waiting to be fixed, it's just not fixed yet. Go to your ZDEW, open the identity, and at the bottom click "forget this identity". Then go back to the main screen and turn the client off/on with the big 'start/stop' button. That'll fix the logs going crazy.

image image

It's the best place at this time. There are three main components to the ZDEW. The UI, the 'data service' (currently called ziti-tunnel.exe, but soon to be replaced with ziti-edge-tunnel.exe) and the monitoring/upgrade service. The data service is a system service. It makes a TUN, adds routes, etc. The identities you enroll are stuffed away inside the %APPDATA% of that user, which is that location. I want to replace this with Credential Manager - or something else but for now that's where they go on Windows.

You'll find when you 'forget' the identity that the file is removed. However the identity is still resident in memory. That's why you need to restart the client at this time. When you add back a new identity with the same exact name, the old identity is still hanging around in memory waiting to be evicted, trying to reconnect etc. That bug is going to be fixed very soon but like I said, it's really, really infrequent for someone to remove an identity at all unless you're doing dev work on the ZDEW (like me) or doing stuff you are.

And if interested I have a under 10 minute video on using the 'host it anywhere quickstart:

Deleting the identity and restarting the client didn’t work in my instance. Sure of it. Anyway, as you said, this issue is unlikely to occur in most peoples day.

I run all my production servers using docker, and very little (basically unless I can’t) installed directly on the OS. I have used the simplified-docker-compose, and changed the names to be FQDN (what I can in the docker-compose and in the .env). Worked when I moved from internal network to 4G, except when I stopped and restarted the desktop client, as it is looking for http://ziti-edge-controller.

I am trying to work out how I can deploy a production ziti with just the simplified-docker-compose script. Thought I would give this a go. Maybe I need to pass through configuration yml files for the components?

The client has the url of the controller in the identity json file. It's set during the enrollment process. If the client is looking for ziti-edge-controller my expectation is the file didn't get deleted properly. I've not seen that happen but that'd explain why you're certain the forget/restart didn't work...

If you have migrated your controller you need to forget that identity. You can do that two ways. Click the 'forget' button as mentioned and restart, but if that's not working the other way is to stop the data service using the UI (the big green stop/start button), the Services widget: "Services->Ziti Desktop Edge Service" or using an administrator prompt and typing net stop ziti. Once stopped you can then delete that identity file from the system profile folder. When you then start ziti the identity will be gone forever.

It sounds to me that identity is maybe 'stuck'. If you go to main menu->feedback and send that zip file it produces to clint at openziti.org i can look at your logs if you like. Otherwise, dropping that identity and making a new one I think should 'fix' the problem.

I think it is more fundamental than that. Just checked the cert on the controller, and it is

 Common Name (CN)             ziti-edge-controller server certificate 
 subjectAltName (SAN)         ziti-edge-controller localhost 127.0.0.1 127.0.0.1 

even though my .env is

ZITI_IMAGE=openziti/quickstart
ZITI_VERSION=latest
ZITI_CONTROLLER_RAWNAME=ziticontroller.example.com
ZITI_EDGE_CONTROLLER_RAWNAME=zitiedgecontroller.example.com

with the docker bit being

  ziti-controller:
    image: "${ZITI_IMAGE}:${ZITI_VERSION}"
    env_file:
      - ./.env
    ports:
      - "1280:1280"
    networks:
      zitiblue:
        aliases:
          - ziti-edge-controller
      zitired:
        aliases:
          - ziti-edge-controller
    volumes:
      - prod-ziti-fs:/openziti
    entrypoint:
      - "/openziti/scripts/run-controller.sh"

what is interesting, is the edgerouter is correct, with the docker config being

 ziti-edge-router:
    image: "${ZITI_IMAGE}:${ZITI_VERSION}"
    environment:
      - ZITI_EDGE_ROUTER_RAWNAME=zitiedgerouter.example.com
    depends_on:
      - ziti-controller
    ports:
      - "3022:3022"
    networks:
      - zitiblue
      - zitired
    volumes:
      - prod-ziti-fs:/openziti
    entrypoint: /bin/bash
    command: "/openziti/scripts/run-router.sh edge"

which results in a cert of:

 Common Name (CN)             ClvKdffxAC 
 subjectAltName (SAN)         zitiedgerouter.example.com localhost 127.0.0.1 

Yeah. Looks like the rawnames are not getting translated into the certiifcates properly. I’ll see if I can give this a go locally and get you a set of steps to get you running in docker. That’ll be a nice add-on to the guides we have already.

The router looking/being correct makes sense. It uses a different process of enrollment and PKI generation. Looks like the controller PKI bootstrapping needs a look.

I’ll give this a go and get back to you. Providing me the mechanism you used to setup/start docker is helpful. Gimme a while and I’ll see if I can make something work. Or if you just have questions around how that PKI is generated and want to dig in, I’m here for that too while I work through trying to do what you’re doing. :slight_smile:

Cheers, I’m sure we’ll get there :slight_smile:

Ok thanks - will look to pull this into another thread as this has gone a little of the original topic. Thanks. I was originally going to look at pulling the yml file through, then I saw the _RAW variables and wondered what they were, so then I just gave it a splat to see what happened. I think it would/might be nice to be able to specify the cert names in an environment variable and it get copied out, so for example ZITI_CONTROLLER_SAN_RAW = “ziti-controller ziti-controller.example.com” and then it writes the cert with only two names on it (instead of localhost etc). From what I see, it looks like you need to provide the yml to prevent the ‘defaults’ from happening. Also (there is always more), looking at changing/specifying the port as well through docker-config.

I pushed a new 0.25.10 docker image with a small change to the scripts that will allow you to set even more DNS values and/or IP addresses. I don’t love what you’ll need to do to get it working but I think that this will work… When I run it locally I can get a cert that is valid for quite a few DNS Names/IP Addresses:

Example DNS/IP reflected in SANS

image

Remove local images if they exist

docker rmi openziti/quickstart:latest
docker rmi openziti/quickstart:0.25.10

Complete .env file

To make that happen I used the following .env file which is what I don’t “love” but it works… We can look to streamline/clean it up over time. I added a network alias for the controller as billy.bob.com and then was able to start docker with:

# OpenZiti Variables
ZITI_IMAGE=openziti/quickstart
ZITI_VERSION=latest
ZITI_CONTROLLER_RAWNAME=ziti-controller
ZITI_EDGE_CONTROLLER_RAWNAME=billy.bob.com
ZITI_EDGE_CONTROLLER_IP_OVERRIDE=20.20.20.20
ZITI_EDGE_CTRL_ADVERTISED_HOST_PORT=billy.bob.com:1280
ZITI_CTRL_ADVERTISED_ADDRESS=billy.bob.com
ZITI_EDGE_CONTROLLER_HOSTNAME=billy.bob.com
ZITI_CONTROLLER_HOSTNAME=billy.bob.com
EXTERNAL_DNS=something.else
EXTERNAL_IP=1.2.3.4

For docker-compose (not simplified-docker-compose.yml) these were all necessary for various reasons. I didn’t end up seeing if I could pare it down for simplified-docker-compose.yml but I did test to make sure it worked for the simplified file and it does.

You don’t need to set EXTERNAL_DNS/EXTERNAL_IP - I just added those on in case you wanted a couple extra… I’m going to put a PR up for these changes and I’ll have @gberl002 test them out Monday to make sure the changes are good for “non-docker” stuff too.

Hopefully that works for ya

Thanks. I have configured and all appears to be going OK. I also reset everything and started from scratch, and forgetting the identity and stopping/starting the client did work - so not sure what planet I was on.

I needed to add port 6262 on the open ports config in the docker file against the controller. This is explained on the host ziti anywhere page so was expected moving to the production configuration.

There are a lot of variables there, and I don’t really understand what each variable does, ie what is the difference between EDGE_CONTROLLER and CONTROLLER variables. Is it not just the controller?

For testing, I did not use the EXTERNAL*, ZITI_EDGE_CONTROLLER_IP_OVERRIDE or ZITI_EDGE_CONTROLLER_HOSTNAME.

While I am also pushing luck here, is it possible to define the jwt lifetime through an environment variable as well? - why - because may be a common change that most people will do.

I didn't explain them either because each one seemed to be being used at different places. They all feel like they need to be cleaned up. We are working on tests for the quickstarts, so once we have those in place I think it'd make sense to clean this all up. They grew organically over time and could benefit from a clean up initiative.

Do you think it would just be better to allow/show people how to mount their own config file and let them change it however they like?

Yes and no. I think for some people, they will need to do this due to needing to change a number of items. However, JWT maybe a common change, so having it done through a variable will mean changes in config files as they are updated through releases will be correct. If you mount a file, then you need to manually confirm that it is still up-to-date.

@gooseleggs I’ve added a PR with the change allowing you to set a variable and alter the Router or Identity enrollment period. You can see that PR here if you’d like.

Also, it may be helpful to run
ziti create config environment
or
ziti create config environment -h
which will provide some detail as to what each environment variable is used for (though some could use some better descriptions). The -h option is a bit cleaner but leaving that off prints out a bunch of text including the current values for those variables that can be placed into a file so you can set up your own set of environment variables and source that file before running the setup. You can also send that output directly to a file by using the -o option like this
ziti create config environment -f "name_of_file.env"

Edit: I forgot to mention, this will be updated for the docker environments as well so you’ll just have to modify a value before running the container or docker-compose.

Thanks. Those are great commands!