Controller upgrade + Controller backup/restore

As part of this question: Question about terminator - #30 by Foles_90 I was advised to upgrade the controller version.

I am using the " Host OpenZiti Anywhere" deployment. I searched in the documentation on how to perform an upgrade but could not find anything.
I assumed that I just needed to run the expressInstall function, however that did not work.
I am currently on version 0.27.2 and the function is downloading version 0.30.3

I cannot find exactly the error that I got but the first thing that I got was this message:

******** Getting OpenZiti Binaries ********
Getting OpenZiti binaries

No existing binary found, creating the ZITI_BIN_DIR directory

Which I found strange because there is an OpenZiti Controller and router running in that VM.

Then, it went ahead and created a PKI, but kept saying that there is already keys present, so I said NO to the option to override the current PKI.
After that it checked for open ports:

******** Ensure the Necessary Ports Are Open ********
Checking Controller's port (6262) Open
Checking Edge Router's port (3022) Open
Checking Edge Controller's port (1280) Open
Checking Router Listener Bind Port's port (10080) Open
Expected ports are all available

So far so good. Then I kept getting this error:

waiting for https://openziti-controller:1280

I created a DNS record with openziti-controller pointing to the loopback but that did not fix the issue.

So my first question here is, did I do anything wrong in the upgrade process? What is the correct way to upgrade the " Host OpenZiti Anywhere" deployment?


At this point the upgrade was not successful and also the controller is not working anymore. So here comes my second question. I could not find any documentation for backup and restore.
I found a few forum posts saying that I should create a snapshot of the db using ziti edge db snapshot and also backup the whole PKI folder.
I did that, installed a clean OpenZiti controller and then tried to restore the db and pki without success.

Is there any documentation on how to do this?

At the end I was able to re-deploy the controller using an old snapshot and now is back running with the older version, but I need to do the upgrade soon since the issue with the "invalid terminators" is affecting the performance

Thanks!!

Hi @Foles_90, funny you should ask, I also noticed we don't have any official documentation on upgrading and I'm actually working on a writeup, coming soon.

Firstly, just so it's documented for anyone seeing this thread in the future, running expressInstall again will not work for an upgrade, this will actually introduce quite a few problems.

I see you mentioned you are running 0.27.2 which is actually kind of perfect since I am using my personal network as the test bed for my write up and it's running version 0.26.5. The main problem I assume you are running into is that, around 0.28.2 we had a fairly major quickstart cleanup (refactor) which renamed a lot of environment variables. There's also a PKI trust chain change that has occurred since that version so it's likely not just a simple, "follow these few steps" type of upgrade.

Here is a synopsis of the steps that I have jotted down for an overview to start my write up. I'm going to go ahead and run through these steps on my 0.26.5 network and update them as I run into any snags. I'll post my updates and let you know.

General Upgrade Process (in its simplest form)

  1. Create a database backup file by running ziti edge db snapshot and copy that file to a backup location (check the ziti controller config's db section for the location of that file)
  2. Backup the controller PKI by copying the $ZITI_HOME/pki/ directory to a backup location
  3. Backup the controller config file, also by copying the file to a backup location
  4. Obtain new binaries by sourcing the latest ziti-cli-functions.sh script and running getZiti, you will be informed that there is a new version of ziti available, select yes when asked to upgrade.
    source /dev/stdin <<< "$(wget -qO- https://get.openziti.io/quick/ziti-cli-functions.sh)"
    
1 Like

Hey @Foles_90, sorry to keep you waiting. My test network appears to have some PKI related issues I'm working around. I just wanted to let you know I am still working on this, I should have something soon.

Hiya @Foles_90,

I was rereading this issue (we're still dealing with a strange PKI-related issue Geoff mentioned) but I feel like there's probably a small issue here that I'd kinda like to understand better if we can.

You said you tried a backup of the db and pki, but didn't have success, but then you were able to get back running from an older backup. If you get a moment and agree, is there any chance you can outline exactly what you did to see if there's something for us to learn there? I wouldn't have expected that, but it makes me wonder if there's a flow we just haven't considered yet... Maybe not, maybe... Thanks!

Hey, sorry for the late response, just returned from PTO

The controller is deployed on an Ubuntu VM running in an EC2 instance in AWS. Some days prior to trying the upgrade I took a snapshot in AWS of the whole EC2 instance and that was what I used for the rollback.

However, not everything came back perfectly. I am getting some random issues with the GUI console that did not happen before.
But the main issue is that the performance issue is still happening. I can see easily replicate the issue that I explained here: Question about terminator - #30 by Foles_90 and I need to try that upgrade to see if this fixes it.

The latest version has an important fix around terminators that might help out. What kind of 'random' issues are you seeing in the UI? Can you explain/demonstrate the issue in a set of steps, video, gif, etc?

Yes, I'm looking forward to upgrading to that one once the upgrade process is fixed, thanks!

So far I have been kicked out of the console in random scenarios, this is the way I found to reproduce it but it happened on other situations as well:

Ahhh. I wonder if you're getting logged out without realizing it. I thought we had fixed that issue a long time ago, but I honestly don't recall at this point. Could it be that ZAC just isn't redirecting you to login until you perform some 'action', then it pushes you back?

I uploaded a video where you can see the issue, let me know if you can see it.

Since I am so far behind in the firmware version I don't think it is necessary to troubleshoot these issues until I upgrade.

Please remember that if there is a way to create a new deployment and just migrate the PKI and DB, that is also good for me, whatever can get me to the latest version :sweat_smile:

Oh my goodness, so you did! Discourse showed it as just a drive link. Somehow my brain just skipped right over it... I see you logged in, went to edit a service, then clicked 'save' and it boots you out. That's definitely odd!

Can you open devtools when this happens? Are there any clear errors that are shown to you in the console log? It looks like you're using ZAC 2.5.4 which is quite a bit behind. You can't upgrade just the ZAC component?

If you want to go that route, you should be able to do this any time you want. The only real requirement is that whatever new machine you bring online must use either the same IP or the same DNS entry as the original install. If you do that, you should have no problem replacing the machine with a newer one. If you don't do that, your whole PKI will be invalid and that'll invalidate your entire overlay, so it's pretty important. :slight_smile:

Here are a rough sketch of the steps I'd do if you want to go this route:

  • stop old ziti router and old ziti controller
  • copy the $ZITI_HOME location (assuming you followed the host it anywhere/local quickstart that'll be at $HOME/.ziti/quickstart/$(hostname -s)/$(hostname -s) or just $ZITI_HOME if you have the .env sourced
  • copy the systemd unit files as well
  • backup old machine or offline it or 'whatever'...
  • bring up a new machine with the same IP or update DNS with the new IP...
  • put the ziti backup into the same location as before (or pick a new home for it)
  • put the systemd unit files back
  • reload systemd
  • start ziti controller/router

Thanks!!!

I will try that soon.
Since I am using an elastic IP on AWS, I can use the same IP and DNS record for the new controller so that part should be easy.

To confirm, I do not need to do a DB snapshot and backup that and the PKI folder separately, right? I can just copy the whole $ZITI_HOME location and the systemd unit files.

Yah. Just shut down the controller first. You can do a backup of course, but you'll be doing that anyway when you copy the whole folder. It'll have the 'db' folder in there.

Sorry for the long delay @Foles_90 I was battling a tricky TLS issue with my network. Luckily, the issue should only be a problem if you upgrade from <v0.27.0 to >=v0.27.0 so since you're running v0.27.2 you shouldn't see it but just to be safe I'll include the workaround at the end.

Upgrading a network's binary version

  1. Create a database backup file by running ziti edge db snapshot and copy that file to a backup location (if necessary, check the ziti controller config's db section for the location of that file)

  2. Backup the controller PKI by copying the $ZITI_HOME/pki/ directory to a backup location

  3. Backup the controller config file, also by copying the file to a backup location

  4. If you're using systemctl services , stop the services

    sudo systemctl stop ziti-controller
    sudo systemctl stop ziti-router
    sudo systemctl stop ziti-console
    
  5. Get the latest version of ziti, unfortunately there is a bug in the getZiti function when performing an upgrade so, you will have to unset your ziti variables before running getZiti. You can also perform this step manually if you prefer.

    source /dev/stdin <<< "$(wget -qO- https://get.openziti.io/quick/ziti-cli-functions.sh)"
    unsetZitiEnv
    getZiti
    
  6. Update your services (ziti-controller, ziti-router, and ziti-console as necessary to point to the newly downloaded binary. Your service is likely using the ziti-controller and ziti-router commands which have since changed, they should now be without the hyphen. Here's an example from my network below, notice the version change and the ziti command change.

    # The old command
    ExecStart="/home/ubuntu/.ziti/quickstart/homeassistant2/ziti-bin/ziti-v0.26.5/ziti-controller" run "/home/ubuntu/.ziti/quickstart/homeassistant2/homeassistant2.yaml"
    
    # The new command
    ExecStart="/home/ubuntu/.ziti/quickstart/homeassistant2/ziti-bin/ziti-v0.30.4/ziti" controller run "/home/ubuntu/.ziti/quickstart/homeassistant2/homeassistant2.yaml"
    
  7. Reload and start the updated services

    sudo systemctl daemon-reload
    sudo systemctl start ziti-controller
    sudo systemctl start ziti-router
    sudo systemctl start ziti-console
    

The TLS Issue

It's unlikely you'll encounter it since your current network is >=v0.27.0 but, if you happen to get a certificate expired error when trying to start up your tunneler(s), you may be seeing the same issue I was which is that you're getting an expired cert returned from your CA bundle. To confirm it is the same issue, run the two following commands.

openssl s_client -connect <your-router-ip>:<and-port>
openssl s_client -connect <your-router-ip>:<and-port> -servername <your-router-SubjAltName>

If one of those (likely the second one) returns with an expired cert, open up your router CA bundle and remove the expired cert. The CA bundle should be located in $ZITI_HOME/pki/routers/<your-router-name>/cas.cert

Thanks for the detailed explanation, unfortunately it did not work. When I do step 5 this is the result:

ubuntu@openziti-controller:~$ sudo systemctl stop ziti-controller
ubuntu@openziti-controller:~$ sudo systemctl stop ziti-router
ubuntu@openziti-controller:~$ sudo systemctl stop ziti-console

ubuntu@openziti-controller:~$ source /dev/stdin <<< "$(wget -qO- https://get.openziti.io/quick/ziti-cli-functions.sh)"
ubuntu@openziti-controller:~$ unsetZitiEnv

unsetting [ZITI_CTRL_PORT] ZITI_CTRL_PORT=8440
...

ubuntu@openziti-controller:~$ getZiti
The path for ziti binaries has not been set, use the default (/home/ubuntu/.ziti/quickstart/openziti-controller/ziti-bin/ziti-v0.30.5)? (Y/n) Y
INFO: using the default path /home/ubuntu/.ziti/quickstart/openziti-controller/ziti-bin/ziti-v0.30.5
Getting OpenZiti binaries

No existing binary found, creating the ZITI_BIN_DIR directory (/home/ubuntu/.ziti/quickstart/openziti-controller/ziti-bin/ziti-v0.30.5)
Downloading https://github.com/openziti/ziti/releases/download/v0.30.5/ziti-linux-amd64-0.30.5.tar.gz to /home/ubuntu/.ziti/quickstart/openziti-controller/ziti-bin/ziti-v0.30.5/ziti-linux-amd64-0.30.5.tar.gz
OpenZiti binaries v0.30.5 successfully extracted to /home/ubuntu/.ziti/quickstart/openziti-controller/ziti-bin/ziti-v0.30.5

Now if I go into that folder there is only 1 file called ziti

ubuntu@openziti-controller:~$ cd /home/ubuntu/.ziti/quickstart/openziti-controller/ziti-bin/ziti-v0.30.5
ubuntu@openziti-controller:~/.ziti/quickstart/openziti-controller/ziti-bin/ziti-v0.30.5$ ls
ziti

If I go into the old folder, I can see files for the controller, the router and the tunnel

ubuntu@openziti-controller:~/.ziti/quickstart/openziti-controller/ziti-bin/ziti-v0.27.2$ ls
ziti  ziti-controller  ziti-router  ziti-tunnel

Maybe that is ok, but I am also not sure what to do on step 6

That's expected, it looks like you're still on track. We combined the multiple binaries into one so now ziti-controller, ziti-router are included in the ziti binary but they are referenced without a dash. So, in step 6, you need to update your services (assuming you created some services, if not then you can disregard this part). Your services, if you have them, will be using the old binaries ziti-router and ziti-controller so that's why I mentioned this step

Your service is likely using the ziti-controller and ziti-router commands which have since changed, they should now be without the hyphen.

You'll need to update the service to call run ziti controller run rather than ziti-controller run. And same goes for the router (ziti-router run to ziti router run. Check out the example difference.

# The old command
ExecStart="/home/ubuntu/.ziti/quickstart/homeassistant2/ziti-bin/ziti-v0.26.5/ziti-controller" run "/home/ubuntu/.ziti/quickstart/homeassistant2/homeassistant2.yaml"

# The new command
ExecStart="/home/ubuntu/.ziti/quickstart/homeassistant2/ziti-bin/ziti-v0.30.4/ziti" controller run "/home/ubuntu/.ziti/quickstart/homeassistant2/homeassistant2.yaml"

If you didn't create services when you created your network then you can simply change whatever startup command you're using to use the ziti router and ziti controller commands instead of ziti-router and ziti-controller

Let me know if this clears things up.

Thanks a lot!
Upgrade was succesful!

1 Like

Awesome, I'm glad it helped. And thanks for letting us know.

Hey, It's me again :sweat_smile:

I realized that we did not change .env file during the upgrade, and now it is still like this:


export ZITI_ARCH="amd64"
export ZITI_BINARIES_FILE="ziti-linux-amd64-0.27.2.tar.gz"
export ZITI_BINARIES_FILE_ABSPATH="/home/ubuntu/.ziti/quickstart/openziti-controller/ziti-bin/ziti-linux-amd64-0.27.2.tar.gz"
export ZITI_BINARIES_VERSION="v0.27.2"
export ZITI_BIN_DIR="/home/ubuntu/.ziti/quickstart/openziti-controller/ziti-bin/ziti-v0.27.2"
export ZITI_BIN_ROOT="/home/ubuntu/.ziti/quickstart/openziti-controller/ziti-bin/ziti-bin"

Do I need to change that to reflect the new version that I installed? What would that look like?

Thanks!

I am pretty sure you shouldn't need to change those entries. You can change them, and if you were to re-run the helper functions like createControllerSystemdFile or createRouterSystemdFile then the stub file that's generated would probably be "out of date". If you already updated any systemd files you had, you don't need to worry about these too much.

I can't think of a reason that you must update them. If @gberl002 knows of a reason though, I'm keen to learn too! :slight_smile:

Yeah, there is no need to update the env file unless you plan on using the latest ziti-cli-functions.sh as that references the environment variables quite heavily. However, because you were running a version <v0.28.2 then you'll experience some issues if you try to use the cli functions because in 0.28.2 a major change to the environment variable names occurred.

There is a migration function that will update your env file variables' names. However, while writing up my upgrade doc I found a minor bug in that function so that's being fixed.

If, like @TheLumberjack said, you already updated your systemd files then you don't need to worry about these specific variables. The most likely function you'd use is zitiLogin and while that will use the binary in ZITI_BIN_DIR, it shouldn't be an issue logging in with an older binary.