Move to HA set up from non HA

First thing first, CONGRATULATIONS on getting to 1.0 milestone! Kudos to the entire team! Really happy to see that.

I have the current openziti set up in AWS in non HA mode. I am thinking to deploy a new controller in HA mode in EKS and then make current controller as a member of the HA config. Would that work?

How would I move the existing bolt db to HA config?

How does the DB part work if controller is deployed in EKS?

TIA.

1 Like

First thing first, CONGRATULATIONS on getting to 1.0 milestone! Kudos to the entire team! Really happy to see that.

Thanks! Appreciate you coming along on the journey :slight_smile:

I have the current openziti set up in AWS in non HA mode. I am thinking to deploy a new controller in HA mode in EKS and then make current controller as a member of the HA config. Would that work?

How would I move the existing bolt db to HA config?

We've got the start of an upgrade guide here There are also a few temporary feature flags you'll need to set in the router and Go SDK, which are documented in the developer setup.

I would recommend converting your existing controller to run in HA mode. The upgrade will show you how to initialize the cluster using your existing bolt DB.

How does the DB part work if controller is deployed in EKS?

There's a section in the HA overview about how data storage differs in HA.

Let me know if that makes sense, or if we can clarify anything. Doc still needs some cleanup and needs to be pushed to the official doc site before official release.

Keep in mind the open issues noted in the 1.1.0 release notes for HA alpha2. We're working on fixing those last couple of things and should have a beta1 release out next week. After that, we'll be testing, polishing, etc.

Cheers,
Paul

1 Like

I just deployed a quickstart configuration with two more public routers and I'm amazed by openziti.

I slightly customized the controller: I changed the default port to 443 and I split the management and the client apis.

Now I'd like to migrate my configuration to HA, but I'm struggling to understand the exact steps of the entire procedure.

I added the raft stanza and I already configured the advertise Address.

My server certificate does not have a SPIFFE ID so I need to recreate it.
Is it sufficient to follow this post:

recreating only the server part and adding the --spiffe-id part?

If my controller is deployed as ziti.example.com, SPIFFE ID should be "controller/ziti" or 'example.com/controller/ziti'?

Is the remaining part of the quickstart pki OK?
Does the root CA already have the "trust-domain" part, or the --explict spiffe-id makes it superfluous?

After restarting my controller with the raft part and the update server certificate, I should have a running one node cluster. That's right?

How do I add the second node?
I presume the command 'ziti agent cluster add' does require an already deployed controller.
How should I deploy the new controller? Migrating the configuration, pki included, changing the controller name and creating appropriate server certificate?

Sorry for the long and quite confused post, but I'm trying to figure out the whole process.

Thanks
Fabio

Hi @Fabio72 welcome to the community and to OpenZiti!

:blush: that's great!

Nice!

From the quickstart -- no this won't be sufficient. I have been getting ready to comment about HOW one might actually accomplish this.


MAKE SURE YOU BACKUP FIRST

Here are my notes. Given what you've already accomplished, I expect you'll hit no issues here and realize you need all the variables set properly...

MAKE SURE YOU BACKUP FIRST


Did you backup??? :slight_smile: You can just cp -r the entire quickstart folder. That's the easiest thing to do imo.

ziti pki create server \
	--spiffe-id  spiffe://$EXTERNAL_DNS/controller/migrated \
	--dns "$EXTERNAL_DNS,$(hostname),localhost" \
	--ip "$EXTERNAL_IP,127.0.0.1" \
	--pki-root "$ZITI_PKI" \
	--ca-name "$ZITI_PKI_CTRL_INTERMEDIATE_NAME" \
	--key-file "${EXTERNAL_DNS}-server" \
	--server-file controller.2025.server \
	--server-name migrated
ziti pki create client \
	--spiffe-id  spiffe://$EXTERNAL_DNS/controller/migrated \
	--pki-root "$ZITI_PKI" \
	--ca-name "$ZITI_PKI_CTRL_INTERMEDIATE_NAME" \
	--key-file "${EXTERNAL_DNS}-server" \
	--client-file controller.2025.client \
	--client-name migrated

NEW_CTRL_SERVER_CERT=$(realpath $(find . -name "*2025*server*chain*" | grep $ZITI_PKI_CTRL_INTERMEDIATE_NAME))
NEW_CTRL_CLIENT_CERT=$(realpath $(find . -name "*2025*client*chain*" | grep $ZITI_PKI_CTRL_INTERMEDIATE_NAME))

ziti pki create server \
	--spiffe-id  spiffe://$EXTERNAL_DNS/controller/migrated \
	--dns "$EXTERNAL_DNS,$(hostname),localhost" \
	--ip "$EXTERNAL_IP,127.0.0.1" \
	--pki-root "$ZITI_PKI" \
	--ca-name "$ZITI_PKI_CTRL_EDGE_INTERMEDIATE_NAME" \
	--key-file "${EXTERNAL_DNS}-server" \
	--server-file controller.2025.server \
	--server-name migrated
ziti pki create client \
	--spiffe-id  spiffe://$EXTERNAL_DNS/controller/migrated \
	--pki-root "$ZITI_PKI" \
	--ca-name "$ZITI_PKI_CTRL_EDGE_INTERMEDIATE_NAME" \
	--key-file "${EXTERNAL_DNS}-server" \
	--client-file controller.2025.client \
	--client-name migrated

NEW_EDGE_SERVER_CERT=$(realpath $(find . -name "*2025*server*chain*" | grep $ZITI_PKI_CTRL_EDGE_INTERMEDIATE_NAME))
NEW_EDGE_CLIENT_CERT=$(realpath $(find . -name "*2025*client*chain*" | grep $ZITI_PKI_CTRL_EDGE_INTERMEDIATE_NAME))

echo "Update your controller config file. "
echo "Assuming you ran a quickstart, you should have a separate PKI for CTRL/EDGE"
echo ""
echo "Modify the control plane identity block with:"
echo "server_cert: \"${NEW_CTRL_SERVER_CERT}\""
echo "cert:        \"${NEW_CTRL_CLIENT_CERT}\""
echo ""
echo "Modify the edge api identity block with:"
echo "server_cert: \"${NEW_EDGE_SERVER_CERT}\""
echo "cert:        \"${NEW_EDGE_CLIENT_CERT}\""

Yes.

We are working on finalizing this doc still. So you're catching us right in the middle of getting that doc out. Paul merged some doc JUST TODAY you can find at Operating a Controller Cluster | OpenZiti

Let's start there and move on once you have a look at my reply

Thank you very much Clint!
I was hoping to see one of yours scripts :grin:
I have a daily scheduled backup (using tar over sshfs and the router itself as terminator) so I should be pretty safe.
I will try to understand and then apply your instructions ASAP.
I will give a feedback when ready

Cheers
Fabio

One more question: migrated is the unique ID of the node?
Can I freely choose a naming schema as long as it is unique per node?

I believe you mean the spiffed id? I believe those should unique, yes but in the same trust domain. see Controller Certificates | OpenZiti. This example relies on the trust domain being discovered in the root/intermediate (which you don't have setup) so ensure you use the full spiffed id: spiffe://$EXTERNAL_DNS/controller/ctrl1|2|3 <--1, 2, 3, whatever

ziti pki create server --pki-root ./pki --ca-name ctrl1 --dns "localhost,ctrl1.ziti.example.com" --ip "127.0.0.1,::1" --server-name ctrl1 --spiffe-id 'controller/ctrl1'

ziti pki create server --pki-root ./pki --ca-name ctrl2 --dns "localhost,ctrl2.ziti.example.com" --ip "127.0.0.1,::1" --server-name ctrl2 --spiffe-id 'controller/ctrl2'

ziti pki create server --pki-root ./pki --ca-name ctrl3 --dns "localhost,ctrl3.ziti.example.com" --ip "127.0.0.1,::1" --server-name ctrl3 --spiffe-id 'controller/ctrl3'

The cluster is started and everything seems OK. I can use services and I can use commands like "ziti fabric list circuits"
Routers are online with non config modification.
But I cannot use agent commands.as non privileged user

~$ ziti agent list
ā•­ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā•®
ā”‚ PID ā”‚ EXECUTABLE ā”‚ APP ID ā”‚ UNIX SOCKET ā”‚ APP TYPE ā”‚ APP VERSION ā”‚ APP ALIAS ā”‚
ā”œā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
ā•°ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā•Æ
# ziti agent list
ā•­ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā•®
ā”‚     PID ā”‚ EXECUTABLE ā”‚ APP ID    ā”‚ UNIX SOCKET                  ā”‚ APP TYPE   ā”‚ APP VERSION ā”‚ APP ALIAS ā”‚
ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
ā”‚ 3753058 ā”‚ ziti       ā”‚ CgSdT-w9Z ā”‚ /tmp/gops-agent.3753058.sock ā”‚ router     ā”‚ v1.2.2      ā”‚           ā”‚
ā”‚ 3759557 ā”‚ ziti       ā”‚ oci       ā”‚ /tmp/gops-agent.3759557.sock ā”‚ controller ā”‚ v1.2.2      ā”‚           ā”‚
ā•°ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā•Æ

# ziti agent cluster list -p 3759557
ā•­ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā•®
ā”‚ ID  ā”‚ ADDRESS                 ā”‚ VOTER ā”‚ LEADER ā”‚ VERSION ā”‚ CONNECTED ā”‚
ā”œā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
ā”‚ oci ā”‚ tls:ziti.xxxxxx.xx:8440 ā”‚ true  ā”‚ true   ā”‚ v1.2.2  ā”‚ true      ā”‚
ā•°ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā•Æ


This is by design. The ipc socket will be owned by whatever user started the server. If you run as root, it's owned by root and you'll need to be root to access the socket.

I don't think there's a way to set the ownership of the socket to a group. I'll ask around or point someone here to comment if there's a way to somehow encourage the ownership to be assigned to a group as well.

I wasn't aware that the quickstart was running as root :sweat_smile:
I changed the systemd unit file to run as ziti and I added the CAP_NET_BIND_SERVICE capability to open reserved ports.
Now everything work like a charm :heart_eyes:

Is ziti edge db snapshot still working to do backups?

I would expect it to, I'd suspect permission issues after modifying systemd. If it's not, it's probably good to put this one issue to bed and start up a new thread. :slight_smile: