HA implementation questions

Hi there, I'm back again with a few questions.

I've been working on a HA implementation for a few weeks now with the aim of creating a cluster of Controllers and Edge Routers. I've put together a rough install process based on a number of sources.

The cluster of Controllers sits behind a Haproxy load balancer. The Edge Routers sit behind another separate Haproxy load balancer. This means my Ziti-edge-tunnel clients only need to connect to two FQDN's total, not each individual FQDN of each Controller and Edge router in the clusters.

I've finally got an installation process that's broadly reliable. Albeit with a few minor issues, hence my questions.

For context, here are my install scripts. The 2 Controllers, 2 Routers and 2 load balancers are all on separate Debian VM's (6 total) on a local network. I'm currently installing the latest Debian packages from your https://packages.openziti.org/zitipax-openziti-deb-test repo as i experience errors relating to naming of config.yml elements when i use 1.3.3 (Current latest release) from Github. I appreciate these are not release versions and come with their own reliability related caveats. Please advise me of the correct versions to be using if these aren't suitable.

Controller version: 1.4.0~13400091730
Router version: 1.4.0~13400091730
Ziti-edge-tunnel version: 1.5.0

One issue i'm seeing is that when i create an identity, i can only enrol the identity against the Controller that created the identity. For example, if i ziti create identity ... on ctrl1 the client side ziti-edge-tunnel add --jwt ... command only succeeds if that traffic hits ctrl1. If it hits ctrl2 i get the following error. I'm sure i'm missing something fundamental here!

"Success":false,
"Error":"enrollment failed: ziti controller is not available",
"Code":500

As a temporary work around, during enrolment only i add an /etc/hosts entry to force traffic to the correct controller, bypassing the load balancer.

Another question i have is around general version compatibility between clients and infrastructure. In this case, i'm using Ziti-edge-tunnel. If the version of ZET is not kept up to date, will it eventually end up incompatible and be unable to connect to the network ?

Thanks again in advance !

Our goal is to maintain backwards compatibility for SDKs, which includes tunnelers. This doesn't mean that old SDKs will necessarily be able to take advantage of new features, but they should continue to work. For example, old SDKs will only be able to work with a single node in an HA cluster. So they'll have the same capability as before. New SDKs will be able to work with multiple controllers and will therefore be more resilient. We know upgrading tunnelers can be very difficult, so we take compatibility very seriously.

I've asked @andrew.martinez, who is the most familiar with the enrollment functionality to answer your question of about cross controller enrollment.

Hope that's helpful,
Paul

1 Like

Hi Paul, Thanks for this answer.

One issue i'm seeing is that when i create an identity, i can only enrol the identity against the Controller that created the identity. For example, if i ziti create identity ... on ctrl1 the client side ziti-edge-tunnel add --jwt ... command only succeeds if that traffic hits ctrl1 . If it hits ctrl2 i get the following error. I'm sure i'm missing something fundamental here!

You aren't missing anything; this is a known gap that we are aware of and have a plan for.

A feature planned for HA release allows enrollments to work against any controller running in HA mode. Most of the groundwork plumbing and low-level crypto material is in place. Still, the actual enrollment JWT minting process, a few verification endpoints in the controllers, and SDK changes haven't been implemented.

FWIW, this implementation is backward-compatible with the old-style enrollment process, which echoes Paul's sentiment of maintaining backwards compatibility.

I looked at your scripts quickly to see if haproxy is running in TCP mode, which it appears to be. OpenZiti does not take kindly to TLS offloading due to its breakdown of mTLS. Based on what I see your setup should work with the new HA enrollment.

1 Like

Thanks for your answer @andrew.martinez !