I have a few questions regarding availability in an OpenZiti network to which I couldn’t find a clear answer:
Multiple controllers / redundancy : I’ve read there’s current work being done (HA if I’m not mistaken) in that area. Is it currently possible to have multiple controllers for redundancy ?
If it’s not possible, how is redundancy typically handled ?
When exactly is the controller needed ? For exemple, does it need to be 100% available and if the connection to the controller is lost, everything stops working ? Or is it just needed to log into the ziti network ? To update new paths ? What happens if the controller is down for a few seconds / minutes / hours ?
Not just yet, but the efforts continue to move forward. It’ll be soon. Look for updates in the coming months
Data plane redundancy is handled by deploying multiple edge routers. Controller availability is all about getting the controller back running, which generally takes very few seconds. Right now we recommend backing up the controllers state and pki at whatever interval you decide is acceptable (daily, for example) and then putting a disaster recovery plan in place that restores the db/pki and starts the controller back up.
While the controller is down, persistent connections in the data plane are not affected. Most users never notice the controller restarting.
The controller is necessary when making new connections, when authenticating new connections, and when making network configuration changes. I think I already answered the other bits of this question in the last paragraph.