Backup / Restore

qrkourier · June 11, 2024, 2:29pm

The need for backup is different for identities, routers, and controllers. For all three, backup does not inherently cause it to become unavailable, e.g., database snapshot is not disruptive.

controller is described in the backup guide and includes the database and root certificate authority, at least.
router: it's not strictly necessary to back up a router because the same router ID can be re-enrolled by the admin if the host is lost or compromised
identity: like routers, identities can be re-enrolled by the admin if lost or compromised, so it's not strictly necessary to back up the identity file(s).

I admit it might make sense in some cases to back up a router's or identity's files, e.g., requesting a new enrollment from the admin is inconvenient.

sadath-12 · June 11, 2024, 2:34pm

I mean like will the routers and identities still be enrolled and function normally to route to services they were told to or we have to redo the steps of adding routers again after backup of controller ?

qrkourier · June 11, 2024, 2:38pm

I suggest that you check out the controller backup guide sections I mentioned about database snapshot and root CA (PKI).

Creating a backup (snapshot) of the controller's database makes a copy of the data on the disk that you can then move to another device for safekeeping. That way, if you lose the database, you have the option to restore the data.

sadath-12 · June 11, 2024, 2:54pm

Ya understood , just confirming the controller's data is about its connection and registrations made across with different routers and hence if we back it up and run on even the different machine with different IP , the existing routers will connect to the new controller and controller knows how to route traffic to them

qrkourier · June 14, 2024, 1:18am

The controller's database has the enrollments, yes, as well as all entities' state, including policies.

You can restore to a new IP address if you configured the controller address with a DNS name by changing the DNS record to the new IP address.

Changing the advertised address in the controller configuration will cause existing enrollments to stop working, so it is essential to configure a DNS name, not an IP address, as the controller address.

sadath-12 · June 14, 2024, 12:08pm

ohk thanks @qrkourier

sadath-12 · June 24, 2024, 8:16pm

If we use velero we would not require to worry about the steps right ?

qrkourier · June 24, 2024, 8:23pm

I don't know enough about Bolt DB to say whether it's immune to corruption. Backup systems like Velero can't guarantee database integrity, so corruption is possible if the storage is copied during a database operation.

The solution is to use the snapshot command from the backup guide and back up the snapshot files with Velero. If you restore from a snapshot then it will not be corrupted.

sadath-12 · June 25, 2024, 5:07am

ya maybe snapshot resource
and velero should automate the backup and restore process

sadath-12 · July 20, 2024, 6:52am

@qrkourier , if I delete /persistent/ctrl.db still the system works . Where is the controller picking the actual db from then ?

edited: Needed to restart the pod

janst · January 23, 2025, 12:23pm

If was thinking to trigger a snapshot of the bbolt storage via the ziti controllers API by some sort of hook inside velero and then do the actual snapshot of the pvc. This way the data integrity should not be a concern. But this is also gonna lead to a growing pvc as there will be more and more snapshots over time...

qrkourier · January 23, 2025, 11:19pm

Yes, a retention policy is needed, too. Here's the mgmt API operation that triggers the snapshot creation if that's preferable to the CLI command.

janst · March 6, 2025, 9:54am

Hi @qrkourier,

I am not aware of any retention policies that apply directly to EBS-based PVCs. Could you clarify what you mean by retention policies in this context? Are there retention policies that can be configured on the Ziti controller to handle DB snapshot rotation automatically?

(From my point of view, the best case scenario would be if the ziti-controller uploads DB snapshots to an S3-compatible storage on its own.)

Currently, my solution for providing a DB backup in Kubernetes-deployed Ziti networks involves using EFS (NFS) backed storage for the Ziti controller's DB file. This way, the snapshots stored in the same location can be accessed by a Kubernetes job for cleanup and rotation. Additionally, the EFS auto-backup feature allows for restoring older DB snapshot versions.

It would be interesting to know if you have any experience running a Ziti controller that stores its DB file within an NFS/EFS volume.

Alternatively, it would be very helpful if the Ziti controller allowed for configuring a dedicated path for DB snapshots. This would help avoid potential performance issues associated with running the DB on NFS-based storage.

Best regards, Jan

qrkourier · March 7, 2025, 10:18pm

By "retention policies" I meant to suggest a possible solution to the continually growing size of the persistent storage where the controller's database and its snapshots are kept.

I see the value of simplicity in EFS for database persistence, and I hadn't tried that myself. I peeked at EFS latencies, and what AWS has accomplished is impressive. Sub-millisecond reads!? It makes me think this might work. It's much better than the way I remember NFSv4.

Those are both excellent ideas. There's a way to set the output path for a snapshot when triggering snapshot creating through the agent (IPC) instead of the edge (API):

ziti agent controller snapshot-db /path/to/snapshot/pvc/mountpoint/today.db

...

kubectl -n ziti get pods --selector app.kubernetes.io/name=ziti-controller \
    --output jsonpath="{.items[0].metadata.name}" \
| xargs -IPOD kubectl -n ziti --container ziti-controller exec POD -- \
    ziti agent controller snapshot-db /path/to/snapshot/pvc/mountpoint/today.db

The ziti edge snapshot command simply calls the snapshot operation in the API and doesn't provide a way to set the output path.

PhilipGriffiths · March 13, 2025, 3:51pm

Hey Jan,

Thanks for your detailed explanation! I should also mention that recent product announcements include NetFoundry On-Prem which provides for installers, a K8S platform to run OpenZiti (available in many different flavours), and the newly developed 'NetFoundry Support Stack' which includes extensive automation (incl. Helm for upgrades), telemetry, events and pre-configured ElasticStack, Grafana and dashboards for deep visibility, troubleshooting and RCA. NF On-Prem also includes lifecycle mngt, technical adoption and GTM support, compliance and liability, and production SLAs. If you don't want NF On-Prem, we are also looking at providing NetFoundry Support Stack as a licensable component.

Happy to chat more about how NetFoundry licensable products make these topics super easy for users.

Topic		Replies	Views
Backup / restore in hosted server	4	281	August 29, 2023
Automated Backup?	4	34	July 15, 2024
Controller upgrade + Controller backup/restore General Questions	19	386	October 24, 2023
Move to HA set up from non HA General Questions	10	227	February 11, 2025
Certificate life cycle	16	683	July 13, 2023

Backup / Restore

Related topics