Boltdb edge snapshot failed - error 500

Hi all,

I am encountering an issue with snapshotting the boltdb on controller v 1.6.6.

error: error updating database/snapshot instance in Ziti Edge Controller at ``https://<redacted>/edge/management/v1``. Status code: 500 Internal Server Error, Server returned: {
"error": {
"cause": {
"code": "UNHANDLED",
"message": "EOF"
},
"code": "UNHANDLED",
"message": "An unhandled error occurred",
"requestId": "oExSqA-jO"
},
"meta": {
"apiEnrollmentVersion": "0.0.1",
"apiVersion": "0.0.1"
}
}

When i check the controller logs i see this

{"file":"github.com/openziti/storage@v0.4.22/boltz/db.go:261","func":"github.com/openziti/storage/boltz.(*DbImpl).SnapshotInTx","level":"info","msg":"snapshotting database to file","path":"/persistent/ctrl.db-20251120-195237","time":"2025-11-20T19:52:37.742Z"}
{"error":"EOF","file":"github.com/openziti/ziti/controller/api/responder.go:133","func":"github.com/openziti/ziti/controller/api.(*ResponderImpl).RespondWithError.func1","level":"error","msg":"unhandled error returned to REST API","time":"2025-11-20T19:52:37.994Z","uri":"/edge/management/v1/database/snapshot"}
{"file":"github.com/openziti/storage@v0.4.22/boltz/db.go:261","func":"github.com/openziti/storage/boltz.(*DbImpl).SnapshotInTx","level":"info","msg":"snapshotting database to file","path":"/persistent/ctrl.db-20251121-184838","time":"2025-11-21T18:48:38.667Z"}
{"error":"EOF","file":"github.com/openziti/ziti/controller/api/responder.go:133","func":"github.com/openziti/ziti/controller/api.(*ResponderImpl).RespondWithError.func1","level":"error","msg":"unhandled error returned to REST API","time":"2025-11-21T18:48:38.775Z","uri":"/edge/management/v1/database/snapshot"}

this error is preventing me from upgrading my controller pod to any version past 1.6.6 as it seems the controller is programmed to take a snapshot on startup.

This led me to suspect my ctrl.db got corrupted somehow, so I checked the db status with the ziti command:

#ziti edge db check-integrity-status
In Progress: false
Fixing Errors: false
Too Many Errors: false (if true, additional errors can be found in controller log)
Started At: 2025-11-21T19:00:39.625Z
Finished At: 2025-11-21T19:00:39.981Z
Operation Error:
Issue: unique index externalJwtSigners.kid missing value for id […]rN1F. Fixed: false

So i made a copy of the db and opened it up to check if there is a value for the kid id and I see a value

I am not sure if this is a bug, or perhaps I inadvertently corrupted something.

Any help would be greatly appreciated and thank you!

Also just want to add, that ziti does actually make the snapshot as there is a db file that is created in the folder with all the same values.

Weird just noticed that the KID value in the gui is empty

image

is that normal?

Hello @Tetrusp

This is not something we've seen before.

Couple of things to note:

  1. The controller should only perform a bbolt snapshot on startup if there's a data model migration. It does the snapshot to ensure that if something goes wrong while it's migrating data from one format to another, we've still got the old DB available to restore from.
  2. The check-integrity-status tool is only checking the ziti constraints, it's not checking the internal consistency of the bbolt data structs. There is a tool for that from the bbolt project. bbolt check <path to db file>.

I would definitely try the bbolt check to make sure the db file is ok. I tried to track down the EOF you're seeing, and it looks like it can come from the bbolt.Tx.CopyFile method, specifically from inside the WriteTo method. It looks like it happens when it's expecting more data from bbolt, as opposed to an issue with disk space (which I would expect would be a write error, rather than an EOF).

We should probably add a startup flag that disables the back-up copy. In the meantime, other things you could try:

Compact the DB

  1. stop the controller
  2. make a manual copy of the db file
  3. use the compact tool: ziti ops db compact <src> <dst> to compact your db. If there's an internal bbolt issue, bbolt may fix it when writing out the compacted data. It may also exhibit the same problem as the snapshot.
  4. Try starting the controller with the new compacted db file

If that doesn't work:

Import/Export
There's a beta import/export tool you can try. However, it's got an issue where the identities that get re-imported don't properly auth, which I need to look into. So it'll restore most things, but the arguably most important thing, identities, won't work properly. I'll see if I can dig into that a bit over the next week or so.

  1. ziti ops export > entities.json
  2. stop your controller
  3. make a manual backup of the db file
  4. remove the db file and re-initialize the controller
  5. ziti ops import entities.json

Let me know what you find,
Paul

Good afternoon Plorenz,

Thank you for your time and help with this!

I didn’t see that there was a compact db command. That fixed my issue, im not sure what was wrong with the db file but I was able to update my controller and snapshots dont fail anymore.

Thanks again

1 Like