Hi There,
I’ve been testing HA Controller functionality under various high load conditions.
I’m currently using version 1.6.7 in HA mode with 3 Controllers and 2 Routers.
I’ve deliberately created ~135k identities to understand if there are any side effects of having a large database. The identities are not being used to connect SDK tunnellers so there are no Terminators.
Here’s the entityCount from my system for visibility.
{"namespace":"entityCount","event_src_id":"ctrl-0","timestamp":"2025-08-28T14:49:47.224510788Z","counts":{"apiSessionCertificates":0,"apiSessions":1,"authPolicies":1,"authenticators":2,"cas":0,"configTypes":5,"configs":2,"controllers":3,"edgeRouterPolicies":2,"enrollments":134492,"eventualEvents":0,"externalJwtSigners":0,"identities":134492,"identityTypes":2,"mfas":0,"postureCheckTypes":5,"postureChecks":0,"revocations":0,"routers":4,"routers.edge":4,"serviceEdgeRouterPolicies":1,"servicePolicies":2,"services":1,"services.edge":1,"sessions":0,"terminators":1}}
And my database is currently at about 1.7G.
[ziggy@ziti-controller-0 ~]$ ls -lah /etc/ziti/config/ctrl-ha.db
-rw-rw---- 1 ziggy ziggy 1.7G Aug 28 14:50 /etc/ziti/config/ctrl-ha.db
One issue i’ve found is that once the DB gets relatively large the ziti edge create edge-router … command often fails with a 503 returned to the CLI client.
The command i run is …
ziti edge create edge-router test --timeout 1000
and after 10 seconds i get the following response.
error: error creating edge-routers instance in Ziti Edge Controller at https://localhost:9443/edge/management/v1. Status code: 503 Service Unavailable, Server returned: {
"error": {
"code": "TIMEOUT",
"message": "The requested operation took too much time to reply",
"requestId": "CvTvWIPAH"
},
"meta": {
"apiEnrollmentVersion": "0.0.1",
"apiVersion": "0.0.1"
}
}
In the Controller log i see…
{"file":"github.com/openziti/ziti/controller/raft/fsm.go:256","func":"github.com/openziti/ziti/controller/raft.(*BoltDbFsm).Apply","index":134827,"level":"info","msg":"apply log with type *model.CreateEdgeRouterCmd","time":"2025-08-28T14:42:01.694Z"}
{"_context":"tls:0.0.0.0:7443","error":"EOF","file":"github.com/openziti/transport/v2@v2.0.183/tls/listener.go:260","func":"github.com/openziti/transport/v2/tls.(*sharedListener).processConn","level":"error","msg":"handshake failed","remote":"10.224.0.65:55878","time":"2025-08-28T14:42:07.608Z"}
{"file":"github.com/openziti/ziti/controller/api/timeouts.go:127","func":"github.com/openziti/ziti/controller/api.(*timeoutHandler).ServeHTTP","level":"error","method":"POST","msg":"timeout for request hit, returning Service Unavailable 503","time":"2025-08-28T14:42:11.685Z","url":{"Scheme":"","Opaque":"","User":null,"Host":"","Path":"/edge/management/v1/edge-routers","RawPath":"","OmitHost":false,"ForceQuery":false,"RawQuery":"","Fragment":"","RawFragment":""}}
{"namespace":"entityCount","event_src_id":"ctrl-0","timestamp":"2025-08-28T14:42:10.224087541Z","counts":{"apiSessionCertificates":0,"apiSessions":1,"authPolicies":1,"authenticators":2,"cas":0,"configTypes":5,"configs":2,"controllers":3,"edgeRouterPolicies":2,"enrollments":134489,"eventualEvents":0,"externalJwtSigners":0,"identities":134492,"identityTypes":2,"mfas":0,"postureCheckTypes":5,"postureChecks":0,"revocations":0,"routers":1,"routers.edge":1,"serviceEdgeRouterPolicies":1,"servicePolicies":2,"services":1,"services.edge":1,"sessions":0,"terminators":1}}
{"_context":"tls:0.0.0.0:7443","error":"EOF","file":"github.com/openziti/transport/v2@v2.0.183/tls/listener.go:260","func":"github.com/openziti/transport/v2/tls.(*sharedListener).processConn","level":"error","msg":"handshake failed","remote":"10.224.0.65:45380","time":"2025-08-28T14:42:17.607Z"}
{"error":"http: Handler timeout","file":"github.com/openziti/ziti/controller/api/responder.go:126","func":"github.com/openziti/ziti/controller/api.(*ResponderImpl).RespondWithProducer","level":"error","msg":"could not respond, writing to response failed","path":"/edge/management/v1/edge-routers","requestId":"CvTvWIPAH","time":"2025-08-28T14:42:17.977Z"}
And after about 30 seconds, i see the router exists when i run ziti edge list edge-routers
Other “creation” type CLI operations seem to work as expected in a timely manner. For example, ziti edge create identity... works fine.
Thanks in advance.