HA Cluster: What still works on an isolated controller without quorum?

Hi everyone,

I'm designing a 3-controller HA deployment and want to make sure I fully understand the failure modes before going to production.

The failure scenario I'm concerned about: 2 of 3 voting controllers go offline. The remaining controller C has lost quorum but is still reachable by edge clients.

From the documentation I understand that:

  • C can still serve reads from its local data model (possibly stale)
  • C cannot forward writes to the leader since no leader can be elected
  • Existing circuits will remain up but can't be rerouted

What I haven't been able to determine from the docs is how session management behaves on the remaining controller:

  1. Can new clients authenticate and create API sessions against C?
  2. Can already-authenticated clients create new service sessions (e.g., to access a service they haven't dialed yet)?
  3. Does new circuit creation require a Raft commit, or can C handle it locally?
  4. If sessions do require writes, do clients get a clear error, or do they hang/timeout?

Basically: if C is the only controller my clients can see, what exactly can they still do and what breaks?

Thanks in advance for any insight!

Hi @jnsfndr

I've got to add this to our HA documentation, because it's not there in a any consolidated form yet.

So:

What still works

Management API Reads

Controller C can still serve REST API reads (list services, get identities, check policies, etc.) from its local data store. The data may be slightly stale (it reflects the last Raft index C received before quorum was lost), but all GET/LIST operations work fine since they read directly from the local database without touching Raft.

Sessions
API sessions (authentication): New clients can authenticate against C. API sessions do not require Raft consensus. When using OIDC (which is required for HA failover), API sessions are issued as signed JWTs entirely in memory; no database write is involved at all. Even on the legacy authentication path, the session is written only to the controller's local database, which doesn't go through Raft.

Service sessions have the same characteristics as api sessions. Posture check enforcement is now in the router, so that should also be unaffected.

Circuit Creation

Circuit creation does not require a Raft commit. Circuits are entirely in-memory objects on the controller that creates them. The flow is all local reads and router communication. No Raft involvement. If a client has (or can create) a valid service session, it can create new circuits through C even without quorum.

Existing Circuits

Circuits are owned by the controller that created them. Once the route messages have been sent and the routers have their forwarding state, the data plane operates independently of the controller. This has important implications:

Circuits owned by C (the surviving controller):

  • Continue to work normally
  • Can be rerouted if a link or router in the path goes down. C has the circuit context and can compute an alternate path

Circuits owned by one of the two controllers that went down:

  • Continue to work because routers hold the forwarding state and keep moving data independently of the controller
  • Get torn down as usual when the circuit completes (client closes the connection, etc.)
  • Cannot be rerouted. If a link or router in the path fails, the owning controller isn't around to compute and install an alternate path, so the circuit breaks
  • Even if the owning controller comes back up, the in-memory circuit context is lost, so it still can't reroute those circuits

What doesn't work

Model Mutations
The things that require Raft are durable model mutations: creating/updating/deleting services, identities, routers, policies, edge router policies, terminators, and similar configuration entities. On C without quorum, any REST API call that tries to write to these will receive and HTTP 503 with error code CLUSTER_NO_LEADER. The error comes back promptly: clients won't hang.

Biggest Concern
The main thing that could impact operations is that new terminators can't be created. If a hosting router or sdk goes down, the terminator may not come back. In some cases it may be ok because we cache the terminator id, in hopes of avoiding an unnecessary write for transient failures, but this doesn't cover all cases.

Hope that's helpful,
Paul

1 Like