OIDC error in ZDE (HA Cluster)

Hiya

I have a pretty unremarkable ziti cluster with 3 controllers.

I've generated identities, and when I attempt to add these identities to ZDE clients, they hit an OIDC error.

I haven't configured OIDC in any way at all

web:
  - name: public
    bindPoints:
      - interface: 0.0.0.0:1280
        address:   01.some.domain.com:1280
      - interface: 0.0.0.0:10080
        address:   01.some.domain.com:10080

    options:
      idleTimeout: 5000ms
      readTimeout: 5000ms
      writeTimeout: 100000ms
      minTLSVersion: TLS1.2
      maxTLSVersion: TLS1.3
    apis:
      - binding: zac
        options:
          location: /etc/ziti/controller/zac
          indexFile: index.html
      - binding: edge-management
        options:
          sessionTimeout: 30m
      - binding: health-checks
        options: {}
      - binding: fabric
        options: {}
      - binding: edge-client
        options: {}

I was running 1.6.3 originally, but dropped back to 1.5.4, but same same

Has anyone seen this? Any ideas? I see other posts relating to this, but nothing specific

tia!

Some AB testing

Turning off controller 02 and 03, and reconfiguring 01 to not be in "cluster mode" - ie using the bolt db etc again, regenerating all identities/polices, ZDE enrols and operates correctly.

Putting controller 01 back into cluster mode, operating as a single node, and regenerating all identities and policies, ZDE still presents the same error.

The JWTs generated between the two versions are virtually identical.

So this would strongly suggest a bug no?

It certainly appears that way to me. I've not seen that myself but the HA controllers are still new and we're still actively trying to find and fix these sorts of errors so it's possible.

You don't have clear steps to reproduce, do you? Is it simply:

  • establish a 3-node cluster
  • enable an ext-jwt-signer
  • try logging over and over and once in a while that fails?

Any extra details or steps to reproduce would be really helpful to us. thanks

Few points of clarity, plus the solution.

Not ZDE related - using the ziti-tunneler within zde produces the same.

From my testing, a single controller, from scratch, running in 'cluster:' mode, will produce the same results as a 3x node cluster.

To reproduce, simply configure the controller, dont configure anything to do with oidc or external jwt, init with ziti agent cluster init.... From that point on any identities will fail to enroll.

The solution:

Add an edge-oidc to your web: bindings.

Seems odd, but it works.

1 Like

The OIDC API is fundamentally different from the other two APIs (Client and Management) and, as such, lives as its own API. It is enabled separately through its own XWeb binding (edge-oidc) and must be enabled on each controller you wish to have handle OIDC auth.

This allows each controller to be configured on whether it will handle authentication processing - e.g., allowing deployments to offload authentication processing to specific controllers (regional, voting, non-voting, etc.). As with anything, the additional flexibility adds complexity.

Further, there is a difference in how the APIs are developed and what they do/don't allow from a specification standpoint. Keeping the OIDC API from being merged into the other APIs.

The Client and Management APIs are generated from Open API specs, developed by the OpenZiti team and managed by them. They have their own generated backends based on the Open API toolset and are a JSON Web API.

The OIDC backend is provided by Zitadel's OIDC library, which has its own backend. Additionally, it follows the OIDC spec, which is not strictly a JSON Web API and has its own quirks due to its development history. It is also not defined by an Open API specification.

Both backends (Client/Management vs OIDC) have different assumptions about how default routes should be handled, including who or what handles the root path, allowed inputs and outputs, and so on. They don't play well running together on the same HTTP listener instance. I looked into making it happen, and as the workarounds piled up, I abandoned that path due to the fear of maintenance costs keeping the workarounds "working".

So in short: yes the OIDC API is required for HA, yes you get to decide which controllers run it, and yes having none support it causes issues.