K8S Router Enrollment CSR Issue

I’m currently having major issues in enrolling a router - hosted in EKS - to a controller - also hosted in EKS. I have deployed the ziti-controller (v 0.28.0) and ziti-console (2.6.9) and want to deploy and enroll a ziti-router (v 0.28.0) also in the same EKS cluster. But as soon as I’m deploying the router with helm I get the following error from helm

Error: INSTALLATION FAILED: failed post-install: job failed: BackoffLimitExceeded

and when I check the post install job I see the following logs:

INFO: identity secret does not exist, attempting router enrollment
+ echo 'INFO: identity secret does not exist, attempting router enrollment'
+ mkdir -v /tmp/ziti-router-identity
mkdir: created directory '/tmp/ziti-router-identity'
+ ziti router enroll /etc/ziti/config/ziti-router.yaml --jwt /etc/ziti/config/enrollment.jwt --verbose
[   0.000]   DEBUG ziti/ziti/util.LogReleaseVersionCheck: ZITI_CHECK_VERSION is not 'true'. skipping version check
[   0.024]   DEBUG edge/router/enroll.(*RestEnroller).Enroll: JWT parsed
[   5.167]   FATAL ziti/ziti/router.enrollGw: enrollment failure: (enrollment failed recieved HTTP status [400 Bad Request]: {"error":{"cause":{"code":"UNHANDLED","message":"csrPem must not be null or empty"},"code":"COULD_NOT_PROCESS_CSR","message":"The supplied csr could not be processed","requestId":"hgdPnBbm-"},"meta":{"apiEnrollmentVersion":"0.0.1","apiVersion":"0.0.1"}}
)

In the ziti-controller I have no logs about this issue whatsoever, except when I run it in verbose mode I get:

[  31.990]   DEBUG fabric/events.(*entityChangeEventDispatcher).processPreviousTxEvents: {txId=[16]} cleaning up entity change events for tx
**[  91.232]   DEBUG edge/controller/model.(*identityStatusMap).HasEdgeRouterConnection: {identityId=[SvnegBbTv]} reporting identity from active ER conn pool: not found**
[  95.497]   DEBUG fabric/events.(*entityChangeEventDispatcher).flushLoop: cleaning up entity change events
[  95.497]   DEBUG fabric/events.(*entityChangeEventDispatcher).processPreviousTxEvents: {txId=[18]} cleaning up entity change events for tx

I also tried to use controller and router versions v0.28.4 but still no luck.
Could you help me out here ?

BR
Jan

Hrmmm. that seems to me like the automation to create the identity/jwt has failed to put the secret into place. @qrkourier is away for a few days, maybe @dariuszSki has done this recently and can answer. I’ve actually not tried it out recently enough to know what’s changed to help out. I can try it all a bit later today though if nobody else from the community is able to help out.

The controller won’t have the relevant logs here, just the router. It seems like a process on the router side is failing in some way due to that error “csrPem must not be null or empty”

Can you look into the controller with the ziti cli or with ZAC and see if the router exists still? Maybe it’s trying to recreate the same one?

I’ll try to poke back later if nobody else gets to this…

this is what I can see in the zac:

so not really a hint on what is going wrong

Sorry, what about the edge router screen? Could you show that too, please? That’s the one I’m wondering about. I’m wondering if maybe the automation is trying to add a router that already exists and that’s the bug.

Oh ok I’ve found the issue :smiley:
It was actually my fault. I created a new Identity in the identities section of the ZAC instead of a new router in the router section…

most likely the jwt token contains a different set of information then ?

1 Like

Oh great! Glad to hear you got it sorted! Yes, router registration is “not quite” the same as “regular” identities.

1 Like

Probably there is a way to return a more meaningful log message in the router logs :slight_smile:
This would have saved me from two days of root cause investigation :sweat_smile:

That definitely seems like a good idea. I’ll file that issue.

Filed here: more detailed error when using an identity jwt to enroll a router · Issue #1183 · openziti/ziti · GitHub

2 Likes