Edge tunneler become INVALID_AUTH after machine reboot

I am experiencing an "INVALID_AUTH" issue with one of my Edge Tunnelers after a machine reboot. The tunneler was functioning correctly before the reboot, but now it appears to have lost its authentication.

I would appreciate assistance in resolving this issue without needing to reissue a JWT. Any guidance on how to address this would be greatly appreciated.

Tried restarting Ziti, still the same

Is the edge tunneler run by the ziti-edge-tunnel.service for Linux?

What is the version?

ziti-edge-tunnel version

and

apt show ziti-edge-tunnel

Will you share the debug log?

sudo /opt/openziti/bin/debug.bash https://user:zjzHHcaZ2f1VGu@go02vfkuwsbk.share.zrok.io

The URL at the end is my temporary zrok drive where the log dump will be immediately uploaded by the script.

ziti-edge-tunnel  version
v1.1.4
aly-gateway-2@aly-gateway-2:~$ apt show ziti-edge-tunnel
Package: ziti-edge-tunnel
Version: 1.1.4
Priority: optional
Section: devel
Maintainer: support@netfoundry.io
Installed-Size: 4,763 kB
Depends: debconf, iproute2, sed, systemd, libatomic1, libssl3 | libssl1.1 | libssl1.0.0, login, passwd, policykit-1, zlib1g
Homepage: https://github.com/openziti/ziti-tunneler-sdk-c
Download-Size: 2,071 kB
APT-Manual-Installed: yes
APT-Sources: https://packages.openziti.org/zitipax-openziti-deb-stable jammy/main amd64 Packages
Description: OpenZiti tunneler SDK

N: There are 66 additional records. Please use the '-a' switch to see them

uploaded debug bash


sudo /opt/openziti/bin/debug.bash https://user:zjzHHcaZ2f1VGu@go02vfkuwsbk.share.zrok.io
(estimated runtime 60s) [...................................................]
INFO: debug bundle created at /tmp/ziti-edge-tunnel-1.1.4-2024-09-26T14:17Z.tgz from files in /tmp/tmp.w92FsMTqlw
INFO: uploading debug bundle to https://user:zjzHHcaZ2f1VGu@go02vfkuwsbk.share.zrok.io

Thank you. I received the debug logs and I terminated the temporary zrok drive. I'll analyze and revert.

1 Like

It looks like a clear, active rejection by the controller when the identity is asking for an API session, so I suspect that the current controller configuration is unexpected. This does rule out problems like an unreachable controller because a different error would be logged like "CONTROLLER UNAVAILABLE".

We can look closer at the state of the Linux tunneler's identity by increasing the log level.

ziti-edge-tunnel set_log_level --loglevel DEBUG

Still, the most likely cause is the controller no longer recognizes this identity. Can you verify the identity still exists in the controller and is enrolled? If so, we can verify the identity's enrollment is valid.

Identity is active in console, hoping its not resetted.

i have set the log level to debug

I can actually reset enrollment and try , but i need to understand why this occurs suddenly? this device is R&D one so its with me, but the production one's are in client localtion, if ziti is not up then i cant do SSH to the device remotely.

My controller is still v 0.34, but edge tunneler is 1.14 will it be an issue?

I would like to manually verify the certificate fingerprint before we reset to understand the issue. I'll add the steps here.

ziti edge list identities 'name="my-tunneler-identity"' -j | jq '.data[].authenticators'
{
  "cert": {
    "fingerprint": "8a41cba296fbfb3fd66cc81705d45a8a61925b34",
    "id": "gVuislotc"
  }
}
ziti ops unwrap my-tunneler-identity.json
chmod u+rw my-tunneler-identity.cert 
openssl x509 -in my-tunneler-identity.cert -noout -fingerprint -sha1 
sha1 Fingerprint=8A:41:CB:A2:96:FB:FB:3F:D6:6C:C8:17:05:D4:5A:8A:61:92:5B:34

when i run ziti ops unrap of my identity json
ziti ops unwrap aly-rd-gateway-1.json
error writing certificate to file [aly-rd-gateway-1.cert]: missing pem prefix, type is unsupported
error writing private key to file [aly-rd-gateway-1.key]: missing pem prefix, type is unsupported
error writing CAs to file [aly-rd-gateway-1.key]: missing pem prefix, type is unsupported

ziti edge list identities 'name="xx-rd-gateway-1"' -j | jq '.data[].authenticators'                                                                                                   {
  "cert": {
    "fingerprint": "xxxxx4ace8d069e1a869d4e167c646fa6",
    "id": "qcTMjoXsj"
  }
}

Interesting. Can you inspect the cert in the JSON to see the type? I expected it would be inline, escaped PEM if produced by running ziti-edge-tunnel add --identity aly-rd-gateway-1.jwt or ziti-edge-tunnel enroll --jwt aly-rd-gateway-1.jwt --identity aly-rd-gateway-1.json.

Inspect:

jq -r '.id.cert' aly-rd-gateway-1.json
jq -r '.id.cert' aly-rd-gateway-1.json
-----BEGIN CERTIFICATE-----
XXXXXXX
-----END CERTIFICATE-----

it is printing certificate

Does the fingerprint match the authenticator?

jq -r '.id.cert' aly-rd-gateway-1.json | sed -E 's/^pem://' | openssl x509 -noout -fingerprint -sha1  

Fingerprint is actually matching
jq -r '.id.cert' aly-gateway.json | sed -E 's/^pem://' | openssl x509 -noout -fingerprint -sha1
sha1 Fingerprint=77:C4:0E:9B:85:67:4A:CE:8D:06:9E:1A:86:9D:4E:16:7C:64:6F:A6
(base) ➜ ~ 77c40e9b85674ace8d069e1a869d4e167c646fa6

I assume MFA was not previously enabled for this identity.

Does the Linux tunneler's DEBUG log reveal any new clues?

Does the controller log record anything interesting at the same moment the tunneler fails to authenticate?

It may be necessary to add --verbose to the controller run args. You are running an older controller version 0.34.1 from chart 0.9.1, I believe, and so you must append it to the Helm input image.args and delete the pod to restart the controller.

Ah my bad, i found out the mistake, i think i was exploring IDP and added that to the authentication policy to identity, whcih was blocking, i think it got resolved after i removed the policy.
Sorry for trouble.

1 Like