No api session found for token

Zrok stopped to work in HA mode. Very unstable.

Here is the log of a ziti router. The system's log is full of errors.

Apr 27 22:28:28 server.name ziti[13143]: {"error":"no api session found for token [....ab1476], subjects [[CN=EJcj8MF8e,O=NetFoundry,C=US CN=NetFoundry Inc. Intermediate CA yXPqI1xuf,OU=ADV-DEV,O=NetFoundry,L=Charlotte,C=US]]","file":"github.com/openziti/channel/v3@v3.0.39/impl.go:124","func":"github.com/openziti/channel/v3.AcceptNextChannel.func1","level":"error","msg":"failure accepting channel edge with underlay u{classic}-\u003ei{EJcj8MF8e/Lbya}","time":"2025-04-27T22:28:28.169Z"}
Apr 27 22:29:41 server.name ziti[13143]: {"error":"token is unverifiable: error while executing keyfunc: public key not found","file":"github.com/openziti/ziti/router/state/manager.go:715","func":"github.com/openziti/ziti/router/state.(*ManagerImpl).GetApiSession","level":"error","msg":"JWT validation failed","time":"2025-04-27T22:29:41.414Z"}
Apr 27 22:29:41 server.name ziti[13143]: {"error":"token is unverifiable: error while executing keyfunc: public key not found","file":"github.com/openziti/ziti/router/state/manager.go:715","func":"github.com/openziti/ziti/router/state.(*ManagerImpl).GetApiSession","level":"error","msg":"JWT validation failed","time":"2025-04-27T22:29:41.543Z"}
Apr 27 22:29:41 server.name ziti[13143]: {"error":"token is unverifiable: error while executing keyfunc: public key not found","file":"github.com/openziti/ziti/router/state/manager.go:715","func":"github.com/openziti/ziti/router/state.(*ManagerImpl).GetApiSession","level":"error","msg":"JWT validation failed","time":"2025-04-27T22:29:41.800Z"}
Apr 27 22:29:42 server.name ziti[13143]: {"error":"token is unverifiable: error while executing keyfunc: public key not found","file":"github.com/openziti/ziti/router/state/manager.go:715","func":"github.com/openziti/ziti/router/state.(*ManagerImpl).GetApiSession","level":"error","msg":"JWT validation failed","time":"2025-04-27T22:29:42.313Z"}
Apr 27 22:29:43 server.name ziti[13143]: {"error":"token is unverifiable: error while executing keyfunc: public key not found","file":"github.com/openziti/ziti/router/state/manager.go:715","func":"github.com/openziti/ziti/router/state.(*ManagerImpl).GetApiSession","level":"error","msg":"JWT validation failed","time":"2025-04-27T22:29:43.338Z"}

zrok is not yet updated to run HA OpenZiti.

It works 5 mins then It stops and I need to restart the router. Ziti starts to lose the terminators:
ziti-router's log

Apr 28 13:03:52 ziti[1058949]: {"_context":"ch{edge}-\u003eu{classic}-\u003ei{eUSQm1C0R/YBkz}","chSeq":1,"connId":1,"edgeSeq":0,"file":"github.com/openziti/ziti/router/xgress_edge/listener.go:286","func":"github.com/openziti/ziti/router/xgress_edge.(*edgeClientConn).processBind","level":"error","msg":"no controller available, cannot create terminator","routerId":"lDR4m1zF1q","time":"2025-04-28T13:03:52.511Z","token"....type":"EdgeBindType"}
Apr 28 13:03:52 ziti[1058949]: {"connId":2,"file":"github.com/openziti/ziti/router/xgress_edge/listener.go:558","func":"github.com/openziti/ziti/router/xgress_edge.(*edgeClientConn).processUnbind","level":"info","msg":"no terminator found to unbind for token","time":"2025-04-28T13:03:52.516Z","token"....

It's not currently expected to operate with an HA controller.

I see that I need to copy the bbolt.db to another host every time I change the machine.
HA mode is not supported yet.

Yes, you may recover the controller database reliably from a snapshot: Controller Backup and Recovery | OpenZiti

Is it possible to switch to non HA mode?

raft/ctrl-ha.db-20250428-082556
raft/ctrl-ha.db-20250428-115841
raft/ctrl-ha.db
raft/ctrl-ha.db.previous
raft/snapshots
raft/ctrl-ha.db-20250428-091648
raft/raft.db

It's possible, yes, but I recommend remaining in HA mode so there's no barrier to growing your controller cluster at a later date.

I assume the zrok error occurs only when the cluster has multiple members. In case I'm wrong about that, and zrok error occurs at all times in HA mode, then it is necessary to migrate from HA to standalone. Will you let me know what you discover?

Unfortunately It doesn't work even with a single node cluster.
I have changed the identity files putting "enableHa": true. It does not help.
/.zrok/identities/environment.json
./.zrok/identities/public.json

The most common error is of this kind:
ziti-router's log:

Apr 28 12:36:38 hostname ziti[1051831]: {"ctrlId":"ovh","file":"github.com/openziti/ziti/router/state/dataState.go:43","func":"github.com/openziti/ziti/router/state.(*DataStateHandler).HandleReceive.func1","index":348,"level":"info","msg":"received full router data model state","time":"2025-04-28T12:36:38.642Z"}
Apr 28 12:36:38 hostname ziti[1051831]: {"file":"github.com/openziti/ziti/router/state/manager.go:510","func":"github.com/openziti/ziti/router/state.(*ManagerImpl).SetRouterDataModel","index":348,"level":"info","msg":"replacing router data model","time":"2025-04-28T12:36:38.643Z"}
Apr 28 12:36:38 hostname ziti[1051831]: {"existingIndex":342,"file":"github.com/openziti/ziti/router/state/manager.go:531","func":"github.com/openziti/ziti/router/state.(*ManagerImpl).SetRouterDataModel","index":348,"level":"info","msg":"router data model replacement complete, old: 0xc000dda700, new: 0xc00100bdc0","time":"2025-04-28T12:36:38.643Z"}
Apr 28 12:36:38 hostname ziti[1051831]: {"ctrlId":"ovh","file":"github.com/openziti/ziti/router/state/dataState.go:48","func":"github.com/openziti/ziti/router/state.(*DataStateHandler).HandleReceive.func1","index":348,"level":"info","msg":"finished processing full router data model state","time":"2025-04-28T12:36:38.643Z"}
Apr 28 12:36:38 hostname ziti[1051831]: {"file":"github.com/openziti/ziti/common/subscriber.go:446","func":"github.com/openziti/ziti/common.syncAllSubscribersEvent.process","level":"info","msg":"sync all subscribers","subs":1,"time":"2025-04-28T12:36:38.643Z"}
Apr 28 12:36:38 hostname ziti[1051831]: {"file":"github.com/openziti/ziti/common/subscriber.go:446","func":"github.com/openziti/ziti/common.syncAllSubscribersEvent.process","level":"info","msg":"sync all subscribers","subs":1,"time":"2025-04-28T12:36:38.721Z"}
Apr 28 12:36:39 hostname ziti[1051831]: {"file":"github.com/openziti/ziti/common/subscriber.go:446","func":"github.com/openziti/ziti/common.syncAllSubscribersEvent.process","level":"info","msg":"sync all subscribers","subs":1,"time":"2025-04-28T12:36:39.953Z"}
Apr 28 12:36:40 hostname ziti[1051831]: {"file":"github.com/openziti/ziti/common/subscriber.go:446","func":"github.com/openziti/ziti/common.syncAllSubscribersEvent.process","level":"info","msg":"sync all subscribers","subs":1,"time":"2025-04-28T12:36:40.871Z"}
Apr 28 12:36:49 hostname ziti[1051831]: {"_context":"ch{edge}-\u003eu{classic}-\u003ei{lY2W2RC0R/lzxN}","chSeq":1,"connId":1,"edgeSeq":0,"error":"service 1zfSRlT9KfwOCdVtnsTDRk has no terminators","file":"github.com/openziti/ziti/router/xgress_edge/listener.go:199","func":"github.com/openziti/ziti/router/xgress_edge.(*edgeClientConn).processConnect","level":"warning","msg":"failed to dial fabric","time":"2025-04-28T12:36:49.378Z","token":"eyJhbGciOiJSUzI1NiIsImtpZCI6ImVjZDUwYTg2MmJhNjEzYjBmYmU4YTU3NjVhNGI4MTc1ZTliOTMwNTgiLCJ0eXAiOiJKV1QifQ.eyJpc3MiOiJodHRwczovL292aC56aXRpLnJhbnRhbnBsYW4ucnU6OTkzIiwic3ViIjoiMXpmU1JsVDlLZndPQ2RWdG5zVERSayIsImF1ZCI6WyJvcGVueml0aSJdLCJleHAiOjE3NzczNzk4MDgsImlhdCI6MTc0NTg0MzgwOCwianRpIjoiNGE3NTdmYTEtODQ5ZS00MzRkLWI3MmUtNTRkODEzM2ZhYjRhIiwiel9hc2lkIjoiNDc4MTY3OTMtZmVmOS00N2UyLWFiNzMtM2FlODhmZmNmNjIxIiwiel9paWQiOiJsWTJXMlJDMFIiLCJ6X3QiOiJzIiwiel9zdCI6IkRpYWwifQ.uuNa2-IXxQfpN90YstrGTVwZew2bTZHSjNhefQKOoSFRXSYZ5dIo4DhaJbRmenle5wb-Q-duNJbESYk7JlplTKWL9zw4Qtwd1SP-iHLoFrKdCICCaO0JatTMjJfqPF5opp6nrqCDCzl_LP10fnFLeJlHqNm9iyPBia7FAjTOUOcOcROC_i2WUR9S2_8sId1ZOQAjJVfiXzDCVPImmIceuAXxeflwnhVmntnbxhv2dt-bEcft7T5vXCAwFuLlkvFNHZqQ12PWOvdwlUz4tU4f569HNRhlDQZCRLr9X6OB3D4N9YoOJZ9C2RJNDHu0mSVWJZMicN43pc7LqK_CR0rERJHSfX_H0HvKiDIfXXlQwa2UKiumq4gEKq7xKQH05JJSi1HZRaXq8xP-fl22yNf3-_MqBSGeYSKPYP0KKzo4kwHmWCFCAT78hTjenjv675of4TA5Y-1SWAVVh84RCGVGvMyol36WFI_QmM2dQWmGslRpj8lO-Q-6oNQ6OCVjshiymm0-7wsGYehsBkeINDcBLwD2ZLwdjxg0YqJ8u61w0dW9NhHyni8atblAUjhB_XZaoB6bfbKM_P5K12pZGOfGXZAiD22fjou0rQo-gFmYG8a5tCVXhxoAlpt58DmWpvPC064klh0v5YDO39HjB1AyTsn7SpzG3e5UrBC_QHnft3s","type":"EdgeConnectType"}
Apr 28 12:36:49 hostname ziti[1051831]: {"_context":"ch{edge}-\u003eu{classic}-\u003ei{lY2W2RC0R/lzxN}","chSeq":3,"connId":2,"edgeSeq":0,"error":"invalid api session id, expected 7e68ba1e-2ad3-4047-8177-e71528f58bb8, got 47816793-fef9-47e2-ab73-3ae88ffcf621","file":"github.com/openziti/ziti/router/xgress_edge/listener.go:199","func":"github.com/openziti/ziti/router/xgress_edge.(*edgeClientConn).processConnect","level":"warning","msg":"failed to dial fabric","time":"2025-04-28T12:36:49.470Z","token":"eyJhbGciOiJSUzI1NiIsImtpZCI6ImVjZDUwYTg2MmJhNjEzYjBmYmU4YTU3NjVhNGI4MTc1ZTliOTMwNTgiLCJ0eXAiOiJKV1QifQ.eyJpc3MiOiJodHRwczovL292aC56aXRpLnJhbnRhbnBsYW4ucnU6OTkzIiwic3ViIjoiMXpmU1JsVDlLZndPQ2RWdG5zVERSayIsImF1ZCI6WyJvcGVueml0aSJdLCJleHAiOjE3NzczNzk4MDgsImlhdCI6MTc0NTg0MzgwOCwianRpIjoiNGE3NTdmYTEtODQ5ZS00MzRkLWI3MmUtNTRkODEzM2ZhYjRhIiwiel9hc2lkIjoiNDc4MTY3OTMtZmVmOS00N2UyLWFiNzMtM2FlODhmZmNmNjIxIiwiel9paWQiOiJsWTJXMlJDMFIiLCJ6X3QiOiJzIiwiel9zdCI6IkRpYWwifQ.uuNa2-IXxQfpN90YstrGTVwZew2bTZHSjNhefQKOoSFRXSYZ5dIo4DhaJbRmenle5wb-Q-duNJbESYk7JlplTKWL9zw4Qtwd1SP-iHLoFrKdCICCaO0JatTMjJfqPF5opp6nrqCDCzl_LP10fnFLeJlHqNm9iyPBia7FAjTOUOcOcROC_i2WUR9S2_8sId1ZOQAjJVfiXzDCVPImmIceuAXxeflwnhVmntnbxhv2dt-bEcft7T5vXCAwFuLlkvFNHZqQ12PWOvdwlUz4tU4f569HNRhlDQZCRLr9X6OB3D4N9YoOJZ9C2RJNDHu0mSVWJZMicN43pc7LqK_CR0rERJHSfX_H0HvKiDIfXXlQwa2UKiumq4gEKq7xKQH05JJSi1HZRaXq8xP-fl22yNf3-_MqBSGeYSKPYP0KKzo4kwHmWCFCAT78hTjenjv675of4TA5Y-1SWAVVh84RCGVGvMyol36WFI_QmM2dQWmGslRpj8lO-Q-6oNQ6OCVjshiymm0-7wsGYehsBkeINDcBLwD2ZLwdjxg0YqJ8u61w0dW9NhHyni8atblAUjhB_XZaoB6bfbKM_P5K12pZGOfGXZAiD22fjou0rQo-gFmYG8a5tCVXhxoAlpt58DmWpvPC064klh0v5YDO39HjB1AyTsn7SpzG3e5UrBC_QHnft3s","type":"EdgeConnectType"}

The router repeat the last message until he stops working.

Apr 28 12:36:50 hostname ziti[1051831]: {"_context":"ch{edge}-\u003eu{classic}-\u003ei{lY2W2RC0R/N6QW}","chSeq":3,"connId":2,"edgeSeq":0,"error":"service 1zfSRlT9KfwOCdVtnsTDRk has no terminators","file":"github.com/openziti/ziti/router/xgress_edge/listener.go:199","func":"github.com/openziti/ziti/router/xgress_edge.(*edgeClientConn).processConnect","level":"warning","msg":"failed to dial fabric","time":"2025-04-28T12:36:50.388Z","token":"eyJhbGciOiJSUzI1NiIsImtpZCI6ImVjZDUwYTg2MmJhNjEzYjBmYmU4YTU3NjVhNGI4MTc1ZTliOTMwNTgiLCJ0eXAiOiJKV1QifQ.eyJpc3MiOiJodHRwczovL292aC56aXRpLnJhbnRhbnBsYW4ucnU6OTkzIiwic3ViIjoiMXpmU1JsVDlLZndPQ2RWdG5zVERSayIsImF1ZCI6WyJvcGVueml0aSJdLCJleHAiOjE3NzczNzk4MDksImlhdCI6MTc0NTg0MzgwOSwianRpIjoiMGZiYTZmNTUtZjFmOC00ZWJiLTg5ZDYtMmEyZmM0ZWU5OTI3Iiwiel9hc2lkIjoiN2U2OGJhMWUtMmFkMy00MDQ3LTgxNzctZTcxNTI4ZjU4YmI4Iiwiel9paWQiOiJsWTJXMlJDMFIiLCJ6X3QiOiJzIiwiel9zdCI6IkRpYWwifQ.rDZsYPp-f1l-Ker1qzM4oqkJR0jpXORTSsRX3CesbSddshXlSWe1EmsmdWvO6jk-Dgqz_Nest2dqHD3z9ssKZcSEvkBc9G-Iy7X1OAqqHV23MbU7Bhz1gTvBVPzLHrflTOFQumRygZP1APiaEvbsjrrYmjNXVS9F_9_4UhkdirXlsTC0egl_4LwPbFV5C94ASqEeX7v-O81OYmQcJoWlRTDSLZ2WYvOl39gOctsu0ybGCmaGMsjw3SjbgcqBAtlcP_IL1k9-BcIP8qvG2Zfq7RGFPYWBPKAKdPnjfFiGUstaC3VunLxB6gnraZCuj-tfx-7aAuQNbjU6rkzv3Qo-236ScQEKNrgsNhONsE8gHiAVHVj9YluwC-eyGpULeCb_3rRhIl8-MYDxqFpQnrKI4yqj-JhzkcT0iTXZqzFKziqfb183xN57MYt_N4UPsCBXL9ZIwjuHJbet4ZlOR8PRCSH_JVQNdzvLg-ZLbjwzYlsNwk6IuNDCDXVBUZqCz88qgUr_EZsqRJMvFuWpUwL9NLW7GCoGJicqw86XJQ9eB-KnMBj36VkiOIP7Ei9wc59OooryWD2Vy55AhK_f6LHPZlWpjPkDTfkWa-fZfiaLqIvIh6ZwkJhvXGMfYw5sRuVo1ZYVFwhq_tW7iJK9RKTxhYSErLXXWWjLOszVw4WRwjY","type":"EdgeConnectType"}
Apr 28 12:36:50 hostname ziti[1051831]: {"_context":"ch{edge}-\u003eu{classic}-\u003ei{lY2W2RC0R/3a8N}","chSeq":1,"connId":1,"edgeSeq":0,"error":"service 1zfSRlT9KfwOCdVtnsTDRk has no terminators","file":"github.com/openziti/ziti/router/xgress_edge/listener.go:199","func":"github.com/openziti/ziti/router/xgress_edge.(*edgeClientConn).processConnect","level":"warning","msg":"failed to dial fabric","time":"2025-04-28T12:36:50.587Z","token":"eyJhbGciOiJSUzI1NiIsImtpZCI6ImVjZDUwYTg2MmJhNjEzYjBmYmU4YTU3NjVhNGI4MTc1ZTliOTMwNTgiLCJ0eXAiOiJKV1QifQ.eyJpc3MiOiJodHRwczovL292aC56aXRpLnJhbnRhbnBsYW4ucnU6OTkzIiwic3ViIjoiMXpmU1JsVDlLZndPQ2RWdG5zVERSayIsImF1ZCI6WyJvcGVueml0aSJdLCJleHAiOjE3NzczNzk4MDksImlhdCI6MTc0NTg0MzgwOSwianRpIjoiMGZiYTZmNTUtZjFmOC00ZWJiLTg5ZDYtMmEyZmM0ZWU5OTI3Iiwiel9hc2lkIjoiN2U2OGJhMWUtMmFkMy00MDQ3LTgxNzctZTcxNTI4ZjU4YmI4Iiwiel9paWQiOiJsWTJXMlJDMFIiLCJ6X3QiOiJzIiwiel9zdCI6IkRpYWwifQ.rDZsYPp-f1l-Ker1qzM4oqkJR0jpXORTSsRX3CesbSddshXlSWe1EmsmdWvO6jk-Dgqz_Nest2dqHD3z9ssKZcSEvkBc9G-Iy7X1OAqqHV23MbU7Bhz1gTvBVPzLHrflTOFQumRygZP1APiaEvbsjrrYmjNXVS9F_9_4UhkdirXlsTC0egl_4LwPbFV5C94ASqEeX7v-O81OYmQcJoWlRTDSLZ2WYvOl39gOctsu0ybGCmaGMsjw3SjbgcqBAtlcP_IL1k9-BcIP8qvG2Zfq7RGFPYWBPKAKdPnjfFiGUstaC3VunLxB6gnraZCuj-tfx-7aAuQNbjU6rkzv3Qo-236ScQEKNrgsNhONsE8gHiAVHVj9YluwC-eyGpULeCb_3rRhIl8-MYDxqFpQnrKI4yqj-JhzkcT0iTXZqzFKziqfb183xN57MYt_N4UPsCBXL9ZIwjuHJbet4ZlOR8PRCSH_JVQNdzvLg-ZLbjwzYlsNwk6IuNDCDXVBUZqCz88qgUr_EZsqRJMvFuWpUwL9NLW7GCoGJicqw86XJQ9eB-KnMBj36VkiOIP7Ei9wc59OooryWD2Vy55AhK_f6LHPZlWpjPkDTfkWa-fZfiaLqIvIh6ZwkJhvXGMfYw5sRuVo1ZYVFwhq_tW7iJK9RKTxhYSErLXXWWjLOszVw4WRwjY","type":"EdgeConnectType"}

It sounds like zrok self-hosters need to continue using ziti in standalone mode, not clustered mode (even with a single cluster member), until zrok implements ziti clustered mode.

There is only one ziti controller without HA mode. So yes, It will be nice to have HA implementation.

I have disabled ha in the router's config file and reverted back to the initial value enableHa=false in identities files (sdk side).
It seems that Zrok works in a single node HA configuration :grinning_face:

By a pure luck zrok works in a single node HA cluster, but adding a second node to the HA cluster stops the network:

Apr 29 11:50:53 ziti[1161676]: {"file":"github.com/openziti/ziti/router/xgress_edge/hosted.go:267","func":"github.com/openziti/ziti/router/xgress_edge.(*hostedServiceRegistry).RemoveTerminatorsRateLimited.func1","level":"error","msg":"terminator was replaced after being put into deleting state?!","terminatorId":"2n7c2xFkkNif7TuwKeHHDB","time":"2025-04-29T11:50:53.838Z"}
Apr 29 11:51:05 ziti[1155702]: {"error":"no api session found for token [69f62655-519a-4c9b-b2e2-4a2820e46ee9], fingerprint: [72230c5c2f41113d46a101ca49a5021bc0725f31], subjects [[CN=bXAXQYCyhW,...","file":"github.com/openziti/channel/v3@v3.0.39/impl.go:124","func":"github.com/openziti/channel/v3.AcceptNextChannel.func1","level":"error","msg":"failure accepting channel edge with underlay u{classic}-\u003ei{bXAXQYCyhW/YBLW}","time":"2025-04-29T11:51:05.093Z"}