Environment
-
Host: Windows + WSL2 (Ubuntu). Kernel in logs:
6.6.87.2-microsoft-standard-WSL2 -
Docker/Compose running inside WSL.
-
OpenZiti images:
-
openziti/ziti-controller:1.6.12 -
openziti/ziti-router:1.6.12
-
-
Topology: a lab Docker Compose stack with multiple services, but the issue is specifically between ziti-controller and ziti-router.
-
Controller persistence:
-
named volume
ziti_controller_data:/openziti/var(controller DB) -
named volume
ziti_pki:/openziti/pki(PKI) -
bind mount
./configs/ziti/controller:/openziti/config
-
-
Router config/identity persistence:
-
bind mount
./configs/ziti/router:/openziti/config -
router runs as
user: "0:0"and uses customentrypoint.s
-
Project/orchestration code
-
The lab is orchestrated by bash scripts + Docker Compose. Router/controller YAML files are generated by a bash provision script.
-
scripts/provision_ziti.shgenerates:-
configs/ziti/controller/controller.yaml(controller config) -
configs/ziti/router/router.yaml(router config)
-
-
Router config generated by the script pins identity files to the bind mount:
identity: cert: /openziti/config/router.cert server_cert: /openziti/config/router.cert key: /openziti/config/router.key ca: /openziti/config/ctrl-ca.cert ctrl: endpoint: tls:ziti-controller:6262 ca: /openziti/config/ctrl-ca.cert(from
provision_ziti.sh) -
Router container entrypoint only validates presence of
router.cert/router.key/ctrl-ca.certthen runs router; it does not auto-enroll on start.
What I’m trying to do (expected behavior)
-
Start controller (healthy).
-
ziti controller edge init(if fresh) and login. -
Recreate edge-router, export JWT.
-
Run one-shot enroll via
ziti-router enroll ... -j router.jwtwhich ends withregistration complete. -
Start router container and it should go ONLINE in controller.
What actually happens (problem)
-
Enrollment frequently reports success:
ziti/router/enroll.(*RestEnroller).Enroll: registration complete
-
But router never becomes online, and provisioning script fails:
[lab][fatal] Edge-router did not become online
-
Controller logs show repeated fingerprint mismatch / unenrolled router errors on the control plane port
6262:-
router fingerprint mismatchwithrouterIdand mismatched fingerprints -
incorrect fingerprint/unenrolled router, routerId: nRwU0xrBIW, given fingerprints: [...]
-
-
Example (same routerId, stable mismatch):
-
Controller side:
router fingerprint mismatch,routerId:"nRwU0xrBIW"; fieldfp:"75ed65...",givenFps:["d82920..."] -
Router side: starts normally but fails to connect to controller endpoint:
routerId":"nRwU0xrBIW"thenunable to connect controller ... (EOF)
-
So it looks like the controller expects one fingerprint for routerId, but the router presents a different identity certificate (or controller DB has different one saved).
Repro steps (as implemented in the scripts)
From PowerShell/WSL I run the lab script which does roughly:
-
docker compose up -d ...and waits for controller healthcheck. -
provision_ziti.shdoes:-
login to Edge Management API
-
delete/recreate the edge-router and export a fresh JWT
-
run one-shot enroll using the router image
-
then start the router container and wait for it to become ONLINE
-
The relevant parts from logs:
-
“Recreating edge-router ziti-router and exporting JWT” then JWT is copied to
./configs/ziti/router/router.jwt -
“One-shot router enrollment” runs and finishes with “registration complete”
-
Router container starts, but provisioning ends with “Edge-router did not become online”
Cleanup attempts / why this is not “old images”
-
Images are pinned to
1.6.12. -
Between iterations I delete the iteration folder state on disk and restart containers.
-
However, compose repeatedly prints warnings that the named volumes already exist and were not created by this compose project:
-
volume "..._ziti_pki" already exists but was not created by Docker Compose -
volume "..._ziti_controller_data" already exists but was not created by Docker Compose
(I’m mentioning this because it may be relevant to persistence/DB/PKI state across runs.)
-
Additional observations that may be relevant
-
Router sometimes loads a cached router model file:
-
loaded router model from file ... /openziti/config/router.yaml.proto.gzipThe cleanup in my provision script removes
router.yaml.json.gzip(note: different name)— unsure if this mismatch means the old
router.yaml.proto.gzipcan persist unintentionally.
-
-
Router uses
endpoints.ymlbut reports it empty and falls back to initial endpoint from config:-
empty endpoint list in endpoints file, falling back to initial endpoints from configEndpoint used:
tls:ziti-controller:6262
-
-
Controller also logged a TLS “bad certificate” handshake on
1280from localhost earlier, but main issue is the router control plane mismatch on6262.
What I need help with (questions for OpenZiti devs)
-
In this setup (Docker Compose + WSL2 + bind-mounted router identity files), what are the most likely causes for:
-
enroll reports “registration complete”
-
but controller then rejects control plane connection with fingerprint mismatch for the same
routerId?
-
-
Is it possible that:
-
router is using a different cert/key than the ones produced by the enroll step (e.g., stale files, wrong path, cached model)?
-
controller has stale router record in DB due to persistent volumes / timeline / router model caching?
-
-
What is the recommended fully-deterministic storage strategy for router identity (
router.cert/router.key) in Docker/WSL context?- Should identity be stored in a named volume rather than a bind mount?
-
Any known pitfalls with the generated/implicit trust domain warnings on controller startup affecting enrolled components?
docker-console-logs2.txt (627.4 KB)
help-please.zip (115.7 KB)
ziti-router-logs2.txt (208.5 KB)
ziti-controller-logs2.txt (268.9 KB)
power-shell-logs2.txt (5.1 KB)