"CONTROLLER_UNAVAILABLE" error when enrolling

Hi,

I set up OpenZiti following Controller deployment and Router deployment.
The strange thing I did to bypass "dns name" to the controller when running the bootstrap.bash script was deleting the condition:

elif [[ "${ZITI_CTRL_ADVERTISED_ADDRESS}" =~ ^[:0-9] ]]; then

to be able to pass my dns name (which looks like 8a1db54a85q97d.example.com). See the issue of bootstrap.bash here.

Environment:

  • Controller (ubuntu lxc)
  • Edge-Router (ubuntu lxc)

Both are available on the internet via my dns name I provided in the bootstrap.bash of controller.

After the installation, I migrated from an old OpenZiti infrastructure, I copied my database file to the new openziti infrastructure.
Also installed ziti-console, seeing that after re-enrolling, the edge-router is online:
image

So in this case, the enrollment worked (I entered the DNS name address of the controller when running router's bootstrap.bash).

Topic issue:
I tried to do the same for my other identities (on my Android and my Windows devices).
And got an unexpected error:

Android:
java.lang.Exception: CONTROLLER_UNAVAILABLE
signal-2024-11-03-161519_002-ezgif.com-video-to-gif-converter

Windows:

I tried with two different enrollment tokens took directly from the ziti console (QRcode and JWT file) but doesnt work.

I tried to check both of applications and tunneler logs of the two devices but no log was generated except (JWT obfuscated):

[2024-11-03T14:52:00.775Z] TRACE	ZitiDesktopEdge.ServiceClient.DataClient	{"Data":{"JwtFileName":"Damien's laptop.jwt","JwtContent":"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c"},"Command":"AddIdentity"}	
[2024-11-03T14:52:00.775Z] ERROR	ZitiDesktopEdge.ServiceClient.DataClient	Unexpected error	System.IO.IOException: Unexpected error when sending data to service. the monitor service appears to be offline?
   à ZitiDesktopEdge.ServiceClient.AbstractClient.<sendAsync>d__36.MoveNext()
--- Fin de la trace de la pile à partir de l'emplacement précédent au niveau duquel l'exception a été levée ---
   à System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   à System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   à ZitiDesktopEdge.ServiceClient.DataClient.<AddIdentityAsync>d__49.MoveNext()

And got zero log on both controller and router service.

DNS name is reachable from the Internet (I can access to the API and ziti console) on differents browsers from differents networks and from differents dns servers.

I also tested to create a new identity and enrolled it on a device but got the same problem.

How is that possible ?

Thanks in advance

Additional information:
I got my domain name on my administrator panel of my router (ISP proprietary) which can provide a free DNS with Let's Encrypt certificate associated with.
It may be a conflict between dns name certificate and openziti certificate ?

This means the ziti-monitor service is not available. The ziti-monitor service is what stops and starts the ziti service on behalf of users. Check to make sure it's running, after a fresh restart, it takes the service a moment to come online as it's set to "delayed start". This means it starts up after all the other windows services are running. If it's not running, try starting it via 'services' or with net start ziti-monitor from an admin command prompt. This should not affect enrollment, but it will affect turning the tunneler on/off until the service is started. If the service is not starting, you should probably look in the logs generally found at C:\Program Files (x86)\NetFoundry Inc\Ziti Desktop Edge\logs\ZitiMonitorService -- or Main Menu -> Feedback and email me your logs and I'll look at them for you. Sometimes windows event viewer is necessary to understand what happened to that service. My guess is that it was a fresh restart?

As for the extra bit of detail you supplied :

It's actually quite important you don't use this cert/key unless you put it only into the "alt server cert" section, and even if you DO THAT, it's important that the alternative server name isn't exactly the same. If the controller has two sets of certs for the exact same name you will receive non-deterministic behavior. The controller uses server name indication (SNI) to locate the proper certificate to present. if there are more than one (one from the OpenZiti setup process, one from LetsEncrypt) you will have non-deterministic selection of that certificate.

Best place to look for these logs is in the Ziti Desktop Edge for Windows, after you get the service up and running.

Let me know how that goes and make sure you don't have overlapping certs.

Thanks for you reply

I regenerated a DNS name without let's encrypt certificate, fully-uninstalled ziti-controller and ziti-router from both devices and then reinstalled them with the new dns name containing only lowercase letters.

After the installation, I did not import the old database but tried to create an identity on a empty database and tried to enroll it with Android and Windows but didn't work, with the same error.
Tried on iOS too
I also uninstalled and reinstalled both tunnelers android and windows but didn't resolve the issue.

Tried to create an identity in command line instead of using ziti console.

I will send you the whole windows logs.

The ziti CLI now ships with a function: ziti ops verify-traffic

Can you also run that?

ziti ops verify-traffic --host localhost --port 1280 --username admin --password admin

a successful run will look like this

w

If there are any errors, those are helpful for me to help diagnose what's going on too

So,

After few discussion with @TheLumberjack , I found the bug when capturing the network traffic:


As you can see, my tunneler resolved my domain name as an IPv6 an try to communicate on port 48440 of the controller and it instantly reset because it doesn't support IPv6.
And that's why the tunneler raise "CONTROLLER UNAVAILABLE"

To resolve this issue, if your domain name can be resolved in both IPv4 and IPv6:

  • Your controller is internet facing: enable IPv6 and it should works
  • Your controller is behind a NAT: Enable DHCPv6 on your internet fronting router and make sure that both NAT rule for IPv4 and IPv6 are set

Otherwise if you don't want to use IPv6 on your specific Tunneler, you can simply disable it in your device settings and it will be forced to use IPv4.