I have a VM on Azure Cloud and I've performed the ExpressInstall to install OpenZiti on Windows 11, but I encountered an error at this step:
"waiting for the controller to come online to allow the edge router to enroll
waiting for https://at170114.southindia.cloudapp.azure.com:8441
waiting for https://at170114.southindia.cloudapp.azure.com:8441
waiting for https://at170114.southindia.cloudapp.azure.com:8441
waiting for https://at170114.southindia.cloudapp.azure.com:8441"
I've also tried opening port 8441/TCP in the Cloud Firewall, but it's still not working. Can you help me fix this issue?
Hi @CQDet2803, welcome to the community and to OpenZiti (and zrok and BrowZer)!
When the quickstart waits for the controller to start, the external address needs to be resolvable and the process needs to start. It's possible that the process is running but the assigned advertised address is not addressable.
When you get into this state, first check that the controller is running. You say it's running on windows, but I'm not exactly sure what you mean because the quicsktart is exclusively a bash script. I assume you're using WSL??? You should run ps and grep or systemd (if you used systemd) to make sure the controller is running. Something like ps -ef | grep ziti
. Make sure a process is running.
After that you need to verify it's listening on the proper port (8441 in this case). You'd do that with netstat (or if you know ss you could use that). Something like: netstat -ano | grep 8441 | grep LIST
If both of those things are true, then you can connect to the port using localhost:8441. A browser will be fine for this task. If you get json back -- the problem is definitely firewall/routing related. Unfortunately, we won't be able to help with that.
What version of ziti do you have running? If you have a version greater than 1.1.10, you could try using the new ziti ops verify-network --controller-config-file $HOME/.ziti/quickstart/$(hostname)/$(hostname).yaml
command. It'll show you output like:
ziti ops verify-network --controller-config-file $HOME/.ziti/quickstart/$(hostname)/$(hostname).yaml
INFO Verifying controller config: /home/ubuntu/.ziti/quickstart/ip-172-31-47-200/ip-172-31-47-200.yaml
INFO controller advertise address at ip-172-31-47-200:8440 is available.
INFO verifying 1 web entries
INFO verifying 1 web bindPoints
INFO web entry[client-management], bindPoint[0] address at ec2-3-18-113-172.us-east-2.compute.amazonaws.com:24882 is available.
INFO web entry[client-management], bindPoint[0] is valid
INFO All requested checks passed.
AWS hairpins your DNS entry so in my case, although the test shows 24882 is available
it's actually not allowed through the firewall.
It might help if you also shared more of the logs to see if there are any errors before this section
After encountering the error, I tried running it again and found that the Controller is still running but listening on IPv6.
"sudo netstat -ano | grep 8441
tcp6 0 0 :::8441 :::* LISTEN off (0.00/0/0)"
Could this be the main cause?
Interesting, yes that definitely can cause problems. Is IPv4 disabled on the box? Using dig I can see only an IPv4 address:
dig AAAA at170114.southindia.cloudapp.azure.com +short
dig A at170114.southindia.cloudapp.azure.com +short
20.235.157.136
My guess is that if you had ipv4 working, it'd have be fine. Or possibly returned a AAAA record for the DNS entry? I don't have a ton of experience with ipv4 -> ipv6 listeners but my guess is that yes, that's the issue
I'm still facing the same issue with that error. I tried to skip it and continue with ZitiLogin, and I received this message:
"zitiLogin
[ 0.126] INFO ziti/ziti/cmd/helpers.StandardErrorMessage: Connection error: Get https://at170114.southindia.cloudapp.azure.com:8441/.well-known/est/cacerts: dial tcp 20.235.157.136:8441: connect: connection refused
The connection to the server at170114.southindia.cloudapp.azure.com:8441 was refused - did you specify the right host or port?"
Do you know anything about it? I'm not really sure because I've checked the DNS name, port, and firewall, and everything seems to be correct.
Yes I'm familiar with that endpoint. It's still a systemic problem of not being able to access the controller itself though. This endpoint returns the bundle of CAs that is to be trusted and is most often used during device enrollment or in your case -- during logging in.
Are you sure the process is still running? This still seems like a case of the controller just not being online or listening on the proper port (or in this case, IP address v4/v6 maybe)
EDIT:
For what it's worth, I still cannot connect to your controller from my location either with the browser or with openssl:
openssl s_client -connect at170114.southindia.cloudapp.azure.com:8441
40F75A950B7F0000:error:8000006F:system library:BIO_connect:Connection refused:../crypto/bio/bio_sock2.c:114:calling connect()
40F75A950B7F0000:error:10000067:BIO routines:BIO_connect:connect error:../crypto/bio/bio_sock2.c:116:
connect:errno=111
I have a question, If I can connect to the server on the Cloud using this command: "openssl s_client -connect at170114.southindia.cloudapp.azure.com:8441", then I should be able to run ExpressInstall normally without encountering that previous error, right?
Generally I'd turn that around and say after running the controller process, you should be able to connect to it using openssl. But if you stand up some other process on the port and can connect to it first, then yes, I would expect the express install script to succeed... You will need some server providing TLS on that port though for openssl to function as expected. To just port probe, nmap works but openssl specifically looks at the certificates being returned.
What I was trying to see was the certificates returned by that port (if any). the full command would be something like:
openssl s_client -connect at170114.southindia.cloudapp.azure.com:8441 | openssl x509 -text
From that output, I'd be looking for a certificate with the address in the subject alternate name field.
Hello, sorry for not replying to you earlier. I’ve been struggling with this issue for quite some time, but today I tried SSH into Azure and ran the Express Install, and surprisingly, it worked. I’m planning to use it as a proxy to access cloud resources. Could I ask for some advice from you?