I'm getting failed to connect: -111/connection refused on one host, even though I'm running the command as root and on the other host I'm getting a success message but no output is created:
dmuensterer@zabbix:~$ sudo ziti-edge-tunnel dump -p /tmp/ziti-edge-tunnel-dump-bind/
received response <{"Success":true,"Code":0}
>
We definitely need to clean up the dump command. In the meantime which user/group is the ziti-edge-tunnel process running as? Does that user have write permission to drwxr-xr-x 2 dmuensterer dmuensterer ?
Aha, I'm now able to dump on the bind host. I thought since I was running ziti-edge-tunnel as root this was sufficient but after chowing the directory to ziti:ziti it created a file successfully. @ekoby I've send you the dump via PM!
For the dial side, the same command and permissions still leave me with
Correct, 4242 is for the ssh sessions.
Any further data I can provide to troubleshoot?
Were you able telling by the pcap what the issue in the traffic is which is causing this behaviour?
$ sudo ls -al /tmp/.ziti/
total 0
drwxr-x---. 2 root ziti 80 Dec 21 08:11 .
drwxrwxrwt. 24 root root 640 Dec 21 08:11 ..
srwxrwxrwx. 1 root ziti 0 Dec 21 08:11 ziti-edge-tunnel-event.sock
srwxrwxrwx. 1 root ziti 0 Dec 21 08:11 ziti-edge-tunnel.sock
These would be the domain sockets that are created by the ziti-edge-tunnel server process. It will only create this directory and the domain sockets within if they can be created with the ziti group. It insists on using the ziti group to avoid requiring things like electron UIs to run as root strictly for access to the domain sockets.
I'm guessing you have a ziti group on the intercepting system, since you say you're using the same permissions (and I assume ownership) of the dump directory as on the hosting tunneler's host. Did the ziti group on this system exist when the server ziti-edge-tunnel was started? If not you'll need to restart the process to have the domain sockets created.
packet capture looked normal -- no leaked/stale connections -- which is consistent with the output from ziti-edge-tunnel dump. These findings narrow the cause of the problems to communication between ZET and ER.
In the normal flow ZET sends ConnectionClosed message to ER and ER tears down the circuit. So it's either ZET failing to send the message or ER failing to process it.
Have you updated both sides to latest ZET release?
removing idle circuit, idle time of X exceedes max idle time of Y"
If not, did you upgrade the routers, or just the controller? If you upgraded both, would you will to try running the controller with verbose output? There are several debug messages that would tell us more about why idle circuits aren't being terminated.
Ah, sorry about that - I missed the routers and only upgraded the controller!
Works now, the circuits seem to be getting closed - no more idle circuits building up... Thanks for the help.
As for the ZET/ZR connection, if I can help here further narrow down the problem please let me know.
Such awesome work everyone of you at NetFoundry does and I'd love to help!
That's great! I'm assuming ekoby will continue to dig into the ZET/router disconnect and will reach out if necessary, but likely not until the new year. Appreciate your persistence and assistance