I have been restarting my lab controller, A LOT, in testing, and I find I am too impatient to wait and see how long it will take before the clients know the controller is back up... or perhaps it's the router they I am waiting on. In any case, I get impatient and sudo systemctl restart ziti-edge-tunnel
, and I am gratified instantly. Part of this is trying to get the tunneler to work on the docker host hosting ziti... and race condition of the tunnel service starting before the docker services do.
If I were to just wait it out, is there a polling cycle for checking in with the controller/router if said goes offline?
Can you provide the steps you're taking and what you mean by "retry"? We'll want to see your exact steps to understand what you mean. It also depends on the tunneler, but I'm assuming you mean ziti-edge-tunnel
?
A tunneler will try to connect to the controller on a set interval (I think the default is 10 or 15s, I didn't find the exact value but if you need to know the precise value we can find it) to check for any updates to the network. Outside of that, every time the client tries to dial a service, it'll need to connect to a controller to authorize the service dial.
If the controller goes offline "forever", existing sessions will be fine but as mentioned, no new sessions will be able to be established (until the upcoming HA release)
I will provide some more context... but don't get distracted as I have what I am describing in a separate thread ;-).
I was restarting the docker host, over and over, to try and reproduce an error I was having on restart with the Router container. My testing meant a reboot, then a test of one of a ziti service, such as ssh root@myhost.ziti
. The command would fail until I restarted the tunneler service on my mac and/or on the debian hosts. I wouldn't proceed to my next test until I verified that I could DIAL and BIND as expected.
Since it was the docker host restarting, I expected there to be time for the containers to spin up again and start communicating. I would expect that by the time the ZAC was up and could login to the controller that should have been sufficient time for the tunnels to start working again.
So, assuming the host for the controller and edgerouter were to restart, should it take more than a couple of minutes for the DIAL and BIND to start working? Or should I wait a specified time before trying? Again, restarting the tunneler always restores it within seconds.
That is definitely not what I would expect. You most definitely should not need to restart the tunneler in order to connect. That sounds like you've discovered a bug to fix. If you wait "a minute or two", that's much longer than I'd expect you needed to wait. @scareything, this isn't some kind of known issue, right?
@jptechnical, clear steps to reproduce are always appreciated if it happens to you ever time. I worry that we won't be able to reproduce this one either though.
JP-as-a-Service, at your service.
I think the specific scenario is my not being patient enough. It also could be the mac agent being dumb... it's not nearly as responsive as the linux one (IMO). Knowing that there is a 15-30 second polling cycle and my need for dopamine is sufficient. Thanks!