I know one place we've seen it is with some load balancer software. I think maybe it was closing the connection but not sending a TCP FIN message, so tunneler was never notified that the connection was closed. @scareything will hopefully correct me if I've got the details wrong.
I'm unsure why the amount of idle circuits happens. Using tcpdump
, I can see that the connections are closed correctly via TCP FIN messages. Any ideas as to why this happens?
dmuensterer@bastion:~$ sudo tcpdump -i any port 10051
tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
19:35:03.268283 ziti0 Out IP bastion.45818 > 100.64.0.4.zabbix-trapper: Flags [S], seq 2908279969, win 64240, options [mss 1460,sackOK,TS val 2170538484 ecr 0,nop,wscale 7], length 0
19:35:03.277572 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.45818: Flags [S.], seq 396031900, ack 2908279970, win 65535, options [mss 32768,nop,wscale 14], length 0
19:35:03.277579 ziti0 Out IP bastion.45818 > 100.64.0.4.zabbix-trapper: Flags [.], ack 1, win 502, length 0
19:35:03.277608 ziti0 Out IP bastion.45818 > 100.64.0.4.zabbix-trapper: Flags [P.], seq 1:75, ack 1, win 502, length 74
19:35:03.285306 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.45818: Flags [.], seq 1:1461, ack 75, win 3, length 1460
19:35:03.285311 ziti0 Out IP bastion.45818 > 100.64.0.4.zabbix-trapper: Flags [.], ack 1461, win 501, length 0
19:35:03.285314 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.45818: Flags [.], seq 1461:2921, ack 75, win 3, length 1460
19:35:03.285316 ziti0 Out IP bastion.45818 > 100.64.0.4.zabbix-trapper: Flags [.], ack 2921, win 494, length 0
19:35:03.285318 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.45818: Flags [.], seq 2921:4381, ack 75, win 3, length 1460
19:35:03.285320 ziti0 Out IP bastion.45818 > 100.64.0.4.zabbix-trapper: Flags [.], ack 4381, win 485, length 0
19:35:03.285328 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.45818: Flags [P.], seq 4381:5509, ack 75, win 3, length 1128
19:35:03.285330 ziti0 Out IP bastion.45818 > 100.64.0.4.zabbix-trapper: Flags [.], ack 5509, win 477, length 0
19:35:03.285531 ziti0 Out IP bastion.45818 > 100.64.0.4.zabbix-trapper: Flags [F.], seq 75, ack 5509, win 501, length 0
19:35:03.285903 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.45818: Flags [.], ack 76, win 3, length 0
19:35:03.286031 ziti0 Out IP bastion.45832 > 100.64.0.4.zabbix-trapper: Flags [S], seq 1523320303, win 64240, options [mss 1460,sackOK,TS val 2170538502 ecr 0,nop,wscale 7], length 0
19:35:03.288265 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.45818: Flags [F.], seq 5509, ack 76, win 3, length 0
19:35:03.288269 ziti0 Out IP bastion.45818 > 100.64.0.4.zabbix-trapper: Flags [.], ack 5510, win 501, length 0
19:35:03.295162 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.45832: Flags [S.], seq 396128851, ack 1523320304, win 65535, options [mss 32768,nop,wscale 14], length 0
19:35:03.295168 ziti0 Out IP bastion.45832 > 100.64.0.4.zabbix-trapper: Flags [.], ack 1, win 502, length 0
19:35:03.295190 ziti0 Out IP bastion.45832 > 100.64.0.4.zabbix-trapper: Flags [P.], seq 1:259, ack 1, win 502, length 258
19:35:03.301504 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.45832: Flags [P.], seq 1:104, ack 259, win 3, length 103
19:35:03.301509 ziti0 Out IP bastion.45832 > 100.64.0.4.zabbix-trapper: Flags [.], ack 104, win 502, length 0
19:35:03.301526 ziti0 Out IP bastion.45832 > 100.64.0.4.zabbix-trapper: Flags [F.], seq 259, ack 104, win 502, length 0
19:35:03.301989 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.45832: Flags [.], ack 260, win 3, length 0
19:35:03.308321 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.45832: Flags [F.], seq 104, ack 260, win 3, length 0
19:35:03.308325 ziti0 Out IP bastion.45832 > 100.64.0.4.zabbix-trapper: Flags [.], ack 105, win 502, length 0
19:35:08.312165 ziti0 Out IP bastion.53790 > 100.64.0.4.zabbix-trapper: Flags [S], seq 3785237650, win 64240, options [mss 1460,sackOK,TS val 2170543528 ecr 0,nop,wscale 7], length 0
19:35:08.322980 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.53790: Flags [S.], seq 396225803, ack 3785237651, win 65535, options [mss 32768,nop,wscale 14], length 0
19:35:08.322988 ziti0 Out IP bastion.53790 > 100.64.0.4.zabbix-trapper: Flags [.], ack 1, win 502, length 0
19:35:08.323022 ziti0 Out IP bastion.53790 > 100.64.0.4.zabbix-trapper: Flags [.], seq 1:1461, ack 1, win 502, length 1460
19:35:08.323023 ziti0 Out IP bastion.53790 > 100.64.0.4.zabbix-trapper: Flags [P.], seq 1461:2921, ack 1, win 502, length 1460
19:35:08.323027 ziti0 Out IP bastion.53790 > 100.64.0.4.zabbix-trapper: Flags [.], seq 2921:4381, ack 1, win 502, length 1460
19:35:08.323028 ziti0 Out IP bastion.53790 > 100.64.0.4.zabbix-trapper: Flags [P.], seq 4381:5841, ack 1, win 502, length 1460
19:35:08.323029 ziti0 Out IP bastion.53790 > 100.64.0.4.zabbix-trapper: Flags [P.], seq 5841:6497, ack 1, win 502, length 656
19:35:08.323047 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.53790: Flags [.], ack 2921, win 3, length 0
19:35:08.323061 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.53790: Flags [.], ack 5841, win 3, length 0
19:35:08.334556 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.53790: Flags [P.], seq 1:106, ack 6497, win 3, length 105
19:35:08.334563 ziti0 Out IP bastion.53790 > 100.64.0.4.zabbix-trapper: Flags [.], ack 106, win 502, length 0
19:35:08.334593 ziti0 Out IP bastion.53790 > 100.64.0.4.zabbix-trapper: Flags [F.], seq 6497, ack 106, win 502, length 0
19:35:08.334627 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.53790: Flags [.], ack 6498, win 3, length 0
19:35:08.341659 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.53790: Flags [F.], seq 106, ack 6498, win 3, length 0
19:35:08.341668 ziti0 Out IP bastion.53790 > 100.64.0.4.zabbix-trapper: Flags [.], ack 107, win 502, length 0
19:35:33.338740 ziti0 Out IP bastion.42338 > 100.64.0.4.zabbix-trapper: Flags [S], seq 1821337844, win 64240, options [mss 1460,sackOK,TS val 2170568554 ecr 0,nop,wscale 7], length 0
19:35:33.348655 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.42338: Flags [S.], seq 396322755, ack 1821337845, win 65535, options [mss 32768,nop,wscale 14], length 0
19:35:33.348663 ziti0 Out IP bastion.42338 > 100.64.0.4.zabbix-trapper: Flags [.], ack 1, win 502, length 0
19:35:33.348743 ziti0 Out IP bastion.42338 > 100.64.0.4.zabbix-trapper: Flags [P.], seq 1:245, ack 1, win 502, length 244
19:35:33.355690 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.42338: Flags [P.], seq 1:104, ack 245, win 3, length 103
19:35:33.355695 ziti0 Out IP bastion.42338 > 100.64.0.4.zabbix-trapper: Flags [.], ack 104, win 502, length 0
19:35:33.355716 ziti0 Out IP bastion.42338 > 100.64.0.4.zabbix-trapper: Flags [F.], seq 245, ack 104, win 502, length 0
19:35:33.355735 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.42338: Flags [.], ack 246, win 3, length 0
19:35:33.362821 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.42338: Flags [F.], seq 104, ack 246, win 3, length 0
19:35:33.362825 ziti0 Out IP bastion.42338 > 100.64.0.4.zabbix-trapper: Flags [.], ack 105, win 502, length 0
19:36:03.360547 ziti0 Out IP bastion.42010 > 100.64.0.4.zabbix-trapper: Flags [S], seq 541629275, win 64240, options [mss 1460,sackOK,TS val 2170598576 ecr 0,nop,wscale 7], length 0
19:36:03.371208 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.42010: Flags [S.], seq 396419708, ack 541629276, win 65535, options [mss 32768,nop,wscale 14], length 0
19:36:03.371219 ziti0 Out IP bastion.42010 > 100.64.0.4.zabbix-trapper: Flags [.], ack 1, win 502, length 0
19:36:03.371258 ziti0 Out IP bastion.42010 > 100.64.0.4.zabbix-trapper: Flags [P.], seq 1:259, ack 1, win 502, length 258
19:36:03.377822 ziti0 In IP 100.64.0.4.zabbix-trapper > bastion.42010: Flags [P.], seq 1:104, ack 259, win 3, length 103
19:36:03.377828 ziti0 Out IP bastion.42010 > 100.64.0.4.zabbix-trapper: Flags [.], ack 104, win 502, length 0
19:36:03.377853 ziti0 Out IP bastion.42010 > 100.64.0.4.zabbix-trapper: Flags [F.], seq 259, ack 104, win 502, length 0
I'm really interested in why Ziti thinks those circuits should be kept open and what I can do to prevent this problem.
Are you using ziti-edge-tunnel or an edge router/tunneler to intercept/host traffic? I'm hoping we can try a different client and see if we can narrow it down to a specific code base.
Let us know what you're using, and if you're willing to try a different tunneler to try and narrow things down.
Thank you,
Paul
Thanks, I'm using ziti-edge-tunnel
on both sides and of course I'd like to try a different tunneler to see if we can prevent the problem. Just tell me what I should try
The simplest thing to try would be ziti tunnel
. It's a standalone version of the tunneling code in the edge-router/tunneler. ziti-edge-tunnel
is more full featured and lighter weight, so we don't generally recommend it, but in this case it should be a simple drop in replacement, at least if you're on linux.
If you can try ziti tunnel tproxy -i <identity file.json>
, let me know how that works.
FYI, the per-service max idle time is almost done.Settings for tcp keepalives in intercept and host configs is up next. See:
- Configurable Timer needed to close idle circuits · Issue #1496 · openziti/ziti · GitHub
- Add config option for tcp keep-alive in intercept.v1, host.v1 and host.v2 · Issue #1567 · openziti/ziti · GitHub
Appreciate your helping tracking this down!
Paul
On both the dial and bind side?
Very cool, I very much appreciate it! Thanks
Sure, let's start with that, and then we can try variations depending on the outcome of the first test.
Alrighty, running ziti tunnel on both support.mycompany.ziti
(dial) and zabbix.mycompany.ziti
(bind):
ziti@zt:~/.ziti/quickstart/zt/ziti-bin$ ziti edge list identities "limit 100" | grep zabbix
│ fSBYXsTxXr │ support.mycompany.ziti │ Default │ ssh,zabbix_agent │ Default │
│ yBJ68BYIL │ zabbix.mycompany.ziti │ Default │ admin,bind_web80,ssh │ Default │
Now created the following service:
ziti edge create config "mycompanyAllowZabbixAgent10051-Server" "cea49285-6c07-42cf-9f52-09a9b115c783" {"hostname":"127.0.0.1","port":10051,"protocol":"tcp"}
ziti edge create config "mycompanyAllowZabbixAgent10051-Client" "f2dd2df0-9c04-4b84-a91e-71437ac229f1" {"hostname":"zabbix.mycompany.ziti","port":10051}
ziti edge create service "mycompany_Allow_Zabbix_Agent_10051" --configs [3bCieUa19ltK2Zy8WIoSbc, 7e1QFnfonj72kEbEvSssUL]
ziti edge create service-policy "mycompanyAllowZabbixAgent10051-BindPolicy" Bind --semantic AnyOf --serviceRoles [@5hGZLFMoqtAjKeOZmvbFIn] --identityRoles [@yBJ68BYIL]
ziti edge create service-policy "mycompanyAllowZabbixAgent10051-DialPolicy" Dial --semantic AnyOf --serviceRoles [@5hGZLFMoqtAjKeOZmvbFIn] --identityRoles [#zabbix_agent]
Network connection is possible:
dmuensterer@support:~$ echo "" > /dev/tcp/zabbix.mycompany.ziti/10051
dmuensterer@support:~$
No circuit there yet:
ziti@zt:~/.ziti/quickstart/zt/ziti-bin$ ziti fabric list circuits | grep "mycompany_Allow_Zabbix_Agent_10051" | wc -l
0
Let's start the zabbix-agent
and check again:
ziti@zt:~/.ziti/quickstart/zt/ziti-bin$ ziti fabric list circuits | grep "mycompany_Allow_Zabbix_Agent_10051" | wc -l
0
Huh, still 0? The scheduled data transmission works though (approx. once a minute) - so looks like the circuit is created and closed immediately afterwards and there is no problem using ziti tunnel
on both sides?
Running ziti tunnel support.mycompany.ziti
(dial) and ziti-edge-tunnel on zabbix.mycompany.ziti
(bind):
ziti@zt:~/.ziti/quickstart/zt/ziti-bin$ ziti fabric list circuits | grep "mycompany_Allow_Zabbix_Agent_10051" | wc -l
0
Still 0.
Running ziti-edge-tunnel on support.mycompany.ziti
(dial) and ziti tunnel on zabbix.mycompany.ziti
(bind):
ziti@zt:~/.ziti/quickstart/zt/ziti-bin$ ziti fabric list circuits | grep "mycompany_Allow_Zabbix_Agent_10051" | wc -l
0
Still 0.
Running ziti-edge-tunnel support.mycompany.ziti
(dial) and ziti-edge-tunnel on zabbix.mycompany.ziti
(bind):
ziti@zt:~/.ziti/quickstart/zt/ziti-bin$ date
Fri Dec 8 13:55:39 CET 2023
ziti@zt:~/.ziti/quickstart/zt/ziti-bin$ ziti fabric list circuits | grep "mycompany_Allow_Zabbix_Agent_10051" | wc -l
4
let's wait 5 minutes:
ziti@zt:~/.ziti/quickstart/zt/ziti-bin$ date
Fri Dec 8 14:00:03 CET 2023
ziti@zt:~/.ziti/quickstart/zt/ziti-bin$ ziti fabric list circuits | grep "mycompany_Allow_Zabbix_Agent_10051" | wc -l
20
Aha! Looks like we found that the problem only occurs with ziti-edge-tunnel
on both sides?
If it helps, I can provide a 5 minute PCAP of the traffic that's sent over the tunneler?
This is out of my area of expertise, so I've asked the folks who work on ziti-edge-tunnel to take a look and hop on this thread. I know I'm repeating myself, but we really appreciate you helping us track this down!
any chance you are willing to try the following setups:
- ziti-tunnel -> ziti-edge-tunnel
- ziti-edge-tunnel -> ziti-tunnel
and report your finding?
that would really help finding the culprit. also when you see idle circuit can you check connections on the terminating side withnetstat
? maybe tcpdump on the terminating side as well
edit:
Sorry, I just noticed you already did that
is there any chance we can get packet capture on the terminating side -- behind hosting ziti-edge-tunnel?
just filtered to ziti-edge-tunnel to the service if possible
Absolutely. Any email address to where I can send the pcap?
I'm not sure if you've figured this out yet but just to be sure: you can DM to @ekoby on this discourse group if you click on his avatar:
@ekoby I've sent you the packet capture via DM.
Sure, here's the output:
dmuensterer@zabbix:~$ netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 localhos:zabbix-trapper localhost:46502 TIME_WAIT
tcp 0 0 localhos:zabbix-trapper localhost:51674 TIME_WAIT
tcp 0 0 localhost:ssh localhost:42688 ESTABLISHED
tcp 0 36 localhost:42688 localhost:ssh ESTABLISHED
tcp 0 0 localhost:55752 localhost:http ESTABLISHED
tcp 0 0 localhos:zabbix-trapper localhost:47748 TIME_WAIT
tcp 0 0 localhos:zabbix-trapper localhost:51670 TIME_WAIT
tcp 0 0 zabbix:45796 zt-router-1.mycomp:8442 ESTABLISHED
tcp 0 0 zabbix:36682 zt.mycompany.de:8442 ESTABLISHED
tcp 0 0 localhos:zabbix-trapper localhost:52954 TIME_WAIT
tcp 0 0 localhos:zabbix-trapper localhost:47750 TIME_WAIT
tcp6 0 0 127.0.0.1:33154 127.0.0.1:5355 TIME_WAIT
tcp6 0 0 127.0.0.1:44454 127.0.0.1:5355 TIME_WAIT
Hi, were you already able to find anything?
I'd love to use Ziti with Zabbix but it looks like it's currently not possible?
A new release of ziti-edge-tunnel was just released today.
One of the changes was to enable TCP keep alive on connections from hosting ZET. This should cleanup abandoned TCP connections fairly quickly.
Thanks! I’ll test the new release and report back if the problem was solved!
I just had a very quick look at the commit and it looks like there’s nothing I need to configure here additionally for it to work?
Correct, just drop in the new version and give it a go