Hello and thanks for using OpenZiti for this interesting use case!
You've previously sent me the log from a tunneler that was intercepting UDP connections, so I'll start there. I think there are two issues - packet loss and you also see a crash on the intercepting tunneler.
The packet loss is indicated by this log message:
to_ziti() ziti_write stalled: dropping UDP packet
What's happening here is the tunneler has received more data from the UDP client than it is able to send to the ziti network. In your case (based on the logs you sent) the tunneler isn't sending packets at all for this connection yet, because the other end of the ziti connection hasn't yet been established. The ziti SDK has a feature that allows it to buffer data while waiting for the connection to be completed. If it weren't for this feature, we would have no choice but to drop UDP packets that arrive before the end-to-end connection is established. Let's look at the stages of your connection that leads to a crash.
The first packet is intercepted. Notice that this causes the ziti service to be "dialed", and the connection state for conn "0.11/7PZyd4nD" is Connecting
(1058)[ 16.498] DEBUG tunnel-sdk:tunnel_udp.c:231 recv_udp() intercepted address[udp:172.16.117.3:30134] client[udp:172.16.7.151:15060] service[video-gbs-svc]
(1058)[ 16.498] VERBOSE tunnel-cbs:ziti_tunnel_cbs.c:287 ziti_sdk_c_dial() ziti_dial(name=video-gbs-svc)
(1058)[ 16.498] DEBUG tunnel-cbs:ziti_tunnel_cbs.c:354 ziti_sdk_c_dial() service[video-gbs-svc] app_data_json[145]='{"connType":null,"dst_protocol":"udp","dst_ip":"172.16.117.3","dst_port":"30134","src_protocol":"udp","src_ip":"172.16.7.151","src_port":"15060"}'
(1058)[ 16.498] VERBOSE ziti-sdk:connect.c:127 conn_set_state() conn[0.11/7PZyd4nD/Initial](video-gbs-svc) transitioning Initial => Connecting
(1058)[ 16.498] DEBUG ziti-sdk:connect.c:430 connect_get_service_cb() conn[0.11/7PZyd4nD/Connecting](video-gbs-svc) got service[video-gbs-svc] id[7BXVZa3bGCOZhmpOdjEwrm]
(1058)[ 16.498] DEBUG ziti-sdk:connect.c:551 process_connect() conn[0.11/7PZyd4nD/Connecting](video-gbs-svc) starting Dial connection for service[video-gbs-svc] with session[cm7ttdcab6wlgjin1uicdvjp1]
The initial packet and any subsequent packets from this client are queued. Notice the ziti connection state is still Connecting
(1058)[ 16.498] VERBOSE tunnel-sdk:tunnel_udp.c:84 on_udp_client_data() 1272 bytes from 172.16.7.151:15060
(1058)[ 16.498] TRACE tunnel-sdk:tunnel_udp.c:54 to_ziti() writing 1272 bytes to ziti src[udp:172.16.7.151:15060] dst[udp:172.16.117.3:30134] service[video-gbs-svc]
(1058)[ 16.498] TRACE ziti-sdk:connect.c:1282 ziti_write() conn[0.11/7PZyd4nD/Connecting](video-gbs-svc) write 1272 bytes
(1058)[ 16.498] TRACE ziti-sdk:connect.c:811 flush_connection() conn[0.11/7PZyd4nD/Connecting](video-gbs-svc) starting flusher
(1058)[ 16.498] TRACE tunnel-sdk:tunnel_udp.c:54 to_ziti() writing 140 bytes to ziti src[udp:172.16.7.151:15060] dst[udp:172.16.117.3:30134] service[video-gbs-svc]
(1058)[ 16.498] TRACE ziti-sdk:connect.c:1282 ziti_write() conn[0.11/7PZyd4nD/Connecting](video-gbs-svc) write 140 bytes
The tunneler continues intercepting packets for the "Connecting" connection (for about 0.25 seconds), and then it runs out of buffer space for pending the packets.:
(1058)[ 16.522] TRACE tunnel-sdk:tunnel_udp.c:156 recv_udp() received datagram src[172.16.7.151:15060] dst[172.16.117.3:30134]
(1058)[ 16.522] VERBOSE tunnel-sdk:tunnel_udp.c:84 on_udp_client_data() 1272 bytes from 172.16.7.151:15060
(1058)[ 16.522] TRACE tunnel-sdk:tunnel_udp.c:54 to_ziti() writing 1272 bytes to ziti src[udp:172.16.7.151:15060] dst[udp:172.16.117.3:30134] service[video-gbs-svc]
(1058)[ 16.522] VERBOSE tunnel-cbs:ziti_tunnel_cbs.c:395 ziti_sdk_c_write() applying backpressure 129904 pending bytes
(1058)[ 16.522] WARN tunnel-sdk:tunnel_udp.c:66 to_ziti() ziti_write stalled: dropping UDP packet service=video-gbs-svc, client=udp:172.16.7.151:15060, ret=-7
Finally the ziti connection (with the hosting tunneler) is established:
(1058)[ 16.601] VERBOSE ziti-sdk:connect.c:127 conn_set_state() conn[0.11/7PZyd4nD/Connecting](video-gbs-svc) transitioning Connecting => Connected
(1058)[ 16.601] VERBOSE tunnel-cbs:ziti_tunnel_cbs.c:93 on_ziti_connect() on_ziti_connect status: 0
(1058)[ 16.601] DEBUG tunnel-sdk:ziti_tunnel.c:221 ziti_tunneler_dial_completed() ziti dial succeeded: client[udp:172.16.7.151:15060] service[video-gbs-svc]
At this point the tunneler starts sending the queued data to the hosting tunneler:
(1058)[ 16.601] TRACE ziti-sdk:connect.c:312 send_message() conn[0.11/7PZyd4nD/Connected](video-gbs-svc) => ct[ED72] uuid[53cd8b78:00000000:20e27] edge_seq[0] len[24] hash[53cd8b78:4646f9a9:b8afd0af:597ac30d:1641f46b:bd40a1a6:6b94fd82:503e939d]
(1058)[ 16.601] TRACE ziti-sdk:channel.c:420 ziti_channel_send_message() ch[0] => ct[ED72] seq[482] len[24]
(1058)[ 16.601] TRACE ziti-sdk:channel.c:391 on_channel_send() ch[0] write delay = 0.000d q=1 qs=104
(1058)[ 16.601] TRACE ziti-sdk:connect.c:240 on_write_completed() conn[0.11/7PZyd4nD/Connected](video-gbs-svc) status 0
(1058)[ 16.601] TRACE ziti-sdk:connect.c:312 send_message() conn[0.11/7PZyd4nD/Connected](video-gbs-svc) => ct[ED72] uuid[12305f2c:00000001:20e27] edge_seq[1] len[1289] hash[12305f2c:8a4baa58:1e8cc29c:92304a78:100a4b0e:917b026e:af3e22b9:c597549e]
(1058)[ 16.601] TRACE ziti-sdk:channel.c:420 ziti_channel_send_message() ch[0] => ct[ED72] seq[483] len[1289]
(1058)[ 16.601] TRACE ziti-sdk:channel.c:391 on_channel_send() ch[0] write delay = 0.000d q=1 qs=1357
(1058)[ 16.601] TRACE ziti-sdk:connect.c:240 on_write_completed() conn[0.11/7PZyd4nD/Connected](video-gbs-svc) status 0
(1058)[ 16.601] TRACE ziti-sdk:connect.c:312 send_message() conn[0.11/7PZyd4nD/Connected](video-gbs-svc) => ct[ED72] uuid[e6166c10:00000002:20e27] edge_seq[2] len[157] hash[e6166c10:a34ce0b2:377b229c:da788642:b75bfb40:cccc183a:e317e3d6:777c2e31]
(1058)[ 16.601] TRACE ziti-sdk:channel.c:420 ziti_channel_send_message() ch[0] => ct[ED72] seq[484] len[157]
(1058)[ 16.601] TRACE ziti-sdk:channel.c:391 on_channel_send() ch[0] write delay = 0.000d q=1 qs=225
(1058)[ 16.601] TRACE ziti-sdk:connect.c:240 on_write_completed() conn[0.11/7PZyd4nD/Connected](video-gbs-svc) status 0
Assertion "pbuf_free: p->ref > 0" failed at line 755 in /github/workspace/build/_deps/lwip-src/src/core/pbuf.c
As the data is sent over the OpenZiti overlay, the underlying storage (packet buffer, or "pbufs") for each of the pending packets is released. The assertion that is failing here suggests that one of the packet buffers is somehow being freed twice. This is the first time I've seen that happen, but clearly something isn't working as it should. Let me reacquaint myself with the relevant portions of code and get back to you on this.
In the meantime, it would be informative to see if you encounter this assertion when you give the ziti connection enough time to complete before hitting it with this high rate of data (if that's possible).
Edit: I'm not able to reproduce the problem you're seeing here (with udp), and the code seems correct to me so I must be missing something. If you're building ziti-edge-tunnel from source (I suspect you were at one point at least), could you add some options to the build to enable additional debug messages? The following preprocessor symbols will enable the debugging that I'm hoping to see:
#define LWIP_DEBUG 1
#define PBUF_DEBUG LWIP_DBG_ON
I achieved this by adding the following to my cmake configure preset in CMakeUserPresets.json:
"cacheVariables": {
"CMAKE_C_FLAGS": "-DLWIP_DEBUG=1 -DPBUF_DEBUG=LWIP_DBG_ON",
}
You should see lines like this when running ziti-edge-tunnel after rebuilding:
pbuf_alloc(length=40)
pbuf_alloc(length=40) == 0x10668a258
pbuf_free(0x105575d84)
pbuf_free: deallocating 0x105575d84
Regarding the issue that you're seeing when using TCP, I'll need to see logs from the tunnelers (both intercepting and hosting) that were handling the TCP connection to help with that.
Thanks.