Intermittent Connectivity Issues: ziti-edge-tunnel

Hello Again,

this is a continuation of a previous thread as the scope of the content began to escape the initial threads request. To boil it down, I believe I am having intermittent connectivity issues when utilizing a service that simulates UDP communication between two edge nodes over a specific port. To get a better understanding, my last reply in the previous thread gives a rough overview of where I am at.

I'll go more in depth in my network creation and how I have the ziti-edge-tunnels configured on the respective edge nodes. To start,

  • My controller and edge router are hosted on an ubuntu EC2 instance. I follow these instructions to get the network up and running.
  • After creating two identities, I first enroll my laptop running widows 11 by adding the JWT through the windows desktop edge. This goes as expected and through ZAC I can see the API session created and the edge router connected.
  • The second identity is being hosted on a different EC2 instance also running ubuntu. I copy the JWT from my local machine to the ubuntu server using scp. I then input these sequence of commands to install the ziti-edge-tunnel and add my node to the network with the JWT, this is all done on a fresh EC2 instance.
1. sudo apt update 
2. curl -sSLf https://get.openziti.io/tun/scripts/install-ubuntu.bash | bash
3. sudo systemctl enable --now ziti-edge-tunnel.service
4. sudo chown -cR :ziti        /opt/openziti/etc/identities
5. sudo chmod -cR ug=rwX,o-rwx /opt/openziti/etc/identities
6. sudo ziti-edge-tunnel add --jwt "$(< ./in-file.jwt)" --identity myIdentityName
7. sudo systemctl restart ziti-edge-tunnel.service

After running command 6. I get something similar to

{
       success: true
       some other info (Sorry dont have it on hand)
}

I then am able to confirm in ZAC that the API session was created and the edge router connected. Now with both identities added and configured on the network, I'll get into making my first service. I've initially done this through the CLI, but I will be showing screenshots through ZAC.

Below is host.v1 config.


Here is the intercept.v1 config.

My Bind policy has the identity attribute of the EC2 instance set, while the Dial policy has the identity attribute of my laptop set. With these set in to place, my service is created and we can begin testing. Here is the step by step process.

  1. SSH into the EC2 instance, and run this command to get the tunneler running.
sudo ziti-edge-tunnel run -i /opt/openziti/etc/identities/myIdentityName.json
  1. Then in a different shell, I SSH into the same EC2 instance and run this command.
nc -u -l 14550

Where I am listening to incoming UDP connections on port 14550.

  1. Then in a plain shell on my host machine, I run this command.
nc -u EC2.ziti 14550

Where I am sending UDP packets to EC2.ziti on port 14550, the exact address noted in my intercept.v1 config.

Previously stated in the other thread, I've been able to simulate communications between the two by typing messages from my local machine that will then show up on the EC2 instance. On the first SSH session that is running the tunneler, there have been times I can see the successful incoming connection requests as seen below.

(2988)[       39.471]    INFO tunnel-cbs:ziti_hosting.c:637 on_hosted_client_connect() hosted_service[Drone] client[Me] client_src_addr[udp:100.64.0.1:59326] dst_addr[udp:3.90.72.176:14550]: incoming connection
(2988)[       44.497]    INFO tunnel-cbs:ziti_hosting.c:637 on_hosted_client_connect() hosted_service[Drone] client[Me] client_src_addr[udp:100.64.0.1:59326] dst_addr[udp:3.90.72.176:14550]: incoming connection

But then there are other times when I begin testing where I

  1. Start the tunneler
  2. Start netcat on the other two shell sessions
  3. Successfully simulate communication between the 2, but my controller displays the exact same information it did on start.
EC2 Instance:~$ sudo ziti-edge-tunnel run -i /opt/openziti/etc/identities/EdgeEC2.json
About to run tunnel service... ziti-edge-tunnel
(2988)[        0.000]    INFO ziti-sdk:utils.c:198 ziti_log_set_level() set log level: root=3/INFO
(2988)[        0.000]    INFO ziti-sdk:utils.c:167 ziti_log_init() Ziti C SDK version 1.1.5 @g2120296(HEAD) starting at (2024-11-19T22:36:06.260)
RTNETLINK answers: File exists
(2988)[        0.000]   ERROR ziti-edge-tunnel:utils.c:31 run_command_va() cmd{ip route add 100.64.0.0/10 dev ziti1} failed: 512/0/Success

(2988)[        0.000]    INFO tunnel-sdk:ziti_tunnel.c:60 create_tunneler_ctx() Ziti Tunneler SDK (v1.2.6)
(2988)[        0.000]    INFO tunnel-cbs:ziti_dns.c:173 seed_dns() DNS configured with range 100.64.0.0 - 100.127.255.255 (4194302 ips)
(2988)[        0.000]    INFO ziti-edge-tunnel:ziti-edge-tunnel.c:887 make_socket_path() effective group set to 'ziti' (gid=988)
(2988)[        0.014]    WARN ziti-edge-tunnel:instance.c:39 find_tunnel_identity() Identity ztx[/opt/openziti/etc/identities/EdgeEC2.json] is not loaded yet or already removed.
(2988)[        0.014]    INFO ziti-edge-tunnel:resolvers.c:68 init_libsystemd() Initializing libsystemd
(2988)[        0.014]    INFO tunnel-cbs:ziti_tunnel_ctrl.c:1121 load_ziti_async() attempting to load ziti instance[/opt/openziti/etc/identities/EdgeEC2.json]
(2988)[        0.014]    INFO tunnel-cbs:ziti_tunnel_ctrl.c:1128 load_ziti_async() loading ziti instance[/opt/openziti/etc/identities/EdgeEC2.json]
(2988)[        0.014]    INFO ziti-edge-tunnel:ziti-edge-tunnel.c:402 load_id_cb() identity[/opt/openziti/etc/identities/EdgeEC2.json] loaded
(2988)[        0.017]    INFO ziti-sdk:ziti.c:437 ziti_start_internal() ztx[0] using tlsuv[v0.32.6/OpenSSL 3.3.1 4 Jun 2024]
(2988)[        0.017]    INFO ziti-sdk:ziti_ctrl.c:593 ziti_ctrl_init() ctrl[(null):] using https://40.201.235.56:8441
(2988)[        0.017]    INFO ziti-sdk:ziti.c:507 ztx_init_controller() ztx[0] Loading ziti context with controller[https://40.201.235.56:8441]
(2988)[        0.058]    INFO ziti-sdk:ziti.c:1759 version_pre_auth_cb() ztx[0] connected to Legacy controller https://40.201.235.56:8441 version v1.1.15(0eec47ce3c80 2024-10-02T12:59:41Z)
(2988)[        0.073]    INFO tunnel-cbs:ziti_tunnel_ctrl.c:968 on_ziti_event() ziti_ctx[EdgeEC2] connected to controller
(2988)[        0.073]    INFO ziti-edge-tunnel:ziti-edge-tunnel.c:440 on_event() ztx[/opt/openziti/etc/identities/EdgeEC2.json] context event : status is OK
(2988)[        0.100]    INFO ziti-sdk:channel.c:270 new_ziti_channel() ch[0] (ip-172-31-8-111-edge-router) new channel for ztx[0] identity[EdgeEC2]
(2988)[        0.100]    INFO tunnel-cbs:ziti_tunnel_ctrl.c:1039 on_ziti_event() ztx[EdgeEC2] added edge router ip-172-31-8-111-edge-router@40.201.235.566
(2988)[        0.100]    INFO ziti-sdk:channel.c:799 reconnect_channel() ch[0] reconnecting NOW
(2988)[        0.119]    INFO tunnel-cbs:ziti_tunnel_ctrl.c:940 on_service() hosting server_address[udp:3.90.72.176:14550] service[EC2]
(2988)[        0.119]    INFO ziti-edge-tunnel:ziti-edge-tunnel.c:563 on_event() =============== service event (added) - EC2:67d3ZGVJpEQZQYRuDC6VaG ===============
(2988)[        0.119]    INFO ziti-edge-tunnel:tun.c:196 tun_commit_routes() starting 1 route updates
RTNETLINK answers: File exists
Command failed /tmp/ziti-tunnel-routes.oAeLGm:1
(2988)[        0.121]   ERROR ziti-edge-tunnel:utils.c:31 run_command_va() cmd{ip -force -batch /tmp/ziti-tunnel-routes.oAeLGm} failed: 256/0/Success

(2988)[        0.121]    INFO ziti-edge-tunnel:tun.c:118 route_updates_done() route updates[1]: 0/OK
(2988)[        0.135]    INFO ziti-sdk:channel.c:697 hello_reply_cb() ch[0] connected. EdgeRouter version: v1.1.15|0eec47ce3c80|2024-10-02T12:59:41Z|linux|amd64
(2988)[        0.135]    INFO tunnel-cbs:ziti_tunnel_ctrl.c:1043 on_ziti_event() ztx[DroneEC2] router ip-172-31-8-111-edge-router connected
(2988)[        0.528]    INFO ziti-edge-tunnel:resolvers.c:402 try_libsystemd_resolver() systemd-resolved selected as DNS resolver manager
(2988)[        1.069]    INFO ziti-sdk:posture.c:206 ziti_send_posture_data() ztx[0] first run or potential controller restart detected

I would be inclined to think that the reason I am still able to simulate communication between the two regardless of my tunneler not logging any incoming connections is because the traffic isnt getting to the overlay properly, but I am only able to communicate when the host is set to 'EC2.ziti' as seen below.

Sorry for the long post.... again, but I hope this can clear things up in terms of how I've gotten to where I am at. Thank you all for taking the time to help. I really appreciate it :slight_smile:

Thank you for the repost and the level set. This really did help!

Are you stopping the "client-side" netcat? On the server, can you try this with -k?

nc -u -k -l 14550

I am concerned the problem is when the netcat listener and it is not clear why it doesn't work. Taking ziti out of the picture entirely as I did below, you can see that you can only make one connection to that netcat server when not using -k. But when using -k you can connect over and over. Could this be your problem?

nc

Sorry, what I meant by this, is that it's hard to understand what's going on the first time you hit this kind of problem... not that it's unclear in general.

If you aren't using -k, I think you'll have this problem

You could always switch to trying TCP at first.. With TCP, if you disconnect the client you'll see netcat actually exit... The difference is TCP/UDP...

I think that's the issue here so if you can confirm that'll be great :slight_smile:

Thank you so much, I believe that was my problem! I just ran the test again and none of my messages are dropping, plus my tunneler is logging the incoming connections. Looks like I need to brush up on my basic networking information! UDP was working as intended, just not how I intended :grin:. Now I can work on getting my ziti network configured with actual hardware. Thank you again so much for all the help, I really appreciate. I'll be sure to reach out if I have anymore questions :slight_smile:

1 Like

Glad to hear! Yeah, UDP can be "tricky" like this. There's no way for the nc server side to "know" that the connection was closed. TCP can/does, which is why as soon as you shut down the client, the server exits because it knows the client has stopped... Have fun!

1 Like