Trouble after updating v2.41 to v2.45

Hey everyone, long time listener, first time caller.

This morning I updated my Ziti Desktop Client on my M4 MacBook Pro and while the tunnel shows green and all of the services show green, I am unable to access the endpoints.

I was previously using ZitiPacketTunnel Version: 2.41 (512) and, through the Mac App Store, upgraded to the latest ZitiPacketTunnel Version: 2.47 (525). I'm on a MacBook Pro with the M4 Max chip running MacOS Sequoia Version 15.1 (Build 24B2082).

The log is showing this upon initial startup, repeating with (attempt n) where n is the set of all integers from 1 to 16 (so far).

(30368)[2024-11-27T18:45:19.170Z]    INFO ziti-sdk:posture.c:206 ziti_send_posture_data() ztx[0] first run or potential controller restart detected
(30368)[2024-11-27T18:45:19.170Z]   DEBUG ziti-sdk:posture.c:213 ziti_send_posture_data() ztx[0] posture checks must_send set to TRUE, new_session_id[TRUE], must_send_every_time[TRUE], new_controller_instance[TRUE]
(30368)[2024-11-27T18:45:20.431Z]   DEBUG ziti-sdk:channel.c:765 reconnect_cb() ch[0] connecting to tls://ziti.actieve.com:8442
(30368)[2024-11-27T18:45:20.481Z]   ERROR ziti-sdk:channel.c:943 on_tls_connect() ch[0] failed to connect to ER[actieve-public] [-53/software caused connection abort]
(30368)[2024-11-27T18:45:20.481Z]   DEBUG ziti-sdk:channel.c:99 close_connection() ch[0] closing TLS[0x125f3cfc0]

Do you have any suggestions for next steps in further debugging?

edit: Oh, and the controller is v0.30.5

Hi @gberl001, welcome to the community and to OpenZiti! :slight_smile:

I believe a network that old is likely to have been created with an incomplete PKI. OpenZiti used to support this deployment model, but since then, the deployment has required a full and complete PKI for clients to connect.

I expect the network will need to have the PKI rebuilt. There are forum posts on the topic. I'll try to find the other forum post describing the change and post back.

Thanks @TheLumberjack, sounds like a great holiday task lol. I'll check back (failure or not) when it's complete.

Yes indeed!

I was just on a call with @jeremy.tellier, his controller certs were ok. During troubleshooting his post here (live in a call): Update Controller - Terminators are terminated, we stopped and started the controller and both routers. After that, connectivity appears to have resumed. things appear functional again.

I was able to enroll an identity on my macmini, so hopefully your connectivity is restored?

Unfortunately, I'm still experiencing the same issue.

(21383)[2024-12-02T17:50:18.774Z]    INFO ziti-sdk:channel.c:797 reconnect_channel() ch[0] reconnecting in 2047ms (attempt = 3)
(21383)[2024-12-02T17:50:20.821Z]   DEBUG ziti-sdk:channel.c:765 reconnect_cb() ch[0] connecting to tls://ziti.actieve.com:8442
(21383)[2024-12-02T17:50:20.876Z]   ERROR ziti-sdk:channel.c:943 on_tls_connect() ch[0] failed to connect to ER[actieve-public] [-53/software caused connection abort]
(21383)[2024-12-02T17:50:20.876Z]   DEBUG ziti-sdk:channel.c:99 close_connection() ch[0] closing TLS[0x13e81cab0]
(21383)[2024-12-02T17:50:20.876Z]    INFO ziti-sdk:channel.c:797 reconnect_channel() ch[0] reconnecting in 37439ms (attempt = 4)

I'm also seeing this different error repeating, I don't recall if it was in the previous logs.

(21383)[2024-12-02T17:50:35.494Z]   DEBUG ziti-sdk:connect.c:701 ziti_disconnect_async() conn[0.0/ojRafjXU/Connecting] no channel -- no disconnect
(21383)[2024-12-02T17:50:35.494Z]   DEBUG ziti-sdk:connect.c:721 ziti_disconnect_async() conn[0.0/ojRafjXU/Closed] can't send StateClosed in state[Closed]
(21383)[2024-12-02T17:50:35.494Z]   DEBUG ziti-sdk:connect.c:182 close_conn_internal() conn[0.3/swh7oesX/Closed] removing
(21383)[2024-12-02T17:50:35.494Z]   DEBUG tunnel-sdk:ziti_tunnel.c:435 ziti_tunneler_close() closing connection: client[tcp:101.60.0.1:49814] service[Sql-Server]
(21383)[2024-12-02T17:50:35.494Z]   DEBUG tunnel-sdk:tunnel_tcp.c:248 tunneler_tcp_close() null pcb
(21383)[2024-12-02T17:50:35.494Z]   DEBUG ziti-sdk:connect.c:182 close_conn_internal() conn[0.1/wZ6IQPBI/Closed] removing
(21383)[2024-12-02T17:50:35.494Z]   DEBUG tunnel-sdk:ziti_tunnel.c:435 ziti_tunneler_close() closing connection: client[tcp:101.60.0.1:49810] service[Sql-Server]
(21383)[2024-12-02T17:50:35.494Z]   DEBUG tunnel-sdk:tunnel_tcp.c:248 tunneler_tcp_close() null pcb
(21383)[2024-12-02T17:50:35.494Z]   DEBUG ziti-sdk:connect.c:182 close_conn_internal() conn[0.0/ojRafjXU/Closed] removing
(21383)[2024-12-02T17:50:35.494Z]   DEBUG tunnel-sdk:ziti_tunnel.c:435 ziti_tunneler_close() closing connection: client[tcp:101.60.0.1:49808] service[Sql-Server]
(21383)[2024-12-02T17:50:35.494Z]   DEBUG tunnel-sdk:tunnel_tcp.c:248 tunneler_tcp_close() null pcb
(21383)[2024-12-02T17:50:35.494Z]   DEBUG ziti-sdk:ziti.c:1502 grim_reaper() ztx[0] reaped 3 closed (out of 7 total) connections
(21383)[2024-12-02T17:50:36.679Z]   DEBUG tunnel-sdk:tunnel_tcp.c:116 new_tcp_pcb() snd_wnd: 65535, snd_snd_max: 65535, mss: 3960
(21383)[2024-12-02T17:50:36.679Z]   DEBUG tunnel-sdk:tunnel_tcp.c:438 recv_tcp() intercepted address[tcp:101.60.0.1:1433] client[tcp:101.60.0.1:49829] service[Sql-Server]
(21383)[2024-12-02T17:50:36.679Z]   DEBUG tunnel-cbs:ziti_tunnel_cbs.c:354 ziti_sdk_c_dial() service[Sql-Server] app_data_json[169]='{"connType":null,"dst_protocol":"tcp","dst_hostname":"sql","dst_ip":"101.60.0.1","dst_port":"1433","src_protocol":"tcp","src_ip":"101.60.0.1","src_port":"49829"}'
(21383)[2024-12-02T17:50:36.679Z]   DEBUG ziti-sdk:connect.c:428 connect_get_service_cb() conn[0.7/ebYZvlgw/Connecting] got service[Sql-Server] id[2oeACdzT68ezDaZng39Qy5]
(21383)[2024-12-02T17:50:36.679Z]   DEBUG ziti-sdk:posture.c:213 ziti_send_posture_data() ztx[0] posture checks must_send set to TRUE, new_session_id[FALSE], must_send_every_time[TRUE], new_controller_instance[FALSE]
(21383)[2024-12-02T17:50:36.679Z]   DEBUG ziti-sdk:connect.c:549 process_connect() conn[0.7/ebYZvlgw/Connecting] starting Dial connection for service[Sql-Server] with session[cm47bses48timo8ehc0atj7u2]
(21383)[2024-12-02T17:50:36.679Z]   DEBUG ziti-sdk:connect.c:416 ziti_connect() conn[0.7/ebYZvlgw/Connecting] waiting for suitable channel
(21383)[2024-12-02T17:50:36.679Z]   DEBUG ziti-sdk:connect.c:551 process_connect() conn[0.7/ebYZvlgw/Connecting] no active edge routers, pending ER connection
(21383)[2024-12-02T17:50:41.013Z]   ERROR tunnel-sdk:tunnel_tcp.c:190 on_tcp_client_err() client=tcp:101.60.0.1:49817 err=-13, terminating connection
(21383)[2024-12-02T17:50:41.013Z]   DEBUG tunnel-cbs:ziti_tunnel_cbs.c:150 ziti_sdk_c_close() closing ziti_conn tnlr_eof=0, ziti_eof=0

Any suggestions?

Hrmmm. It looks to me like the router is not returning a full set of certificates. Probing it:

openssl s_client -connect ziti.actieve.com:8442 -showcerts

shows a single certificate returned. I believe it should be returning a chain instead of a single cert now.

Can you check the router config and confirm the configuration server_cert references a chain? for example, with a recent deployment, the router config's identity block should like this:

identity:
  cert:             "/home/ubuntu/.ziti/quickstart/ip-172-31-47-200/ip-172-31-47-200-edge-router.cert"
  server_cert:      "/home/ubuntu/.ziti/quickstart/ip-172-31-47-200/ip-172-31-47-200-edge-router.server.chain.cert"
  key:              "/home/ubuntu/.ziti/quickstart/ip-172-31-47-200/ip-172-31-47-200-edge-router.key"
  ca:               "/home/ubuntu/.ziti/quickstart/ip-172-31-47-200/ip-172-31-47-200-edge-router.cas"

My guess is that the server cert references a single cert, not a chain. Would you update that router and see if that solves the issue?