Issue:
I am seeing repeated handshake failed
errors in my Ziti router pods when checking their logs. The routers seem to be running fine, and all services I created are reachable, but these errors persist.
Environment:
- Platform: AWS EKS
- Helm Chart: Ziti Router
- StorageClass: EBS
- Network: Internal routers with
ClusterIP
, public edge access viaLoadBalancer
- Ingress: Disabled
- Persistent Storage: Enabled
Error Logs:
{"_context":"tls:0.0.0.0:3022","error":"EOF","file":"github.com/openziti/transport/v2@v2.0.153/tls/listener.go:257","func":"github.com/openziti/transport/v2/tls.(*sharedListener).processConn","level":"error","msg":"handshake failed","remote":"172.25.181.109:17981","time":"2025-02-07T22:54:39.274Z"}
{"_context":"tls:0.0.0.0:3022","error":"EOF","file":"github.com/openziti/transport/v2@v2.0.153/tls/listener.go:257","func":"github.com/openziti/transport/v2/tls.(*sharedListener).processConn","level":"error","msg":"handshake failed","remote":"172.25.89.248:49070","time":"2025-02-07T22:54:39.375Z"}
{"_context":"tls:0.0.0.0:3022","error":"EOF","file":"github.com/openziti/transport/v2@v2.0.153/tls/listener.go:257","func":"github.com/openziti/transport/v2/tls.(*sharedListener).processConn","level":"error","msg":"handshake failed","remote":"172.25.158.72:40193","time":"2025-02-07T22:54:39.742Z"}
{"_context":"tls:0.0.0.0:3022","error":"EOF","file":"github.com/openziti/transport/v2@v2.0.153/tls/listener.go:257","func":"github.com/openziti/transport/v2/tls.(*sharedListener).processConn","level":"error","msg":"handshake failed","remote":"172.25.107.42:60677","time":"2025-02-07T22:54:41.017Z"}
Router Values.yaml Configuration:
ctrl:
endpoint: ziti-controller.example.com:443
advertisedHost: ziti-router-public-1.example.com
edge:
advertisedHost: ziti-router-public-1.example.com
advertisedPort: 443
service:
type: LoadBalancer
annotations:
external-dns.alpha.kubernetes.io/hostname: ziti-router-public-1.example.com
service.beta.kubernetes.io/aws-load-balancer-internal: "false"
service.beta.kubernetes.io/aws-load-balancer-security-groups: "sg-0158c2bd5d277c65b"
service.beta.kubernetes.io/aws-load-balancer-manage-backend-security-group-rules: "true"
ingress:
enabled: false
linkListeners:
transport:
advertisedHost: ziti-router-public-release-1-transport.ziti-router.svc.cluster.local
advertisedPort: 443
service:
enabled: true
type: ClusterIP
ingress:
enabled: false
image:
additionalArgs:
- '--extend'
persistence:
enabled: true
accessMode: ReadWriteOnce
size: 1Gi
storageClass: ebs-sc
What I've Checked So Far:
All created services are reachable.
The routers are running without crashes.
Certificates should be valid as they were generated correctly.
Questions:
- What could be causing these TLS handshake failures?
- Are these errors expected behavior, or do they indicate a misconfiguration?
- Could this be due to mismatched certificates or an issue with the advertised hosts?
- Any debugging tips for identifying which service is attempting the failed handshake?
Would appreciate any insights from the community! Thanks in advance!