[k8s] router is unable to connect to controller: no such host

Hello again,

Im trying to get a working k8s openziti cluster running. I mainly followed this k8s quickstart guide here Kubernetes Quickstart | OpenZiti

but before that I installed cert-manager and trust-manager like this:

#installed cert-manager with k apply:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.2/cert-manager.yaml

#installed trust-manager with helm and the controller namespace as app-namespace:
helm repo add jetstack https://charts.jetstack.io --force-update
helm upgrade -i -n cert-manager trust-manager jetstack/trust-manager --wait --set app.trust.namespace=ziti-controller 

Now I went for every step in the k8s quickstart guide except installing the tunneler app. When I try to install the router it fails because it cant find the controllers host, see pod logs:

 ERROR ziti/router/env.(*networkControllers).connectToControllerWithBackoff.func2: {endpoint=[tls:ziti-controller-ctrl.miniziti.svc:6262] error=[error connecting ctrl (dial tcp: lookup ziti-controller-ctrl.miniziti.svc on 10.96.0.10:53: no such host)]} unable to connect controller
[   0.065]   ERROR ziti/router/env.(*networkControllers).connectToControllerWithBackoff.func2: {endpoint=[tls:ziti-controller-ctrl.miniziti.svc:6262] error=[error connecting ctrl (dial tcp: lookup ziti-controller-ctrl.miniziti.svc on 10.96.0.10:53: no such host)]} unable to connect controller
[   0.127]   ERROR ziti/router/env.(*networkControllers).connectToControllerWithBackoff.func2: {endpoint=[tls:ziti-controller-ctrl.miniziti.svc:6262] error=[error connecting ctrl (dial tcp: lookup ziti-controller-ctrl.miniziti.svc on 10.96.0.10:53: no such host)]} unable to connect controller
[   0.281]   ERROR ziti/router/env.(*networkControllers).connectToControllerWithBackoff.func2: {endpoint=[tls:ziti-controller-ctrl.miniziti.svc:6262] error=[error connecting ctrl (dial tcp: lookup ziti-controller-ctrl.miniziti.svc on 10.96.0.10:53: no such host)]} unable to connect controller
[   0.503]   ERROR ziti/router/env.(*networkControllers).connectToControllerWithBackoff.func2: {endpoint=[tls:ziti-controller-ctrl.miniziti.svc:6262] error=[error connecting ctrl (dial tcp: lookup ziti-controller-ctrl.miniziti.svc on 10.96.0.10:53: no such host)]} unable to connect controller
[   0.839]   ERROR ziti/router/env.(*networkControllers).connectToControllerWithBackoff.func2: {endpoint=[tls:ziti-controller-ctrl.miniziti.svc:6262] error=[error connecting ctrl (dial tcp: lookup ziti-controller-ctrl.miniziti.svc on 10.96.0.10:53: no such host)]} unable to connect controller
[   1.172]   ERROR ziti/router/env.(*networkControllers).connectToControllerWithBackoff.func2: {error=[error connecting ctrl (dial tcp: lookup ziti-controller-ctrl.miniziti.svc on 10.96.0.10:53: no such host)] endpoint=[tls:ziti-controller-ctrl.miniziti.svc:6262]} unable to connect controller
[   1.749]   ERROR ziti/router/env.(*networkControllers).connectToControllerWithBackoff.func2: {error=[error connecting ctrl (dial tcp: lookup ziti-controller-ctrl.miniziti.svc on 10.96.0.10:53: no such host)] endpoint=[tls:ziti-controller-ctrl.miniziti.svc:6262]} unable to connect controller
[   2.934]   ERROR ziti/router/env.(*networkControllers).connectToControllerWithBackoff.func2: {error=[error connecting ctrl (dial tcp: lookup ziti-controller-ctrl.miniziti.svc on 10.96.0.10:53: no such host)] endpoint=[tls:ziti-controller-ctrl.miniziti.svc:6262]} unable to connect controller
[   4.211]   ERROR ziti/router/env.(*networkControllers).connectToControllerWithBackoff.func2: {error=[error connecting ctrl (dial tcp: lookup ziti-controller-ctrl.miniziti.svc on 10.96.0.10:53: no such host)] endpoint=[tls:ziti-controller-ctrl.miniziti.svc:6262]} unable to connect controller
[   6.188]   ERROR ziti/router/env.(*networkControllers).connectToControllerWithBackoff.func2: {endpoint=[tls:ziti-controller-ctrl.miniziti.svc:6262] error=[error connecting ctrl (dial tcp: lookup ziti-controller-ctrl.miniziti.svc on 10.96.0.10:53: no such host)]} unable to connect controller
[  10.481]   ERROR ziti/router/env.(*networkControllers).connectToControllerWithBackoff.func2: {endpoint=[tls:ziti-controller-ctrl.miniziti.svc:6262] error=[error connecting ctrl (dial tcp: lookup ziti-controller-ctrl.miniziti.svc on 10.96.0.10:53: no such host)]} unable to connect controller
[  15.007]   FATAL ziti/router.(*Router).startControlPlane.func1: unable to connect to any controllers before timeout

now I did the mentioned edit for the names in /etc/hosts and I edited the coredns config of minikube. I also deleted the pod. here is the dnstest output:

kubectl run "dnstest" --rm --tty --stdin --image=busybox --restart=Never -- \
      nslookup miniziti-controller.miniziti.internal
Server:         10.96.0.10
Address:        10.96.0.10:53

Non-authoritative answer:
Name:   miniziti-controller.miniziti.internal
Address: 192.168.49.2

Non-authoritative answer:
Name:   miniziti-controller.miniziti.internal
Address: 192.168.49.2

pod "dnstest" deleted

for now I dont have any more idea. please help,thank you :slight_smile:

1 Like

Hello @fkaute!

Let's check the cluster namespaces. It looks like you're creating the OpenZiti Controller's Helm release in namespace miniziti, which is the default namespace used by miniziti.bash, but the Trust Manager release has input value app.trust.namespace=ziti-controller, which should be app.trust.namespace=miniziti so that Trust Manager can compose trust Bundle resources from the Certificate resources in the Controller's namespace.

I see the error from the OpenZiti Router's pod log too. I'm assuming this means the Router's Helm release has input value ctrl.endpoint=ziti-controller-ctrl.miniziti.svc:443. I think it's likely that Kubernetes service is not yet available if the OpenZiti Controller pod is still waiting for Trust Manager, so this problem may resolve itself within a few seconds of the OpenZiti Controller becoming ready.

What's the status of the Controller's pod?

kubectl get pods --selector app.kubernetes.io/component=ziti-controller

In summary, it looks like the Router is waiting for the Controller, and the Controller is waiting for Trust Manager to provide the trust Bundle resource.

Hi @qrkourier
not sure why you assume the namespace must be miniziti since the quick start documentation mentioned above is using the ziti-controller ns when installing controller with helm:

helm install "ziti-controller" openziti/ziti-controller \
   --namespace ziti-controller --create-namespace \
   --set clientApi.advertisedHost="miniziti-controller.miniziti.internal" \
   --values https://openziti.io/helm-charts/charts/ziti-controller/values-ingress-nginx.yaml

anyways, controller is up and running and the trustmanagers logfiles looks good to me too:

fkaute@jammy02:~$ k get pods -A
NAMESPACE         NAME                                        READY   STATUS             RESTARTS          AGE
cert-manager      cert-manager-7d75f47cc5-jjwtq               1/1     Running            0                 22h
cert-manager      cert-manager-cainjector-c778d44d8-fw2d7     1/1     Running            0                 22h
cert-manager      cert-manager-webhook-55d76f97bb-jfdwh       1/1     Running            0                 22h
cert-manager      trust-manager-547c94ddc9-tw9p6              1/1     Running            0                 19h
ingress-nginx     ingress-nginx-admission-create-dpq8s        0/1     Completed          0                 20h
ingress-nginx     ingress-nginx-admission-patch-nm26z         0/1     Completed          0                 20h
ingress-nginx     ingress-nginx-controller-5f6c78c7f5-zklrr   1/1     Running            0                 20h
kube-system       coredns-5dd5756b68-58ptz                    1/1     Running            0                 19h
kube-system       etcd-miniziti                               1/1     Running            1 (22h ago)       22h
kube-system       kube-apiserver-miniziti                     1/1     Running            1 (22h ago)       22h
kube-system       kube-controller-manager-miniziti            1/1     Running            1 (22h ago)       22h
kube-system       kube-ingress-dns-minikube                   1/1     Running            0                 20h
kube-system       kube-proxy-bvpdk                            1/1     Running            1 (22h ago)       22h
kube-system       kube-scheduler-miniziti                     1/1     Running            1 (22h ago)       22h
kube-system       storage-provisioner                         1/1     Running            3 (22h ago)       22h
ziti-controller   ziti-controller-55fcf485bf-kztvn            1/1     Running            0                 19h
ziti-router       ziti-router7-6b459db77-29pd7                0/1     CrashLoopBackOff   215 (4m24s ago)   18h
fkaute@jammy02:~$ k logs -n cert-manager trust-manager-547c94ddc9-tw9p6
Defaulted container "trust-manager" out of: trust-manager, cert-manager-package-debian (init)
I1205 12:14:49.208975       1 controller.go:79] trust/bundle "msg"="successfully loaded default package from filesystem" "path"="/packages/cert-manager-package-debian.json"
I1205 12:14:49.209236       1 webhook.go:36] trust/webhook "msg"="registering webhook endpoints"
I1205 12:14:49.209330       1 webhook.go:173] trust/manager/controller-runtime/builder "msg"="skip registering a mutating webhook, object does not implement admission.Defaulter or WithDefaulter wasn't called" "GVK"={"Group":"trust.cert-manager.io","Version":"v1alpha1","Kind":"Bundle"}
I1205 12:14:49.209458       1 webhook.go:189] trust/manager/controller-runtime/builder "msg"="Registering a validating webhook" "GVK"={"Group":"trust.cert-manager.io","Version":"v1alpha1","Kind":"Bundle"} "path"="/validate-trust-cert-manager-io-v1alpha1-bundle"
I1205 12:14:49.209614       1 server.go:183] trust/manager/controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-trust-cert-manager-io-v1alpha1-bundle"
I1205 12:14:49.209760       1 server.go:185] trust/manager/controller-runtime/metrics "msg"="Starting metrics server"
I1205 12:14:49.209905       1 server.go:224] trust/manager/controller-runtime/metrics "msg"="Serving metrics server" "bindAddress"="0.0.0.0:9402" "secure"=false
I1205 12:14:49.209902       1 server.go:50] trust/manager "msg"="starting server" "addr"={"IP":"::","Port":6060,"Zone":""} "kind"="health probe"
I1205 12:14:49.209966       1 server.go:191] trust/manager/controller-runtime/webhook "msg"="Starting webhook server"
I1205 12:14:49.210148       1 leaderelection.go:250] attempting to acquire leader lease ziti-controller/trust-manager-leader-election...
I1205 12:14:49.210509       1 certwatcher.go:161] trust/manager/controller-runtime/certwatcher "msg"="Updated current TLS certificate"
I1205 12:14:49.211257       1 server.go:242] trust/manager/controller-runtime/webhook "msg"="Serving webhook server" "host"="0.0.0.0" "port"=6443
I1205 12:14:49.211518       1 certwatcher.go:115] trust/manager/controller-runtime/certwatcher "msg"="Starting certificate watcher"
I1205 12:14:49.216844       1 leaderelection.go:260] successfully acquired lease ziti-controller/trust-manager-leader-election
I1205 12:14:49.218502       1 recorder.go:104] trust/manager/events "msg"="trust-manager-547c94ddc9-tw9p6_1007dd08-e28c-4668-9bfb-00913f29850e became leader" "object"={"kind":"Lease","namespace":"ziti-controller","name":"trust-manager-leader-election","uid":"aa198db1-ebb2-49c3-b579-1c33b96750b1","apiVersion":"coordination.k8s.io/v1","resourceVersion":"18974"} "reason"="LeaderElection" "type"="Normal"
I1205 12:14:49.219053       1 controller.go:178] trust/manager "msg"="Starting EventSource" "controller"="bundles" "source"="kind source: *v1.PartialObjectMetadata"
I1205 12:14:49.228903       1 controller.go:178] trust/manager "msg"="Starting EventSource" "controller"="bundles" "source"="kind source: *v1alpha1.Bundle"
I1205 12:14:49.228924       1 controller.go:178] trust/manager "msg"="Starting EventSource" "controller"="bundles" "source"="kind source: *v1.Namespace"
I1205 12:14:49.229010       1 controller.go:178] trust/manager "msg"="Starting EventSource" "controller"="bundles" "source"="kind source: *v1.ConfigMap"
I1205 12:14:49.229029       1 controller.go:178] trust/manager "msg"="Starting EventSource" "controller"="bundles" "source"="kind source: *v1.Secret"
I1205 12:14:49.229036       1 controller.go:186] trust/manager "msg"="Starting Controller" "controller"="bundles"
I1205 12:14:49.331325       1 controller.go:220] trust/manager "msg"="Starting workers" "controller"="bundles" "worker count"=1
I1205 12:15:06.524664       1 recorder.go:104] trust/manager/events "msg"="Successfully synced Bundle to all namespaces" "object"={"kind":"Bundle","name":"ziti-controller-ctrl-plane-cas","uid":"65e072a8-ddde-41ff-a74b-2d8651e3d025","apiVersion":"trust.cert-manager.io/v1alpha1","resourceVersion":"19066"} "reason"="Synced" "type"="Normal"
I1205 12:15:06.524717       1 recorder.go:104] trust/manager/events "msg"="Successfully synced Bundle to all namespaces" "object"={"kind":"Bundle","name":"ziti-controller-ctrl-plane-cas","uid":"65e072a8-ddde-41ff-a74b-2d8651e3d025","apiVersion":"trust.cert-manager.io/v1alpha1","resourceVersion":"19066"} "reason"="Synced" "type"="Normal"

Thanks for confirming your Controller and Trust Manager are ready. This leads me to believe the Helm input value ctrl.endpoint to the Router's Helm release is not valid for the "ctrl" (Router control plane) service provided by the OpenZiti Controller.

The error logs indicate this value must be ziti-controller-ctrl.miniziti.svc:6262, and this must be changed to match the Kubernetes service provided by the OpenZiti Controller pod named like "ziti-controller-ctrl."

For example, if your Controller is deployed in namespace "ziti-controller," and the Controller's Helm release has input value ctrlPlane.advertisedPort=443 (the default). A valid value for the Router's release is ctrl.endpoint=ziti-controller-ctrl.ziti-controller.svc:443.

You can upgrade your Router release with this value by re-running helm upgrade with a complete set of input values. You can extract the current values with helm get values.


I see what you mean about the incongruity between the manual steps that use "ziti-controller" as an example and the scripted alternative that uses namespace "miniziti" as a default.

I will correct those manual steps to use miniziti.bash default values as an example. That way, there won't be a discrepancy between the namespaces if you follow the manual steps or run the script.

Sorry that made you stumble! Will you let me know if correcting the value of ctrl.endpoint gets your Router up and running?

Hi @qrkourier worked like a charm. thanks for helping me out, router is now in running state

1 Like

Documentation fixes preview: Kubernetes Quickstart | OpenZiti

and proposed in this pull request.