Followed quickstart, services not connecting :(

I followed the quickstart, got a controller, router, and console running just fine.

From the ‘zitifying kubectl’ article, I found ziti commands to setup a service. Change service name, port, and ran. From console, everything looks OK (as far as I can understand they should be), but services don’t connect. I tried to replicate with NF free tier, and that one works just fine.

One thing I’m wondering, if I run ‘ziti fabric list’ there are no links or routers there. Where are those to be setup from? Are there any steps I might have missed?

@punasusi, Welcome to the community! I’m glad you had success with the NF free tier.

What service did you setup? Could you provide the ziti commands you ran? Did you look at the logs from both ends? Any hints in either log as to the problem? I suspect there will be.

ziti fabric list with the host it anywhere quickstart produces a single router. A single router won’t have any links because all traffic will flow through that one router and no mesh is formed. You’d need two routers before you see a single link.

My guess as to the problem would be that you didn’t authorize the services quite correctly. That’s a common mistake. If you could please describe your overall setup you’re trying to accomplish and share the relevant logs, that’d be helpful for us to give you better assistance.

Welcome!

1 Like

here are the commands I ran. I also tried editing the service config for the host to have the laptops local name, instead of localhost IP address, but didn’t have any better luck.

For this test, I have an echo HTTP service running in one laptop, and try to access it from the other. I’ve also tried with postgres, but this echo is easier to setup. Both echo HTTP and postgres work just fine in the NF network. I’m using ziti desktop edge tunneler in both laptops.

And, what logs ‘from each end’ do you mean? from the tunneler?

service_name=echo
the_port=8080
client_identity="${service_name}“Client
server_identity=”${service_name}"Server

ziti edge create identity device “${server_identity}” -a "${service_name}"ServerEndpoints -o “${server_identity}”.jwt
ziti edge create identity device “${client_identity}” -a "${service_name}"ClientEndpoints -o “${client_identity}”.jwt

ziti edge create config “${service_name}”-host.v1 host.v1 ‘{“protocol”:“tcp”, “address”:“127.0.0.1”,“port”:’"${the_port}"’}’
ziti edge create config “${service_name}”-client-config intercept.v1 ‘{“protocols”:[“tcp”],“addresses”:["’"${service_name}.ziti"’"], “portRanges”:[{“low”:’"${the_port}"’, “high”:’"${the_port}"’}]}’
ziti edge create service “${service_name}” --configs “${service_name}”-client-config,"${service_name}"-host.v1
ziti edge create service-policy “${service_name}”-binding Bind --service-roles ‘@’"${service_name}" --identity-roles ‘#’"${service_name}"‘ServerEndpoints’
ziti edge create service-policy “${service_name}”-dialing Dial --service-roles ‘@’"${service_name}" --identity-roles ‘#’"${service_name}"‘ClientEndpoints’

Thanks, I checked your commands and they look fine to me (other than the copy/paste/autocorrect of straight quotes to smart quotes :slight_smile:)

Yes, when I wrote "from each end", I meant the tunneller logs. Are you running Ziti Desktop Edge for Mac or Windows? Depending on which kind they are in different locations.

How did you deploy the openziti quickstart? Is it docker, host it anywhere, or everything local? I am wondering if one of the machines cannot address the edge router. That error shows up in the logs as "no suitable edge router" (I think, exact wording I can't remember right now).

Sounds like you're deploying this sort of scenario. The HTTP server is set to bind on any ip address (0.0.0.0) right? From your commands I expect it's listening on 8080 as well.

Can you find the logs for both ziti desktop edge and see if there are any hints in there as to the problem?

If you're using windows, the logs are found in %ProgramFiles(x86)%\NetFoundry, Inc\Ziti Desktop Edge\logs\service. For MacOS you can open the Z menu at the top. (My Mac is acting up, I can't get a screen cap for you)

Let's peek in your logs and make sure there's no error.

Controller/router installed with the ‘host it yourself’, VM in Azure. Ports for controller and router open to world. Console also deployed, console and SSH ports open only to my IP.

Setup as your diagram, other than both remote and local machines are sitting on my desk.

Edge desktop running on Mac, installed from app store, and I don’t see any Z or any menu options at all…

If you don’t see the ‘z’ that might be “the problem” I suppose :slight_smile: My mac finally came to life. Here’s the Z and how you find ‘the logs’

image

(facepalm)… oh, that ‘z’… yes, I see that, and found the logs…

ERROR ziti-sdk:channel.c:845 on_channel_connect_internal() ch[0] failed to connect [-3008/unknown node or service]
[2022-05-03T12:27:05.847Z] ERROR ziti-sdk:connect.c:284 on_channel_connected() ztx[1] ch[0] failed to connect [-3008/unknown node or service]
[2022-05-03T12:27:05.847Z] INFO ziti-sdk:channel.c:731 reconnect_channel() ch[0] reconnecting in 49824 ms (attempt = 3568)
[2022-05-03T12:27:10.845Z] WARN ziti-sdk:connect.c:331 connect_timeout() conn[1.0] bind timeout: no suitable edge router

I guess somewhat obvious where to look next :slight_smile:

“no suitable edge router” - I think you’ve hit a bug with the current “host it anywhere” deployment… I’m trying to get that fixed but, the release hasn’t come out yet. I can walk you through how to fix that. This was what I was worried about. Let me get my notes together and I’ll provide you the steps to perform

Ah, I changed to DEBUG logs, and it’s trying to connect to the OS hostname:8442, instead of the public DNS name. Is that the bug you were mentioning?

Yes. Specifically the edge router config file at ~/.ziti/quickstart/$(hostname)/$(hostname)-edge-router.yaml needs to advertise the DNS name:

listeners:
# bindings of edge and tunnel requires an "edge" section below
  - binding: edge
    address: tls:0.0.0.0:8442
    options:
      advertise: ec2-3-19-75-37.us-east-2.compute.amazonaws.com:8442
      connectTimeoutMs: 1000
      getSessionTimeout: 60s
  - binding: tunnel
    options:
      mode: host #tproxy|host

edge:
  csr:
    country: US
    province: NC
    locality: Charlotte
    organization: NetFoundry
    organizationalUnit: Ziti
    sans:
      dns:
        - ec2-3-19-75-37.us-east-2.compute.amazonaws.com
        - localhost
      ip:
        - "127.0.0.1"

You need to modify the config and recreate the edge router so that the proper certificates are generated. I’m working on those steps for you now. I also don’t run this in Azure - I use AWS and AWS has an external DNS name provisioned for all machines out of the gate. I don’t recall if Azure does this.

During the expressInstall did you set the EXTERNAL_DNS and EXTERNAL_IP as shown in the quickstart? I expect you have. I just ran a quickstart on a new VM and it seems like it worked for AWS. I can try running on Azure to see if there’s a difference.

You can’t just edit the config values and restart the router though. OpenZiti - being a zero trust overlay, creates certificates based on values provided during the expressInstall phase. The router needs to have a SANS for the DNS name/external IP - not ‘the hostname’.

I’m working on the steps you need to ‘fix’ this right now - won’t be long

I see that v0.25.5 just came out! That might be why my expressInstall seemed fine and yours had the bug in it.

Here are the steps necessary to correct the problem:

# stop the edge router somehow...
systemctl stop ziti-router or killall ziti-router etc...

# edit the router config file
#  - change the listeners.binding.options.advertise
#  - change the edge.csr.sans.dns with the external DNS name
#  - change the edge.csr.sans.dns with the externa IP address
vi router config: vi ~/.ziti/quickstart/$(hostname)/$(hostname)-edge-router.yaml

# verify edge router exists
ubuntu@ip-172-31-32-5:~$ ziti edge list edge-routers
╭────────────┬────────────────────────────┬────────┬───────────────┬──────┬────────────╮
│ ID         │ NAME                       │ ONLINE │ ALLOW TRANSIT │ COST │ ATTRIBUTES │
├────────────┼────────────────────────────┼────────┼───────────────┼──────┼────────────┤
│ F0h1Eo07Rv │ ip-172-31-32-5-edge-router │ false  │ true          │    0 │            │
╰────────────┴────────────────────────────┴────────┴───────────────┴──────┴────────────╯
results: 1-1 of 1

# delete that edge router
ubuntu@ip-172-31-32-5:~$ ziti edge delete edge-router $(hostname)-edge-router
delete of edge-router with id F0h1Eo07Rv: OK

# create the edge router so we can enroll it again
ubuntu@ip-172-31-32-5:~$ ziti edge create edge-router $(hostname)-edge-router -t -o $ZITI_HOME/$(hostname)-edge-router.jwt
New edge router ip-172-31-32-5-edge-router created with id: GTTCEGhkRv
Enrollment expires at 2022-05-03T12:58:11.207Z

# use the ziti-router binary to enroll the edge router
ubuntu@ip-172-31-32-5:~$ ziti-router enroll "$ZITI_HOME/$(hostname)-edge-router.yaml" --jwt "$ZITI_HOME/$(hostname)-edge-router.jwt"
[   1.474]    INFO edge/router/enroll.(*RestEnroller).Enroll: registration complete

# grant all endpoints access to this edge router
ziti edge update edge-router $(hostname)-edge-router -a "public"

# start the edge router

The instructions worked (after restart of the edge client also).

I guess I just missed the fix in 0.25.5, as I was on 0.25.4… but I learned a lot about where/how to find these things.

Thanks again for the help!

2 Likes