Ziti Routing Question

Hello everyone,

currently I am playing with OpenZiti thinking about replacing mit current (Tailscale) mesh setup.
Getting things up and running took some time but I finally managed to connections to my local subnet
192.168.100.0/24 from and outsite windows system working.
All internel clients are Ubuntu 22.04 with the tunneler installed and inside this network is also the edge router and the controler located.
Now my two questions:

Having a policy with an intercept.v1 allowing 192.168.100.0/24 and the same for the host.v1 policy leads to the fact that I can see SSH connections from the outside windows mashine are getting created from different internal hosts (which all are running the tunneler).
For Example: starting an SSH session from the windows machine to 192.168.100.5 several times, shows different origins (192.168.100.4, 192.168.100.7 and sometimes even 192.168.100.5) in the logs.
I would expect that it always tries to stay within the encrypted ziti network, but it seems that it doesn`t?
How could I restrict this to stay within the city network not using different tunnelers because all system are equipped with them?

All systems are named $host.ziti.
Building polcies with *.ziti for the intercept and localhost for the host.v1 does not work.
How do I have to build the policies so that I am able to use ssh to the fqdn which is resolved correctly?
Even setting 100.64.0.0/12 in the host.v1 does not work.

Any help would be great. It seems that I am a little lost.
Thanks!

Hi @strand, welcome to the community and to OpenZiti!

Would you mind explaining exactly what you're trying to do just 'overall'? OpenZiti has some different paradigms available and it can lead to different setups than "regular IP networking". Everyone starts with that knowledge of IP though. It can help me and others give you a better answer if you can describe what the overall solution is... I'll still see if I can give you answers to your questions though...

When you say "shows different origins", can I assume you mean via tcpdump on the remote ssh server? It kind of sounds to me like you might have allowed "all" identities bind the particular ssh service? If that's the case, OpenZiti round-robins traffic to the "termination point". This is the identity the traffic should be sent to on the overlay network, and then 'processed' in some way at that remote node. When using a tunneling app, and offloading the traffic from the overlay back to the underlay (like in this case for ssh), the traffic is 'processed' by effectively being forwarded on to the final destination on the underlay (ip) network. It sounds like more than one device is able to do that forwarding, and it sounds like the traffic is then coming from a random node in the destination network? Without having a great overview, that'd be my guess there...

You mean you haven't gotten it working yet, right? :slight_smile: not that it "doesn't work"??? Again, the overall idea would help me answer better here but here's what it sounds like you're trying to do. It sounds like you're trying to have say 3 (three) machines in the remote network. I'd like to assume that these three machines are named:

  • host1.my.domain
  • host2.my.domain
  • host3.my.domain

If that's the case, you would want to use a feature of the tunnelers where we can grab the intercepted domain name, and simply "send it to the far side", where it is effectively placed back onto the underlay at the domain name that was intercepted. It seems like that's what you want to do?

Assuming that's the case, that sounds a lot like the post we had recently where a user was trying to access a bunch of prometheus scrape targets, similarly to how you're using it for ssh. That thread is here: Reduce number of Service Policies for Monitoring

For that thread, I made a video that shows you how to use the $dst_hostname variables to accomplish what you're trying to do. It demonstrates the concept using prometheus but I think it'll show you the ideas needed properly... If not, correct my understanding and we'll go from there :slight_smile:

You can see the ziti CLI commands I used out at GitHub - dovholuknf/hello-prometheus

I'm sure we'll get you sorted. If you have a request for more/different doc or a different more targetted video let me know. I'll see if I can make one. You also might be interested in zssh too... Perhaps: GitHub - openziti-test-kitchen/zssh: Ziti SSH

Hi @TheLumberjack ,

thanks for the welcome and your lighing fast response!
Perhaps I did not give you the full picture but it seems that your answers hit it to some extend.
As I need to watch the video for Question 2, I will just go step by step and try to give you more background on question 1:
I have worked with different overlay networks in the past, tailscale, nebula, zerotier.
Taking the SSH example I am used to the my clients IP in the auth.log when I am starting an SSH session to a remote host.
For OpenZiti I understood so far, that I would see the IP of the tunneler at the destination in my auth.log at the SSH server's end, correct?
What I am trying to archive is the following:

  • All hosts have the edge tunneler installed, all systems are Ubuntu 22.04
  • starting an SSH session to one of these hosts I want to have to local IP in the auth.log as this should be the point where the traversal vom overlay to underlay network takes place.
  • I don't want other edge tunnelers to be used to access hosts on the network just the local installed one, as all hosts will have the edge tunnelers. I don't want the trust to be on the local network and in best case no ports are open for the hosts.

Does this make sence so far?
What I did so far to archive this:

  • I have a bind (host.v1) and a dial (intercept.v1) policy each including 192.168.100.0/24 whith all ports forwaded for testing purpose.
  • I have a service policy combining these.

The result is described above. It works, but not as I am expecting it. It seems as if my bind policy is to wide.
I tried to set the host.v1 to localhost, so that just the local services are exposed, but this does not work.

Thanks and cheers

It truly depends on the overall setup. For example let's say you had those three hosts from before (host1: 192.168.1.1, host2: 192.168.1.2, host3: 192.168.1.3), and through service policies, you allow ALL THREE to bind a service that sends ssh traffic to some other target, like your plex server or whatever which is at (remote) 192.168.1.4... In that situation, when you are on your local computer (not on the 192.168.1.1/24 network) and you ssh through OpenZiti, it's possible for the traffic to go through any of those three remote hosts, because you authorized it to do that. That's totally fine if that's what you intend, but that might explain why you're seeing the traffic from different nodes.

  • All hosts have the edge tunneler installed, all systems are Ubuntu 22.04

Perfect! This is the best solution too. Right here we can start to focus on the service's host.v1 config. If this is the case, and if it were me, I would make a single service with a single host.v1 entry and offload from that local tunneler to 127.0.0.1 (I'd suggest you use an ipv4 address like this and avoid 'localhost' as it can cause some ipv4/ipv6 confusion since localhost might resolve to ::1 too...). If you do that, well then there's no way that any of those machines could possibly enter the scenario as I outlined above. :slight_smile: ALL the traffic would exclusively come from the same node.

I already answered this so it's probably obvious, but this is where you're going "wrong" (I mean, it works like you discovered, it's just confusing lol). You're allowing the intercepted IP to 'forward', a cool feature of OpenZiti, but you don't want that in this particular scenario. Instead of capturing the ip and forwarding the ip you captured, simply offload to "127.0.0.1:22" and you'll get the result you desire. (assuming I've understood, but I think I do)

I wrote all that and then read this:

And that might be due to the "ipv4 vs ipv6" confusion I mentioned... That might be the issue, maybe? It will work, but I suspect localhost resolved to ::1 and your sshd is only listening on ipv4? I have an ubuntu in AWS and I see that it thinks it's listening on both:

ss -lntp | grep 22
LISTEN 0      128          0.0.0.0:22         0.0.0.0:*
LISTEN 0      128             [::]:22            [::]:*

If you put it back to 127.0.0.1 offload, and it's not working. Let's chat some more and I'll make a video along with every ziti cli command I ran and demonstrate it... Maybe some small thing is going wrong for you (it's always something small, right?) :slight_smile:

Good @TheLumberjack ,

thanks again for your feedback, it helps me to understand things further and get a better idea how to form my network.
Before I deep dive into the challenges of topic 2 I have tested setting my host.v1 conf to 127.0.0.1 which leads to a not working cofiguration.
My client logs are showing:

[2023-09-11T05:13:02.386Z]   ERROR ziti-sdk:connect.c:963 connect_reply_cb() conn[0.2/Connecting] failed to connect, reason=exceeded maximum [2] retries creating circuit [c/7y-Zrfr1ne]: error creating route for [s/7y-Zrfr1ne] on [r/XLZsNZ3jl3] (error creating route for [c/7y-Zrfr1ne]: failed to establish connection with token 58c0ee83-1eb6-4070-8bcb-08108b80ffc5. error: (rejected by application))
[2023-09-11T05:13:02.386Z]   ERROR tunnel-cbs:ziti_tunnel_cbs.c:103 on_ziti_connect() ziti dial failed: connection is closed

Could this be because of the fact that I am sending a network 192.168.100.0/24 in the intercept of just a single host?

This is my config for reference:

{
  "name":"StrandUndMeer.net.host.v1",
  "data":{
    "allowedAddresses":[
      "127.0.0.1"],
    "allowedPortRanges":[
      {
        "high":65535,
        "low":1
      }],
    "allowedProtocols":[
      "tcp",
      "udp"],
    "allowedSourceAddresses":[
    ],
    "forwardAddress":true,
    "forwardPort":true,
    "forwardProtocol":true,
    "listenOptions":{
      "bindUsingEdgeIdentity":false,
      "identity":"",
      "precedence":"default"
    }
  }
}
{
  "name":"StrandUndMeer.net.intercept.v1",
  "data":{
    "addresses":[
      "192.168.100.0/23"],
    "dialOptions":{
      "identity":""
    },
    "portRanges":[
      {
        "high":65535,
        "low":1
      }],
    "protocols":[
      "tcp",
      "udp"],
    "sourceIp":""
  }
}

Thanks for taking the time to look into this.

Kai

Just something to add on top:
If I change the intercept.v1 to 192.168.100.7 and change the host.v1 to just use 127.0.0.1 without forwarding it works.
Unfortunately it would be much easer to have one config to allow SSH to all hosts of a subnet without forwarding involved.
Is there any way to archive this?

Thanks
Kai

Thanks for sharing your configs. I believe I can see clearly what "the problem" is. You're intercepting the range, like you want (192.168.100.0/23) but you left the host.v1 config with "forwardAddress: true" and you set the "allowedAddresses" to just be 127.0.0.1... That's not quite what you want.

Instead, all you should need to do is change the host.v1 to:

  • forwardAddress: false (or just remove it entirely)
  • allowedAddresses: null (or just remove it entirely)
  • address: '127.0.0.1'

So your full config should look like this:

ziti edge update config StrandUndMeer.net.host.v1 -d '{"protocol":"tcp", "address":"127.0.0.1","port":22}'

This way you're configuring the service to ONLY use TCP (it's ssh), ONLY send traffic to 127.0.0.1, and only use port 22 (instead of permitting forwarding all ports)

If you want to use "forwarding" and you WANT to foward all ports (not the most zero trust approach, but some people are fine with that) then you would change your bind-side config like this:

ziti edge update config StrandUndMeer.net.host.v1 -d '{"address":"127.0.0.1","forwardPort":true, "allowedPortRanges":[{"low":1,"high":65535}], "forwardProtocol":true, "allowedProtocols":["tcp","udp"]}'

But, I'd start off with that first example, just to get a 'win' :slight_smile:

Thanks @TheLumberjack for the fast help.

Now it seems that we are getting into the funny part as this is totally screwing up things:
having the following host.v1

with the intercept policy set to 192.168.100.0/23 is randomizing my connections.

doing an ssh to 192.168.100.12 is bringing me to the correct host.
doing an ssh to 192.168.100.7 is bringing me to the real host 192.168.100.12 again, I do get an SSH key change warning then.
doing multiple ssh sessions in a row seems to randomize my destinations, I never know where I will get connected to and just notice that it is wrong when the ssh key change warning appears....

what ever happens here, it is very strange...

cheers
Kai

OH MY GOODNESS... I totally messed up. Give me a moment and I'll get you the proper configs. I'll blame it on my early morning. I'm sorry.

:slight_smile:
Just for reference the configs.
Don't be irritated by the naming, I changed the namings during clean up:

{
  "name":"SSH_Proxmox",
  "configs":[
    "38ORl5g8Pp6do1afKetKUj",
    "nJhGKlRn0bZr1aXOOpe1a"],
  "tags":{
  },
  "terminatorStrategy":"smartrouting",
  "encryptionRequired":true,
  "roleAttributes":[
  ]
}
{
  "name":"SSH_Proxmox.host.v1",
  "data":{
    "address":"127.0.0.1",
    "allowedSourceAddresses":[
    ],
    "listenOptions":{
      "bindUsingEdgeIdentity":false,
      "identity":"",
      "precedence":"default"
    },
    "port":22,
    "protocol":"tcp"
  }
}
{
  "name":"SSH_Proxmox.intercept.v1",
  "data":{
    "addresses":[
      "192.168.100.0/23"],
    "dialOptions":{
      "identity":""
    },
    "portRanges":[
      {
        "high":22,
        "low":22
      }],
    "protocols":[
      "tcp"],
    "sourceIp":""
  }
}
{
  "name":"SSH_Proxmox.bind",
  "type":"Bind",
  "serviceRoles":[
    "@56NwCogJnruS2KKtyc7OL5"],
  "identityRoles":[
    "#PVE_Server"],
  "postureCheckRoles":[
  ],
  "semantic":"AnyOf",
  "tags":{
  }
}
{
  "name":"SSH_Proxmox.dial",
  "type":"Dial",
  "serviceRoles":[
    "@56NwCogJnruS2KKtyc7OL5"],
  "identityRoles":[
    "#KaisClients"],
  "postureCheckRoles":[
  ],
  "semantic":"AnyOf",
  "tags":{
  }
}

Ok. Here's the precise list of commands that you want. I forgot that you're using a single service and that needs to use the $dst_ip token and it requires you to create your identities with a corresponding name to the ip you want to ssh to...

I've use 172.16.0.0/24 in my example:

Create Four Identities

ziti edge create identity "clint_tunneler" -a "ssh.dialers" -o clint_tunneler.jwt
ziti edge create identity "172.16.0.1" -a "ssh.binders" -o "172.16.0.1.jwt"
ziti edge create identity "172.16.0.2" -a "ssh.binders" -o "172.16.0.2.jwt"
ziti edge create identity "172.16.0.3" -a "ssh.binders" -o "172.16.0.3.jwt"

Configure the Overlay

ziti edge create config "ssh.host.v1" host.v1 '{"protocol":"tcp", "address":"localhost","port":22, "listenOptions": {"bindUsingEdgeIdentity":true}}'
ziti edge create config "ssh.intercept.v1" intercept.v1 '{"protocols":["tcp"],"addresses":["172.16.0.0/24"], "portRanges":[{"low":22, "high":22}], "dialOptions": {"identity": "$dst_ip"}}'
ziti edge create service "ssh" --configs "ssh.intercept.v1","ssh.host.v1"
ziti edge create service-policy "ssh-binding" Bind --service-roles "@ssh" --identity-roles "#ssh.binders"
ziti edge create service-policy "ssh-dialing" Dial --service-roles "@ssh" --identity-roles "#ssh.dialers"

Important Notes

Notice here that the intercept config has used dialOptions and uses $dst_ip as the identity to dial. That is why your identity, in this case must be named 172.16.0.1 (or 2, or 3 as shown above). We're instructing the tunneler to dial an identity named the same as the IP it just intercepted.

Also notice the bind identity is set to use listenOptions and bindUsingEdgeIdentity. This lets the overlay network create a terminator that the client can then send traffic to using the dialOptions as shown above...

That's what I should've sent you before. Sorry about that!

Thanks @TheLumberjack ,

you don't have to excuse while helping me out :slight_smile:
So I got it that I can't have both.
my current naming scheme of my hosts with name.domain.ziti and access all of them using ssh by IP with just one policy right?
It seems that I need to make a decision on how to go on with my current setup without ending up having hundreds of policies for each service on different hosts...

Thanks again

Oh you could do that as well.

ziti edge create config "ssh.by.name.host.v1" host.v1 '{"protocol":"tcp", "address":"127.0.0.1","port":22, "listenOptions": {"bindUsingEdgeIdentity":true}}'
ziti edge create config "ssh.by.name.intercept.v1" intercept.v1 '{"protocols":["tcp"],"addresses":["*.ziti"], "portRanges":[{"low":22, "high":22}], "dialOptions": {"identity": "$dst_hostname"}}'
ziti edge create service "ssh.by.name" --configs "ssh.by.name.intercept.v1","ssh.by.name.host.v1"
ziti edge create service-policy "ssh.by.name-binding" Bind --service-roles "@ssh.by.name" --identity-roles "#ssh.by.name.binders"
ziti edge create service-policy "ssh.by.name-dialing" Dial --service-roles "@ssh.by.name" --identity-roles "#ssh.by.name.dialers"

The problem (if you consider it a problem) is that in this configuration, OpenZiti will most likely assign non-deterministic IPs to the domain names and that can be "a problem" for ssh. Ssh (as I'm sure you know) will save the ip address of the host you're ssh'ing to into a "known_hosts" file. If the IP changes, ssh will warn you with something like "WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!"

Since the IPs are non-deterministic, this will happen easily if you change the order you access the services (it's nearly a guarantee)...

Now, if you control DNS on the client end, you could probably match them up manually but that seems like "effort" to me, maybe that's something you want to do?

The other option is to just work around that "small issue" via ssh command:

ssh ubuntu@identity.1.ziti -o StrictHostKeyChecking=no

or via a rule in your .ssh config...

Hope that helps

Thanks @TheLumberjack

I need to think if this is going to work for me as I will not lower the security level by disabling the ssh keys.
Perhaps this is just not the right solution for my usecase as it seems that I am not thinking in the correct terms and ways to adopt this to my needs!

Thanks
Kai

I don't blame you! I do want to be clear that this is not "disabling the ssh keys" though. Not at all. Ssh is just caching the IP address first used to connect to the client. The "key" that's referenced here is the IP address and the servers signature. You're still using ssh keys to authenticate to the sshd on the far side, but you probably know that. It's no different than just deleting the known_hosts file before ssh'ing to the target. Do you always pre-seed your clients with known_hosts or do you allow them to connect and add the key the first time? If you do, it feels ok to me to tell ssh not to reference the known_hosts file to me, but that's just me maybe. :smile:

If you used the IP approach, ssh wouldn't end up with this error because the IP never changes.

I totally understand your perspective though. For this one use case alone (ssh is the only application that does this I've found so far), I wonder if we should consider adding determinism to the IP assignments for DNS entries... I just end up telling ssh "it's fine" when ssh'ing to wildcard based addresses.

I have been ssh'ing via OpenZiti to a wildcard domain and it made me remember a setting I had added to my $HOME/.ssh/config file that I thought I'd pass along to you @strand and anyone else who finds this post.

To avoid ssh being grumpy about ssh'ing via OpenZiti, I forgot that I had added this to the config file:

 host *.ziti
    StrictHostKeyChecking no

I made a new wildcard domain recently called "cdzrok" and today I added:

 host *.cdzrok
    StrictHostKeyChecking no

I'm using ficticious DNS entries (*.ziti, *.cdzrok) so this doesn't feel like a security compromise to me whatsoever. Anyway, thought I'd just pass that along to anyone else using OpenZiti for wildcard ssh'ing needs!