Ziti router from quickstart conflicting with existing local DNS after update

Hello!

I have a setup built originally from the quickstart on a raspberry pi that has been working wonderfully for a few months.

Recently (from about a week ago), I updated the ziti binaries, but a new problem has come up --- the ziti-router service (originally created as part of the quickstart) appears to be binding port 53, which is somehow now blocking pihole from working correctly. Stopping ziti-router and doing a pihole restartdns then makes my local DNS all work, but of course now my ziti stuff is not working, since this was basically what was handling access to all the internal local stuff.

Oddly, doing a systemctl start ziti-router after pihole is up and running seems to work fine, and both appear to be coexisting.

I tried adding an After=pihole-FTL.service to the generated ziti-router systemd file but that doesn't appear to have helped, though admittedly I haven't done much with systemd unit files before.

Is there any proper way I should be handling this, or any particular configuration I should look at?

Hi @Himekaidou,

I wouldn't have expected the router to hold port 53 open in that way unless the router was configured for "tproxy" mode. This allows the rotuer to intercept traffic like other tunnelers. Can you confirm your tunnel binding?

  - binding: tunnel
    options:
      mode: host #tproxy|host

it should be 'host' by default but maybe you're running tproxy mode?

I checked the argument listed in the ExecStart section of the service's unit file, and I see:

listeners:
# bindings of edge and tunnel requires an "edge" section below
  - binding: edge
    address: tls:0.0.0.0:8442
    options:
      advertise: ziti.(hostname removed here):8442
      connectTimeoutMs: 5000
      getSessionTimeout: 60
  - binding: tunnel
    options:
      mode: host #tproxy|host

Which does appear to be set to host, hrm.

I also noticed, if it helps in any way, based on the path in the unit file, it was originally created by the quickstart with version 0.32.2, the currently installed version is 1.1.15. It has been updated a few times in between but I did not notice issues at those times, though to be fair, this only seems to be an issue when the pi is restarted fully, as starting ziti-router after pihole is up (ie, after an update of the binaries) seems to be fine.

I'll have to ask others (so it prolly won't be for a few days) but I don't expect it to be listening. Can you get the pid of the router and run a netstat or ss and verify the router is actually listening on that port?

No worries about the time, it works as long as I manually start it after pihole so nothing's straight broken at the moment, haha.

Tried looking up ss and I think I got it right.

ziti-router started before pihole-FTL:

sudo ss -lp | grep ziti
u_str LISTEN 0      0                                   /tmp/gops-agent.724.sock 7295                          * 0    users:(("ziti",pid=724,fd=8))

u_str LISTEN 0      0                                 /tmp/gops-agent.35451.sock 269215                        * 0    users:(("ziti",pid=35451,fd=6))

udp   UNCONN 0      0                                                  127.0.0.1:domain                  0.0.0.0:*    users:(("ziti",pid=35451,fd=10))

tcp   LISTEN 0      4096                                                       *:amanda                        *:*    users:(("ziti",pid=35451,fd=7))

tcp   LISTEN 0      4096                                                       *:8442                          *:*    users:(("ziti",pid=35451,fd=8))

tcp   LISTEN 0      4096                                                       *:8440                          *:*    users:(("ziti",pid=724,fd=9))

tcp   LISTEN 0      4096                                                       *:8441                          *:*    users:(("ziti",pid=724,fd=10))

ziti-router started after pihole-FTL:

sudo ss -lp | grep ziti
u_str LISTEN 0      0                                   /tmp/gops-agent.724.sock 7295                          * 0    users:(("ziti",pid=724,fd=8))

u_str LISTEN 0      0                                 /tmp/gops-agent.35528.sock 271054                        * 0    users:(("ziti",pid=35528,fd=6))

tcp   LISTEN 0      4096                                                       *:amanda                        *:*    users:(("ziti",pid=35528,fd=7))

tcp   LISTEN 0      4096                                                       *:8442                          *:*    users:(("ziti",pid=35528,fd=8))

tcp   LISTEN 0      4096                                                       *:8440                          *:*    users:(("ziti",pid=724,fd=9))

tcp   LISTEN 0      4096                                                       *:8441                          *:*    users:(("ziti",pid=724,fd=10))

So I think it's binding to 53 (the 127.0.0.1:domain line) when it starts first, and sits on it, preventing pihole from starting properly.

EDIT: To clarify, I also originally found this because my network's DNS wasn't working (as it goes through pihole) and a pihole debug revealed that there was a conflict with the ziti binary on port 53, so this tracks so far.

Yep, the same router PID that's listening on all interfaces for 8442/tcp is binding a socket on the loopback for 53/udp.

I checked my local Linux router's listeners in host tunnel mode and it's not binding a udp socket on port 53. I'm guessing there's another router config.yml that's in effect that's not the same one where you noted the "host" mode tunnel binding.

Let's find the path of the effective config file.

pgrep -f 'ziti router run' | xargs -r ps -fww

...should return output like this.

UID          PID    PPID  C STIME TTY      STAT   TIME CMD
ziti-ro+  103615       1  0 10:55 ?        SNsl   0:01 /opt/openziti/bin/ziti router run config.yml --extend

Then find the router's current working dir.

pgrep -f 'ziti router run' | xargs -rI PID sudo ls -l /proc/PID/cwd

...should look like this.

lrwxrwxrwx 1 ziti-router ziti-router 0 Oct 12 10:55 /proc/103615/cwd -> /var/lib/private/ziti-router

Putting these together, we know the effective config file is /var/lib/private/ziti-router/config.yml in my case.

Shortcut one-liner to find the realpath if we know the filename and PID. Substitute your config filename for "config.yml."

pgrep -f 'ziti router run' | xargs -rI PID sudo realpath /proc/PID/cwd/config.yml

BTW, it is possible to run the router and pi hole at the same time even if they both bind 53/udp. The router only needs to bind the loopback and pi hole only needs to bind the main interface, so you could adjust the pi hole's bind configuration to use your local network's IP address instead of 0.0.0.0 (all interfaces, including loopback where the conflict occurred).

Hmmm, those are showing the same file that were used earlier, which is the same file, as far as I can tell, as in the systemd unit:

pgrep -f 'ziti router run' | xargs -r ps  -fww
UID          PID    PPID  C STIME TTY      STAT   TIME CMD
root       35528       1  0 Oct11 ?        Ssl    0:45 /usr/bin/ziti router run /home/amackenzie/.ziti/quickstart/cirno/cirno-edge-router.yaml
pgrep -f 'ziti router run' | xargs -rI PID sudo ls -l /proc/PID/cwd
lrwxrwxrwx 1 root root 0 Oct 12 10:12 /proc/35528/cwd -> /home/amackenzie/.ziti/quickstart/cirno

/etc/systemd/system/ziti-router.service (snipped out from the middle):

WorkingDirectory=/home/amackenzie/.ziti/quickstart/cirno
ExecStart="/usr/bin/ziti" router run "/home/amackenzie/.ziti/quickstart/cirno/cirno-edge-router.yaml"

Copy and pasting the path to that config from the psgrep output, the listeners block is:

listeners:
# bindings of edge and tunnel requires an "edge" section below
  - binding: edge
    address: tls:0.0.0.0:8442
    options:
      advertise: ziti.broken-mirror.net:8442
      connectTimeoutMs: 5000
      getSessionTimeout: 60
  - binding: tunnel
    options:
      mode: host #tproxy|host

BTW, it is possible to run the router and pi hole at the same time even if they both bind 53/udp. The router only needs to bind the loopback and pi hole only needs to bind the main interface, so you could adjust the pi hole's bind configuration to use your local network's IP address instead of 0.0.0.0 (all interfaces, including loopback where the conflict occurred).

That makes sense, I can try afterwards. I'm a bit worried that my config somewhere is borked since this issue wasn't present (or at least wasn't noticed... though losing DNS on the entire local network would normally be noticeable?) until recently.

I found evidence the Ziti nameserver tries to start when tunnel mode is "host." If the PID has permission to bind low ports like 53 it will succeed unnecessarily binding the UDP socket, but fails with an error message if it lacks permission.

This tells me two things:

  1. ziti router run has a defect that causes it to bind port 53 when tunneling mode isn't tproxy (link to GitHub issue).
  2. Your router is running with unnecessarily elevated permissions, e.g., as root. If you remove privileges and manage file owner and mode accordingly, then the router will be unable to bind 53/udp and create the conflict with the Pi Hole nameserver.

I see from the thread you used the expressInstall function to generate a temporary, local controller and router configuration.

You can inspect the path to the router's system-wide service unit to see the User directive, if any:

systemctl cat ziti-router.service

Since you're working with a temporary, local quickstart, you may wish to run the router as your login user instead of root. You can accomplish this by moving the service unit file from the system-wide location reported by the systemctl cat command to your user's systemd namespace.

For example,

Create the systemd user directory if necessary.

mkdir -pv ~/.config/systemd/user

Move the system-wide service unit.

sudo mv /lib/systemd/system/ziti-router.service ~/.config/systemd/user/ziti-router.service

Grant your login user permissions on the router's current working directory and systemd units directory.

sudo chown -Rc $(id -u) /home/amackenzie/.ziti/quickstart/cirno ~/.config/systemd/user/
sudo chmod -Rc u+rwX /home/amackenzie/.ziti/quickstart/cirno ~/.config/systemd/user/

Edit the moved file to delete the User=root directive.

vi ~/.config/systemd/user/ziti-router.service

Reload all systemd units from the disk.

sudo systemctl daemon-reload
systemctl --user daemon-reload

Now the router is running as your login UID and does not have permission to bind 53/udp, which causes an "error" message like the below that has no effect on tunnel mode "host."

{
  "error": "dns server failed to start: listen udp 127.0.0.1:53: bind: permission denied",
  "file": "github.com/openziti/ziti/router/xgress_edge_tunnel/tunneler.go:75",
  "func": "github.com/openziti/ziti/router/xgress_edge_tunnel.(*tunneler).Start",
  "level": "error",
  "msg": "failed to start DNS resolver. using dummy resolver",
  "time": "2024-10-14T09:05:16.665Z"
}

Monitor the relocated service unit's log.

journalctl --user -lfu ziti-router.service

Thanks for the advice! It seems to work well, though I had to make a few minor changes to the unit file to get it to work (mostly since I haven't used systemd stuff much before, main thing was changing the target from multi-user to default, since multi-user is not a valid target on user services), and enable lingering on the user. I moved the controller to the user service folder as well and made similar changes, while I was at it.

Restarted a few times and there doesn't appear to be any issues, so this looks great so far.

Should this be standard for an expressInstall setup? The tutorial I followed at Host OpenZiti Anywhere | OpenZiti still says to install it at /etc/systemd/system/, and the unit files are the generated ones, just with paths adjusted now. It seems like an excellent idea to have it user-based to start with.

1 Like

In that same time frame of months since you set up the Pi as a ziti server, we published Linux deployment guides: Deploying on Linux | OpenZiti

That's where I'd start if the goal is a permanent, upgradable installation. There are also guides for Docker and Kubernetes.

Oh, neat. I'll take a look at those and see how they work, then. I have vague memories of a bit of a hellish time trying to figure out whether I could stick it into a small k3s node way back when, and ran into a buncha stuff I didn't understand (due to lack of experience). Sounds like it's time to look again!

Thank you!

You're welcome. Here's a post I wrote about installing in K3D.