Router Startup Fails with Only Link.Listeners Configured

Hi Team,
I was working on a POC, creating a two-edge router and some mesh routers in between with only Link-listeners configuration(no edge or tunnel binding).

But I'm getting an error as

 [   0.013]    INFO ziti/common/metrics.ConfigureGoroutinesPoolMetrics.GoroutinesPoolMetricsConfigF.func1.1: {idleTime=[30s] poolType=[pool.terminator_validation] minWorkers=[0] maxQueueSize=[1] maxWorkers=[50]} starting goroutine pool
[   0.013]    INFO ziti/router/forwarder.(*Scanner).run: started
[   0.013]    INFO ziti/router/internal/edgerouter.(*Config).LoadConfigFromMap: cached data model file set to: /var/ziti/etc/router.yaml.json.gzip
[   0.013] WARNING ziti/router/internal/edgerouter.(*Config).LoadConfigFromMap: Invalid heartbeat interval [0] (min: 60, max: 10), setting to default [60]
[   0.013]   **PANIC ziti/ziti/router.run: {error=[section [listeners] is required to be an array]} error registering edge in framework**
panic: (*logrus.Entry) 0x40004b42a0

goroutine 1 [running]:
github.com/sirupsen/logrus.(*Entry).log(0x40004b4230, 0x0, {0x4000b28030, 0x23})
	github.com/sirupsen/logrus@v1.9.3/entry.go:260 +0x494
github.com/sirupsen/logrus.(*Entry).Log(0x40004b4230, 0x0, {0x4000b87248?, 0x364bcc0?, 0x4000ca4108?})
	github.com/sirupsen/logrus@v1.9.3/entry.go:304 +0x60
github.com/sirupsen/logrus.(*Entry).Panic(...)
	github.com/sirupsen/logrus@v1.9.3/entry.go:342
github.com/openziti/ziti/ziti/router.run(0x4000acd808, {0x40002fae60, 0x1, 0x3a2c380?})
	github.com/openziti/ziti/ziti/router/run.go:78 +0x7d8
github.com/spf13/cobra.(*Command).execute(0x4000acd808, {0x40002fae40, 0x2, 0x2})
	github.com/spf13/cobra@v1.8.1/command.go:989 +0x828
github.com/spf13/cobra.(*Command).ExecuteC(0x5aa49a0)
	github.com/spf13/cobra@v1.8.1/command.go:1117 +0x344
github.com/spf13/cobra.(*Command).Execute(...)
	github.com/spf13/cobra@v1.8.1/command.go:1041

It looks like the Listener section is always required in router configuration, but what if I don't want any listener and want this router to only function as a fabric mesh to transfer data?

Here is the config file I'm using

identity:
  cert: /var/ziti/pki/router.cert.pem
  key: /var/ziti/pki/router.key.pem
  ca: /var/ziti/pki/ca-chain.cert.pem
  server_cert: /var/ziti/pki/router.server.cert.pem
ctrl:
  endpoint: tls:CONTROLLER_IP:8440
healthChecks:
  ctrlPingCheck:
    interval: 30s
    timeout: 15s
    initialDelay: 15s
  linkCheck:
    intercal: 10s
metrics:
  reportInterval: 1m
  messageQueueSize: 10
edge:
  csr:
    country: US
    province: TX
    locality: Austin
    organization: ziti
    organizationalUnit: ziti
    sans:
      dns:
        - localhost
        - SOME_DNS
      ip:
        - 127.0.0.1
        - SOME_IP
        - 10.17.0.2
link:
  listeners:
    - binding: transport
      bind: tls:SOME_IP:10080
      advertise: tls:SOME_DNS:10080
  dialers:
    - binding: transport
web:
  - name: router-healthchecks
    bindPoints:
      - address: 10.17.1.02:8444
        identity:
          cert: /var/ziti/pki/router.cert.pem
          key: /var/ziti/pki/router.key.pem
          ca: /var/ziti/pki/ca-chain.cert.pem
          server_cert: /var/ziti/pki/router.server.cert.pem
        interface: 10.17.1.02:8444
    apis:
      - binding: health-checks
v: 3

Using ziti version: v.1.1.9

Is the router perhaps tunneler enabled like mine is?

ziti edge list edge-routers 'name = "ip-172-31-47-200-edge-router"' -j |  jq . | grep Tunn
      "isTunnelerEnabled": true,

I suspect that's the problem. If that's the case, you will require an edge section as the tunneler feature is an edge feature.

Disable that and I expect things will work fine.

Its already disabled.

bash-5.1# /var/ziti/bin/ziti edge ls ers 'name="9a45df5b9e444ddc9d8a3232edb52cbc"' -j | jq . | grep Tunn
      "isTunnelerEnabled": false,
      "isTunnelerEnabled",
bash-5.1# 

Hunh. Ok :slight_smile: I'll give it a test and see if it works on my side. thanks for confirming

1 Like

So it works for my test... Here are a set of steps you can use to test it on your own. I'm not exactly sure where the breakdown is. Do you have a similar set of steps I could try to help determine where it went wrong? Maybe there's a bug where disabling the identity from being tunneler enabled won't clear some internal state? You could try deleting the router from ziti, re-enrolling it if that's the case?

Terminal 1 - Start OpenZiti

just bring up a ziti instance locally to test with. this is ephemeral and goes away when you stop the process. The quickstart will create a local controller and a local router all in one...

ziti edge quickstart

Terminal 2 - Configure and run the fabric router

This will:

  • make a temp directory at /tmp/fabric-router
  • set some env vars so the ziti create config command will produce a router config file that's suitable ( i added some extra env vars so you could try this on your actual openziti instance if you wish )
  • create a config file for a router to use
  • create the router in the locally running overlay
  • enroll therouter
  • run the router
mkdir -vp /tmp/fabric-router
export ZITI_HOME="/tmp/fabric-router"
export ZITI_ROUTER_PORT=4022
export ZITI_ROUTER_LISTENER_BIND_PORT=4023
export ZITI_CTRL_ADVERTISED_ADDRESS="localhost"
export ZITI_CTRL_ADVERTISED_PORT="1280"
ziti create config router fabric --routerName fabric-router > "$ZITI_HOME/config.yml"
ziti edge create edge-router fabric-router -o "$ZITI_HOME/config.jwt"
ziti router enroll "$ZITI_HOME/config.yml" --jwt "$ZITI_HOME/config.jwt"
ziti router run "$ZITI_HOME/config.yml"

Terminal 3 - List links

In a third terminal, run ziti fabric list links and ziti edge list ers and see the link and edge routers:

$ ziti fabric list links
╭────────────────────────┬───────────────────┬───────────────┬─────────────┬─────────────┬─────────────┬───────────┬────────┬───────────╮
│ ID                     │ DIALER            │ ACCEPTOR      │ STATIC COST │ SRC LATENCY │ DST LATENCY │ STATE     │ STATUS │ FULL COST │
├────────────────────────┼───────────────────┼───────────────┼─────────────┼─────────────┼─────────────┼───────────┼────────┼───────────┤
│ 5nek04AvtbUoDbDvqtN7H2 │ quickstart-router │ fabric-router │           1 │       0.2ms │   65000.0ms │ Connected │     up │     65001 │
╰────────────────────────┴───────────────────┴───────────────┴─────────────┴─────────────┴─────────────┴───────────┴────────┴───────────╯
results: 1-1 of 1
cd@192.168.253.239:sg4: ~
$ ziti edge list ers
╭────────────┬───────────────────┬────────┬───────────────┬──────┬────────────╮
│ ID         │ NAME              │ ONLINE │ ALLOW TRANSIT │ COST │ ATTRIBUTES │
├────────────┼───────────────────┼────────┼───────────────┼──────┼────────────┤
│ 5zKhcxZiQs │ fabric-router     │ true   │ true          │    0 │            │
│ B.0-zjBWts │ quickstart-router │ true   │ true          │    0 │ public     │
╰────────────┴───────────────────┴────────┴───────────────┴──────┴────────────╯
results: 1-2 of 2

Oh I forgot to add the config file that is generated. It'll look like this. Important to note there are no edge listeners declared, only link dialer/listener:

$ cat /tmp/fabric-router/config.yml
v: 3

identity:
  cert:             "/tmp/fabric-router/fabric-router.cert"
  server_cert:      "/tmp/fabric-router/fabric-router.server.chain.cert"
  key:              "/tmp/fabric-router/fabric-router.key"
  ca:               "/tmp/fabric-router/fabric-router.cas"
  #alt_server_certs:
  #  - server_cert:  ""
  #    server_key:   ""

ctrl:
  endpoint:             tls:localhost:1280

link:
  dialers:
    - binding: transport
  listeners:
    - binding:          transport
      bind:             tls:0.0.0.0:4023
      advertise:        tls:sg4:4023
      options:
        outQueueSize:   4

#listeners:
# bindings of edge and tunnel requires an "edge" section below
#  - binding: edge
#    address: tls:0.0.0.0:4022
#    options:
#      advertise: sg4:4022
#      connectTimeoutMs: 5000
#      getSessionTimeout: 60
#  - binding: tunnel
#    options:
#      mode: host #tproxy|host


csr:
  country: US
  province: NC
  locality: Charlotte
  organization: NetFoundry
  organizationalUnit: Ziti
  sans:
    dns:
      - localhost
      - sg4

    ip:
      - "127.0.0.1"
      - "::1"



#transport:
#  ws:
#    writeTimeout: 10
#    readTimeout: 5
#    idleTimeout: 120
#    pongTimeout: 60
#    pingInterval: 54
#    handshakeTimeout: 10
#    readBufferSize: 4096
#    writeBufferSize: 4096
#    enableCompression: true

forwarder:
  latencyProbeInterval: 0
  xgressDialQueueLength: 1000
  xgressDialWorkerCount: 128
  linkDialQueueLength: 1000
  linkDialWorkerCount: 32

Ok found it,

The problem was in the configuration,

This is what I was using.

edge:
  csr:
    country: US
    province: TX
    locality: Austin
    organization: ziti
    organizationalUnit: ziti
    sans:
      dns:
        - localhost
        - SOME_DNS
      ip:
        - 127.0.0.1
        - SOME_IP
        - 10.17.0.2

and after I removed the Edge: from it, it worked.

csr:
    country: US
    province: TX
    locality: Austin
    organization: ziti
    organizationalUnit: ziti
    sans:
      dns:
        - localhost
        - SOME_DNS
      ip:
        - 127.0.0.1
        - SOME_IP
        - 10.17.0.2
1 Like

OH. I missed that! Glad you got it sorted! :slight_smile:

Thanks for Always Helping :heart: