Does anyone else have an issue with the edge-router in docker compose if `restart` key is enabled?

So I have this weird issue where the router will keep crash looping, only after reboot, and only if restart key is enabled. To get around it I have to add the following crontab entry as the local user, it won't work as root.

@reboot docker compose -f /opt/ziti/docker-compose.yml down && docker compose -f /opt/ziti/docker-compose.yml up -d

Is it the same sort of panic where the binary can't read the config file? Can you share the logs from two cycles of crashes? I expect there's a panic shown?

Here is my docker-config for the router. Only the restart: unless-stopped key/value has been added to make the compose survive restarts.

Before reboot:

What is interesting is that there is NO problem if the unprivileged (but in docker group) does a docker compose down and then up. That works, that's why I have the crontab to down and then up the project as the user and not root.

It seems to me that the router is also writing an initialization file to the volume. Here is the contents of my docker volumes.

root@ziti:~# ll /var/lib/docker/volumes/ziti_ziti-fs/_data/
total 128K
drwxr-xr-x 5 2171 2171 4.0K Sep  3 01:40 .
drwx-----x 3 root root 4.0K Sep  2 16:39 ..
drwxr-xr-x 2 2171 2171 4.0K Sep  2 16:40 db
drwxr-xr-x 9 2171 2171 4.0K Sep  2 16:40 pki
drwxr-xr-x 2 2171 2171 4.0K Aug 22 20:17 scripts
-rw-r--r-- 1 2171 2171  11K Sep  3 01:48 ziti.env
-rw-r--r-- 1 2171 2171   19 Sep  2 16:42 ziti.mydomain.com-2dd3c13a0b0e.init
-rw-r--r-- 1 2171 2171   19 Sep  3 01:40 ziti.mydomain.com-52343e5c9bc2.init
-rw-r--r-- 1 2171 2171   19 Sep  2 16:40 ziti.mydomain.com-77871c0c70d7.init
-rw-r--r-- 1 2171 2171   19 Sep  2 16:40 ziti.mydomain.com-77da37d8f942.init
-rw-r--r-- 1 2171 2171   19 Sep  2 16:40 ziti.mydomain.com-8cd8ef81712d.init
-rw-r--r-- 1 2171 2171   19 Sep  2 16:40 ziti.mydomain.com-93df7f462166.init
-rw-r--r-- 1 2171 2171   19 Sep  2 16:40 ziti.mydomain.com-994a24a1020f.init
-rwxr-xr-x 1 2171 2171   19 Sep  2 16:40 ziti.mydomain.com-a7498a64955a.init
-rw-r--r-- 1 2171 2171   19 Sep  3 01:35 ziti.mydomain.com-abb3dd04cc60.init
-rw-r--r-- 1 2171 2171   19 Sep  2 16:46 ziti.mydomain.com-bf107867b2a0.init
-rw-r--r-- 1 2171 2171   19 Sep  2 16:55 ziti.mydomain.com-c1cffa79eb7b.init
-rwxr-xr-x 1 2171 2171   19 Sep  2 16:40 ziti.mydomain.com-da92aa6c1c29.init
-rw-r--r-- 1 2171 2171   19 Sep  2 16:40 ziti.mydomain.com-e0886e4afadc.init
-rwxr-xr-x 1 2171 2171   19 Sep  2 16:40 ziti.mydomain.com-fbf48b4995ad.init
-rwxr-xr-x 1 2171 2171  11K Sep  2 16:40 ziti.mydomain.com.cas
-rwxr-xr-x 1 2171 2171 2.1K Sep  2 16:40 ziti.mydomain.com.cert
-rwxr-xr-x 1 2171 2171 1007 Sep  2 16:40 ziti.mydomain.com.jwt
-rwxr-xr-x 1 2171 2171 3.2K Sep  2 16:40 ziti.mydomain.com.key
-rwxr-xr-x 1 2171 2171    0 Sep  3 01:51 ziti.mydomain.com.log
-rwxr-xr-x 1 2171 2171 2.1K Sep  2 16:40 ziti.mydomain.com.server.chain.cert
-rwxr-xr-x 1 2171 2171  11K Sep  3 01:48 ziti.mydomain.com.yaml

Here are the log files

adding /var/openziti/ziti-bin to the path
ZITI_ROUTER_NAME set to: ziti.mydomain.com
system has been initialized. starting the process.
[   0.015] WARNING fabric/router.LoadConfig: invalid [healthChecks] stanza
[   0.015]    INFO ziti/ziti/router.run: {build-date=[2023-08-22T20:01:59Z] version=[v0.30.1] go-version=[go1.20.7] configFile=[/persistent/ziti.mydomain.com.yaml] routerId=[ziti.mydomain.com] os=[linux] revision=[c74a60a04f1d] arch=[amd64]} starting ziti-router
[   0.016]    INFO fabric/metrics.GoroutinesPoolMetricsConfigF.func1.1: {maxQueueSize=[1000] poolType=[pool.link.dialer] minWorkers=[0] maxWorkers=[32] idleTime=[30s]} starting goroutine pool
[   0.016]    INFO fabric/metrics.GoroutinesPoolMetricsConfigF.func1.1: {idleTime=[30s] maxQueueSize=[1000] poolType=[pool.route.handler] minWorkers=[0] maxWorkers=[128]} starting goroutine pool
[   0.017] WARNING edge/router/internal/edgerouter.(*Config).LoadConfigFromMap: Invalid heartbeat interval [0] (min: 60, max: 10), setting to default [60]
[   0.017]   PANIC ziti/ziti/router.run: {error=[required section [edge.csr] not found]} error registering edge in framework
panic: (*logrus.Entry) 0xc0001c09a0

goroutine 1 [running]:
github.com/sirupsen/logrus.(*Entry).log(0xc0001c0930, 0x0, {0xc000b5a570, 0x23})
        github.com/sirupsen/logrus@v1.9.3/entry.go:260 +0x4d6
github.com/sirupsen/logrus.(*Entry).Log(0xc0001c0930, 0x0, {0xc0009b90e0?, 0x32097e0?, 0xc00038e080?})
        github.com/sirupsen/logrus@v1.9.3/entry.go:304 +0x4f
github.com/sirupsen/logrus.(*Entry).Panic(...)
        github.com/sirupsen/logrus@v1.9.3/entry.go:342
github.com/openziti/ziti/ziti/router.run(0xc0009c5800?, {0xc0001ae690, 0x1, 0x1?})
        github.com/openziti/ziti/ziti/router/run.go:81 +0xa94
github.com/spf13/cobra.(*Command).execute(0xc0009c5800, {0xc0001ae650, 0x1, 0x1})
        github.com/spf13/cobra@v1.7.0/command.go:944 +0x847
github.com/spf13/cobra.(*Command).ExecuteC(0x5281c60)
        github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
        github.com/spf13/cobra@v1.7.0/command.go:992
github.com/openziti/ziti/ziti/cmd.Execute()
        github.com/openziti/ziti/ziti/cmd/cmd.go:79 +0x25
main.main()
        github.com/openziti/ziti/ziti/main.go:51 +0x17
_ZITI_ROUTER_NAME set to: ziti.mydomain.com
NOT OVERRIDING: env var ZITI_BIN_DIR already set. using existing value
NOT OVERRIDING: env var ZITI_BIN_ROOT already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_ADVERTISED_ADDRESS already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_ADVERTISED_PORT already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_EDGE_ADVERTISED_ADDRESS already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_EDGE_ADVERTISED_PORT already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_NAME already set. using existing value
NOT OVERRIDING: env var ZITI_EDGE_IDENTITY_ENROLLMENT_DURATION already set. using existing value
NOT OVERRIDING: env var ZITI_ENV_FILE already set. using existing value
NOT OVERRIDING: env var ZITI_HOME already set. using existing value
NOT OVERRIDING: env var ZITI_IMAGE already set. using existing value
NOT OVERRIDING: env var ZITI_NETWORK already set. using existing value
NOT OVERRIDING: env var ZITI_PWD already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_ADVERTISED_ADDRESS already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_ADVERTISED_HOST already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_ENROLLMENT_DURATION already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_LISTENER_BIND_PORT already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_NAME already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_PORT already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_ROLES already set. using existing value
NOT OVERRIDING: env var ZITI_SCRIPTS already set. using existing value
NOT OVERRIDING: env var ZITI_SHARED already set. using existing value
NOT OVERRIDING: env var ZITI_USER already set. using existing value
NOT OVERRIDING: env var ZITI_VERSION already set. using existing value

adding /var/openziti/ziti-bin to the path
ZITI_ROUTER_NAME set to: ziti.mydomain.com
system has been initialized. starting the process.
[   0.015] WARNING fabric/router.LoadConfig: invalid [healthChecks] stanza
[   0.015]    INFO ziti/ziti/router.run: {arch=[amd64] build-date=[2023-08-22T20:01:59Z] revision=[c74a60a04f1d] version=[v0.30.1] go-version=[go1.20.7] configFile=[/persistent/ziti.mydomain.com.yaml] os=[linux] routerId=[ziti.mydomain.com]} starting ziti-router
[   0.016]    INFO fabric/metrics.GoroutinesPoolMetricsConfigF.func1.1: {maxWorkers=[32] idleTime=[30s] maxQueueSize=[1000] poolType=[pool.link.dialer] minWorkers=[0]} starting goroutine pool
[   0.016]    INFO fabric/metrics.GoroutinesPoolMetricsConfigF.func1.1: {poolType=[pool.route.handler] minWorkers=[0] maxWorkers=[128] idleTime=[30s] maxQueueSize=[1000]} starting goroutine pool
[   0.017] WARNING edge/router/internal/edgerouter.(*Config).LoadConfigFromMap: Invalid heartbeat interval [0] (min: 60, max: 10), setting to default [60]
[   0.017]   PANIC ziti/ziti/router.run: {error=[required section [edge.csr] not found]} error registering edge in framework
panic: (*logrus.Entry) 0xc0001b16c0

goroutine 1 [running]:
github.com/sirupsen/logrus.(*Entry).log(0xc0001b1650, 0x0, {0xc000b48540, 0x23})
        github.com/sirupsen/logrus@v1.9.3/entry.go:260 +0x4d6
github.com/sirupsen/logrus.(*Entry).Log(0xc0001b1650, 0x0, {0xc0008fd0e0?, 0x32097e0?, 0xc000aa1b30?})
        github.com/sirupsen/logrus@v1.9.3/entry.go:304 +0x4f
github.com/sirupsen/logrus.(*Entry).Panic(...)
        github.com/sirupsen/logrus@v1.9.3/entry.go:342
github.com/openziti/ziti/ziti/router.run(0xc000937800?, {0xc0009f9b20, 0x1, 0x1?})
        github.com/openziti/ziti/ziti/router/run.go:81 +0xa94
github.com/spf13/cobra.(*Command).execute(0xc000937800, {0xc0009f9af0, 0x1, 0x1})
        github.com/spf13/cobra@v1.7.0/command.go:944 +0x847
github.com/spf13/cobra.(*Command).ExecuteC(0x5281c60)
        github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
        github.com/spf13/cobra@v1.7.0/command.go:992
github.com/openziti/ziti/ziti/cmd.Execute()
        github.com/openziti/ziti/ziti/cmd/cmd.go:79 +0x25
main.main()
        github.com/openziti/ziti/ziti/main.go:51 +0x17
_ZITI_ROUTER_NAME set to: ziti.mydomain.com
NOT OVERRIDING: env var ZITI_BIN_DIR already set. using existing value
NOT OVERRIDING: env var ZITI_BIN_ROOT already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_ADVERTISED_ADDRESS already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_ADVERTISED_PORT already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_EDGE_ADVERTISED_ADDRESS already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_EDGE_ADVERTISED_PORT already set. using existing value
NOT OVERRIDING: env var ZITI_CTRL_NAME already set. using existing value
NOT OVERRIDING: env var ZITI_EDGE_IDENTITY_ENROLLMENT_DURATION already set. using existing value
NOT OVERRIDING: env var ZITI_ENV_FILE already set. using existing value
NOT OVERRIDING: env var ZITI_HOME already set. using existing value
NOT OVERRIDING: env var ZITI_IMAGE already set. using existing value
NOT OVERRIDING: env var ZITI_NETWORK already set. using existing value
NOT OVERRIDING: env var ZITI_PWD already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_ADVERTISED_ADDRESS already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_ADVERTISED_HOST already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_ENROLLMENT_DURATION already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_LISTENER_BIND_PORT already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_NAME already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_PORT already set. using existing value
NOT OVERRIDING: env var ZITI_ROUTER_ROLES already set. using existing value
NOT OVERRIDING: env var ZITI_SCRIPTS already set. using existing value
NOT OVERRIDING: env var ZITI_SHARED already set. using existing value
NOT OVERRIDING: env var ZITI_USER already set. using existing value
NOT OVERRIDING: env var ZITI_VERSION already set. using existing value

adding /var/openziti/ziti-bin to the path
ZITI_ROUTER_NAME set to: ziti.mydomain.com
system has been initialized. starting the process.
[   0.015] WARNING fabric/router.LoadConfig: invalid [healthChecks] stanza
[   0.015]    INFO ziti/ziti/router.run: {revision=[c74a60a04f1d] configFile=[/persistent/ziti.mydomain.com.yaml] os=[linux] routerId=[ziti.mydomain.com] arch=[amd64] version=[v0.30.1] build-date=[2023-08-22T20:01:59Z] go-version=[go1.20.7]} starting ziti-router
[   0.016]    INFO fabric/metrics.GoroutinesPoolMetricsConfigF.func1.1: {poolType=[pool.link.dialer] minWorkers=[0] maxQueueSize=[1000] maxWorkers=[32] idleTime=[30s]} starting goroutine pool
[   0.016]    INFO fabric/metrics.GoroutinesPoolMetricsConfigF.func1.1: {idleTime=[30s] maxQueueSize=[1000] minWorkers=[0] poolType=[pool.route.handler] maxWorkers=[128]} starting goroutine pool
[   0.017] WARNING edge/router/internal/edgerouter.(*Config).LoadConfigFromMap: Invalid heartbeat interval [0] (min: 60, max: 10), setting to default [60]
[   0.017]   PANIC ziti/ziti/router.run: {error=[required section [edge.csr] not found]} error registering edge in framework
panic: (*logrus.Entry) 0xc0001b16c0

goroutine 1 [running]:
github.com/sirupsen/logrus.(*Entry).log(0xc0001b1650, 0x0, {0xc000b3c540, 0x23})
        github.com/sirupsen/logrus@v1.9.3/entry.go:260 +0x4d6
github.com/sirupsen/logrus.(*Entry).Log(0xc0001b1650, 0x0, {0xc0008c50e0?, 0x32097e0?, 0xc000a95b00?})
        github.com/sirupsen/logrus@v1.9.3/entry.go:304 +0x4f
github.com/sirupsen/logrus.(*Entry).Panic(...)
        github.com/sirupsen/logrus@v1.9.3/entry.go:342
github.com/openziti/ziti/ziti/router.run(0xc00092b800?, {0xc0009efaf0, 0x1, 0x1?})
        github.com/openziti/ziti/ziti/router/run.go:81 +0xa94
github.com/spf13/cobra.(*Command).execute(0xc00092b800, {0xc0009efac0, 0x1, 0x1})
        github.com/spf13/cobra@v1.7.0/command.go:944 +0x847
github.com/spf13/cobra.(*Command).ExecuteC(0x5281c60)
        github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
        github.com/spf13/cobra@v1.7.0/command.go:992
github.com/openziti/ziti/ziti/cmd.Execute()
        github.com/openziti/ziti/ziti/cmd/cmd.go:79 +0x25
main.main()
        github.com/openziti/ziti/ziti/main.go:51 +0x17

Very interesting. This is exactly the same type of error you had/have with the controller, only it's a different section of the config that not being found (since this is a router). I think there's an issue with our script's just trying to be "too friendly" and recreating the config file each time the container comes online, subsequently leading to an incomplete read of the file. I'm leaning towards that.

I think we'll be taking a look at the script lifecycle around that file and instead of using the .init file (as we chose) just use the presence of the config file instead...

Based on how quickly you are coming "up to speed" on ziti, I think you are probably a good candidate to outgrow the quickstart container and move to the to the "no-frills" containers. I've spent a bit of today working up an answer for @Metz over on this topi Use ziti PKI as intermediate / subordanary ca but it is outside the scope of the quickstarts. You might want to keep an eye on that topic if you're not already.

2 Likes