Problem initializing HA cluster

Hello everyone,
I was just trying to setup a HA cluster following this guide.
The problem is with initializing the controllers.

I run this command:

ziti agent controller init admin REDACTED MK

And I recieve this output:

Error: no processes found matching filter, use 'ziti agent list' to list candidates
Usage:
ziti agent controller init [flags]

Flags:
-a, --app-alias string Alias of host application to talk to (specified in host application)
-i, --app-id string Id of host application to talk to (like controller or router id)
-t, --app-type string Type of host application to talk to (like controller or router)
-h, --help help for init
-p, --pid uint32 Process ID of host application to talk to
-n, --process-name string Process name of host application to talk to
--tcp-addr string Type of host application to talk to (like controller or router)
--timeout duration Operation timeout (default 5s)

no processes found matching filter, use 'ziti agent list' to list candidates

Im not sure what I messed up here. Any help is highly appreciated.

Are you running the ziti agent command as the same user as the controller is running? If not, you may need to do sudo -u <ziti user> ziti agent ....

Let me know if that helps.
Paul

Thanks for your quick reply!
Unfortunately, this yields the same output:

sudo -u ziti-controller ziti agent controller init admin REDACTED MK
Error: no processes found matching filter, use 'ziti agent list' to list candidates
Usage:
ziti agent controller init [flags]

Flags:
-a, --app-alias string Alias of host application to talk to (specified in host application)
-i, --app-id string Id of host application to talk to (like controller or router id)
-t, --app-type string Type of host application to talk to (like controller or router)
-h, --help help for init
-p, --pid uint32 Process ID of host application to talk to
-n, --process-name string Process name of host application to talk to
--tcp-addr string Type of host application to talk to (like controller or router)
--timeout duration Operation timeout (default 5s)

no processes found matching filter, use 'ziti agent list' to list candidates

Just to confirm, the controller is running? You should see something like this in the controller log:

[   0.351]    INFO ziti/controller/network.(*Network).Run: started
[   1.986] WARNING github.com/hashicorp/raft.(*Raft).runFollower: no known peers, aborting election
[   3.288] WARNING ziti/controller/server.(*Controller).checkEdgeInitialized: the Ziti Edge has not been initialized, no default admin exists. Please run 'ziti agent controller init' to configure the default admin'

The last message will be repeated on an interval until the controller is initialized.

Paul

I think so, yes. This is the latest line in the journal:

Oct 21 17:39:51 REDACTED ziti[14897]: {"file":"github.com/openziti/ziti/controller/server/controller.go:294","func":"github.com/openziti/ziti/controller/server.(*Controller).checkEdgeInitialized","level":"warning","msg":"the Ziti Edge has not been initialized, no default admin exists. Please run 'ziti agent controller init' to configure the default admin'","time":"2024-10-21T17:39:51.875Z"}

Do you see any of the unix pipes used for the ziti agent IPC? They're in /tmp.

$ sudo /home/plorenz/go/bin/ziti agent list
╭───────┬────────────┬────────┬────────────────────────────┬────────────┬─────────────┬───────────╮
│   PID │ EXECUTABLE │ APP ID │ UNIX SOCKET                │ APP TYPE   │ APP VERSION │ APP ALIAS │
├───────┼────────────┼────────┼────────────────────────────┼────────────┼─────────────┼───────────┤
│ 36637 │ ziti       │ ctrl1  │ /tmp/gops-agent.36637.sock │ controller │ v0.0.0      │           │
│  5740 │ ziti       │        │ /tmp/gops-agent.5740.sock  │            │             │           │
╰───────┴────────────┴────────┴────────────────────────────┴────────────┴─────────────┴───────────╯
$ ls -l /tmp/gops-agent.*
srwx------ 1 plorenz plorenz 0 Oct 21 11:37 /tmp/gops-agent.36637.sock=
srwx------ 1 root    root    0 Oct 21 09:22 /tmp/gops-agent.5740.sock=

Paul

There are no entries called /tmp/gops*

sudo -u ziti-controller ziti agent list
yields an empty list

Can you run sudo ps -Af | grep ziti just so we can verify that it's running, check the flags and see what user it's running as?

Thats the output of the command:

[root@ctrl01 tmp]# sudo ps -Af | grep ziti
ziti-co+ 14897 1 1 17:35 ? 00:00:59 /opt/openziti/bin/ziti controller run config.yml --

Thank you. Do you see unable to start CLI agent in the log, by any chance? If it's there, it should include the error which prevented the CLI agent from starting. I'm wondering if maybe there's a permission issue?

I've not seen this issue before. We do have an option to place the named pipe somewhere else when starting the controller, but I'm noticing that we don't have a way to specify it when using the agent, which is a bug. I'll file an issue so I remember to fix that.

Paul

I dont see any errors in the logs. Maybe pasting my config can help here:

v: 3

db: "/var/lib/private/ziti-controller/bbolt.db"

raft:
dataDir: "/var/lib/private/ziti-controller/raft"
minClusterSize: 2
bootstrapMembers:
- tls:{Second Controllers DNS Name}:6262

identity:
cert: "pki/intermediate/certs/ctrl01.cert"
server_cert: "pki/intermediate/certs/ctrl01.chain.pem"
key: "pki/intermediate/keys/ctrl01.key"
ca: "pki/ca/certs/ca.cert"

trustDomain: {My Domain Name}

ctrl:
options:
advertiseAddress: tls:{My Internal DNS Name}:6262
listener: tls:0.0.0.0:6262

healthChecks:
boltCheck:
interval: 30s
timeout: 20s
initialDelay: 30s

edge:
api:
sessionTimeout: 30m
enrollment:
signingCert:
cert: pki/intermediate/certs/intermediate.cert
key: pki/intermediate/keys/intermediate.key
edgeIdentity:
duration: 180m
edgeRouter:
duration: 180m

web:

  • name: client-management
    bindPoints:
    • interface: 0.0.0.0:1280
      address: {My Public DNS Name}:1280
      identity:
      ca: "pki/ca/certs/ca.cert"
      key: "pki/intermediate/keys/ctrl01.key"
      server_cert: "pki/intermediate/certs/ctrl01.chain.pem"
      cert: "pki/intermediate/certs/client.cert"
      options:
      idleTimeout: 5000ms
      readTimeout: 5000ms
      writeTimeout: 100000ms
      minTLSVersion: TLS1.2
      maxTLSVersion: TLS1.3
      apis:
    • binding: edge-management
      options: { }
    • binding: edge-client
      options: { }
    • binding: fabric
      options: { }

The config is basically a result of the bootstrapping process. The only this I changed was the raft part.

Thanks in advance!

I just gave it a fresh start: This time I did not use the package repo and simply used the binary and my config files. It seems I tripped over the extra complexity of the systemd unit file, the entrypoint script and all that stuff.
It works now. Case closed.

Thank you so much for your input!

Thank you for the update! I'll let our package maintainer know about the issue.

Paul

Please let me know if this explanation is sufficient. It's about accessing the agent when running a sandboxed systemd service provided by the openziti-controller or openziti-router Linux packages.

e.g.,

systemctl show -p MainPID --value ziti-controller.service \
| xargs -rIPID sudo nsenter --target PID --mount -- \
    ziti agent stats

Sorry for the late response! That has helped indeed!

1 Like