I am starting to play around with the ziti HA controller. I have the rpms installed on 3 redhat vms and was wondering if the install process was similar to that of a regular controller. i.e run the bootstrap or is there a different process to the HA controller
There is "actual" doc being worked on you can preview (very subject to change, know it's a WIP) here: Controller HA | OpenZiti Again, just know it's being worked on and will change...
I am getting an error when I try and start the controller, The yaml file looked correct for migrating a ziti controller
Starting OpenZiti Controller...
Feb 6 09:33:54 C3E-ZitiCtrl01 entrypoint.bash[2959]: WARN: set VERBOSE=1 or DEBUG=1 for more output
Feb 6 09:33:54 C3E-ZitiCtrl01 entrypoint.bash[2959]: WARN: see output in '/tmp/tmp.Cfgqa3w4BF'
Feb 6 09:33:54 C3E-ZitiCtrl01 systemd[1]: Started OpenZiti Controller.
Feb 6 09:33:54 C3E-ZitiCtrl01 ziti[2979]: {"arch":"amd64","build-date":"2025-01-27T19:25:51Z","error":"yaml: line 13: did not find expected key","file":"github.com/openziti/ziti/ziti/controller/run.go:56","func":"github.com/openziti/ziti/ziti/controller.run","go-version":"go1.23.4","level":"error","msg":"error starting ziti-controller","os":"linux","revision":"2a62cc577e45","time":"2025-02-06T09:33:54.543Z","version":"v1.3.3"}
Feb 6 09:33:54 C3E-ZitiCtrl01 ziti[2979]: panic: yaml: line 13: did not find expected key
Until Ziti clustered mode (HA) is released, it will be necessary to manually compose the config.yml, or edit the generated config.yml to suit the requirements of the HA beta.
I think the error means your controller's config.yml doesn't yet suit the requirements for clustered mode because an expected configuration key/directive wasn't found.
This is an area of Ziti that is changing rapidly, but I'm sure we can figure out what's needed, and it's great that you're test-driving the clustered Ziti beta!
Do I understand correctly the first controller starts and does not exit with an error, but evenutally emits this error message while continueing to run?
#trace:
# path: "C3E-ZitiCtrl01.trace"
#profile:
# memory:
# path: ctrl.memprof
db: "/var/lib/private/ziti-controller/bbolt.db"
# uncomment and configure to enable HA
# raft:
dataDir: "/var/lib/private/ziti-controller/raft"
minClusterSize: 1
identity:
cert: "pki/intermediate/certs/client.chain.pem"
server_cert: "pki/intermediate/certs/server.chain.pem"
key: "pki/intermediate/keys/server.key"
ca: "pki/root/certs/root.cert"
#alt_server_certs:
# - server_cert: ""
# server_key: ""
# trust domains may be overridden by SPIFFE ID as URI SAN
#trustDomain: ziti.example.com
# additional trust domains allow for migrating to a new trust domain
#additionalTrustDomains: []
# Network Configuration
#
# Configure how the controller will establish and manage the overlay network, and routing operations on top of
# the network.
#
#network:
# routeTimeoutSeconds controls the number of seconds the controller will wait for a route attempt to succeed.
#routeTimeoutSeconds: 10
# createCircuitRetries controls the number of retries that will be attempted to create a path (and terminate it)
# for new circuits.
#createCircuitRetries: 2
# pendingLinkTimeoutSeconds controls how long we'll wait before creating a new link between routers where
# there isn't an established link, but a link request has been sent
#pendingLinkTimeoutSeconds: 10
# Defines the period that the controller re-evaluates the performance of all of the circuits
# running on the network.
#
#cycleSeconds: 15
# Sets router minimum cost. Defaults to 10
#minRouterCost: 10
# Sets how often a new control channel connection can take over for a router with an existing control channel connection
# Defaults to 1 minute
#routerConnectChurnLimit: 1m
# Sets the latency of link when it's first created. Will be overwritten as soon as latency from the link is actually
# reported from the routers. Defaults to 65 seconds.
#initialLinkLatency: 65s
#smart:
#
# Defines the fractional upper limit of underperforming circuits that are candidates to be re-routed. If
# smart routing detects 100 circuits that are underperforming, and `smart.rerouteFraction` is set to `0.02`,
# then the upper limit of circuits that will be re-routed in this `cycleSeconds` period will be limited to
# 2 (2% of 100).
#
#rerouteFraction: 0.02
#
# Defines the hard upper limit of underperforming circuits that are candidates to be re-routed. If smart
# routing detects 100 circuits that are underperforming, and `smart.rerouteCap` is set to `1`, and
# `smart.rerouteFraction` is set to `0.02`, then the upper limit of circuits that will be re-routed in this
# `cycleSeconds` period will be limited to 1.
#
#rerouteCap: 4
# the endpoint that routers will connect to the controller over.
ctrl:
options:
advertiseAddress: tls:C3E-ZitiCtrl01:6262
# (optional) settings
# set the maximum number of connect requests that are buffered and waiting to be acknowledged (1 to 5000, default 1)
#maxQueuedConnects: 1
# the maximum number of connects that have begun hello synchronization (1 to 1000, default 16)
#maxOutstandingConnects: 16
# the number of milliseconds to wait before a hello synchronization fails and closes the connection (30ms to 60000ms, default: 5000ms)
#connectTimeoutMs: 5000
listener: tls:0.0.0.0:6262
#metrics:
# influxdb:
# url: http://localhost:8086
# database: ziti
# xctrl_example
#
#example:
# enabled: false
# delay: 5s
healthChecks:
boltCheck:
# How often to try entering a bolt read tx. Defaults to 30 seconds
interval: 30s
# When to time out the check. Defaults to 20 seconds
timeout: 20s
# How long to wait before starting the check. Defaults to 30 seconds
initialDelay: 30s
# By having an 'edge' section defined, the ziti-controller will attempt to parse the edge configuration. Removing this
# section, commenting out, or altering the name of the section will cause the edge to not run.
edge:
# This section represents the configuration of the Edge API that is served over HTTPS
api:
#(optional, default 90s) Alters how frequently heartbeat and last activity values are persisted
# activityUpdateInterval: 90s
#(optional, default 250) The number of API Sessions updated for last activity per transaction
# activityUpdateBatchSize: 250
# sessionTimeout - optional, default 30m
# The number of minutes before an Edge API session will time out. Timeouts are reset by
# API requests and connections that are maintained to Edge Routers
sessionTimeout: 30m
# address - required
# The default address (host:port) to use for enrollment for the Client API. This value must match one of the addresses
# defined in this Controller.WebListener.'s bindPoints.
address: C3E-ZitiCtrl01:1280
# This section is used to define option that are used during enrollment of Edge Routers, Ziti Edge Identities.
enrollment:
# signingCert - required
# A Ziti Identity configuration section that specifically makes use of the cert and key fields to define
# a signing certificate from the PKI that the Ziti environment is using to sign certificates. The signingCert.cert
# will be added to the /.well-known CA store that is used to bootstrap trust with the Ziti Controller.
signingCert:
cert: pki/intermediate/certs/intermediate.cert
key: pki/intermediate/keys/intermediate.key
# edgeIdentity - optional
# A section for identity enrollment specific settings
edgeIdentity:
# duration - optional, default 180m
# The length of time that a Ziti Edge Identity enrollment should remain valid. After
# this duration, the enrollment will expire and no longer be usable.
duration: 180m
# edgeRouter - Optional
# A section for edge router enrollment specific settings.
edgeRouter:
# duration - optional, default 180m
# The length of time that a Ziti Edge Router enrollment should remain valid. After
# this duration, the enrollment will expire and no longer be usable.
duration: 180m
# web
# Defines webListeners that will be hosted by the controller. Each webListener can host many APIs and be bound to many
# bind points.
web:
# name - required
# Provides a name for this listener, used for logging output. Not required to be unique, but is highly suggested.
- name: client-management
# bindPoints - required
# One or more bind points are required. A bind point specifies an interface (interface:port string) that defines
# where on the host machine the webListener will listen and the address (host:port) that should be used to
# publicly address the webListener(i.e. mydomain.com, localhost, 127.0.0.1). This public address may be used for
# incoming address resolution as well as used in responses in the API.
bindPoints:
#interface - required
# A host:port string on which network interface to listen on. 0.0.0.0 will listen on all interfaces
- interface: 0.0.0.0:1280
# address - required
# The public address that external incoming requests will be able to resolve. Used in request processing and
# response content that requires full host:port/path addresses.
address: C3E-ZitiCtrl01:1280
# identity - optional
# Allows the webListener to have a specific identity instead of defaulting to the root 'identity' section.
identity:
ca: "pki/root/certs/root.cert"
key: "pki/intermediate/keys/server.key"
server_cert: "pki/intermediate/certs/server.chain.pem"
cert: "pki/intermediate/certs/client.chain.pem"
#alt_server_certs:
#- server_cert: ""
# server_key: ""
# options - optional
# Allows the specification of webListener level options - mainly dealing with HTTP/TLS settings. These options are
# used for all http servers started by the current webListener.
options:
# idleTimeoutMs - optional, default 5000ms
# The maximum amount of idle time in milliseconds allowed for pipelined HTTP requests. Setting this too high
# can cause resources on the host to be consumed as clients remain connected and idle. Lowering this value
# will cause clients to reconnect on subsequent HTTPs requests.
idleTimeout: 5000ms #http timeouts, new
# readTimeoutMs - optional, default 5000ms
# The maximum amount of time in milliseconds http servers will wait to read the first incoming requests. A higher
# value risks consuming resources on the host with clients that are acting bad faith or suffering from high latency
# or packet loss. A lower value can risk losing connections to high latency/packet loss clients.
readTimeout: 5000ms
# writeTimeoutMs - optional, default 100000ms
# The total maximum time in milliseconds that the http server will wait for a single requests to be received and
# responded too. A higher value can allow long-running requests to consume resources on the host. A lower value
# can risk ending requests before the server has a chance to respond.
writeTimeout: 100000ms
# minTLSVersion - optional, default TLS1.2
# The minimum version of TSL to support
minTLSVersion: TLS1.2
# maxTLSVersion - optional, default TLS1.3
# The maximum version of TSL to support
maxTLSVersion: TLS1.3
# apis - required
# Allows one or more APIs to be bound to this webListener
apis:
# binding - required
# Specifies an API to bind to this webListener. Built-in APIs are
# - edge-management
# - edge-client
# - fabric-management
- binding: edge-management
# options - arg optional/required
# This section is used to define values that are specified by the API they are associated with.
# These settings are per API. The example below is for the 'edge-api' and contains both optional values and
# required values.
options: { }
- binding: edge-client
options: { }
- binding: fabric
options: { }
- binding: edge-oidc
options: { }
- binding: zac
options:
location: /opt/openziti/share/console
indexFile: index.html
There is a known problem with ZITI_ARGS being unable to correctly parse multiple space-separated values. The workaround is to run sudo -E systemctl edit ziti-controller.service and add the desired arguments like this.
ExecStart=
ExecStart=/opt/openziti/bin/ziti controller run config.yml --verbose --log-formatter=text
The unit definition will be reloaded when you save the file and exit the editor.
[Unit]
Description=OpenZiti Controller
After=network-online.target
[Service]
Type=simple
# manage the user and permissions for the service automatically
DynamicUser=yes
# this env file configures the service, including whether or not to perform bootstrapping
EnvironmentFile=/opt/openziti/etc/controller/service.env
# relative to /var/lib
StateDirectory=ziti-controller
WorkingDirectory=/var/lib/ziti-controller
ReadOnlyPaths=/opt/openziti/share/console
ExecStartPre=/opt/openziti/etc/controller/entrypoint.bash check config.yml
ExecStart=/opt/openziti/bin/ziti controller run config.yml ${ZITI_ARGS}
Restart=always
RestartSec=3
LimitNOFILE=65535
UMask=0007
[Install]
WantedBy=multi-user.target
# /etc/systemd/system/ziti-controller.service.d/override.conf
[Service]
#
## Optional Permissions
#
# allow binding low ports, e.g., 443/tcp; NOTE: use TLS passthrough if fronting with a reverse proxy, i.e., "raw" TCP
# proxy
# AmbientCapabilities=CAP_NET_BIND_SERVICE
#
## Optional Parameters
#
# you must re-initialize with an empty ExecStartPre or ExecStart value before redefining
# ExecStartPre=
# ExecStartPre=/opt/openziti/etc/controller/entrypoint.bash check alt_config.yml
# ExecStart=
# ExecStart=/opt/openziti/bin/ziti controller run alt_config.yml ${ZITI_ARGS}
Output of /opt/etc/controller/service.env
ZITI_RUNTIME='systemd'
# set "false" to disable bootstrapping
ZITI_BOOTSTRAP='true'
# create a new PKI unless it exists
ZITI_BOOTSTRAP_PKI='true'
# create a config file unless it exists if 'true'; 'force' to re-create
# WARNING: changing the controller address will break most things
ZITI_BOOTSTRAP_CONFIG='true'
# create a new database unless it exists
ZITI_BOOTSTRAP_DATABASE='true'
# configure the web console if 'true'
ZITI_BOOTSTRAP_CONSOLE='true'
# configure controller to serve static HTML provided by openziti-console package
ZITI_CONSOLE_LOCATION='/opt/openziti/share/console'
# BASH script that defines function bootstrap()
ZITI_CTRL_BOOTSTRAP_BASH='/opt/openziti/etc/controller/bootstrap.bash'
# renew server and client certificates every startup
ZITI_AUTO_RENEW_CERTS='true'
# additional arguments to the ExecStart command must be a non-empty string
ZITI_ARGS='--'