Testing Openziti Throughput Under Heavy Loading

Hi,

I modified Grafana/k6 with golang-sdk to run high traffic performance test.

When I increase the VU(amount of users to send http requests on the same time) to about 200, I got error messages like:

failed to dial: no apiSession, authentication attempt failed: error for request kNSbWTCL8: SERVER_TOO_MANY_REQUESTS: Too many requests to alter state have been issued. Please slow your request rate or try again later."

I traced some code and I guess the error comes from the rate limiter. Is there any way that I can configure this rate limiter?

Is there any advice to configure controller/router for performing such tests?
Thanks!

1 Like

There are a couple limiters. I think this one is the Auth limiter: Release v0.32.0 · openziti/ziti · GitHub

edge:
  # This section allows configurating the rate limiter for auth attempts
  authRateLimiter:
    # if disabled, no auth rate limiting with be enforced
    enabled: true
    # the smallest window size for auth attempts
    minSize: 5
    # the largest allowed window size for auth attempts
    maxSize: 250

(but there's another one that came with 1.0. see: Release v1.0.0 · openziti/ziti · GitHub)

As for advice, I'll get @plorenz to comment. I think he's done some of that sort of testing recently and probably has good insights overall anyway...

1 Like

Yes, there are three rate limiters to consider:

Auth Rate Limter

I agree with @TheLumberjack that this is likely what you are hitting.

This rate limiter exists because without it the system can enter a state where authentications get queued and take a long time to complete. The SDK then times out, waiting for the authentication and retries. The controller completes the auth, but the SDK no longer cares and the work is wasted. The rate limiter tries to ensure that we don't accept more auths than we can process before the SDK gives up on it. It can generally tell when we've hit the limit because writing the response back will fail, as the SDK has closed the connection.

You can try disabling it or bumping the max size if you think it will be helpful.

Also, we'll probably be able to remove this limiter in the future. Once users are using JWT-based authentication, which doesn't require storing anything in the DB, that will remove the primary bottleneck on auth. At that point, the rate limiter shouldn't be necessary anymore.

Example config was posted above.

Command Rate Limiter

Since model changes are serialized, we want to be sure that we're not building up a large queue of operations. If you're load testing using SDK clients, you shouldn't be hitting this. This primary comes into play when a lot of SDKs or routers are setting up terminators for hosting services at once. The routers handle this internally (including on behalf of SDKs setting up hosting), by retrying as appropriate when the controller is handling a flood of requests.

Example config

commandRateLimiter:
    enabled:   true
    maxQueued: 100

TLS Handshake Limiter

In testing we found that TLS handshakes could use all the CPU on a controller, leaving little for other work to get done. This limiter is disabled by default for now.

Example config

tls:
  handshakeTimeout: 30s
  rateLimiter:
    enabled: true
    minSize: 5
    maxSize: 250

Hope that's helpful, let me know if you've got any follow-up questions.
Paul

1 Like

Thanks for @TheLumberjack & @plorenz instant response!

Disabling Auth limiter did help me get rid of the logs I reported.
However, while testing 200 VUs, the other warning came up:

WARN[0032] failure creating Dial session to service web.endpoint  error="Post \"https://ctrl-perf.c2.test.synology.inc:1280/edge/client/v1/sessions\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" errorType="*url.Error"

controller's log was flooding with:

[ 122.360]   ERROR ziti/controller/api.(*ResponderImpl).RespondWithProducer: {error=[http: Handler timeout] requestId=[hY9pSal1F] path=[/edge/client/v1/sessions]} could not respond, writing to response failed
[ 122.376]   ERROR ziti/controller/api.(*timeoutHandler).ServeHTTP: {url=[/edge/client/v1/sessions] method=[POST]} timeout for request hit, returning Service Unavailable 503
[ 122.384]   ERROR ziti/controller/api.(*timeoutHandler).ServeHTTP: {url=[/edge/client/v1/authenticate?method=cert] method=[POST]} timeout for request hit, returning Service Unavailable 503

Is there any timeout configuration I could try to tweak?
Not sure if the following timeouts in "options" could help?

    options:
      # idleTimeoutMs - optional, default 5000ms
      # The maximum amount of idle time in milliseconds allowed for pipelined HTTP requests. Setting this too high
      # can cause resources on the host to be consumed as clients remain connected and idle. Lowering this value
      # will cause clients to reconnect on subsequent HTTPs requests.
      idleTimeout: 5000ms  #http timeouts, new
      # readTimeoutMs - optional, default 5000ms
      # The maximum amount of time in milliseconds http servers will wait to read the first incoming requests. A higher
      # value risks consuming resources on the host with clients that are acting bad faith or suffering from high latency
      # or packet loss. A lower value can risk losing connections to high latency/packet loss clients.
      readTimeout: 5000ms
      # writeTimeoutMs - optional, default 100000ms
      # The total maximum time in milliseconds that the http server will wait for a single requests to be received and
      # responded too. A higher value can allow long-running requests to consume resources on the host. A lower value
      # can risk ending requests before the server has a chance to respond.
      writeTimeout: 100000ms
      # minTLSVersion - optional, default TLS1.2
      # The minimum version of TSL to support
      minTLSVersion: TLS1.2
      # maxTLSVersion - optional, default TLS1.3
      # The maximum version of TSL to support
      maxTLSVersion: TLS1.3

Thanks!

Updated:

Finally I realized that I should set Client.Timeout in k6's http.Client, and setting it to 30sec solved my flooding logs in test case VU=200. But It still happened in test case VU=500.
I'm wondering if I should find a way to change the Client.Timeout for golang-sdk's library? Because I think the request /edge/client/v1/sessions was sent by the sdk.

You can get to the HttpClient used to make those REST calls, but it's harder than it needs to be.

If you've got an SDK context, you can do the following:

var ctx Context = ...
client := ctx.(*ContextImpl).CtrlClt.HttpClient

We should probably add some way to tweak that explicitly.

Cheers,
Paul

1 Like

@plorenz Thank you very much for your reply!
I applied your advice to here.

Today I found a frustrated symptom.
I noticed that the result of VU=100 test result dropped from 100-rps to 20-rps, and I couldn't find the reason even if I restored all my modifications on parameters.

I also retest my control group(NGINX), which can still produce the same result like before, so I don't think the problem comes from my network or hardware environment.

Do you have any idea that controller/edge could have some 'state' persisted after a long-time test which could cause the handling rate become low?

I also noticed that after docker compose up my ziti environment, the controller seems to be busying on sth. Because it consumed about 67% of a core observed from htop. Meanwhile, the log output wasn't printing anything(maybe I should set loglevel to debug?).

Here are my configuration:
My Image version: openziti/quickstart:1.0.0-b0f91af0

controller:

v: 3

#trace:
#  path: "ctrl-perf.mydomain.inc.trace"

#profile:
#  memory:
#    path: ctrl.memprof

db:                     "/persistent/db/ctrl.db"

identity:
  cert:        "/persistent/pki/ziti-controller-intermediate/certs/ziti-controller-client.cert"
  server_cert: "/persistent/pki/ziti-controller-intermediate/certs/ziti-controller-server.chain.pem"
  key:         "/persistent/pki/ziti-controller-intermediate/keys/ziti-controller-server.key"
  ca:          "/persistent/pki/cas.pem"
  alt_server_certs:
    - server_cert:  "/etc/certs/bundle.crt"
      server_key:   "/etc/certs/mydomain.inc.key"

# Network Configuration
#
# Configure how the controller will establish and manage the overlay network, and routing operations on top of
# the network.
#
#network:

  # routeTimeoutSeconds controls the number of seconds the controller will wait for a route attempt to succeed.
  #routeTimeoutSeconds:  10

  # createCircuitRetries controls the number of retries that will be attempted to create a path (and terminate it)
  # for new circuits.
  #createCircuitRetries: 2  

  # pendingLinkTimeoutSeconds controls how long we'll wait before creating a new link between routers where
  # there isn't an established link, but a link request has been sent
  #pendingLinkTimeoutSeconds: 10

  # Defines the period that the controller re-evaluates the performance of all of the circuits
  # running on the network.
  #
  #cycleSeconds:         15
  
  # Sets router minimum cost. Defaults to 10
  #minRouterCost: 10

  # Sets how often a new control channel connection can take over for a router with an existing control channel connection
  # Defaults to 1 minute
  #routerConnectChurnLimit: 1m

  # Sets the latency of link when it's first created. Will be overwritten as soon as latency from the link is actually
  # reported from the routers. Defaults to 65 seconds.
  #initialLinkLatency: 65s
  
  #smart:
    #
    # Defines the fractional upper limit of underperforming circuits that are candidates to be re-routed. If 
    # smart routing detects 100 circuits that are underperforming, and `smart.rerouteFraction` is set to `0.02`,
    # then the upper limit of circuits that will be re-routed in this `cycleSeconds` period will be limited to 
    # 2 (2% of 100). 
    #
    #rerouteFraction:    0.02
    # 
    # Defines the hard upper limit of underperforming circuits that are candidates to be re-routed. If smart 
    # routing detects 100 circuits that are underperforming, and `smart.rerouteCap` is set to `1`, and 
    # `smart.rerouteFraction` is set to `0.02`, then the upper limit of circuits that will be re-routed in this 
    # `cycleSeconds` period will be limited to 1.
    #
    #rerouteCap:         4  

# the endpoint that routers will connect to the controller over.
ctrl:
  options:
  # (optional) settings
  # set the maximum number of connect requests that are buffered and waiting to be acknowledged (1 to 5000, default 1)
   outQueueSize: 5000
   maxQueuedConnects: 5000
  # the maximum number of connects that have  begun hello synchronization (1 to 1000, default 16)
   maxOutstandingConnects: 1000
  # the number of milliseconds to wait before a hello synchronization fails and closes the connection (30ms to 60000ms, default: 5000ms)
  #connectTimeoutMs:       300000
  #writeTimeout: 300000
  listener:             tls:0.0.0.0:6262

#metrics:
#  influxdb:
#    url:                http://localhost:8086
#    database:           ziti

# xctrl_example
#
#example:
#  enabled:              false
#  delay:                5s

healthChecks:
  boltCheck:
    # How often to try entering a bolt read tx. Defaults to 30 seconds
    interval: 30s
    # When to time out the check. Defaults to 20 seconds
    timeout: 20s
    # How long to wait before starting the check. Defaults to 30 seconds
    initialDelay: 30s

# By having an 'edge' section defined, the ziti-controller will attempt to parse the edge configuration. Removing this
# section, commenting out, or altering the name of the section will cause the edge to not run.
edge:
  # This section allows configurating the rate limiter for auth attempts
  authRateLimiter:
    # if disabled, no auth rate limiting with be enforced
    enabled: false
    # the smallest window size for auth attempts
    minSize: 1000
    # the largest allowed window size for auth attempts
    maxSize: 1000
  # This section represents the configuration of the Edge API that is served over HTTPS
  api:
    #(optional, default 90s) Alters how frequently heartbeat and last activity values are persisted
    # activityUpdateInterval: 90s
    #(optional, default 250) The number of API Sessions updated for last activity per transaction
    # activityUpdateBatchSize: 250
    # sessionTimeout - optional, default 30m
    # The number of minutes before an Edge API session will time out. Timeouts are reset by
    # API requests and connections that are maintained to Edge Routers
    sessionTimeout: 30m
    # address - required
    # The default address (host:port) to use for enrollment for the Client API. This value must match one of the addresses
    # defined in this Controller.WebListener.'s bindPoints.
    address: ctrl-perf.mydomain.inc:1280
  # This section is used to define option that are used during enrollment of Edge Routers, Ziti Edge Identities.
  enrollment:
    # signingCert - required
    # A Ziti Identity configuration section that specifically makes use of the cert and key fields to define
    # a signing certificate from the PKI that the Ziti environment is using to sign certificates. The signingCert.cert
    # will be added to the /.well-known CA store that is used to bootstrap trust with the Ziti Controller.
    signingCert:
      cert: /persistent/pki/signing.pem
      key:  /persistent/pki/ziti-signing-intermediate/keys/ziti-signing-intermediate.key
    # edgeIdentity - optional
    # A section for identity enrollment specific settings
    edgeIdentity:
      # duration - optional, default 180m
      # The length of time that a Ziti Edge Identity enrollment should remain valid. After
      # this duration, the enrollment will expire and no longer be usable.
      duration: 180m
    # edgeRouter - Optional
    # A section for edge router enrollment specific settings.
    edgeRouter:
      # duration - optional, default 180m
      # The length of time that a Ziti Edge Router enrollment should remain valid. After
      # this duration, the enrollment will expire and no longer be usable.
      duration: 180m

# web
# Defines webListeners that will be hosted by the controller. Each webListener can host many APIs and be bound to many
# bind points.
web:
  # name - required
  # Provides a name for this listener, used for logging output. Not required to be unique, but is highly suggested.
  - name: client-management
    # bindPoints - required
    # One or more bind points are required. A bind point specifies an interface (interface:port string) that defines
    # where on the host machine the webListener will listen and the address (host:port) that should be used to
    # publicly address the webListener(i.e. mydomain.com, localhost, 127.0.0.1). This public address may be used for
    # incoming address resolution as well as used in responses in the API.
    bindPoints:
      #interface - required
      # A host:port string on which network interface to listen on. 0.0.0.0 will listen on all interfaces
      - interface: 0.0.0.0:1280
        # address - required
        # The public address that external incoming requests will be able to resolve. Used in request processing and
        # response content that requires full host:port/path addresses.
        address: ctrl-perf.mydomain.inc:1280
    # identity - optional
    # Allows the webListener to have a specific identity instead of defaulting to the root 'identity' section.
    identity:
      ca:          "/persistent/pki/ziti-edge-controller-root-ca/certs/ziti-edge-controller-root-ca.cert"
      key:         "/persistent/pki/ziti-edge-controller-intermediate/keys/ctrl-perf.mydomain.inc-server.key"
      server_cert: "/persistent/pki/ziti-edge-controller-intermediate/certs/ctrl-perf.mydomain.inc-server.chain.pem"
      cert:        "/persistent/pki/ziti-edge-controller-intermediate/certs/ctrl-perf.mydomain.inc-client.cert"
      alt_server_certs:
      - server_cert: "/etc/certs/bundle.crt"
        server_key:  "/etc/certs/mydomain.inc.key"
      
    # options - optional
    # Allows the specification of webListener level options - mainly dealing with HTTP/TLS settings. These options are
    # used for all http servers started by the current webListener.
    options:
      # idleTimeoutMs - optional, default 5000ms
      # The maximum amount of idle time in milliseconds allowed for pipelined HTTP requests. Setting this too high
      # can cause resources on the host to be consumed as clients remain connected and idle. Lowering this value
      # will cause clients to reconnect on subsequent HTTPs requests.
      idleTimeout: 5000ms  #http timeouts, new
      # readTimeoutMs - optional, default 5000ms
      # The maximum amount of time in milliseconds http servers will wait to read the first incoming requests. A higher
      # value risks consuming resources on the host with clients that are acting bad faith or suffering from high latency
      # or packet loss. A lower value can risk losing connections to high latency/packet loss clients.
      readTimeout: 5000ms
      # writeTimeoutMs - optional, default 100000ms
      # The total maximum time in milliseconds that the http server will wait for a single requests to be received and
      # responded too. A higher value can allow long-running requests to consume resources on the host. A lower value
      # can risk ending requests before the server has a chance to respond.
      writeTimeout: 100000ms
      # minTLSVersion - optional, default TLS1.2
      # The minimum version of TSL to support
      minTLSVersion: TLS1.2
      # maxTLSVersion - optional, default TLS1.3
      # The maximum version of TSL to support
      maxTLSVersion: TLS1.3
    # apis - required
    # Allows one or more APIs to be bound to this webListener
    apis:
      # binding - required
      # Specifies an API to bind to this webListener. Built-in APIs are
      #   - edge-management
      #   - edge-client
      #   - fabric-management
      - binding: edge-management
        # options - arg optional/required
        # This section is used to define values that are specified by the API they are associated with.
        # These settings are per API. The example below is for the 'edge-api' and contains both optional values and
        # required values.
        options: { }
      - binding: edge-client
        options: { }
      - binding: fabric
        options: { }

edge router:

v: 3

identity:
  cert:             "/persistent/ziti-edge-router.cert"
  server_cert:      "/persistent/ziti-edge-router.server.chain.cert"
  key:              "/persistent/ziti-edge-router.key"
  ca:               "/persistent/ziti-edge-router.cas"
  alt_server_certs:
    - server_cert:  "/etc/certs/bundle.crt"
      server_key:   "/etc/certs/mydomain.inc.key"

ctrl:
  endpoint:             tls:ziti-controller:6262

link:
  dialers:
    - binding: transport
  listeners:
    - binding:          transport
      bind:             tls:0.0.0.0:10081
      advertise:        tls:ctrl-perf.mydomain.inc:10081
      options:
        outQueueSize:   4

listeners:
# bindings of edge and tunnel requires an "edge" section below
  - binding: edge
    address: tls:0.0.0.0:8442
    options:
      advertise: ctrl-perf.mydomain.inc:8442
      connectTimeoutMs: 5000
      getSessionTimeout: 60
  - binding: tunnel
    options:
      mode: host #tproxy|host



edge:
  # This section allows configurating the rate limiter for auth attempts
  csr:
    country: US
    province: NC
    locality: Charlotte
    organization: NetFoundry
    organizationalUnit: Ziti
    sans:
      dns:
        - localhost
        - ctrl-perf.mydomain.inc
        - bfde46659dc5
      ip:
        - "127.0.0.1"


#transport:
#  ws:
#    writeTimeout: 10
#    readTimeout: 5
#    idleTimeout: 120
#    pongTimeout: 60
#    pingInterval: 54
#    handshakeTimeout: 10
#    readBufferSize: 4096
#    writeBufferSize: 4096
#    enableCompression: true
#    server_cert: /persistent/ziti-edge-router.server.chain.cert
#    key: /persistent/ziti-edge-router.key
alt_server_certs:
  - server_cert:  "/etc/certs/bundle.crt"
    server_key:   "/etc/certs/mydomain.inc.key"
forwarder:
  latencyProbeInterval: 10
  xgressDialQueueLength: 10000
  xgressDialWorkerCount: 10000
  linkDialQueueLength: 10000
  linkDialWorkerCount: 32

update:
I highly suspect why k6 launches requests with a slow pace. So I enabled k6's profiling option and got the frame view below, looks like connecting to edge router took a lot of time?:

Thanks for posting the pprof (side note: big fan of the new flame graph view in the pprof viewer) .

The SDK <-> router connections should generally happen once a session is established which uses the edge router. Once they are established, they should be persistent. So it should be a once time cost, unless you're restarting up your client for each request. Let me know how that lines up with how you're testing.

The high CPU load on the controller is interesting, would you be able to capture a pprof for that as well? The controller can be busy on restart as it's usually handling router connections, refreshing link state and establishing/validating terminators. The pprof should show us if there's something unexpected happening there.

How do things look on the 20 rps test vs the 100 rps test? Is the CPU or IO load any different on routers/controllers? How many routers do you have? Are there options for multiple paths? Are different paths being selected?

If you want to do a ziti fabric inspect stackdump -f '*' and send me the results, I can take a look and see if there's anything suspicious going on in the execution dumps. That will grab stackdumps for each controller/router with one file per process.

Cheers,
Paul

1 Like

Hi @plorenz ,
Really appreciate your reply with full of technical details and guidance! Orz

First, I would like to elaborate my testing topology, hw specs, and plan:

The blue arrow lines indicate the request flow.
The traffic between PC-2 & PC-3 is plain http.

I'm also planning to test another topology which moves openziti router to another PC. So that I can observe the CPU utilization from router & controller separately, and conclude which one is the bottleneck.

I only deployed one router & one controller on PC-2, and the PC-3's IP is configured as a host.v1 service
and set to 192.168.0.2, so I don't think it has multiple paths.

While testing VU=100 with result of 20-rps, CPU utilization of the controller & router(they are deployed together on PC-2) is about 30~40%.
I didn't clearly record the cpu utilization while achieving 100-rps, but I think it was nearly 90%.

PC-2's low cpu utilization is the reason why I suspect the requesting speed of K6 is too low.

For confirming the behavior of K6, I added a line for printing log before this line. And even I only test VU=1, my log keeps printing. So I think K6 is really launching a new http.Client every time for each request.

I think this kind of test is like simulating always receiving new requests from different clients(browsers, desktop apps, ...) for a duration.

I found that this cmd ziti fabric inspect stackdump -f '*' will cause error:

Encountered errors: (1)
        ziti-controller: error parsing regexp: missing argument to repetition operator: `*`

So I use this cmd ziti fabric inspect stackdump -f ".*" to generate stackdumps from controller & router's docker container individually:

Results: (2)
output result to: ziti-controller.stackdump
output result to: r2419PY9X.stackdump

Please download the zipped file by this link, the download request(expired in 7-day) is bind to your email(paul.lorenz@netfoundry.io), you will need to enter the OTP sent to your email.

Thank you very much indeed for your help!

Update:
I moved the ziti ctx generation part to init function, so that it should only create one ctx during the test.
As a result, the request speed increases immediately. So I guess reconstruction of ziti connection is the main reason causes low throughput?

However, after the test went about 1-min, I met this kind of error(192.168.0.2:7712 is the destination ip & port denoted as PC-3):

WARN[0096] Request Failed                                error="Get \"http://web.endpoint/get\": unable to dial service 'web.endpoint': dial failed: exceeded maximum [2] retries creating circuit [c/wiLDcNIBh]: error creating route for [s/wiLDcNIBh] on [r/r2419PY9X] (error creating route for [c/wiLDcNIBh]: dial tcp 192.168.0.2:7712: connect: cannot assign requested address)"
diff --git a/lib/netext/httpext/ziti.go b/lib/netext/httpext/ziti.go
index 220d9c39..5a33bff5 100644
--- a/lib/netext/httpext/ziti.go
+++ b/lib/netext/httpext/ziti.go
@@ -1,51 +1,60 @@
 package httpext
 
 import (
 	"context"
 	"fmt"
 	"net"
 	"net/http"
 	"os"
 	"strings"
 	"time"
 
 	"github.com/openziti/sdk-golang/ziti"
 )
 
-type ZitiDialContext struct {
-	context ziti.Context
-}
-
-func (dc *ZitiDialContext) Dial(_ context.Context, _ string, addr string) (net.Conn, error) {
-	service := strings.Split(addr, ":")[0] // will always get passed host:port
-	return dc.context.Dial(service)
-}
+var zitiCfg *ziti.Config
+var zitiCtx ziti.Context
 
-// GetZitiTransport returns a http.Transport that uses Ziti to dial
-func GetZitiTransport(originTransport *http.Transport) *http.Transport {
+func init() {
 	filePath := os.Getenv("ZITI_IDENTITY_FILE") //nolint: forbidigo //for dev
 	if filePath == "" {
 		panic("ZITI_IDENTITY_FILE should be set")
 	}
 
 	cfg, err := ziti.NewConfigFromFile(filePath)
 	if err != nil {
 		panic(fmt.Sprintf("err reading ziti identity file: %v", err))
 	}
-	ctx, err := ziti.NewContext(cfg)
+	zitiCfg = cfg
+	ctx, err := ziti.NewContext(zitiCfg)
 	if err != nil {
 		panic(fmt.Sprintf("err creating ziti context: %v", err))
 	}
 
 	impl, ok := ctx.(*ziti.ContextImpl)
 	if !ok {
 		panic("failed to get *ziti.ContextImpl from ziti.Context")
 	}
 	impl.CtrlClt.HttpClient.Timeout = 30 * time.Second
 
-	zitiDialContext := ZitiDialContext{context: ctx}
+	zitiCtx = ctx
+	fmt.Println("zitiCfg inited")
+}
+
+type ZitiDialContext struct {
+	context ziti.Context
+}
+
+func (dc *ZitiDialContext) Dial(_ context.Context, _ string, addr string) (net.Conn, error) {
+	service := strings.Split(addr, ":")[0] // will always get passed host:port
+	return dc.context.Dial(service)
+}
+
+// GetZitiTransport returns a http.Transport that uses Ziti to dial
+func GetZitiTransport(originTransport *http.Transport) *http.Transport {
+	zitiDialContext := ZitiDialContext{context: zitiCtx}
 	zitiTransport := originTransport.Clone() // copy default transport
 	zitiTransport.DialContext = zitiDialContext.Dial
 
 	return zitiTransport
 }

Best Regards,
Rick

Hi Rick,
Glad to hear the performance is better when you're not initializing a new Context for each run. I looked at the stack dumps you shared (thank you), and the cost is not just from having to re-establish connections to the edge routers. The SDK also needs to create a new api session and session for the service. I saw 40-50 goroutines queued up to write api sessions and a similar number queued to create sessions. Because db changes are single-writer, if you've got a lot of new sessions, the controller can bottleneck on this. One of the changes coming with the HA code is that both kinds of sessions won't be persisted in the DB, but with instead be bearer tokens. That will remove DB writes from the critical path for these operations. Those are the only write operations involved in SDK calls.

I looked into this dial tcp 192.168.0.2:7712: connect: cannot assign requested address error a bit.

I found this: net/http: cannot assign requested address · Issue #16012 · golang/go · GitHub which makes it sounds like an OS limit that may need to be adjusted. Let me know if that's helpful.

Cheers,
Paul

1 Like

Hi Paul,

Thank you for your reply!

According to your previous reply, I will run another test which send more requests during the lifetime of one persistent session, I think it will be more valuable for reference.

Is there a tutorial that I could learn how to analyze by using stackdumps? I googled a little bit, and I found goroutine-analyzer written by you! This tool definitely make the analyzing process far more friendly for users to read these stackdumps. But I still have no clue how to find the queued up sessions writing goroutines? In the image below, am I highlighting the right one which match with your description?

Could you use my stackdumps as an example to explain how to analyze and reach these conclusions?

Do I need to enable HA mode for using the bearer tokens? I upgraded my ziti binary to v1.1.3, but I still got many boltz updates during the test.

Thanks!

Best Regards,
Rick

Hi Rick,
You're using goroutine-analyzer the way i do, which is to load the stackdump and run 'Extract Duplicated Goroutines'. Then I look for suspicious things. SemaquireMutex is generally a red flag, because it means that there's contention. A lot of the analysis comes down to understanding the code and knowing what's expected and what's not. Looking at those mutex acquires and seeing the api session and session creates there, and knowing that bbolt is single writer, it makes sense.

By comparison, looking the router stacks, I know that each connected SDK client is going to have a pair of goroutines, a reader (rxer) and a writer (txer) for messages to/from the SDK. In addition, each active circuit is going to have 3 goroutines related to xgress, again a reader and a writer, plus one for managing retransmissions (LinkSendBuffer). So you can use those to measure the number of connected SDKs and the number of active circuits in the router.

You should be able to use bearer token sessions without running in full HA mode. You'd need to do at least the following:

  1. Add the oidc endpoint to the controller: ziti/doc/ha/overview.md at release-next · openziti/ziti · GitHub
  2. Set the HA flag in the router configuration
  3. Set the HA flag in the SDK ziti/doc/ha/dev-setup.md at release-next · openziti/ziti · GitHub

Even with that, the code is still alpha and my coworker tells me there's a bug (since fixed) in v1.1.3, which means this particular setup won't work.

Note that using OIDC based auth requires more work and more requests that the legacy authentication mechanism, but it's work that won't be bottle-necked, so you'll be CPU and network constrained instead of DB constrained.

You can also run the controller in HA mode without setting up a full cluster.

If you can reasonably re-use Contexts, though, then that would be the most efficient thing to do, regardless of auth mechanism.

If you've got other questions, either about stack traces or OIDC, let me know.
Cheers,
Paul

1 Like

Hi Paul,

According to your description, I decide to postpone the bearer token tests until the HA code get a mature version. For now I will focus on modifying k6 to reuse ziti context, I need to move the ziti context creation to a earlier time.

Thank you very much for your detailed explanation for the stackdumps analysis!

Best Regards,
Rick

Hi Paul,

Recently I'm running tests(VU=200, with 1000 subsequent requests for each VU after authentication) for deployment of separating router & controller to different PCs.

And I found that under high loading, the router seems to be killed by sth. This situation didn't happen while router & controller was deployed on the same PC and docker network(from quickstart guide).

Below is the last log from router before it was killed:

...
...
[4082.197] WARNING ziti/router/forwarder.(*Scanner).scan: {ctrlId=[ziti-controller] circuitCount=[1044]} sent confirmation for circuits
[4159.134]   ERROR channel/v2/latency.(*LatencyHandler).HandleReceive.func1 [ch{edge}->u{classic}->i{7kMr}]: {error=[channel closed]} error sending latency response
[4159.428]   ERROR channel/v2/latency.(*LatencyHandler).HandleReceive.func1 [ch{edge}->u{classic}->i{erNG}]: {error=[channel closed]} error sending latency response
[4159.792]   ERROR channel/v2/latency.(*LatencyHandler).HandleReceive.func1 [ch{edge}->u{classic}->i{rzlj}]: {error=[channel closed]} error sending latency response
[4160.076]   ERROR channel/v2/latency.(*LatencyHandler).HandleReceive.func1 [ch{edge}->u{classic}->i{Gkmy}]: {error=[channel closed]} error sending latency response
[4160.269]   ERROR channel/v2.(*heartbeater).sendHeartbeat: {error=[context deadline exceeded] channelId=[ch{ctrl}->u{reconnecting}->i{ZRXa}]} pulse failed to send heartbeat
[4160.598]   ERROR channel/v2/latency.(*LatencyHandler).HandleReceive.func1 [ch{edge}->u{classic}->i{l5gN}]: {error=[channel closed]} error sending latency response
[4160.625]   ERROR channel/v2.(*reconnectingImpl).Tx [u{reconnecting}->i{ZRXa}]: tx error (write tcp 172.20.0.2:42762->10.17.64.119:6262: write: broken pipe). starting reconnection process
[4160.633]    INFO channel/v2.(*reconnectingImpl).pingInstance [u{reconnecting}->i{ZRXa}]: starting
[4160.636]    INFO channel/v2.(*reconnectingImpl).pingInstance [u{reconnecting}->i{ZRXa}]: exiting
[4160.644]   ERROR channel/v2/latency.(*LatencyHandler).HandleReceive.func1 [ch{edge}->u{classic}->i{nkVm}]: {error=[channel closed]} error sending latency response
[4160.656]   ERROR ziti/router/xgress_edge.(*edgeClientConn).processConnect [ch{edge}->u{classic}->i{4ZyN}]: {token=[a0f414c1-7b71-4ffd-91eb-6614f38c985c] connId=[4] type=[EdgeConnectType] error=[mux is closed, can't add sink with id [4]] chSeq=[520] edgeSeq=[0]} error adding to msg mux
[4160.681]   ERROR channel/v2.(*reconnectingImpl).Rx [u{reconnecting}->i{ZRXa}]: {error=[tls: failed to send closeNotify alert (but connection was closed anyway): write tcp 172.20.0.2:42762->10.17.64.119:6262: write: broken pipe]} error closing peer after rx error
[4160.684]   ERROR channel/v2.(*reconnectingImpl).Rx [u{reconnecting}->i{ZRXa}]: {error=[EOF]} rx error. closed peer and starting reconnection process
[4160.791] WARNING ziti/router/forwarder.(*Scanner).scan: {ctrlId=[ziti-controller] idleTime=[2m18.16s] idleThreshold=[1m0s] circuitId=[lfMZ3-rPBy]} circuit exceeds idle threshold
[4160.825] WARNING ziti/router/forwarder.(*Scanner).scan: {circuitId=[Fi643-rPBX] ctrlId=[ziti-controller] idleTime=[1m16.122s] idleThreshold=[1m0s]} circuit exceeds idle threshold
[4160.826] WARNING ziti/router/forwarder.(*Scanner).scan: {circuitId=[OQf13P6PB] ctrlId=[ziti-controller] idleTime=[6m13.638s] idleThreshold=[1m0s]} circuit exceeds idle threshold
[4160.826] WARNING ziti/router/forwarder.(*Scanner).scan: {circuitId=[yfRZ3-rPgU] ctrlId=[ziti-controller] idleTime=[2m18.147s] idleThreshold=[1m0s]} circuit exceeds idle threshold
[4160.826] WARNING ziti/router/forwarder.(*Scanner).scan: {idleThreshold=[1m0s] circuitId=[y3MZ3-6PBI] ctrlId=[ziti-controller] idleTime=[2m18.151s]} circuit exceeds idle threshold
[4160.826] WARNING ziti/router/forwarder.(*Scanner).scan: {idleThreshold=[1m0s] ctrlId=[ziti-controller] circuitId=[1E113P6-gX] idleTime=[6m19.241s]} circuit exceeds idle threshold
[4160.826] WARNING ziti/router/forwarder.(*Scanner).scan: {circuitId=[XfRm3-r-Bx] ctrlId=[ziti-controller] idleTime=[2m18.185s] idleThreshold=[1m0s]} circuit exceeds idle threshold
[4160.827] WARNING ziti/router/forwarder.(*Scanner).scan: {ctrlId=[ziti-controller] circuitId=[rsLffPr-gW] idleTime=[3m19.208s] idleThreshold=[1m0s]} circuit exceeds idle threshold
[4160.830] WARNING ziti/router/forwarder.(*Scanner).scan: {idleThreshold=[1m0s] circuitId=[pfMmfP6PBt] ctrlId=[ziti-controller] idleTime=[2m18.091s]} circuit exceeds idle threshold
[4160.830] WARNING ziti/router/forwarder.(*Scanner).scan: {ctrlId=[ziti-controller] idleThreshold=[1m0s] idleTime=[1m16.109s] circuitId=[JI643-6PB]} circuit exceeds idle threshold
[4160.830] WARNING ziti/router/forwarder.(*Scanner).scan: {ctrlId=[ziti-controller] idleTime=[3m19.243s] idleThreshold=[1m0s] circuitId=[PPWffP6Pgy]} circuit exceeds idle threshold
[4160.830] WARNING ziti/router/forwarder.(*Scanner).scan: {circuitId=[1rt63-r-g] ctrlId=[ziti-controller] idleTime=[3m36.191s] idleThreshold=[1m0s]} circuit exceeds idle threshold
[4160.830] WARNING ziti/router/forwarder.(*Scanner).scan: {idleTime=[1m16.17s] idleThreshold=[1m0s] circuitId=[BFr4f-6Pgs] ctrlId=[ziti-controller]} circuit exceeds idle threshold
[4160.831] WARNING ziti/router/forwarder.(*Scanner).scan: {idleTime=[6m13.189s] idleThreshold=[1m0s] circuitId=[y6f13-r-B] ctrlId=[ziti-controller]} circuit exceeds idle threshold
[4160.831] WARNING ziti/router/forwarder.(*Scanner).scan: {ctrlId=[ziti-controller] idleTime=[1m16.15s] idleThreshold=[1m0s] circuitId=[2Q643P6PBX]} circuit exceeds idle threshold
[4160.831] WARNING ziti/router/forwarder.(*Scanner).scan: {ctrlId=[ziti-controller] idleTime=[6m19.299s] idleThreshold=[1m0s] circuitId=[kaTT3PrPBw]} circuit exceeds idle threshold
[4160.831] WARNING ziti/router/forwarder.(*Scanner).scan: {idleTime=[6m12.823s] idleThreshold=[1m0s] circuitId=[khZ1fP6-g] ctrlId=[ziti-controller]} circuit exceeds idle threshold
[4160.831] WARNING ziti/router/forwarder.(*Scanner).scan: {idleTime=[3m52.693s] idleThreshold=[1m0s] circuitId=[xqW63Pr-B] ctrlId=[ziti-controller]} circuit exceeds idle threshold
[4160.831] WARNING ziti/router/forwarder.(*Scanner).scan: {idleTime=[3m34.498s] circuitId=[pFc6f-6Pg] ctrlId=[ziti-controller] idleThreshold=[1m0s]} circuit exceeds idle threshold
[4160.831] WARNING ziti/router/forwarder.(*Scanner).scan: {ctrlId=[ziti-controller] idleTime=[3m44.93s] circuitId=[K-Nrf-6-B] idleThreshold=[1m0s]} circuit exceeds idle threshold
[4160.831] WARNING ziti/router/forwarder.(*Scanner).scan: {ctrlId=[ziti-controller] idleTime=[1m16.117s] circuitId=[Ti643P6-gl] idleThreshold=[1m0s]} circuit exceeds idle threshold
[4160.831] WARNING ziti/router/forwarder.(*Scanner).scan: {idleTime=[6m19.347s] idleThreshold=[1m0s] circuitId=[05113-rPBy] ctrlId=[ziti-controller]} circuit exceeds idle threshold
[4160.831] WARNING ziti/router/forwarder.(*Scanner).scan: {circuitId=[CErefPr-BX] ctrlId=[ziti-controller] idleTime=[1m16.14s] idleThreshold=[1m0s]} circuit exceeds idle threshold
[4160.937]   ERROR ziti/router/xgress_edge.(*edgeClientConn).sendStateClosedReply: {edgeSeq=[0] error=[channel closed] connId=[4] type=[EdgeStateClosedType] chSeq=[514]} failed to send state response
/var/openziti/scripts/run-router.sh: line 75:    15 Killed                  "${ZITI_BIN_DIR}/ziti" router run "${ZITI_HOME}/${ZITI_ROUTER_NAME}.yaml"

I saw there was a log [4160.269] ERROR channel/v2.(*heartbeater).sendHeartbeat: {error=[context deadline exceeded] channelId=[ch{ctrl}->u{reconnecting}->i{ZRXa}]} pulse failed to send heartbeat, is it highly relevant with the self-termination?

Here is my docker-compose.yml for my router:

services:
  ziti-edge-router:
    user: "1000"
    image: "${ZITI_IMAGE}:${ZITI_VERSION}"
    env_file:
      - ./.env
    ports:
      - ${ZITI_INTERFACE:-0.0.0.0}:${ZITI_ROUTER_PORT:-3022}:${ZITI_ROUTER_PORT:-3022}
      - ${ZITI_INTERFACE:-0.0.0.0}:${ZITI_ROUTER_WSS_PORT:-3023}:${ZITI_ROUTER_WSS_PORT:-3023}
      - ${ZITI_INTERFACE:-0.0.0.0}:${ZITI_ROUTER_LISTENER_BIND_PORT:-10080}:${ZITI_ROUTER_LISTENER_BIND_PORT:-10080}
    environment:
      - ZITI_CTRL_ADVERTISED_ADDRESS=${ZITI_CTRL_ADVERTISED_ADDRESS:-ziti-controller}
      - ZITI_CTRL_ADVERTISED_PORT=${ZITI_CTRL_ADVERTISED_PORT:-6262}
      - ZITI_CTRL_EDGE_ADVERTISED_ADDRESS=${ZITI_CTRL_EDGE_ADVERTISED_ADDRESS:-ziti-edge-controller}
      - ZITI_CTRL_EDGE_ADVERTISED_PORT=${ZITI_CTRL_EDGE_ADVERTISED_PORT:-1280}
      - ZITI_ROUTER_NAME=${ZITI_ROUTER_NAME:-ziti-edge-router}
      - ZITI_ROUTER_ADVERTISED_ADDRESS=${ZITI_ROUTER_ADVERTISED_ADDRESS:-ziti-edge-router}
      - ZITI_ROUTER_PORT=${ZITI_ROUTER_PORT:-3022}
      - ZITI_ROUTER_LISTENER_BIND_PORT=${ZITI_ROUTER_LISTENER_BIND_PORT:-10080}
      - ZITI_ROUTER_ROLES=public
      - ZITI_ROUTER_IP_OVERRIDE=${EXTERNAL_IP}
      - NODE_EXTRA_CA_CERTS=${NODE_EXTRA_CA_CERTS:-}
      - ZITI_PKI_ALT_SERVER_CERT=${ZITI_PKI_ALT_SERVER_CERT:-}
      - ZITI_PKI_ALT_SERVER_KEY=${ZITI_PKI_ALT_SERVER_KEY:-}
    networks:
      - ziti
    volumes:
      - ./ziti-fs/:/persistent/
      - ./dev_certs/:/etc/certs/
      - ./ziti-bins/:/openziti/ziti-bin
    entrypoint: /bin/bash
    command: "/var/openziti/scripts/run-router.sh edge"
networks:
  ziti:

#volumes:
#  ziti-fs:

I'm wondering if there is anything I can do to prevent from this situation? Is it possible that I can configure the heartbeat timeout?

I saw logs from controller:

[89094.433]   ERROR ziti/controller/handler_ctrl.(*heartbeatCallback).CheckHeartBeat: {channelType=[router] channelId=[.5eXAZqVrh]} heartbeat not received in time, closing control channel connection
[89094.434]    INFO ziti/controller/handler_ctrl.(*xctrlCloseHandler).HandleClose [ch{.5eXAZqVrh}->u{classic}->i{aVNa}]: closing Xctrl instances
[89094.434] WARNING ziti/controller/handler_ctrl.(*closeHandler).HandleClose: {routerId=[.5eXAZqVrh]} disconnected
[89094.434]    INFO ziti/controller/network.(*routerChangedEvent).handle: {routerId=[.5eXAZqVrh] connected=[false]} calculating router updates for router
[89094.434]    INFO ziti/controller/env.(*Broker).RouterDisconnected.func1: {routerName=[router2] routerFingerprint=[0xc0065c6d60] routerId=[.5eXAZqVrh]} broker detected router with id .5eXAZqVrh disconnecting
[89094.434]    INFO ziti/controller/sync_strats.(*InstantStrategy).RouterDisconnected: {sync_strategy=[instant] routerId=[.5eXAZqVrh] routerName=[router2] routerFingerprint=[121b433e8af8b9f8c4e6f6a358ddb8ca85c3325f]} edge router [.5eXAZqVrh] disconnect event, router rtx removed

Thanks!

About the performance result of separating controller & router, I found an interesting symptom.

If I restart router before each round test, I can easily get relatively higher(almost by ten times) RPS than without restarting router.

Here is my result csv:

  1. without restarting router before each round(sleep 20s between each round, Client.Timeout set to 30s)
subseqreqs,round,vus_max,success_rate,checks_pass,checks_fails,http_reqs,recv_avg,recv_med,recv_p90,router_cpu_usage%,router_mem_usage%,controller_cpu_usage%,controller_mem_usage%
1000,1,200,2912.1833333333334,524193,1107,525300,0.01997380120883363,0.017289,0.03476910000000004,81.73888888888743,28.278586175669517,23.633333333334832,3.285104235341707
1000,2,200,1264.388888888889,227590,1085,228675,0.023423838854268495,0.019292,0.033446,83.55555555555435,70.79598548208621,22.988888888888454,3.2806255281422336
1000,3,200,20631.722222222223,3713710,0,3713710,0.012002000709532605,0.007832,0.025244,85.28300628903382,6.009228731519823,27.26388888888728,4.338284759014388
1000,4,200,4269.322222222222,768478,986,769464,0.01907823072424435,0.016734,0.034548,81.69444444443798,21.68988339219917,3.3694444444457474,3.331525760509535
1000,5,200,1625.1833333333334,292533,1360,293893,0.022204870139812875,0.020581,0.034079,83.00517789285209,61.92495597068538,2.517278161730772,3.265047814629185
1000,6,200,1577,283860,1310,285170,0.023525204597258337,0.021283,0.035495,82.70000000000033,95.1559441852859,2.568831803731058,3.285443346186956
1000,9,200,5146.155555555555,926308,921,927229,0.017954461989432867,0.013958,0.03332320000000007,80.92820159552721,16.803491461231424,3.2805555555508903,3.371754693935658
1000,7,200,12357.33888888889,2224321,345,2224667,0.014231163109804488,0.009247,0.029168,83.95519900441974,7.335058466447453,26.263888888893746,4.372681596912686
1000,8,200,3518.822222222222,633388,1027,634415,0.02071548012263244,0.019113,0.035976,81.5444444444438,21.011809781539238,3.9472222222280635,3.394074908669198
1000,10,200,1625.3055555555557,292555,1360,293916,0.02244683817825517,0.0207485,0.034423,82.5781649296679,52.08347135465474,2.5533900753301424,3.3005444714850563

set Client.Timeout to 180s(this helps to reducing request errors & router self-termination)

subseqreqs,round,vus_max,success_rate,checks_pass,checks_fails,http_reqs,recv_avg,recv_med,recv_p90,router_cpu_usage%,router_mem_usage%,controller_cpu_usage%,controller_mem_usage%
1000,11,200,17521.03888888889,3153787,0,3153787,0.012693815892131963,0.007965,0.026576,89.144444444448,8.558243437840407,27.305555555548295,4.343673871906528
1000,12,200,3297.461111111111,593543,0,593543,0.018924504687950396,0.015005,0.03450280000000005,80.06622369385727,22.1718846319898,2.655555555563012,3.3605548708843136
1000,13,200,472.2888888888889,85012,0,85012,0.02125048080271018,0.019427,0.03313790000000001,77.44444444445739,35.85280429360627,2.258333333334633,3.3095049409366695
1000,14,200,372.22222222222223,67000,0,67000,0.019862711238806043,0.018182,0.031417,72.47777777777326,51.27760744544268,2.2833333333296224,3.3083776265051434
  1. with restarting router for each round(Client.Timeout set to 180s)
subseqreqs,round,vus_max,success_rate,checks_pass,checks_fails,http_reqs,recv_avg,recv_med,recv_p90,router_cpu_usage%,router_mem_usage%,controller_cpu_usage%,controller_mem_usage%
1000,15,200,14860.594444444445,2674907,0,2674907,0.011936543716473885,0.007559,0.025392,81.7448501144419,9.398568551629493,26.1321970622854,4.391204381189917
1000,16,200,16427.62777777778,2956973,0,2956973,0.011946431934278499,0.007427,0.025549,87.1888888888889,9.348776205477193,27.551167803729598,4.410933911267834
1000,17,200,5425.9,976662,0,976662,0.015532077946106108,0.009869,0.030533,81.70514900331122,17.54594355233756,3.7916666666610155,3.4215642725033635
1000,18,200,20106.977777777778,3619256,0,3619256,0.011362545774048133,0.007388,0.024219,80.6277777777776,9.086195002762064,26.290526877184377,4.386090223037176
1000,19,200,9662.822222222223,1739308,0,1739308,0.013278769226037832,0.008183,0.027637,80.59401320029318,10.288889195374896,21.75381675148128,4.131818190150138
1000,20,200,16734.26111111111,3012167,0,3012167,0.011419567380893336,0.007247,0.024619,85.05522344940968,8.863530181907386,27.232172618381895,4.389075009485943

And whenever I got higher of RPS, the controller's CPU usage will be higher, too.
So I started to think, for those cases with lower RPS and lower controller CPU usage, does it mean controller was waiting on sth? Will it be controller's db writes? But how come restarting router can help on this?

Best Regards,
Rick Wang

Hi Rick,
Interesting data!

The main thing I've seen kill routers during perf testing is the OOM (out-of-memory) killer on linux systems. If this is the culprit then there should be some messages in the system log.

If you've got time to grab some numbers, I'd be curious if we can find what's causing the performance drop after the first run. First thing to look into would be how the overall memory sits after a run. Does it keep going up (is something being leaked)? Does it eventually return to a stable point? How close are you getting to the capacity of the machine/VM? Is it maybe spending a lot of time doing GC?

Second thing would be to grab a few stacks stacks during each run, to get an idea of what's running. Grabbing cpu and memory pprofs could also be very helpful.

With the heartbeats, I'm wondering if the control channel was backed up. It would also be interesting to get a stackdump of the controller while it was in the state, so we could see if it was not keeping up with traffic from the router for some reason.

Cheers,
Paul

1 Like

Hi Paul,

Thank you for your advice!
It would take me some time to generate those stack dumps.

Moreover, you are right about the OOM!
In my router's PC, I found there were multiple oom logs like this:
Jun 24 18:44:26 openziti-router kernel: [24729.387703] Out of memory: Killed process 4593 (ziti)

Jun 24 18:44:26 openziti-router kernel: [24729.387460] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Jun 24 18:44:26 openziti-router kernel: [24729.387666] [   4593]  1000  4593  4760029  3890982 34267136   337313             0 ziti

I have provided the entire OOM log at this link (binded to your email) if you are interested in seeing it.

For further investigation, I would need more time to gather more data. Because I need to do other POCs for my OpenZiti application first.

Thanks!

Best Regards,
Rick