Extremely high resource usage

I opened my Ziti Desktop Edge the other day and noticed that identities were missing from my list. Furthermore, none of my services tied to my main company’s controller were accessible.

I RDP’d into my servers that host my services and noticed that ziti-tunnel.exe was using an extremely high amount of memory and CPU. It was contending with a very high use Sql Server for the system memory. This made me go look at the servers that host the controller and router and those also were using an above average amount of memory and CPU.

In order to figure out what was going on, I did the normal IT routing of restarting everything and looking at the logs. The tunnel and router logs had errors pertaining to certificates being expired or not yet available.

I decided to make sure I had everything updated, all controllers, routers and tunnelers were updated to the latest and as I started to turn things on, I would get the following errors:

Controller:

wsarecv: An existing connection was forcibly closed by the remote host.

Router:

FATAL ziti/ziti-router/subcmd.run: {error=[error connecting ctrl (x509: certificate has expired or is not yet valid: current time 2022-06-07T08:47:30-04:00 is after 2022-06-03T16:37:16Z)]} error starting

Desktop Edge/Tunneller

SDKe: uv-mbed:tls_link.c:161 TLS(0000020ca8134520) handshake error SSL - An invalid SSL record was received

All the cert issues drove me to go look at the certificates on the controller but those certs were all fine.

openssl x509 -enddate -noout -in ziti.actieve-root-ca.cert
notAfter=Jun 1 16:37:03 2031 GMT

So now I am in a locked-out state where none of my Ziti services are available and I am not sure what to check next. Anyone that has suggestions, they will be greatly appreciated!

I am DM-ing the logs to @plorenz to see if he sees what is going on.

Things to check:

  • The controller’s server certificate
  • The edge router’s client certificate

If either of those are invalid (expired) that would cause problems. Looking at the error messages, I would check the controller’s server cert first.

I ran an openssl s_client -connect $controller. It shows me that my certificate is invalid. How can I renew the controller’s cert?

It depends on how you created the server certificate.

If you were using the Ziti CLI’s built in PKI management, you should be able to use that again to generate a new server certificate via ziti pki create server.

For example, if you created the CA named myca that command would have been:

> ziti pki create ca myca

(defaulting it to ~/.ziti/environments/pki/myca/certs/myca.cert and …/keys/myca.key)

> ziti pki create server --ca-name myca --dns example.com

(outputting to ~/.ziti/environments/pki/myca/certs/server.cert and …/keys/server.key)

or if using an intermediate, you would have done:

> ziti pki create intermediate --ca-name myca --intermediate-file myintermediate

(outputting to ~/.ziti/environments/pki/myintermediate/certs/myintermediate.cert and …/keys/myintermediate.key)

then using the intermediate to create a server cert

> ziti pki create server --ca-name myintermediate --dns example.com

If you created it through another tool or even with the openssl binary I suggest looking for tutorials on those tools.

To finish this post off… @actieve and I, with some PKI help from @andrew.martinez got things sorted for this. The problem was that @actieve’s install was from a very, very early quickstart which had some other bugs within it. The first is that the PKI was generated a while back and moved to a windows machine. That’s totally fine, it just makes it “non-standard”. We powered through that by moving the PKI and the original .env file the quickstart produces and made a new server certificate using the original key from the PKI.

We did that using a command similar to this (obviously replacing values accordingly)

ziti pki create server --pki-root "/path/to/pki" \
      --ca-name "external-host-name-intermediate" \
      --server-file "external-host-name-server-${now}" \
      --dns "localhost,external-host-name" --ip "127.0.0.1,w.x.y.z" \
      --server-name "external-host-name server certificate - ${now}" \
      --key-file "external-host-name-server"

After that we were able to move this cert to the controller and the controller booted correctly.

Since this config used the same certs for controller and edge controller that was all we needed to do to fix the controller, however routers were still not connecting.

Routers were not connecting due to the age of the quickstart used. This was fixed by editing the controller config file and replacing the incorrect “CA” value in the top-level “identity” section.