Controller upgrade issue

I upgraded controller from v0.26.5 to v0.26.8 ( did export ZITI_VERSION=v0.26.8 and ran getZiti function ) and something went wrong. So I went back to v0.26.5 and I still got the same stack trace. Am I missing anything?

Here is the stack trace

$ "${ZITI_BIN_DIR-}/ziti-controller" run "${ZITI_HOME_OS_SPECIFIC}/${ZITI_EDGE_CONTROLLER_RAWNAME}.yaml"

*********************************************************************************

An update with v0.26.8 is available for ziti-controller v0.26.5 from
https://github.com/openziti/ziti/releases/latest/

*********************************************************************************
[   0.521]    INFO ziti/ziti-controller/subcmd.run: {arch=[amd64] build-date=[2022-08-17T14:01:21Z] revision=[e5d4b19e3568] version=[v0.26.5] go-version=[go1.18.5] os=[linux] nodeId=[athenz.experiments.openziti-controller]} starting ziti-controller
[   0.521]    INFO fabric/events.(*Dispatcher).eventLoop: event dispatcher: started
[   0.521]    INFO storage/boltz.(*migrationManager).Migrate.func1: fabric datastore is up to date at version 5
[   0.524]    INFO fabric/controller/network.(*Network).showOptions: network = {
  "CycleSeconds": 60,
  "Smart": {
    "RerouteFraction": 0.02,
    "RerouteCap": 4
  },
  "RouteTimeout": 10000000000,
  "CreateCircuitRetries": 2,
  "CtrlChanLatencyInterval": 10000000000,
  "PendingLinkTimeout": 10000000000,
  "MinRouterCost": 10,
  "RouterConnectChurnLimit": 60000000000,
  "InitialLinkLatency": 65000000000,
  "MetricsReportInterval": 60000000000
}
[   0.524]    INFO fabric/controller.(*Controller).showOptions: ctrl = {
  "OutQueueSize": 4,
  "MaxQueuedConnects": 1,
  "MaxOutstandingConnects": 16,
  "ConnectTimeout": 1000000000,
  "DelayRxStart": false,
  "WriteTimeout": 0,
  "NewListener": null
}
[   1.162]    INFO edge/controller/server.NewController: edge controller instance id: cl8mk84kr00006pjvktqp1aov
[   1.162]    INFO edge/controller/server.(*Controller).Initialize: initializing edge
[   1.166]    INFO storage/boltz.(*migrationManager).Migrate.func1: edge datastore is up to date at version 29
[   1.169]    INFO edge/controller/internal/policy.NewSessionEnforcer: {sessionTimeout=[30m0s] frequency=[5s]} session enforcer configured
[   1.169]    INFO edge/controller/server.(*Controller).Run: starting edge
[   1.173]    INFO fabric/metrics.GoroutinesPoolMetricsConfigF.func1.1: {maxWorkers=[16] idleTime=[10s] maxQueueSize=[1] poolType=[pool.listener.ctrl] minWorkers=[1]} starting goroutine pool
[   1.173]    INFO fabric/controller/handler_ctrl.(*CtrlAccepter).Run: started
[   1.174]    INFO fabric/controller/network.(*Network).Run: started
[   1.299]    INFO xweb/v2.(*Server).Start: starting ApiConfig to listen and serve tls on 0.0.0.0:8262 for server client-management with APIs: [edge-management edge-client fabric]
[   2.843]    INFO fabric/controller/handler_ctrl.(*bindHandler).BindChannel: {routerId=[HZSpDtw90] routerVersion=[v0.26.5]} router supports heartbeats
[   2.843]    INFO fabric/controller/handler_ctrl.(*CtrlAccepter).Bind: {routerId=[HZSpDtw90]} accepted new router connection [r/HZSpDtw90]
[   2.844] WARNING channel.(*channelImpl).rxer [ch{HZSpDtw90}->u{classic}->i{y5xY}]: dropped message. type [-33], sequence [-1], replyFor [-1]
[   2.852]    INFO edge/controller/env.(*Broker).RouterConnected.func1: {routerId=[HZSpDtw90] routerFingerprint=[0xc001d04480] routerName=[er1]} broker detected edge router with id HZSpDtw90 connecting
[   2.852]    INFO edge/controller/sync_strats.(*InstantStrategy).RouterConnected: {routerName=[er1] routerFingerprint=[0xc001d04480] sync_strategy=[instant] syncStatus=[SYNC_QUEUED] routerId=[HZSpDtw90]} edge router connected, adding to sync routerConnectedQueue
[   2.852]    INFO edge/controller/sync_strats.(*InstantStrategy).hello: {routerChannelIsOpen=[true] routerId=[HZSpDtw90] routerName=[er1] routerFingerprint=[0xc001d04480] routerTxId=[-ew0A777Y] strategy=[instant]} edge router sync starting
[   2.852]    INFO edge/controller/sync_strats.(*InstantStrategy).hello: {routerFingerprint=[0xc001d04480] routerTxId=[-ew0A777Y] strategy=[instant] routerChannelIsOpen=[true] routerId=[HZSpDtw90] routerName=[er1] syncStatus=[SYNC_HELLO]} sending edge router hello
[   2.853]    INFO edge/controller/sync_strats.(*InstantStrategy).ReceiveClientHello: {listeners=[[address:{value:"tls:0.0.0.0:8888"  protocol:"tls"  hostname:"0.0.0.0"  port:8888}  advertise:{value:"er1.example.com:8888"  protocol:"tls"  hostname:"er1.example.com"  port:8888}]] routerId=[HZSpDtw90] os=[linux] protocolPorts=[[8888]] buildDate=[2022-08-17T14:01:21Z] arch=[amd64] revision=[e5d4b19e3568] version=[v0.26.5] routerChannelIsOpen=[true] routerTxId=[-ew0A777Y] protocols=[[tls]] routerFingerprint=[0xc001d04480] strategy=[instant] routerName=[er1] data=[map[]]} edge router sent hello with version [v0.26.5] to controller with version [v0.26.5]
[   2.853]    INFO edge/controller/sync_strats.(*InstantStrategy).synchronize: {routerTxId=[-ew0A777Y] routerId=[HZSpDtw90] routerName=[er1] routerFingerprint=[0xc001d04480] routerChannelIsOpen=[true] strategy=[instant]} started synchronizing edge router
[   2.854]   ERROR fabric/controller/handler_ctrl.(*routerLinkHandler).HandleLinks [ch{HZSpDtw90}->u{classic}->i{y5xY}]: {routerId=[HZSpDtw90] linkId=[18QrDqqxAqYFkdPY6mJC2p] destRouterId=[lAJSZUj5SG] error=[destination router not connected]} unexpected error adding router reported link
[   2.854]   ERROR fabric/controller/handler_ctrl.(*routerLinkHandler).HandleLinks [ch{HZSpDtw90}->u{classic}->i{y5xY}]: {error=[destination router not connected] routerId=[HZSpDtw90] linkId=[2yXnn9clLvRIwoizTDYtNr] destRouterId=[Kp5Sm465SG]} unexpected error adding router reported link
[   2.870]    INFO fabric/controller/handler_ctrl.(*bindHandler).BindChannel: {routerId=[Kp5Sm465SG] routerVersion=[v0.26.5]} router supports heartbeats
[   2.870]    INFO fabric/controller/handler_ctrl.(*CtrlAccepter).Bind: {routerId=[Kp5Sm465SG]} accepted new router connection [r/Kp5Sm465SG]
[   2.871] WARNING channel.(*channelImpl).rxer [ch{Kp5Sm465SG}->u{classic}->i{vnpy}]: dropped message. type [-33], sequence [-1], replyFor [-1]
[   2.874]    INFO edge/controller/env.(*Broker).RouterConnected.func1: {routerFingerprint=[0xc00328d780] routerId=[Kp5Sm465SG] routerName=[er2]} broker detected edge router with id Kp5Sm465SG connecting
[   2.874]    INFO edge/controller/sync_strats.(*InstantStrategy).RouterConnected: {sync_strategy=[instant] syncStatus=[SYNC_QUEUED] routerId=[Kp5Sm465SG] routerName=[er2] routerFingerprint=[0xc00328d780]} edge router connected, adding to sync routerConnectedQueue
[   2.874]    INFO edge/controller/sync_strats.(*InstantStrategy).hello: {routerName=[er2] routerFingerprint=[0xc00328d780] routerTxId=[EeR00777p] routerChannelIsOpen=[true] strategy=[instant] routerId=[Kp5Sm465SG]} edge router sync starting
[   2.874]    INFO edge/controller/sync_strats.(*InstantStrategy).hello: {routerTxId=[EeR00777p] routerChannelIsOpen=[true] strategy=[instant] routerId=[Kp5Sm465SG] routerName=[er2] routerFingerprint=[0xc00328d780] syncStatus=[SYNC_HELLO]} sending edge router hello
[   2.883]    INFO edge/controller/sync_strats.(*InstantStrategy).ReceiveClientHello: {routerName=[er2] version=[v0.26.5] buildDate=[2022-08-17T14:01:21Z] arch=[amd64] revision=[e5d4b19e3568] routerTxId=[EeR00777p] protocols=[[tls]] routerChannelIsOpen=[true] data=[map[]] protocolPorts=[[8888]] listeners=[[address:{value:"tls:0.0.0.0:8888"  protocol:"tls"  hostname:"0.0.0.0"  port:8888}  advertise:{value:"er2.example.com:8888"  protocol:"tls"  hostname:"er2.example.com"  port:8888}]] strategy=[instant] routerId=[Kp5Sm465SG] routerFingerprint=[0xc00328d780] os=[linux]} edge router sent hello with version [v0.26.5] to controller with version [v0.26.5]
[   2.883]    INFO edge/controller/sync_strats.(*InstantStrategy).synchronize: {routerFingerprint=[0xc00328d780] routerName=[er2] strategy=[instant] routerTxId=[EeR00777p] routerId=[Kp5Sm465SG] routerChannelIsOpen=[true]} started synchronizing edge router
[   2.898]    INFO fabric/controller/handler_ctrl.(*routerLinkHandler).HandleLinks [ch{Kp5Sm465SG}->u{classic}->i{vnpy}]: {routerId=[Kp5Sm465SG] linkId=[2yXnn9clLvRIwoizTDYtNr] destRouterId=[HZSpDtw90]} router reported link added
[   2.898]   ERROR fabric/controller/handler_ctrl.(*routerLinkHandler).HandleLinks [ch{Kp5Sm465SG}->u{classic}->i{vnpy}]: {routerId=[Kp5Sm465SG] linkId=[swEnD8pHrXclpNdHNw5ZX] destRouterId=[lAJSZUj5SG] error=[destination router not connected]} unexpected error adding router reported link
[   2.920]    INFO edge/controller/sync_strats.(*InstantStrategy).synchronize.func1: {routerFingerprint=[0xc00328d780] routerTxId=[EeR00777p] routerId=[Kp5Sm465SG] routerName=[er2] routerChannelIsOpen=[true] strategy=[instant]} exiting synchronization, final status: SYNC_DONE
[   2.928]    INFO edge/controller/sync_strats.(*InstantStrategy).synchronize.func1: {routerId=[HZSpDtw90] routerTxId=[-ew0A777Y] strategy=[instant] routerName=[er1] routerFingerprint=[0xc001d04480] routerChannelIsOpen=[true]} exiting synchronization, final status: SYNC_DONE
[   2.940]   ERROR fabric/controller/api.(*timeoutHandler).ServeHTTP.func1.1: panic caught by timeout next: runtime error: invalid memory address or nil pointer dereference
goroutine 113 [running]:
github.com/openziti/foundation/v2/debugz.generateStack(0x2000, 0xed?)
	github.com/openziti/foundation/v2@v2.0.2/debugz/stack.go:38 +0x4a
github.com/openziti/foundation/v2/debugz.GenerateLocalStack(...)
	github.com/openziti/foundation/v2@v2.0.2/debugz/stack.go:33
github.com/openziti/fabric/controller/api.(*timeoutHandler).ServeHTTP.func1.1()
	github.com/openziti/fabric@v0.19.34/controller/api/timeouts.go:91 +0xfa
panic({0x2100160, 0x35af3d0})
	runtime/panic.go:838 +0x207
github.com/openziti/x509-claims/x509claims.(*ProviderBasic).Claims(...)
	github.com/openziti/x509-claims@v1.0.2/x509claims/cliam.go:71
github.com/openziti/edge/controller/model.(*Ca).GetExternalId(0xc0030d21a0, 0xc002aa3488?)
	github.com/openziti/edge@v0.22.54/controller/model/ca_model.go:246 +0x448
github.com/openziti/edge/controller/model.(*AuthModuleCert).Process(0xc001c68700, {0x281fdd8, 0xc00236cd00})
	github.com/openziti/edge@v0.22.54/controller/model/authenticator_mod_cert.go:152 +0x592
github.com/openziti/edge/controller/model.(*AuthenticatorManager).Authorize(0xc001afb800?, {0x281fdd8, 0xc00236cd00})
	github.com/openziti/edge@v0.22.54/controller/model/authenticator_manager.go:81 +0x82
github.com/openziti/edge/controller/internal/routes.(*AuthRouter).authHandler(0xc0004ae570, 0xc001c286c0, 0xc002b92540, 0x1bc0c74?, {0xc002968f69, 0x4}, 0xc002eea910)
	github.com/openziti/edge@v0.22.54/controller/internal/routes/authenticate_router.go:82 +0x17c
github.com/openziti/edge/controller/internal/routes.(*AuthRouter).Register.func1.1(0x98?, 0x280fd40?)
	github.com/openziti/edge@v0.22.54/controller/internal/routes/authenticate_router.go:58 +0x36
github.com/openziti/edge/controller/env.(*AppEnv).IsAllowed.func1({0x281d040, 0xc0020ccfc0}, {0x280fd40, 0x26395f8})
	github.com/openziti/edge@v0.22.54/controller/env/appenv.go:596 +0x38d
github.com/go-openapi/runtime/middleware.ResponderFunc.WriteResponse(0x20e5120?, {0x281d040?, 0xc0020ccfc0?}, {0x280fd40?, 0x26395f8?})
	github.com/go-openapi/runtime@v0.24.1/middleware/context.go:69 +0x3d
github.com/go-openapi/runtime/middleware.(*Context).Respond(0xc0017a93b0, {0x281d040?, 0xc0020ccfc0}, 0xc001afb800, {0xc0017356b0?, 0x1, 0x1}, 0xc001afb700, {0x211da60, 0xc0020cd140})
	github.com/go-openapi/runtime@v0.24.1/middleware/context.go:510 +0x59a
github.com/openziti/edge/rest_client_api_server/operations/authentication.(*Authenticate).ServeHTTP(0xc001381590, {0x281d040, 0xc0020ccfc0}, 0xc001afb800)
	github.com/openziti/edge@v0.22.54/rest_client_api_server/operations/authentication/authenticate.go:81 +0x1bf
github.com/go-openapi/runtime/middleware.NewOperationExecutor.func1({0x281d040, 0xc0020ccfc0}, 0xc001afb800)
	github.com/go-openapi/runtime@v0.24.1/middleware/operation.go:28 +0x59
net/http.HandlerFunc.ServeHTTP(0xc002d8fa48?, {0x281d040?, 0xc0020ccfc0?}, 0x0?)
	net/http/server.go:2084 +0x2f
github.com/go-openapi/runtime/middleware.NewRouter.func1({0x281d040, 0xc0020ccfc0}, 0xc001afb600)
	github.com/go-openapi/runtime@v0.24.1/middleware/router.go:78 +0x257
net/http.HandlerFunc.ServeHTTP(0xc002e8c468?, {0x281d040?, 0xc0020ccfc0?}, 0xc002d8fbc8?)
	net/http/server.go:2084 +0x2f
github.com/go-openapi/runtime/middleware.Redoc.func1({0x281d040, 0xc0020ccfc0}, 0xa50000c001c75f90?)
	github.com/go-openapi/runtime@v0.24.1/middleware/redoc.go:72 +0x242
net/http.HandlerFunc.ServeHTTP(0xc002d8fc90?, {0x281d040?, 0xc0020ccfc0?}, 0x201bc00?)
	net/http/server.go:2084 +0x2f
github.com/go-openapi/runtime/middleware.Spec.func1({0x281d040, 0xc0020ccfc0}, 0xc000a00c60?)
	github.com/go-openapi/runtime@v0.24.1/middleware/spec.go:46 +0x18c
net/http.HandlerFunc.ServeHTTP(0xc002b92540?, {0x281d040?, 0xc0020ccfc0?}, 0xc001afb600?)
	net/http/server.go:2084 +0x2f
github.com/openziti/edge/controller/server.ClientApiHandler.newHandler.func1({0x281d040, 0xc0020ccfc0}, 0xc001afb600)
	github.com/openziti/edge@v0.22.54/controller/server/client-api.go:193 +0x330
net/http.HandlerFunc.ServeHTTP(0xc000a00b70?, {0x281d040?, 0xc0020ccfc0?}, 0xb?)
	net/http/server.go:2084 +0x2f
github.com/gorilla/handlers.(*cors).ServeHTTP(0xc001db8bd0, {0x281d040, 0xc0020ccfc0}, 0xc001afb600)
	github.com/gorilla/handlers@v1.5.1/cors.go:54 +0x370
github.com/openziti/fabric/controller/api.(*timeoutHandler).ServeHTTP.func1()
	github.com/openziti/fabric@v0.19.34/controller/api/timeouts.go:95 +0x7c
created by github.com/openziti/fabric/controller/api.(*timeoutHandler).ServeHTTP
	github.com/openziti/fabric@v0.19.34/controller/api/timeouts.go:88 +0x430

[   3.093]    INFO fabric/controller/network.(*Network).Run: {routerId=[HZSpDtw90]} changed router
[   3.121]    INFO fabric/controller/network.(*Network).Run: {routerId=[Kp5Sm465SG]} changed router
[   3.782] WARNING fabric/controller/handler_ctrl.(*CtrlAccepter).Bind: {routerId=[lAJSZUj5SG]} no advertised listeners
[   3.782]    INFO fabric/controller/handler_ctrl.(*bindHandler).BindChannel: {routerVersion=[v0.26.5] routerId=[lAJSZUj5SG]} router supports heartbeats
[   3.782]    INFO fabric/controller/handler_ctrl.(*CtrlAccepter).Bind: {routerId=[lAJSZUj5SG]} accepted new router connection [r/lAJSZUj5SG]
[   3.783]    INFO edge/controller/env.(*Broker).RouterConnected.func1: {routerId=[lAJSZUj5SG] routerName=[corp-router1] routerFingerprint=[0xc0029f0d50]} broker detected edge router with id lAJSZUj5SG connecting
[   3.783]    INFO edge/controller/sync_strats.(*InstantStrategy).RouterConnected: {syncStatus=[SYNC_QUEUED] routerId=[lAJSZUj5SG] routerName=[corp-router1] routerFingerprint=[0xc0029f0d50] sync_strategy=[instant]} edge router connected, adding to sync routerConnectedQueue
[   3.783]    INFO edge/controller/sync_strats.(*InstantStrategy).hello: {routerChannelIsOpen=[true] routerTxId=[6EW0A77bY] routerId=[lAJSZUj5SG] routerName=[corp-router1] strategy=[instant] routerFingerprint=[0xc0029f0d50]} edge router sync starting
[   3.783]    INFO edge/controller/sync_strats.(*InstantStrategy).hello: {routerId=[lAJSZUj5SG] routerName=[corp-router1] syncStatus=[SYNC_HELLO] strategy=[instant] routerFingerprint=[0xc0029f0d50] routerChannelIsOpen=[true] routerTxId=[6EW0A77bY]} sending edge router hello
[   3.793] WARNING channel.(*channelImpl).rxer [ch{lAJSZUj5SG}->u{classic}->i{vExM}]: dropped message. type [-33], sequence [-1], replyFor [-1]
[   3.804]    INFO fabric/controller/handler_ctrl.(*routerLinkHandler).HandleLinks [ch{lAJSZUj5SG}->u{classic}->i{vExM}]: {destRouterId=[HZSpDtw90] routerId=[lAJSZUj5SG] linkId=[18QrDqqxAqYFkdPY6mJC2p]} router reported link added
[   3.804]    INFO fabric/controller/handler_ctrl.(*routerLinkHandler).HandleLinks [ch{lAJSZUj5SG}->u{classic}->i{vExM}]: {linkId=[swEnD8pHrXclpNdHNw5ZX] destRouterId=[Kp5Sm465SG] routerId=[lAJSZUj5SG]} router reported link added
[   3.804]    INFO edge/controller/sync_strats.(*InstantStrategy).ReceiveClientHello: {routerTxId=[6EW0A77bY] routerChannelIsOpen=[true] buildDate=[2022-08-17T14:01:21Z] version=[v0.26.5] routerName=[corp-router1] protocolPorts=[[443]] strategy=[instant] routerFingerprint=[0xc0029f0d50] os=[linux] arch=[amd64] protocols=[[tls]] listeners=[[address:{value:"tls:0.0.0.0:443"  protocol:"tls"  hostname:"0.0.0.0"  port:443}  advertise:{value:"er1:443"  protocol:"tls"  hostname:"er1"  port:443}]] revision=[e5d4b19e3568] routerId=[lAJSZUj5SG] data=[map[]]} edge router sent hello with version [v0.26.5] to controller with version [v0.26.5]
[   3.804]    INFO edge/controller/sync_strats.(*InstantStrategy).synchronize: {routerTxId=[6EW0A77bY] routerId=[lAJSZUj5SG] routerName=[corp-router1] routerFingerprint=[0xc0029f0d50] routerChannelIsOpen=[true] strategy=[instant]} started synchronizing edge router
[   3.820]    INFO edge/controller/sync_strats.(*InstantStrategy).synchronize.func1: {routerFingerprint=[0xc0029f0d50] routerId=[lAJSZUj5SG] routerChannelIsOpen=[true] strategy=[instant] routerName=[corp-router1] routerTxId=[6EW0A77bY]} exiting synchronization, final status: SYNC_DONE
[   4.033]    INFO fabric/controller/network.(*Network).Run: {routerId=[lAJSZUj5SG]} changed router
[   5.266]   ERROR fabric/controller/api.(*timeoutHandler).ServeHTTP.func1.1: panic caught by timeout next: runtime error: invalid memory address or nil pointer dereference

Up on closer look, upgrade might be a red herring and I am wondering if something went wrong with CA ( considering the error wrt externalId ) and went unnoticed until I tried the upgrade and watched the logs.

P.S couldn’t help but notice the typo in the name cliam.go :slight_smile:

Before upgrading, you probably should backup the database. If there’s anything that gets migrated, schema wise, you’ll want that undone when reverting back. That might be an issue?

I see the panic in your logs, but nothing clicks with me immediately.

In the past there have indeed been bugs around certificates expiring and a restart causing an issue but I haven’t seen one for a while. Could be related.

I’ll ping a couple others and see if this is a known issue or not.

I created this issue for you in the ziti repo for OpenZiti. I’ll be taking a look at it this morning.

1 Like

PRs for the issue are moving up to the ziti repo now with fixes and new tests.

Thanks for the report!

1 Like

Great. As always, appreciate the quick response!