Unused circuit timeouts / removals?

Hi, is there a way to remove old circuits automatically?

I have keycloak server with ZET installed on same server and it talks via ziti-routers to internal AD/LDAP server.
Now the problem is that it always opens new circuit and doesn’t close old ones. If I restart ZET on keycloak server then all circuits are closed/removed. But if I goto keycloak admin page and press 10 times “Test connection” on LDAP I get 10 more circuits???

So is there a way to remove unused circuits per services?

All other services doesn’t generate circuits like this keycloak.

Topology below. Single controller and 2x public routers and 2x internal routers. All in version 1.6.7. ZET is on version v1.7.11.

Hrmmm. I would think that's unexpected. You're certain nothing is holding it open, right?

Just tested with following

  1. turned off the keycloak service
  2. waited 30min
  3. checked circuits, still couple hundreds circuit left/open
  4. turned off the ZET on keycloak server
  5. checked circuits, all circuits were closed/disappeared from the list

So I am pretty sure that nothing keeps them open.

Is there away to (force) close circuit after x minutes?

We'll probably want @plorenz to have a look at this. Is there any way you have the full set of steps to reproduce the issue? I know you've provided some of them here but it would help to konw what uses the keycloak service, how, etc. A minimal example to reproduce would be much appreciated.

If I ziti-edge-tunnel dump (from keycloak server), is this idle_time in seconds and from last traffic event?

And state[CloseWrite] waiting to closing?

.....
nConnections:\nconn[8163/vle-QedV]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[9] idle_time[46661] sent[0] recv[0] recv_buff[0]\nconn[8086/02CUi9ta]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[10] idle_time[140758] sent[0] recv[0] recv_buff[0]\nconn[8079/gV7vbnZq]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[11] idle_time[149891] sent[0] recv[0] recv_buff[0]\nconn[8054/_Ok2GudB]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[10] idle_time[194349] sent[590] recv[44] recv_buff[0]\nconn[8042/4m-dM9m1]: state[Connected] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[10] idle_time[185110] sent[1110] recv[1567] recv_buff[0]\nconn[7654/bm8Rigbk]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[10] idle_time[794345] sent[590] recv[44] recv_buff[0]\nconn[7243/wEoQr-_N]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[10] idle_time[1394304] sent[590] recv[44] recv_buff[0]\nconn[6846/XYXQ_Nf2]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[11] idle_time[1994345] sent[590] recv[44] recv_buff[0]\nconn[6434/bp9_0w3H]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[10] idle_time[2594345] sent[590] recv[44] recv_buff[0]\nconn[6022/_4fYJBjK]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[38] idle_time[3194272] sent[590] recv[44] recv_buff[0]\nconn[5625/0i0QxYWB]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[10] idle_time[3794343] sent[590] recv[44] recv_buff[0]\nconn[5226/47XMrt2U]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[11] idle_time[4394342] sent[590] recv[44] recv_buff[0]\nconn[4815/kFPaLrjf]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[10] idle_time[4994345] sent[590] recv[44] recv_buff[0]\nconn[4007/R6OqL5xE]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[0/ozrb2]\n\tconnect_time[41] idle_time[6194203] sent[590] recv[44] recv_buff[0]\nconn[3595/BEGXcjYC]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[11] idle_time[6794320] sent[590] recv[44] recv_buff[0]\nconn[3198/3jPkGX7I]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[10] idle_time[7394346] sent[590] recv[44] recv_buff[0]\nconn[2799/cLRN1jxO]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[10] idle_time[7994348] sent[590] recv[44] recv_buff[0]\nconn[2388/fx1cBx9T]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[10] idle_time[8594316] sent[590] recv[44] recv_buff[0]\nconn[1991/655tRh5T]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[10] idle_time[9194347] sent[590] recv[44] recv_buff[0]\nconn[1579/eN3JCk6V]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[11] idle_time[9794349] sent[590] recv[44] recv_buff[0]\nconn[1167/8X6lg4oj]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[11] idle_time[10394352] sent[590] recv[44] recv_buff[0]\nconn[1153/dQLzjnHh]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[11] idle_time[10414459] sent[0] recv[0] recv_buff[0]\nconn[769/JJoYh9re]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[11] idle_time[10994345] sent[590] recv[44] recv_buff[0]\nconn[720/4rGn6TwI]: state[Connected] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[10] idle_time[11068093] sent[471] recv[553] recv_buff[0]\nconn[42/3RS9Jh25]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[10] idle_time[12053294] sent[0] recv[0] recv_buff[0]\nconn[41/9GcpQG0p]: state[CloseWrite] service[srv-id-keycoak-ad] using ch[1/ozrb3]\n\tconnect_time[11] idle_time[12194350] sent[590] recv[44] recv_buff[0]\nconn[40/fMLA0EIu]: state[Connected] service[wazuh-agents] using ch[1/ozrb3]\n\tconnect_time[148] idle_time[1175] sent[3048202] recv[110285] recv_buff[0]\nconn[1]: server service[zabbix-id-keycloak] terminators[2]\n\t binding[ozrb2]\n\t binding[ozrb3]\n\n==================\n\n"

I need to test on my dev enviroment if I can get steps reproduce it….

I think state CloseWrite in the ZET means that the write side of the connection has been closed, but the read side is still open.

Do you see connections in the ZET dump for each circuit that you would expect to be closed?

Could you run ziti fabric inspect circuit <circuit id> for a circuit id that you think should be closed?

Thank you,
Paul

I cannot find matching data from ZET dump to circuit list :thinking:

I inspected one older circuit (below) on the circuits list and one suspicious was that
Initiator showed timeSinceLastRetx: 33h29m43.902s
Terminator showed timeSinceLastRetx: 488024h40m40.675s :face_with_monocle:
considering that all servers were rebooted on two weeks ago (also all ziti components)…

Versions what I use is

  • keycloak server:ZET v1.7.11 (Ubuntu 22.04)
  • Router & controller 1.6.7 (Ubuntu 24.04 & Debian 13)
  • AD server: ZDEW 2.7.1.8 (Window server 2025) this now in testing…
timo@TIMO-P14s:~$ ziti fabric inspect circuit 11go7kprdMOUaKJYzWqfB
Results: (2)
YjU85FQMO.circuit:11go7kprdMOUaKJYzWqfB
circuitId: 11go7kprdMOUaKJYzWqfB
errors: null
forwards:
  3QGizW5vJYFnpFG6DqaTpS: 4KHHGvOIsUtldlVacy52jq
  4KHHGvOIsUtldlVacy52jq: 3QGizW5vJYFnpFG6DqaTpS
relatedEntities:
  link:
    4KHHGvOIsUtldlVacy52jq:
      connStateIteration: 4
      connections:
      - Dest: tcp:xxx:36508
        Source: tcp:xxx:80
        Type: link.default
      - Dest: tcp:xxx:36514
        Source: tcp:xxx:80
        Type: link.default
      - Dest: tcp:xxx:36518
        Source: tcp:xxx:80
        Type: link.ack
      - Dest: tcp:xxx:36530
        Source: tcp:xxx:80
        Type: link.default
      dest: ePZkDbQx74
      destVersion: v1.6.7
      dialAddress: tls:ozrb3.domain.com:80
      dialed: false
      id: 4KHHGvOIsUtldlVacy52jq
      iteration: 1
      key: default->tls:ePZkDbQx74->default
      protocol: tls
      split: false
      underlays:
        link.ack: 1
        link.default: 3
xgressDetails:
  3QGizW5vJYFnpFG6DqaTpS:
    address: 3QGizW5vJYFnpFG6DqaTpS
    flags: "10100"
    goroutines: null
    lastSizeSent: 0
    linkSendBufferPointer: "0xc000c90b00"
    originator: Initiator
    recvBufferDetail:
      acquiredSafely: true
      maxSequence: 3
      nextPayload: none
      payloadCount: 0
      sequence: 3
      size: 0
    sendBufferDetail:
      accumulator: 665
      acquiredSafely: true
      blockedByLocalWindow: false
      blockedByRemoteWindow: false
      closeWhenEmpty: false
      closed: false
      duplicateAcks: 0
      linkRecvBufferSize: 0
      linkSendBufferSize: 0
      queuedPayloadCount: 0
      retransmits: 0
      retxScale: 1.5
      retxThreshold: 10
      successfulAcks: 6
      timeSinceLastRetx: 33h29m43.902s
      windowSize: 4194304
    sequence: 6
    timeSinceLastLinkRx: 33h29m43.872s
    xgressPointer: "0xc002b982a0"

ePZkDbQx74.circuit:11go7kprdMOUaKJYzWqfB
circuitId: 11go7kprdMOUaKJYzWqfB
errors: null
forwards:
  4KHHGvOIsUtldlVacy52jq: LCcmjXEY3UBOlCuwDUl4k
  LCcmjXEY3UBOlCuwDUl4k: 4KHHGvOIsUtldlVacy52jq
relatedEntities:
  link:
    4KHHGvOIsUtldlVacy52jq:
      connStateIteration: 4
      connections:
      - Dest: tcp:xxx:80
        Source: tcp:192.168.110.2:36508
        Type: link.default
      - Dest: tcp:xxx:80
        Source: tcp:192.168.110.2:36514
        Type: link.default
      - Dest: tcp:xxx:80
        Source: tcp:192.168.110.2:36518
        Type: link.ack
      - Dest: tcp:xxx:80
        Source: tcp:192.168.110.2:36530
        Type: link.default
      dest: YjU85FQMO
      destVersion: v1.6.7
      dialAddress: tls:ozrb3.domain.com:80
      dialed: true
      id: 4KHHGvOIsUtldlVacy52jq
      iteration: 1
      key: default->tls:YjU85FQMO->default
      protocol: tls
      split: false
      underlays:
        link.ack: 1
        link.default: 3
xgressDetails:
  LCcmjXEY3UBOlCuwDUl4k:
    address: LCcmjXEY3UBOlCuwDUl4k
    flags: "10"
    goroutines: null
    lastSizeSent: 0
    linkSendBufferPointer: "0xc001dd7550"
    originator: Terminator
    recvBufferDetail:
      acquiredSafely: true
      maxSequence: 5
      nextPayload: none
      payloadCount: 0
      sequence: 5
      size: 0
    sendBufferDetail:
      accumulator: 102
      acquiredSafely: false
      blockedByLocalWindow: false
      blockedByRemoteWindow: false
      closeWhenEmpty: true
      closed: true
      duplicateAcks: 0
      linkRecvBufferSize: 0
      linkSendBufferSize: 0
      queuedPayloadCount: 0
      retransmits: 0
      retxScale: 1.5
      retxThreshold: 10
      successfulAcks: 4
      timeSinceLastRetx: 488024h40m40.675s
      windowSize: 4194304
    sequence: 4
    timeSinceLastLinkRx: 33h29m43.839s
    xgressPointer: "0xc0021e47e0"

For the testing I changed service to use ZDEW on terminating end (was router) and what I have now tested I don’t see old/hanging connections on ZET dump. Also if I ziti fabric list circuits 'limit none' I see only one circuit.

So ZET→Router→AD server doesn’t close circuits but ZET→ZDEW (on AD server) closes :thinking:

I think I have an idea for what's going on here. The flow-control behavior changed in 1.6.+ to allow one side of the connection to close without closing both sides. The issue is that we're not closing the connection on EOF on the read, we're just closing half of it. If the other side doesn't close based on the EOF, then we end up stuck. I'll have to do some testing to see in what circumstances this happens.

thank you for the debug info,
Paul

1 Like

I was able to reproduce the issue with a test and have a fix up for review, so hopefully your issue will be resolved with the next release (likely 1.7.0)

Paul

2 Likes

FYI: The fix was released in 1.6.9.

Thanks again,
Paul