Controllers Cluster issues

hi Team , last few days I been working on setting Controller Clusters. I like to check if others are facing similar issues . (I tried to search here but didn't get answers i need)

Scenario: Setup 4 Notes Controller Cluster .
Steps: Created Root CA , Node 1 2 3 4 Certs and no major issues

Problems: when started Controller 1 , no issues . Status shown ok

│ ID │ ADDRESS │ VOTER │ LEADER │ VERSION │ CONNECTED
ctrl1 │ tls:ctrl1.myziti.com:1280 │ true │ true │ v1.5.4 │ true

Added second node using ziti cluster agent add tls:ctl2.myziti.com:1280 shown success
and outputs

│ ID │ ADDRESS │ VOTER │ LEADER │ VERSION │ CONNECTED
ctrl1 │ tls:ctrl1.myziti.com:1280 │ true │ true │ v1.5.4 │ true
ctrl2 │ tls:ctrl1.myziti.com:1280 │ true │ false │ v1.5.4 │ true

while having 2 seem working fine but cannot failed over due to quorum , problems start after adding 3 nodes

node 1 outputs

│ ID │ ADDRESS │ VOTER │ LEADER │ VERSION │ CONNECTED
ctrl1 │ tls:ctrl1.myziti.com:1280 │ true │ true │ v1.5.4 │ true
ctrl2 │ tls:ctrl2.myziti.com:1280 │ true │ false │ v1.5.4 │ true
ctrl3 │ tls:ctrl3.myziti.com:1280 │ true │ false │ v1.5.4 │ true

node 2 outputs

│ ID │ ADDRESS │ VOTER │ LEADER │ VERSION │ CONNECTED
ctrl1 │ tls:ctrl1.myziti.com:1280 │ true │ true │ v1.5.4 │ true
ctrl2 │ tls:ctrl2.myziti.com:1280 │ true │ false │ v1.5.4 │ true
ctrl3 │ tls:ctrl3.myziti.com:1280 │ true │ false │ not connected │ false

node 3 outputs

│ ID │ ADDRESS │ VOTER │ LEADER │ VERSION │ CONNECTED
ctrl1 │ tls:ctrl1.myziti.com:1280 │ true │ true │ v1.5.4 │ true
ctrl2 │ tls:ctrl2.myziti.com:1280 │ true │ false │not connected │ false
ctrl3 │ tls:ctrl3.myziti.com:1280 │ true │ false │ v1.5.4 │ true

adding 4th node give mixed results , some nodes can see 4 nodes all connected, some nodes cannot see 1 of the node.

logs will see peer connected then being disconnected from ctl2 etc.
"msg":"peer disconnected","peerAddr":"tls:ctrl3.myziti.com:1280","peerId":"ctrl3"

if i shutdown 1 of the controller , let's say controller 1 , the other 2 will connect.
i am confused and hope to have some insights how to move forward.

Best regards

The raft model only requires that nodes be connected to the leader. So you query the leader, you should see all the other nodes connected. If you query a follower, it should have a connection to the leader. It may also have connections to other nodes, depending on what happened during the election phase.

Hope that's helpful,
Paul

1 Like

thanks! ok that make sense. because wasn't stated. I was expected to see all being connected. now it looks great.