Unable to forward payload

With ziti-router 1.6.7 I have too much errors. Thus the network is slow.

{
  "_context": "{c/4CpuIMbl75baXLw6iK4v3J|@/7ADl07XYWjSjImnvZaTi05}<Initiator>",
  "circuitId": "4CpuIMbl75baXLw6iK4v3J",
  "error": "cannot forward payload, no forward table for circuit=4CpuIMbl75baXLw6iK4v3J src=7ADl07XYWjSjImnvZaTi05",
  "file": "github.com/openziti/ziti/router/handler_xgress/data_plane.go:58",
  "func": "github.com/openziti/ziti/router/handler_xgress.(*dataPlaneAdapter).ForwardPayload",
  "level": "error",
  "msg": "unable to forward payload",
  "origin": 0,
  "seq": 12,
  "time": "2025-09-10T17:25:36.019Z"
}

{
  "_context": "{c/1rpw2FNucpisqlQkWXcql9|@/6Cey3O5tmGwnvK6eoV0llZ}<Initiator>",
  "circuitId": "1rpw2FNucpisqlQkWXcql9",
  "error": "cannot forward payload, no forward table for circuit=1rpw2FNucpisqlQkWXcql9 src=6Cey3O5tmGwnvK6eoV0llZ",
  "file": "github.com/openziti/ziti/router/handler_xgress/data_plane.go:58",
  "func": "github.com/openziti/ziti/router/handler_xgress.(*dataPlaneAdapter).ForwardPayload",
  "level": "error",
  "msg": "unable to forward payload",
  "origin": 0,
  "seq": 12,
  "time": "2025-09-10T17:26:36.048Z"
}

{
  "_context": "{c/3KUKd4O50bmdHB6omCfHmg|@/74Z9Ibtv32clR4D1DN1I2t}<Initiator>",
  "circuitId": "3KUKd4O50bmdHB6omCfHmg",
  "error": "cannot forward payload, no forward table for circuit=3KUKd4O50bmdHB6omCfHmg src=74Z9Ibtv32clR4D1DN1I2t",
  "file": "github.com/openziti/ziti/router/handler_xgress/data_plane.go:58",
  "func": "github.com/openziti/ziti/router/handler_xgress.(*dataPlaneAdapter).ForwardPayload",
  "level": "error",
  "msg": "unable to forward payload",
  "origin": 0,
  "seq": 12,
  "time": "2025-09-10T17:27:36.115Z"
}

Are the errors themselves the problem, or is the logging of errors slowing things down? If you set the log level to fatal, does that resolve the issue?

You use the following to set the log level at runtime.

ziti agent set-log-level fatal -t router

Paul

The errors are problematic because this slows down the services.

As the result the network 1.6.7 is slow compared to 1.5.4

Routers 1.5.4 work fine with ziti-controller 1.6.7. => Thus the problem is in the router’s code/configuration.

The errors only happen after a circuit is complete and is being torn down. So they are not slowing down services, unless the logging is being overwhelmed by an excess of messages.

This is the design :slight_smile:

Unfortunately the ziti-router 1.6.7 generates much more errors, approximately by a factor of several hundreds. This is why the services are slow compared to 1.5.4

They both work with ziti-controller 1.6.7.

The only way to get back to normal is to downgrade the ziti-router to 1.5.4

There is some misunderstanding: hiding errors will not be helpful.

I am trying to understated why router 1.5.4 runs smoothly but 1.6.7 has such difficulty to handle the payload without disruption.

Clearly every host can handle these additional syslog messages without any noticeable impact on performance. But these errors lead to retransmission/permanent data loss. This is why the services are slower.

I can try and reproduce this, but i need to know:

  1. What ziti component is hosting the the service (ER/T, ZET, SDK (which sdk))?
  2. What ziti component is on the client side (ER/T, ZET, SDK (which sdk))?
  3. What does the traffic look like? If you want to be specific about what software is going over Ziti, that's helpful, but need to know protocol, traffic patterns, etc. Is it TCP/UDP? Is it HTTP/SSH, etc? Are you doing request/response/close or is it back and forth? Are you using TCP half-close?
  4. Can you quantify the issue? What throughput/latency are using seeing on 1.5.4 vs 1.6.7? Can you grab metrics and compare retransmission rates between the two? Are you seeing connections be unexpected terminated?

If you can provide specific instructions on how to reproduce the errors, that would be the most helpful, but if you can describe the data flows in detail, I may be able to reproduce the issue.

I did weeks of data flow testing before we released 1.6.7, and the test cases I have are working fine, so we need to figure what's different about your network traffic.

Surely, the team does a great job. Thank you!

Problem with performance might have a different root:

As you have explained above the error "unable to forward payload” arises when the service close connection.

Clearly a suffocating client open a bunch of connections in hope to get any reply from the service. But zrok shares are mainly single user services.

Firstly, multiplying connections does not help to get any response from the service.

Secondary, these connections create excess of circuits.

Next, the service will close these connections very soon after.

Finally, the router sends "unable to forward payload” into syslog.

As a solution I try to clone the zrok reserved shares so each user works with its own service.

Ok, we're making some progress on understanding the scenario.

So we've got zrok on the front-end and back-end.

  1. Are you running lots of connections over the same front-end and back-end or are you just loading down the one or the other?
  2. You are self-hosting zrok, correct?
  3. Can you either tell me what software you're running over zrok, or describe the traffic patterns? How much traffic is getting send in each direction, is it request/response or is it uncoupled, how large are payloads, etc,etc.
  4. Do you have the 'superNetwork' setting set to true?

Yes, you are right I run self-hosted zrok. There is a variety of applications: zrok http proxy, tcp tunneling, socks, vpn.

After your explanation I have done more tests. Sometime there is a relatively large number of connections to the same service. So zrok service can not handle the load.

I don’t use superNetwork.

As I understand this why the router sends a large number of errors:“unable to forward payload”.

The router simply saying that the service gives up and close the connection.

Users see that the service is slow.