Dual nic, openziti creating static route to controller over wrong NIC

I’m cross-posting from a reddit thread as I was told this is where to officially go for support. The reddit thread is here: Reddit - The heart of the internet

My issue is this - I’ve got an windows client with two nics. One NIC is internet routed, one isn’t. The system is an Amazon Workspaces system where both nics are required. One is for control, the other is for internet access.

The openziti client is creating a windows static route to my controller IP but it’s creating it on the wrong NIC with the wrong gateway. It breaks everything until I manually remove the route and re-create it on the correct NIC. Unfortunately, the client’s policy/watchdog keeps putting the bad route back. How do I fix this?

Hi @ed_schuy8723, welcome to this community and to OpenZiti! :slight_smile:

I missed that you said windows before, that’s my bad. I don’t believe we have any way to control which interface routes are added to. @scareything, there’s no workaround for this right now is there? Should we enter a feature request to support overriding which interface to use? It seems like the interface we use is whatever the default route is setup for.

@ed_schuy8723 you can’t change the default route at all, right? You can see in the logs what nic it uses for the route. My logs show:

[2025-08-05T17:45:23.610Z]    INFO ziti-edge-tunnel:tun.c:449 if_change_cb() default route is now via if_idx[5]

Powershell seems to agree as well:

(Get-NetRoute -DestinationPrefix 0.0.0.0/0 | Sort-Object RouteMetric)[0].InterfaceIndex
5

Btw the route that we’re talking about here is created to prevent a crazy service configuration (e.g. intercepting 0.0.0.0/0) from breaking connections to the routers and/or controllers. So your workaround of moving the route could be simplified by removing the route without replacing it (assuming you aren’t trying to intercept CIDRs that match your router/ctrl IPs.

As @TheLumberjack points out the windows implementation currently puts these “exclusion” routes on the interface that carries the default route. I’m not that handy with the Windows networking APIs, but I’d think we could test the address that we’re about to create an exclusion route for and get that interface instead of assuming the default route’s interface. I’m hoping Windows has something similar to ip route get on Linux.

1 Like

Ok - What you’re telling me clears things up and makes some sense. Things are working for me right now because I rebooted (as is usually the case). Things usually break after a suspend/resume.

So in my case, AWS seems to give each NIC a gateway and 0.0.0.0/0 route. My 10.x.x.x NIC is the internet connected one. The 198.x.x.x isn’t.

I have no idea why AWS does this but I don’t think it’s in my best interest to mess with it. The interface with the internet route has the lower metric (as would be expected). I’m wondering if how AWS handles it’s suspend/resume features breaks this somehow. Will try to catch what changes when it breaks again and follow-up with logs.

So the problem is happening again for me and I think I can guess what’s going on. Here’s my ‘route print’ command for today:

Nothing changed except the order in which the first two routes are listed. That implies to me the client is blindly grabbing the first route in an enumeration statement. It’s assuming there could be only one gateway and not evaluating anything else. Would a simple fix be to grab all 0.0.0.0/0 routes, and select the one with the lowest metric? Then tie the controller route to that?

I just confirmed this is the issue. I wrote a powershell script that deleted and re-added the control network 0.0.0.0/0 route in order to force it down on the list. Then I restarted the openziti client service. Openziti picked the first route again, which was the right one this time and worked.

Thanks for the update. The client determines the default route in a callback that is called by Windows when the WinTun interface is created (or changed). I suspect we need to set up a different callback to notice the routing change that you’re seeing, since it does not involve the WinTun device. Let me read some win32 docs.

Can you send the openziti client logs that cover the time span of the most recent route change?

If you go to Main Menu → Feedback a zip file with a bunch of logs and stuff will be generated. You can share that with us via DM here (don’t post it to this thread) or email it to me with clint at openziti.org and I’ll get it to shawn (and look at it myself maybe). :slight_smile:

Maybe I can do one better then logs and get right to a proposed working solution:

I’d never claim to be a good programmer, but I’ve always had an aptitude for reverse engineering stuff. I can read and understand most code but can’t be bothered to write the stuff because it’s not what I do for a living. That said, I looked in the logs, took the info to the github repo, found the function in question and shamelessly used some AI to propose a solution.

In https://raw.githubusercontent.com/openziti/ziti-tunnel-sdk-c/refs/heads/main/programs/ziti-edge-tunnel/netif_driver/windows/tun.c

Compare static void WINAPI to what AI gave me. It’s also likely a good solution for when users have wired and Wi-Fi going at the same time too:

This is what AI gave me when I asked for a modification that picked the lowest metric first when building excluded routes

static void WINAPI if_change_cb(PVOID CallerContext, PMIB_IPINTERFACE_ROW Row, MIB_NOTIFICATION_TYPE NotificationType) {
    struct netif_handle_s *tun = CallerContext;

    ZITI_LOG(DEBUG, "interface change: if_idx = %d, change = %d", Row ? Row->InterfaceIndex : 0, NotificationType);

    PMIB_IPFORWARD_TABLE2 table;
    ULONG rc = GetIpForwardTable2(AF_INET, &table);

    if (rc == NO_ERROR) {
        MIB_IPFORWARD_ROW2 *best_rt = NULL;

        // Find 0.0.0.0/0 route with the lowest metric
        for (ULONG i = 0; i < table->NumEntries; i++) {
            MIB_IPFORWARD_ROW2 *row = &table->Table[i];

            if (row->DestinationPrefix.PrefixLength == 0 &&
                row->DestinationPrefix.Prefix.Ipv4.sin_family == AF_INET) {
                
                if (best_rt == NULL || row->Metric < best_rt->Metric) {
                    best_rt = row;
                }
            }
        }

        if (best_rt) {
            if (default_rt.InterfaceIndex != best_rt->InterfaceIndex ||
                default_rt.Metric != best_rt->Metric) {

                ZITI_LOG(INFO, "default route (lowest metric) is now via if_idx[%d], metric=%lu",
                         best_rt->InterfaceIndex, best_rt->Metric);

                // Update stored default route info
                default_rt = *best_rt;

                ZITI_LOG(INFO, "updating excluded routes");
                const char *dest;
                MIB_IPFORWARD_ROW2 *route;
                MODEL_MAP_FOREACH(dest, route, &tun->excluded_routes) {
                    route->NextHop = best_rt->NextHop;
                    route->InterfaceIndex = best_rt->InterfaceIndex;
                    route->InterfaceLuid = best_rt->InterfaceLuid;
                    if (SetIpForwardEntry2(route) != NO_ERROR) {
                        char err[256];
                        FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM, NULL, GetLastError(),
                                      MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
                                      err, sizeof(err), NULL);
                        ZITI_LOG(WARN, "failed to update route[%s]: %s", dest, err);
                    }
                }
            }
        } else {
            ZITI_LOG(WARN, "no default route found");
        }

        FreeMibTable(table);
    } else {
        char err[256];
        FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM, NULL, GetLastError(),
                      MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
                      err, sizeof(err), NULL);
        ZITI_LOG(WARN, "failed to get forward table: %d(%s)", rc, err);
    }
}

Is this helpful?

It may be somewhat helpful, but what I’d like to see from the logs is whether the if_change_cb callback was called at all on wake. If that isn’t happening then the logic in if_change_cb won’t matter much in this case.

I’ll start a branch that changes if_change_cb to iterate all of the routes in the table. It should be a minor change, but it won’t be easy for me to test it so I’m hoping you’ll be able to do that when a build is ready. In the meantime could you please send the logs so I can be sure that the issue isn’t something else?

Thanks!

I had to google how to send DMs on discourse and it looks like I might be too new or at too low of a trust level to send any.

So I’ll drop some relevant sanitized excerpts from my logs here: I swapped my controller IP with 5.5.5.5 and changed the DNS name for my controller in these:

Log for fresh startup after reboot:
[2025-08-07T15:00:15.481Z] INFO ziti-edge-tunnel:ziti-edge-tunnel.c:1504 run() ============================ service begins ================================
[2025-08-07T15:00:15.481Z] INFO ziti-edge-tunnel:ziti-edge-tunnel.c:1505 run() Logger initialization
[2025-08-07T15:00:15.481Z] INFO ziti-edge-tunnel:ziti-edge-tunnel.c:1507 run() - config file : c:\windows\system32\config\systemprofile\appdata\roaming\netfoundry\config.json
[2025-08-07T15:00:15.481Z] INFO ziti-edge-tunnel:ziti-edge-tunnel.c:1509 run() - initialized at : Thu Aug 07 2025, 11:00:15 AM (local time), 2025-08-07T15:00:15 (UTC)
[2025-08-07T15:00:15.481Z] INFO ziti-edge-tunnel:ziti-edge-tunnel.c:1510 run() - log file location: C:\Program Files (x86)\NetFoundry Inc\Ziti Desktop Edge\logs\service\ziti-tunneler.log.202508070000.log
[2025-08-07T15:00:15.481Z] INFO ziti-edge-tunnel:ziti-edge-tunnel.c:1512 run() - C SDK Version : 1.7.4:HEAD@g747d935
[2025-08-07T15:00:15.481Z] INFO ziti-edge-tunnel:ziti-edge-tunnel.c:1513 run() - Tunneler SDK : v1.7.3
[2025-08-07T15:00:15.481Z] INFO ziti-edge-tunnel:ziti-edge-tunnel.c:1517 run() ============================================================================
[2025-08-07T15:00:15.481Z] INFO ziti-sdk:utils.c:196 ziti_log_set_level() set log level: root=3/INFO
[2025-08-07T15:00:15.496Z] INFO ziti-edge-tunnel:tun.c:195 tun_open() Wintun v0.0 loaded
[2025-08-07T15:00:15.496Z] INFO ziti-edge-tunnel:tun.c:166 flush_dns() DnsFlushResolverCache succeeded
[2025-08-07T15:00:15.590Z] INFO ziti-edge-tunnel:tun.c:98 WintunLogger() Using existing driver 0.14
[2025-08-07T15:00:15.590Z] INFO ziti-edge-tunnel:tun.c:98 WintunLogger() Creating adapter
[2025-08-07T15:00:15.777Z] INFO ziti-edge-tunnel:tun.c:98 WintunLogger() Removed orphaned adapter "ziti-tun0 1"
[2025-08-07T15:00:15.809Z] INFO ziti-edge-tunnel:tun.c:449 if_change_cb() default route is now via if_idx[6]
[2025-08-07T15:00:15.809Z] INFO ziti-edge-tunnel:tun.c:455 if_change_cb() updating excluded routes
[2025-08-07T15:00:17.469Z] INFO ziti-edge-tunnel:windows-scripts.c:491 is_nrpt_policies_effective() NRPT policies are effective in this system

Log excerpt after power resume:
[2025-08-07T14:17:42.308Z] INFO ziti-edge-tunnel:ziti-edge-tunnel.c:2709 endpoint_status_change() Received power resume event
[2025-08-07T14:17:42.372Z] INFO ziti-sdk:posture.c:1059 ziti_endpoint_state_change() ztx[0] endpoint state change reported: woken[TRUE] unlocked[FALSE]
[2025-08-07T14:17:42.372Z] WARN ziti-sdk:ziti_ctrl.c:815 verify_api_session() ctrl[https://not.my.real.controller.address:1280] no API session
[2025-08-07T14:17:42.372Z] ERROR ziti-sdk:posture.c:1041 ziti_endpoint_state_pr_cb() ztx[0] error during endpoint state posture response submission: 0 - no api session token set for ziti_controller
[2025-08-07T14:17:42.478Z] INFO tunnel-cbs:ziti_dns.c:566 format_resp() found record[100.64.0.17] for query[1:clients4.google.com]
[2025-08-07T14:17:42.579Z] WARN ziti-sdk:ziti_ctrl.c:177 ctrl_resp_cb() ctrl[https://not.my.real.controller.address:1280] request failed: -3008(unknown node or service)
[2025-08-07T14:17:42.579Z] INFO ziti-sdk:ziti_ctrl.c:180 ctrl_resp_cb() ctrl[https://not.my.real.controller.address:1280] attempting to switch endpoint
[2025-08-07T14:17:42.579Z] WARN ziti-sdk:ziti_ctrl.c:602 ctrl_next_ep() ctrl[https://not.my.real.controller.address:1280] no controllers are online
[2025-08-07T14:17:42.579Z] ERROR ziti-sdk:ziti_ctrl.c:389 ctrl_login_cb() ctrl[https://not.my.real.controller.address:1280] CONTROLLER_UNAVAILABLE(unknown node or service)
[2025-08-07T14:17:42.579Z] WARN ziti-sdk:legacy_auth.c:183 login_cb() failed to login to ctrl[https://not.my.real.controller.address:1280] CONTROLLER_UNAVAILABLE[-16] unknown node or service
[2025-08-07T14:17:42.579Z] ERROR ziti-sdk:connect.c:504 process_connect() conn[0.233/3Y4dJX-x/Connecting](google services) ziti context is not authenticated, cannot connect to service[google services]
[2025-08-07T14:17:42.579Z] ERROR tunnel-cbs:ziti_tunnel_cbs.c:103 on_ziti_connect() ziti dial failed: invalid state
[2025-08-07T14:17:42.894Z] ERROR ziti-sdk:connect.c:504 process_connect() conn[0.234/hwz3Ctod/Connecting](google services) ziti context is not authenticated, cannot connect to service[google services]
[2025-08-07T14:17:42.894Z] ERROR tunnel-cbs:ziti_tunnel_cbs.c:103 on_ziti_connect() ziti dial failed: invalid state
[2025-08-07T14:17:43.122Z] ERROR ziti-sdk:connect.c:504 process_connect() conn[0.235/_CwmuM2C/Connecting](google services) ziti context is not authenticated, cannot connect to service[google services]
[2025-08-07T14:17:43.122Z] ERROR tunnel-cbs:ziti_tunnel_cbs.c:103 on_ziti_connect() ziti dial failed: invalid state
[2025-08-07T14:17:43.196Z] WARN ziti-edge-tunnel:tun.c:476 if_change_cb() failed to get default route: 1168(The operation completed successfully.

)
[2025-08-07T14:17:43.237Z] WARN ziti-edge-tunnel:tun.c:476 if_change_cb() failed to get default route: 1168(The operation completed successfully.

)
[2025-08-07T14:17:43.262Z] WARN ziti-edge-tunnel:tun.c:476 if_change_cb() failed to get default route: 1168(The operation completed successfully.

)
[2025-08-07T14:17:43.273Z] WARN ziti-edge-tunnel:tun.c:476 if_change_cb() failed to get default route: 1168(The operation completed successfully.

)
[2025-08-07T14:17:43.294Z] WARN ziti-edge-tunnel:tun.c:476 if_change_cb() failed to get default route: 1168(The operation completed successfully.

)
[2025-08-07T14:17:43.307Z] WARN ziti-edge-tunnel:tun.c:476 if_change_cb() failed to get default route: 1168(The operation completed successfully.

)
[2025-08-07T14:17:43.323Z] WARN ziti-edge-tunnel:tun.c:476 if_change_cb() failed to get default route: 1168(The operation completed successfully.

)
[2025-08-07T14:17:43.328Z] WARN ziti-edge-tunnel:tun.c:476 if_change_cb() failed to get default route: 1168(The operation completed successfully.

)
[2025-08-07T14:17:43.438Z] INFO ziti-edge-tunnel:ziti-edge-tunnel.c:2709 endpoint_status_change() Received power resume event
[2025-08-07T14:17:43.452Z] ERROR ziti-sdk:connect.c:504 process_connect() conn[0.236/f8oCZTKK/Connecting](google services) ziti context is not authenticated, cannot connect to service[google services]
[2025-08-07T14:17:43.452Z] ERROR tunnel-cbs:ziti_tunnel_cbs.c:103 on_ziti_connect() ziti dial failed: invalid state
[2025-08-07T14:17:43.453Z] INFO ziti-sdk:posture.c:1059 ziti_endpoint_state_change() ztx[0] endpoint state change reported: woken[TRUE] unlocked[FALSE]
[2025-08-07T14:17:43.453Z] WARN ziti-sdk:ziti_ctrl.c:815 verify_api_session() ctrl[https://not.my.real.controller.address:1280] no API session
[2025-08-07T14:17:43.453Z] ERROR ziti-sdk:posture.c:1041 ziti_endpoint_state_pr_cb() ztx[0] error during endpoint state posture response submission: 0 - no api session token set for ziti_controller
[2025-08-07T14:17:43.473Z] WARN ziti-edge-tunnel:tun.c:476 if_change_cb() failed to get default route: 1168(The operation completed successfully.

)
[2025-08-07T14:17:43.664Z] ERROR ziti-sdk:connect.c:504 process_connect() conn[0.237/BHoTOJ4R/Connecting](google services) ziti context is not authenticated, cannot connect to service[google services]
[2025-08-07T14:17:43.664Z] ERROR tunnel-cbs:ziti_tunnel_cbs.c:103 on_ziti_connect() ziti dial failed: invalid state
[2025-08-07T14:17:43.819Z] INFO tunnel-cbs:ziti_dns.c:566 format_resp() found record[100.64.0.6] for query[1:mtalk.google.com]
[2025-08-07T14:17:43.841Z] ERROR ziti-sdk:connect.c:504 process_connect() conn[0.238/QdBVzS1-/Connecting](google services) ziti context is not authenticated, cannot connect to service[google services]
[2025-08-07T14:17:43.841Z] ERROR tunnel-cbs:ziti_tunnel_cbs.c:103 on_ziti_connect() ziti dial failed: invalid state
[2025-08-07T14:17:43.962Z] ERROR ziti-sdk:connect.c:504 process_connect() conn[0.239/rFyRH8UI/Connecting](google services) ziti context is not authenticated, cannot connect to service[google services]
[2025-08-07T14:17:43.962Z] ERROR tunnel-cbs:ziti_tunnel_cbs.c:103 on_ziti_connect() ziti dial failed: invalid state
[2025-08-07T14:17:44.170Z] ERROR ziti-sdk:connect.c:504 process_connect() conn[0.240/vKYcP3w9/Connecting](google services) ziti context is not authenticated, cannot connect to service[google services]
[2025-08-07T14:17:44.170Z] ERROR tunnel-cbs:ziti_tunnel_cbs.c:103 on_ziti_connect() ziti dial failed: invalid state
[2025-08-07T14:17:44.346Z] INFO ziti-edge-tunnel:tun.c:449 if_change_cb() default route is now via if_idx[17]
[2025-08-07T14:17:44.346Z] INFO ziti-edge-tunnel:tun.c:455 if_change_cb() updating excluded routes
[2025-08-07T14:17:44.346Z] WARN ziti-edge-tunnel:tun.c:467 if_change_cb() failed to update route[5.5.5.5]: 0(The operation completed successfully.

I had a thought - instead of creating the exclusion route for the 0.0.0.0/0 route with the lowest metric as I previously proposed, a safe and compliant way would be to add the exclusion route for EACH 0.0.0.0/0 and copy the metric provided from each table entry into the new route.

Windows prioritizes lowest metric. That would fix the issue without risking regression.

I think we’re on the right track already.

FYI the exclusion route is actually for the IP of the controller or router that’s being protected from intercept - so e.g. “5.5.5.5”. The process looks for the default route (0.0.0.0/0) to determine the network interface that the exclusion route should be assigned to.

I’ve made the change to ziti-edge-tunnel in use default route with smallest metric by scareything · Pull Request #1195 · openziti/ziti-tunnel-sdk-c · GitHub. Now it’s just a matter of packaging it all together into an installer. Stay tuned.

The reason I updated my request to suggest an exclusion route be added for all interfaces with a 0.0.0.0/0 route with the metric copied over (instead of just the lowest metric) in each case, is because I’m imagining a scenario where somebody’s on wired and wireless, and the interface with the lower metric goes down.

I think this is more likely to happen then my individual scenario. My scenario would just happen to be fixed by addressing it this way.

Since windows prioritizes traffic to go through the route with the lowest metric, this will still work and additionally catch the scenario I brought up. I suppose this will also keep things working in networking scenarios where two NICs are configured in active/active where one is the failover nic as defined by said metric.

Thoughts?

The current implementation should handle the preferred route going down, because if_change_cb will be called when that happens, which will cause the tunneler to re-evaluate the (best) default route and update the existing exclusion routes accordingly.

What about upstream failures that don’t trigger an if_change_cb event?

My revised proposal in the case of two routes like so:

Active Routes:
dst 0.0.0.0/0 gw 10.4.4.1 if 10.4.4.20 metric 20
dst 0.0.0.0/0 gw 10.2.2.1 if 10.2.2.128 metric 2000

You’d get two exclusion routes like so (assuming controller ip is 5.5.5.5):

dst 5.5.5.5/32 gw 10.4.4.1 if 10.4.4.20 metric 20
dst 5.5.5.5/32 gw 10.2.2.1 if 10.2.2.129 metric 2000

WIndows allows for this as long as the metrics are different.

If an upstream where a router goes down in the 10.4.4.x network, packet timeouts should trigger windows to resend over the next lowest metric.

Scratch that – just checked my answer. Windows doesn’t route to the next lowest metric until an interface status change is triggered, which would fire off the event anyway, rendering this not entirely necessary.

I still feel like my revised approach is likely to catch some fringe scenarios in windows networking land, but I honestly can’t picture any with my limited knowledge. So I’ll be happy to test the results of whatever you decide to push through as a result of all this back and forth.

Thanks for the effort so far. I really appreciate the help. I’m discussing this internally with my employer as a way to bring our zscaler license costs down. This goes a long way to speak for the support we can expect. :slight_smile:

I just saw this:

{
"name": "2.7.1.8",
"tag_name": "2.7.1.8",
"published_at": "2025-08-07T20:11:24Z",
"installation_critical": false,
"assets": [
{
"name": "Ziti.Desktop.Edge.Client-2.7.1.8.exe",
"browser_download_url": "https://github.com/openziti/desktop-edge-win/releases/download/2.7.1.8/Ziti.Desktop.Edge.Client-2.7.1.8.exe"
}
]
}

at https://get.openziti.io/zdew/beta.json

That was published today - any chance it’s got the changes that you pushed into the code?

Edit: Nevermind, I see that it is. I see the update_default_route() function referenced in the logs. I’ve installed it and am testing. I’ll report back when I have a power event that would normally swap my interface order.

Yes that’s the build with the change. Awesome, thanks!

Not sure if there’s an issue. I have this in my log:

[2025-08-07T20:51:24.064Z] INFO ziti-edge-tunnel:tun.c:469 update_default_route() default route is now via if_idx[6], metric=0

However the exclusion route placed in my route table shows a metric of 20. 20 happens to be correct. So the output is good, but the log doesn’t reflect accurately.