Ziti Tunnel & GW Aggregation of Interfaces - Details Needed

Hi,
I get to understand that NF GW and Ziti tunnel at the client end support for Aggregation of Multiple interfaces from different Internet Providers & LTE into a single logical circuit for improving network performance.

Can I get more details and reference on these -

  1. How to enable this on my client side and Server side
  2. Is there a way I can assign cost or weights for traffic routing if multiple Network interfaces are online.
  3. Can it be assigned at each Application level. That is - App1 maps to interface 1 & 2. App2 maps to Interface 1 only. etc.

Looking forward for your insights and guidance on these.
Thanks
Anantha

Hi anantha,

Great questions. Here’s my current understanding of the situation. I’ll point your question out to others and if there’s anything I get wrong they’ll correct me.

  1. I don’t think there is any way that I know of to aggregate traffic. Each overlay service will use the same interface once a connection is established. It’s certainly been discussed but I don’t know of any way to accomplish this. I’ll ask around and correct this if I’m wrong.
  2. The interface that will be used is the one which responded fastest when connecting to an edge router. There’s no sdk option yet to add costs but I do believe I’ve heard conversations about having it in the longer term road map.
  3. Not at this time for the reasons started above.

Cheers

Hi Anantha,
Client side bandwidth aggregation isn’t supported at this time. Server side is, but with some caveats.

Some terminology:

  • service - network resource addressable by name
  • terminator - defines a way to get a network connection from a Ziti router to the application hosting the service. May use existing connections or create a new connection for each client session
  1. A service can have multiple terminators. You could have multiple routers talking to a single hosting application or a single router talking to multiple hosting applications or some combination.
  2. Traffic will be load balanced across the terminators for a service. Load balancing is done per-session. A given session will always use the same terminator. So you can support more sessions than a single terminator could support, but a single session is still limited to what a single terminator can support.
  3. Each terminator can have a static cost assigned to it, to influence routing decisions.
  4. We don’t have explicit support for binding terminators to interfaces. If you have multiple interfaces on a single box, you can run multiple routers and tie each one to a different interface. It’s not elegant, but it’s a workaround until we add more explicit support.
  5. Terminators are defined on services, so costing can be different for each service.

Here’s some additional information on how routing handles cost: Ziti Services | Ziti

Here’s a quick example of how you might configure multiple terminators. It sets up a service test and then adds 4 terminators using 2 routers and 2 application servers. Each router has a terminator to each application server. We give the terminators to the second application server a higher cost.

$ ziti edge create service test
IzHX.7NYBC

$ ziti edge create terminator test edge-router-1 'tcp:host1:80' --cost 100
3dKl
$ ziti edge create terminator test edge-router-1 'tcp:host2:80' --cost 200
aAYa
$ ziti edge create terminator test edge-router-2 'tcp:host1:80' --cost 100
3bz3
$ ziti edge create terminator test edge-router-2 'tcp:host2:80' --cost 200
Zvoa
$ ziti edge list terminators 'service="IzHX.7NYBC"'
id: 3bz3    service: test    router: JAoyjafljO    binding: edge_transport    address: tcp:host1:80    identity:     cost: 100    precedence: default    dynamic-cost: 0
id: 3dKl    service: test    router: e1lyjaflmO    binding: edge_transport    address: tcp:host1:80    identity:     cost: 100    precedence: default    dynamic-cost: 0
id: Zvoa    service: test    router: JAoyjafljO    binding: edge_transport    address: tcp:host2:80    identity:     cost: 200    precedence: default    dynamic-cost: 0
id: aAYa    service: test    router: e1lyjaflmO    binding: edge_transport    address: tcp:host2:80    identity:     cost: 200    precedence: default    dynamic-cost: 0
results: 1-4 of 4

Hope that’s helpful.

Cool!! Thanks for sharing the beautiful concept on terminator usage.
Kind of interesting - “can run multiple routers and tie each one to a different interface”

One connected query:

  1. Can the edge-router-1 and edge-router-2 be across cloud providers for the same service “test”- i.e. one app server hosted in AWS and other one in AZURE ?

  2. Can the edge-router-1 and edge-router-2 be across different regions/sites for the same service “test” - on a same Cloud provider though i.e. say AWS here.

As I read the documents - for 2nd Query answer is yes! I guess. I was not sure on the 1st query though. Request you to help me with clarifications on both :slight_smile:

Thanks
Anantha

Hi Anantha,
Yes, edge routers can be hosted anywhere. You can have a mesh which spans AZs, regions and providers. You don’t need fully visibility in both directions. The mesh can be established as long as you can connect in one direction. So you could have routers behind a firewall which can join the mesh by connecting out to routers which are reachable on the public internet.

One other things I forgot to point out was that you have the option of embedding the Ziti SDK in your hosting applications. If you do this, you don’t have to create terminators manually. The SDK will connect to one or more edge routers which will create terminators on the fly. You can have the same kind of setup with M routers to N hosting applications using the SDK, only managed dynamically from the application.

Cheers

1 Like

Hi Team,
I have this setup now.

  1. The Laptop running a client app.
  2. Server App is running on a VM at AWS site 1 and on a VM at AWS site 2.
  3. Hence configured two Netfoundry GWs -
    • NFGW-1 connecting to Server App VM on Site 1.
    • NFGW-2 connecting to Serer App VM on Site 2
  4. Created a App WAN having all three of these -
    • So client can now connect either to Server App on Site-1 VM or Site-2 VM
      Things are working fine.

My query is -

  1. Is there a way to associate a cost for each path to Site-1 & Site-2.
  2. Is there a way for Active-Standby Option ?
  3. How many clients can one NFGW handle ? Any guidance on this.
  4. Also any reference data available on NFGW - supported # of Connections vs throughput performance vs CPU/Network resource requirements.

Thanks
Anantha

Hi Anantha,

Pls find responses here -

  1. Is there a way to associate a cost for each path to Site-1 & Site-2.
    Cost is calculated ( not manually assigned) as per the below
  • Cost is proportional to number of open sessions
  • Dial failures drive the cost up
  • Dial successes drive the cost down, but only as much as they were previously driven up by failures
  • The session can react to terminator changes, such when a terminator is added to or removed from a service. The service is also notified via Notify event whenever a session dial succeeds or fails and when a session for the service is ended.
  1. Is there a way for Active-Standby Option ?

It is active / active from NF at the network layer.

  1. How many clients can one NFGW handle ? Any guidance on this.
    ERs do not decide the no of clients. It is a factor of the network. A network can have many endpoints / ERs. ERs do have throughput / concurrent session guidelines - talked about in point 4

  2. Also any reference data available on NFGW - supported # of Connections vs throughput performance vs CPU/Network resource requirements.
    Pls find here - Edge Router VM Sizing Guide – NetFoundry

With regard to link and terminator costs, and “active/standby” operation:

It isn’t exposed through the NetFoundry MOP interface, but static costs can be assigned to data plane links by using the ziti-fabric command-line utility:

$ ziti-fabric set link-cost --help
Set link cost

Usage:
  ziti-fabric set link-cost <linkId> <cost> [flags]

Flags:
  -h, --help                  help for link-cost
  -i, --identityName string   dotzeet identity name (default "default")
  -e, --mgmtEndpoint string   fabric management endpoint address

Global Flags:
  -v, --verbose   Enable verbose logging

By giving “standby” paths high cost values, Ziti will direct traffic over the lower cost links.

Terminator costs can be adjusted using the ziti command line utility:

$ ziti edge update terminator --help
updates a service terminator

Usage:
  ziti edge update terminator <id> [flags]

Flags:
      --address string        Set the terminator address
      --binding string        Set the terminator binding
  -c, --cost int32            Set the terminator cost
  -h, --help                  help for terminator
  -j, --output-json           Output the full JSON response from the Ziti Edge Controller
      --output-request-json   Output the full JSON request to the Ziti Edge Controller
  -p, --precedence string     Set the terminator precedence ('default', 'required' or 'failed')
      --router string         Set the terminator router
      --timeout int           Timeout for REST operations (specified in seconds) (default 5)
      --verbose               Enable verbose logging

Setting a high base cost on a terminator will make Ziti prefer the lower cost terminator. Setting a high enough cost to the standby terminator will provoke an “active/standby” arrangement.

Open-source Ziti also includes an xt_ha terminator strategy, which natively does the active/standby failover arrangement. The default terminator strategy provisioned by the NetFoundry MOP is the xt_smartrouting strategy, which works as described above.

Hi,
Thank you for these details.
I have created EP, Services, Router GW & AppWAN - all using NF console - Web UI. On WebUI i dont see options to configure the Link-cost etc.

Now I logged in into the AWS NF Gateway VM. I see ziti running.
I don’t see the ziti-fabric running on the NF Edge router AWS VM.
So do I need to do something more or download any other tools? Or Am I looking at wrong place. Please advice.

Also, this CLI link only gives about creating a service - Creating a Service | Ziti

But I have already created service and things are working fine. I want to modify the link-cost to the existing ones can check the HA behavior.

Can I get reference docs and help to issue the commands on NG-GW VM running ziti - to list the services, see/view the current link-costs/configuration and on how to modify link-costs?

Thanks
Anantha

You won't find ziti-fabric running. the ziti-router executables are what compose a running Ziti Fabric (along with a controller). ziti-fabric is a way of interacting with this Ziti Fabric. The best thing would be to determine your ziti version and then download the corresponding version from Releases · openziti/ziti · GitHub. The easiest way to determine your version is probably by running. for example my instance has v0.20.2:

ziti-router version
v0.20.2

Once you have ziti-fabric you can make an 'identities.yml' file. To do that you need a file located at ~/.ziti/identities.yml which has contents similar to mine here:

---
default:
  caCert:   "/home/ubuntu/.ziti/quickstart/ip-172-31-27-154/pki/ip-172-31-27-154-intermediate/certs/ip-172-31-27-154-intermediate.cert"
  cert:     "/home/ubuntu/.ziti/quickstart/ip-172-31-27-154/pki/ip-172-31-27-154-intermediate/certs/ip-172-31-27-154-edge-router-client.cert"
  key:      "/home/ubuntu/.ziti/quickstart/ip-172-31-27-154/pki/ip-172-31-27-154-intermediate/keys/ip-172-31-27-154-edge-router-server.key"
  endpoint: tls:127.0.0.1:10000

you can find these paths by just inspecting your router's config file. One small difference exists in that the identities.yml wants the field named "caCert" whereas the router calls it just "ca".

You'll know when you've created the file properly when you execute ziti-fabric list routers and you see results:

ziti-fabric list routers

Routers: (1)

Id           | Name                           | Fingerprint                              | Status    | Version
sl9g31WqZ    | ip-172-31-27-154-edge-router   | 0591329b8d62b960b5152c6708f451e3b06b416c | Connected (tls:ec2-18-216-136-201.us-east-2.compute.amazonaws.com:10080) | v0.20.2

We don't have end-user facing docs for this yet but if you ask questions - we'll keep answering them. :slight_smile:

Hopefully this helps

One more thing has just occurred to me. Port 10000 is not exposed at this time via the NetFoundry console. Are you using a controller from the openziti.org distribution or are you using a NetFoundry console provided solution? If the latter - I fear this won’t work for you at this time.

Hi Anatha,
For an Active-Standby configuration, the best option is to likely to use either the tunneler, or the router running in tunneler mode.

Each service being hosted by a tunneler can be configured with one or more terminators using the host.v2 config type.

Each terminator is configured with:

  1. The address to dial when the terminator is selected for a session
  2. Zero or more health checks. We currently support port checks and http checks.
  3. Each health check can have a threshold and an action.
  4. Health check rule thresholds include number of consecutive pass/fail checks and duration of passing or failed state.
  5. Health check rule actions include raising/lowering cost, marking terminators failing/healthy.

Here’s a sample configuration with one terminator.

{
    "terminators" : [ 
        {
            "protocol" : "tcp",
            "hostname" : "localhost",
            "port" : 8171,
            "portChecks" : [
                {
                    "interval" : "5s",
                    "timeout" : "500ms",
                    "address" : "localhost:8171",
                    "actions": [
                        {
                            "action": "mark unhealthy",
                            "consecutiveEvents": 10,
                            "trigger": "fail"
                        },
                        {
                            "action": "increase cost 100",
                            "trigger": "fail"
                        },
                        {
                            "action": "mark healthy",
                            "duration": "10s",
                            "trigger": "pass"
                        },
                        {
                            "action": "decrease cost 25",
                            "trigger": "pass"
                        }
                    ]
                }
            ]
        }
    ]
}

I’m hoping to put together some better documentation for this in the near future. Please let me know if you have follow up questions.

Cheers,
Paul