What would be the structure of the fabric if I put a local router in each site?

I have been reading up on the threads about adding additional edgerouters. This thread seemed to be particularly helpful. However I want to make sure it is still current, and if I have the right idea.

  • My JP network will have access to EVERYTHING, the whole of the fabric… only the applied role attribute affects the reach.
  • Each client will have their own local router, this would intercept and deliver all traffic for their local environment. They will have the attribute #bob/#sally, etc.
  • Each client will have mobile devices that will need to route through the public router, through the local router, and to the resources within. This would be enabled by role attribute (#bob.mobile/#sally.mobile in this case)

How many policies do I need?
What roughly does each policy allow?

"need" is a funny word. I assume you mean that bob should only have access to bob's services, sally only to sally's services... But is there only bob and sally? or "bob developers", "bob hr people", "bob qa people", "bob administrators" etc.? There's no one answer here. You need as many policies as you... well.... "need".

Since I'll assume you are explicitly referring to "service policies" (as there are also more than one type of policy), you'd want to have a policy for every tenant at least. The "bob.dial.policy" could be granting "#bob.identities" access to "#bob.services". Then you would mark all identities for "bob tenant" with the "bob.identities" attribute, and all the services associated to "bob tenant" with "bob.services".

Then you would have a "bob.host.policies" and that might be one, it might be many granting identities the ability to host/bind services in the "bob network". You might choose to have "bob.hr.services" or "bob.developer.services" etc. Then you rinse repeat for sally's identities/services. So it's not a fast/easy thing to give "one" answer to.

In general, having at least one "public" router exposed to the internet is really important. The quickstart will make an "edge router policy" that grants "#all" identities the permission to use any edge router with a "#public" attribute on it. So if you use the quickstart, that'll be done for you. I'd make sure the "private" routers (bob/sally) don't get the "#public" attribute. You could then also make an edge router policy granting "#bob.identites" accss to "#bob.routers" (etc).

That help or muddy the waters?

You know how something makes perfect sense in your head… you ask a question based on that assumption then someone disabuses you of that notion? That happens to me a lot. LOL

So, I made a new diagram that I tried to simplify. This shows a single client site, one RDP service, and it’s dial/bind/host/intercept policies/configs. I have now become practiced at producing this basic svc/dial/bind/config. So my next PoC is assuming this simple but repeatable setup.

However, I now want this to work under my shared controller and one on-prem edge router. So let’s abstract that to a single location. I added my jp.localedgerouter location just to imply another service hosted elsewhere. I am expecting to set this up at a later time once I have a repeatable client on-prem setup nailed down.

  • The Bob location has a server.bob, a desktop.bob and a laptop.bob.
  • The server.bob will host RDP, both of the computer can dial the bob.rdp.svc and the traffic goes over the fabric on-prem.
  • The laptop.bob is able to dial the bob.rdp.svc from the public web (my public edgerouter).
  • To complicate things, there will be services I host, let’s say I am going to collect logs via a siem.jp, so I will need to apply a #siem.jp role attribute (for simplicity I will use a simple ssh service for the PoC to stand up in place of the siem)
  • I would want all local traffic for bob.rdp.svc to stay within the local lan and not traverse the wan, but still be over the fabric.

What I don’t understand at this phase is what policies for the EdgeRouters I need to make them work. I see from the docs there are 3 running modes for the EdgeRouters; (1) Private Router w/ Edge, (2) Private Router w/ Edge & Tunneler, (3) Public Router w/ Edge. But none of them seem particularly desirable (I don’t want to open ports).

So, here are my questions so far.

  1. What of the three router config types do I need for bob.localedgerouter and jp.localedgerouter?
  2. What edge router policies do I need for bob.localedgerouter and jp.localedgerouter based on the router config type?
  3. How do I ensure the on-prem devices don’t have to go out over the wan and back again?
  4. Is my assumption correct that once I get the edgerouters configured with the right policies I just assign #role_attributes as necessary to accomplish what I want regarding the services and dial/bind/config?

At the bob location you want a 'private' edge router config using something like: ziti create config router edge --private --routerName bobrouter. That means it's a router that does not advertise link listeners (the mesh doesn't try to connect TO this router) but the router can accept edge connections (tunnelers/clients/sdks). You want the devices/identites on the "bob LAN" at the "bob location" to be able to connect to the "bob edge router" so that traffic stays on the LAN and doesn't need to traverse the WAN. (If you run that command WITH --private and without you'll be able to see the only difference between the two invocations)

Then you want to create an edge router policy that grants "#bob.identities" the rights to access the routers with a "#bob.routers" attribute (which there may be only 1, pluralized to just illustrate you COULD have > 1 at the bob location) like that other thread you were referencing...

You MUST have open ports on that edge router, but they only need to be "locally available" to the machines on that LAN. NOT open through the firewall to the internet. A big difference but still a necessity if you want to keep the traffic on the LAN.

Answered above. You want a "private" router. One that links outbound to other routers and accepts incoming connections from local clients, but does not advertise/expect other routers to link to this router

Also answered above, but reiterating it. You need an edge router policy that grants bob identities the rights to connect to the bob edge router. I don't understand what jp.localedgerouter is, but maybe by now it makes sense?

Answered above I hope

Yep. Personally, I would go the other way. Start with a public edge router, get all the services working the way you like it, then add in the "local" edge routers and add the policies and verify "local" identities are connecting to the "local" routers (probably using tcpdump/wireshark/etc).

Hope that all helps?

That makes a TON of sense. My MVP for this PoC was an island with only inbound travel.

tenor-102978910

I am very much enjoying learning this project. It’s really forcing me out of some well-worn thought paths.

1 Like

I see the benefit of the local routers to keep the traffic local and distance short.
Does this also provide additional security? As long as there is only one controller and this controller is also public reachable, the controller is the point a hacker could get control of the whole setup. Must not be a openziti issue. Maybe I forgot to close ssh, the cloud provider got hacked, … .

Actually I’m thinking to do 3 different open ziti setups. One controller / router reachable public, one controller / router reachable internal and one controller / router for high security. The disadvantage of this setup is that the traffic between the different instances is not end to end encrypted.

Do you have a best practise guide to add additional security between security zones? Or do I think totally in the wrong direction? Maybe I’m still in the “old” security world.

There are a lot of reasons to run multiple networks, but I’d have a hard time thinking of it as a “normal” architecture. Usually, it is done by an MSP or similar to manage multiple customers that have their own sovereignty concerns. You could have a situation where you have very highly classified data resources and want to provide a separate network, to minimize the administrative access, etc. It increases the operations load, costs, etc. but certainly could be justified. However, if the same people and systems are running it, I would be hard pressed to see it as a gain. More touches means more chances for mistakes, more auditing to perform, etc. If you do have a sufficient reason, endpoints can use more than one identity, so they can be members of multiple networks simultaneously, removing any need for internet network unencrypted traffic. Of course, this means there is a potential cross over point at the endpoint, so that must be taken into consideration.

Personally, if I were running all the networks myself, for myself (So I don’t have customers to worry about dedicated resources) I would focus my time on protecting the controller(s, once HA is available) with appropriate security controls. There are the normal OS logs, audit logs, and any protections you can apply at the OS or CSP level. We use a combination of CSP security groups to limit ssh access, ssh key authentication only, OS level audit logs to a SIEM, change audit logs from Ziti, CIS benchmark hardened instances, unattended security updates, and other controls to protect the security of our Network Controllers and Edge Routers.

More routers doesn’t really mean more security, but for exactly the reason you state, they are beneficial. If you are using the network for East-West traffic, you would want the data plane to stay local. That reduces internet traffic and any costs associated, as well as latency, which gives performance, while still maintaining all the control and monitoring of the OpenZiti network.

I love to have security architecture discussions around OpenZiti, so if there are finer points you are thinking about, fire away.

The biggest threat is surely compromising the controller/controllers. Right now you can separate the two APIs controllers present: the edge api and the management api. The edge api must always remain public. The management api you can split off and ensure it's only accessible from "localhost" or a "trusted network". Doing that would prevent the controller's management API from even being able to be attacked.

Other than that, I don't find any value in making more than one network and using the policies availalble to control security. Also, OpenZiti isn't the authentication tier for the applications itself necessarily. It's the authentication/authorization of the network, not the app that runs over the app. So for example if you're using zssh, you're still using sshd and openssh keys (or user/pwd I guess) to actually auth to sshd. That basically means when you're using OpenZiti you're already using at least 2FA (first factor being OpenZiti, second being the ssh creds). You can then add more factors into OpenZiti itself as well through the common 2FA to even authenticate to the overlay if you wish.

Those are the ways I would probably go. The security zones are not controlled by the underlay, they're controlled by the overlay. Just some extra thoughts building on what @mike.gorman wrote.

Thank you for your answers. Good to hear that you are confidential with a single (HA redundant) controller / network setup.

@mike.gorman:
You are mor experienced than me. For today I wil start only with internal servers / services and afterwards decide how to go forward. But it could be a multiple networks solution. I’ve qubes OS running on the client side and also dedicated PVHs running. So the PVH only need to be member of one network and openziti will remove the need of wireguard tunnels.

A multi network setup increases the operations load, costs, etc. depends from where you coming from. In the traditional world you have several firewalls, zones, microsegmentation, vpn gateways, … . From this point of view I believe it lowers the operational load and costs.

Also I’ve a fully automated server setup where I did not include the firewall rule setup but plan to include the openziti services and policy setup.

I invested a lot into hardening and doing automation. Also I found a good ansible collection for that in case someone is interested: GitHub - dev-sec/ansible-collection-hardening: This Ansible collection provides battle tested hardening for Linux, SSH, nginx, MySQL

Yesterday I started a short look into the Wazuh SIEM solution. But first I’ll go forward with openziti. How do you grep the change audit logs?

@TheLumberjack:
Yes you are right. The overlay / App is addionally protectec with TLS and certificates. Also your tip from yesterday to bind only the loopback interface helps. Let’s see where I end up.

The entity change information is in the release notes for 28.0 . That will give you any configuration changes.

Perfect. Thank you.
Startet with writing events to a file.

me too... doing that presently in fact.

I have been vacillating about this one. The thing I keep coming back to is that I already have my hooks into each network, and that already exposes each network to a degree that each MSP has to deal with. Also, most of the multi-tenant solutions an MSP uses also have some potential for cross-contamination. Right now I am looking at solving some specific challenges with Ziti, and as I gain familiarity and trust then I will build more thorough SOPs and structures.

This is the approach I am taking at present. The edge API needing to be public I tend to trust enough to just accept the fact; after all, the CloudZiti commercial product has to do the same thing, AFAIK.

@Metz and @jptechnical - Did you know you could use OpenZiti to “lockdown” Wazuh? You’ll find the instructions from the lab setup and hosted on CloudZiti - https://support.netfoundry.io/hc/en-us/articles/14588893503373-NetFoundry-s-CloudZiti-Zero-Trust-overlay-for-secure-log-collection-and-management-of-SIEM-SOAR-platforms

1 Like

I did not, but that was going to be my next thing after I got it installed and started pulling some logs. I had a helluva time with their docker compose deployment and found it exceedingly difficult to reset all the plaintext passwords in the repo... it shouldn't be that hard. So I am going for a local install next. It feels like such an antipattern to run something as a local install.

2 Likes