AWS VPC to VPC via Ziti

hi folks I had a question. If we have a data engine say in AWS VPC1 and the data source itself in Another AWS VPC2. is there a way to use ziti so all data flows of actual data within the region (assume both VPCs are within same region)?

Is there a way to use attribute tags and policies to define the data flows (if we want to manage all data flows within a region to avoid data transfer charges)? Can you suggest way to address this? trying to make sure data transfer is within an AZ (as its free) but secure and scalable. If you can help with set-up approach that will be most appreciated

I ve a suggestion for that,
Assume AWS VPC1 hosts the backend and AWS VPC2 contains the database. Instead of using a NAT Gateway, you can route database requests from AWS VPC1 through a Ziti router. In the Ziti Controller or Console, configure the intercept and host mappings to securely manage this traffic. By routing requests through the Ziti router, you establish Zero Trust control, ensuring secure, direct communication. This approach not only enhances security but also helps avoid data transfer costs by keeping data flows within the same Availability Zone (AZ), provided the backend and database are carefully placed in the same AZ, thereby eliminating the need for a NAT Gateway.

I am also a user, and maintainers are welcome to provide additional comments or suggestions on this

The suggestion from @ss_vinoth22 is exactly what I would recommend. I'm not a pro when it comes to intra-VPC communication, but I would hope/assume that one VPC can peer to another? Then, as @ss_vinoth22 mentions, you'd deploy one or more routers in one VPC and one or more in the other.

Right now, the only way to be entirely sure that traffic travels exclusively through the VPC peering arrangement would be to define a service-edge-router-policy and create two routers. Give access (dial) privs to the service from one private router and give offload (bind) privs to the other private router via the service edge router policy. Then, I would define a link group and have the two private routers link to one another and only each other... Then if you need to give users access to the data engine or to the database, you would need to define two other private routers, and a second service that's basically the same as the first one, but only usable by users (the non-private identities).

This has the challenge of dealing with service edge router policies, which can get tricky... But this would ensure the traffic can never exit the VPC.

Hopefully that makes sense? If not let us know.

1 Like

Thanks @TheLumberjack and @ss_vinoth22. So much appreciate it. For the user...why 2 private routers? (the user in this case can/should only access the data engine and not the db). Also if the traffic is always one-way Engine to DB (and get results) and User to Engine... is there any simplified approach feasible?

i did something similar to this. you can refer this. you can ignore kubernetes, just look into the diagram for flow

Thinking about it more, you could probably use one router in each VPC if the costs associated with the public routers are extraordinarily high. It might be worthwhile to get @plorenz to comment as well or possible @mike.gorman. I've not spent a ton of time with router costing but my still naiive understanding of controller path selection is that it's possible for a router's cost to go so high that a different path is selected. I've just not done it enough to know if there are any gotcha's to be had. I was being extra cautious with this approch because I know that it would prevent the private link from being able to link to the public router. I just want to be certain that the data would never go private router vpc 1 -- public router -- private router vpc 2.

At some point in the future, we'll have explicit control over pathing and more robust path logic, but with what we have right now I wanted to provide a mechanism that I was sure would never end up costing you egress traffic.

I'm interested in anything @plorenz or @mike.gorman add! :slight_smile:

Thanks @TheLumberjack and @ss_vinoth22 again. Egress costs are such a rip-off :frowning: ! @plorenz / @mike.gorman any thoughts are most appreciated

Link groups are available now, so you can force routers to only form links with other routers. You can also apply manual cost to nodes, so that you could increase the cost of the public routers by a large margin. This would make them much more "expensive" in the path calculation algorithm, and very unlikely to be used except in the case of failure, etc. We have certainly seen cases where routers in the same geographic area, like Northern Virginia, where there is a large concentration of data centers, have been involved in paths one didn't intend. The base path costing is latency, and when they are all so small, the variations can be enough to make the costs lower on the other paths.

Nodes have a default value, 10 I believe, so it is already less desirable to take a hop through two routers than one, even if the paths have roughly the same latency, but as I said, it can be increased manually.

So, in the end, there are a few paths you can take, depending on exactly what you need to achieve.

It appears link groups is most deterministic approach from the above? We thought we could have some sort of attribute based policy but guess its not in the options above !

The link groups are created by applying tags to the configuration on the dialers and listeners, so they are attribute based. The issue there is that you probably want links to the public routers, for reachability, so you can't cut them off. We utilize it to implement data sovereignty, on networks that stretch into multiple countries or continents, and need to regionalize the connectivity.

I think you are referring to a situation where you might have 2 VPCs in a region, and the public router may be nearby, and you want to avoid exiting the region to avoid egress charges. In that case, I would go to costing up the publicly available routers. If there is no other path, for example remote access, then it has to go through there regardless. But the local data transfer would avoid it. The cost is technically unitless, but adding a value like 200 would be directly like a 200ms latency on a path. (There are other things in the real cost that aren't ms). You could add 1000 or so, and really make it less attractive.

I could be wrong about what you're trying to achieve, of course.

To go the costing route, the command path is below.

Usage:
  ziti fabric update router <idOrName> [flags]

Flags:
  -i, --cli-identity string   Specify the saved identity you want the CLI to use when connect to the controller with
      --cost uint16           Specifies the router cost. Default 0.
      --disabled              Disabled routers can't connect to controllers
      --fingerprint string    Sets the router fingerprint
  -h, --help                  help for router
  -n, --name string           Set the router name
      --no-traversal          Disallow traversal for this edge router. Default to allowed(false).
  -j, --output-json           Output the full JSON response from the Ziti Edge Controller
      --output-request-json   Output the full JSON request to the Ziti Edge Controller
      --tags stringToString   Custom management tags (default [])
      --timeout int           Timeout for REST operations (specified in seconds) (default 5)
      --verbose               Enable verbose logging

thanks all. Will check these and get back with results

One other thing you can do to control routing is to mark a router as 'no traversal', if you only want it used for ingress or egress to the fabric. That way, small latency variations won't lead to undesired traffic bouncing back through what are supposed to be only edge nodes.
Giving those nodes high costs is similar, but allows them to be used if there's no other path available otherwise.

Cheers,
Paul

Hi , considering the scenario . The pairs of having engine and database can be many, as there are many users in our openziti network each containing these pairs .

We would like to deploy our own public routers in each region and want to make sure the path calculation for every user does not involve reaching routers of other users .

as every user has a userid attributes in their routers and services naming .
Would be great to control the communication path isolated for each userid .
which means user1 engine can only use our public routers and its own routers deployed in database vpc to form a route/communication path .

Also I was wondering during this route path when communication happens is the actual data payload hopped over router to router ? Or is the controller first determining path (without passing payload) and getting the final destination address and then passing the data payload ?

I was also assuming that , lets say we have routers named from router1 to router100 . if router100 policy does not allow router99 traffic , it is still not confirmed that router99 could be indirectly associated in the route/communication path .

So wanted to make sure on having control over route path itself

@ss_vinoth22 on this a related question i have...do openziti logs store the database pwds used to connect? (talking of k8s example here)? can we make sure ziti automatically never stores sensitive info?