Logging and Security Review

Hello,

When using OpenZiti to make a service 'Dark' to the internet, but accessible via a Ziti Service, I notice the logs for my service show "127.0.0.1" as the requesting IP that is connecting. I understand this is from the tunneller, but what I'm struggling to understand is how do we know the true source IP of the client for logging/security/auditing purposes?

Typically reverse proxies, for example, will add an X-Forwarded-For header that many services support reading and use to log requests. If a user takes actions that could be considered malicious (within the context of the hosted app), you'd know right away what IP did it.

It's a fair question. I'm surprised I've never seen this particular question yet. I don't know if we have an answer to that. Historically, OpenZiti has refrained from any L7 type of activity. I do think it would be interesting to know the OpenZiti identity making the request, however i don't think we have anything that adds this information.

We have zitified nginx module, and a zitified caddy, among other things but the tunnelers themselves don't do this. (Unless I'm forgetting something)

I'll mention it to the team and see what our collective agreement is on this matter. I don't know where i land on it just yet myself!

Thanks for bringing it up. I suspect someone will follow up soon.

ziti-edge-tunnel logs a message like this when it receives a connection on the overlay:

INFO tunnel-cbs:ziti_hosting.c:642 on_hosted_client_connect() hosted_service[ssh] client[zet.fedora-41-vm] client_src_addr[tcp:100.64.1.1:45974] dst_addr[tcp:127.0.0.1:22]: incoming connection
  • client is the name of the ziti identity that initiated the connection. I think this is the most (possibly the only) meaningful piece of information that you have at the hosting tunneler.
  • client_src_addr is the source ip:port of the intercepted connection, as observed by the intercepting tunneler.
  • dst_addr indicates the underlay address that the tunneler will connect to in order to complete the connection. If address and/or port forwarding is enabled in the host.v1 confuguration, dst_addr will reflect the IP/hostname/port that was intercepted by the sending tunneler. If forwarding is not enabled then you'll see the literal host/port from the host.v1 configuration. This might be better named something like calculated_dst_addr.

If you're seeing client_src_addr values from "127.0.0.1" then I'm guessing the connection was intercepted by an edge router with tunneling enabled. The edge router uses TPROXY iptables rules to intercept connections (overview), but the TPROXY rule is only valid in PREROUTING chain of the the MANGLE table (doc, flowchart). So, in order to intercept connections from local processes, the edge router/tunneler needs to route connections to intercepted IPs/subnets though a local interface so they traverse the mangle/PREROUTING chain.

The "100.64.0.1" ip that you see on connections that were intercepted by ziti-edge-tunnel are for a similar reason. ziti-edge-tunnel uses it's own network interface (the so-called tun interface) to read packets, but we need operating system routes to bring the packets to the tun interface. So the IP you see here is whatever IP is assigned to the tun (100.64.0.1 by default).

In both cases (zet or er/t interception) the dst_addr IP that you see in the hosting tunneler's log is technically correct, it's just that when a connection is intercepted from a local process the source address is especially meaningless unless you're on the host that's running the intercepting tunneler.

It is possible to intercept connections from other hosts. Of course you need to set up routing on your client hosts for that to work. For these connections, you'll see the lan ip:port of the intercepted client in client_src_addr, but even then, I think the likelihood of those addresses being meaningful at the hosting tunneler is low.

1 Like

This is implementation dependent based on if the hosting device is a tunneler or an Edge Router. In either case, there are events that can be emitted by the controller, depending on the configuration that report the time and the identity of the service accesses. In many cases, this is enough to track the usage, but can fail as the application scales up; as the number of dial overlaps crosses into the time variance that can be relied on. If one has one connection per 10 or 15 seconds, it's pretty easy to know, provided you are synced to a common clock. As the rate continues to increase, it is more difficult to determine with precision.

If the hosting device is an Edge Router, the translated socket information is in the event records. The IP:PORT that the Edge Router used is reported in the fabric.circuit created event record along with the identity, service, path, and a lot more information. This information can be further enriched, and combined with other information, such as the recorded IP of the API session the identity has used to authorize the dial, to get a complete picture of the identity and the details of geoIP data and human readable names. Similarly, if the initiation of the connection is done via a gateway model ER, the initiator details are reported. A lot of this enrichment is done in the NetFoundry service, or you can enrich it yourself by combining the records as necessary to meet your particular needs.

While I completely understand this, there are quite a few use cases to at least add header information (just as reverse proxies do) to requests sent to services through tunnellers. This could be an optional component of a service (e.g. in the advanced options, specify adding headers and be able to use variables such as "source ip" (the ACTUAL IP of the client, not the ziti IP), or the identity information (which could be used by the application to authorize actions/verify login - via JWTs or auth headers, or provide built-in let's encrypt services for internal services to avoid certificate errors without adding a complicated reverse proxy).

That's great news, and will give some information - however the actual application may not have a way to correlate this with the tunneler logs (it would just say '127.0.0.1' tried to login 50 times with the wrong password, or tried to access a resource it was not supposed to, etc.)

As mentioned above, this doesn't provide great insight into the actions within the application's context - you are trying to line up some times in 2 different logs hoping they are correlated, but for true auditing you'd want a concrete way to identify the identity/ip/person within the app.

Thanks all for your replies :slight_smile:

The logs, either tunneler or events will contain not just the IP, but the port. That complete socket should match up with access and application logs to provide concrete evidence of the endpoint associated with that endpoint. If the application logs do not provide L4 information, you are already at a significant disadvantage in a NATed source environment.