Flow session view

Is there currently any way to view the sessions that are established between identities in OpenZiti (i.e. which identity dialed to which bind)? In the Ziti GUI there seems to be a window to view “Sessions” and “API Sessions” but it wasn’t clear to us which of these gave us the information we wanted and also whether they mapped to each other somehow at all. Would there be logs that could also provide us this information and would they be hosted on the controller side? The logs that are in the $ZITI_HOME directory don’t seem to update as frequently as we’d expect with the connections we have going on and it we were not sure if there were additional logs somewhere.

We are primarily interested in being able to audit the flows in the ZT network. Thanks!

Hi @lzt ,
If you haven’t yet, check out the controller events: Controller Configuration Reference | OpenZiti

Circuit events have most of what you need. You can tie them together with session events to get the identity for the dialing side and the terminator host id for the hosting side. There are also usage events, which will give you information on how much data is flowing across each circuit, bucketed by minute, for each router. The usage information is already tagged with the client and host identities. I’m going to try and do the same for circuits, so you don’t have to correlate anything, I’ll post a follow up with the results of that experiment.

Below is the events configuration I’m using to generate the events, including the commented out bits to show what other things are available, as well as some example events. Released versions of ziti can emit events to a file, the rolling of which is configurable. The next release of ziti will also support pushing to an AMQP broker. While it’s not directly related to your questions, we’re also adding entity change events in the next release, both for auditing and for external systems integrations.

If you have any questions, let me know,
Paul

events:
  jsonLogger:
    subscriptions:
#      - type: entityChange
#        include:
#          - services
#          - identities
      - type: fabric.circuits
#      - type: fabric.links
#      - type: fabric.routers
#      - type: fabric.terminators
#      - type: metrics
#        sourceFilter: .*
#        metricFilter: .*
      - type: edge.sessions
      - type: edge.apiSessions
      - type: fabric.usage
#      - type: services
#      - type: edge.entityCounts
#        interval: 5s
    handler:
      type: file
      format: json
      path: /tmp/ziti-events.log

Here are some sample events:

ApiSession Create

{
  "namespace": "edge.apiSessions",
  "event_type": "created",
  "id": "clgx1gwl3001897gdu4h0vzl6",
  "timestamp": "2023-04-25T21:50:15.544107122-04:00",
  "token": "3d3d0c13-b30b-48a2-95f9-677e740ee162",
  "identity_id": "fms1RWfjA4",
  "ip_address": "127.0.0.1"
}

Session Create

{
  "namespace": "edge.sessions",
  "event_type": "created",
  "session_type": "Dial",
  "id": "clgx1gwl8001a97gd0i8xylui",
  "timestamp": "2023-04-25T21:50:15.549266686-04:00",
  "token": "93c23214-26e5-4297-be7a-3022f6e9ef58",
  "api_session_id": "clgx1gwl3001897gdu4h0vzl6",
  "identity_id": "fms1RWfjA4",
  "service_id": "5Vidy7TwqzDfpuWgcDLZVQ"
}

Circuit Create

{
  "namespace": "fabric.circuits",
  "version": 2,
  "event_type": "created",
  "circuit_id": "2vC5NSD1S",
  "timestamp": "2023-04-25T21:50:15.58601573-04:00",
  "client_id": "clgx1gwl8001a97gd0i8xylui",
  "service_id": "5Vidy7TwqzDfpuWgcDLZVQ",
  "terminator_id": "27UOJs90gCjDDrYK3I1gk",
  "instance_id": "",
  "creation_timespan": 955013,
  "path": {
    "nodes": [
      "U7OwPtfjg"
    ],
    "links": null,
    "ingress_id": "Oev2",
    "egress_id": "216R"
  },
  "link_count": 0,
  "path_cost": 262140
}

Usage

{
  "namespace": "fabric.usage",
  "version": 2,
  "event_type": "usage.ingress.tx",
  "source_id": "U7OwPtfjg",
  "circuit_id": "2vC5NSD1S",
  "usage": 47,
  "interval_start_utc": 1682473800,
  "interval_length": 60,
  "tags": {
    "clientId": "fms1RWfjA4",
    "hostId": "m7bVDWI5A4",
    "serviceId": "5Vidy7TwqzDfpuWgcDLZVQ"
  }
}
{
  "namespace": "fabric.usage",
  "version": 2,
  "event_type": "usage.ingress.rx",
  "source_id": "U7OwPtfjg",
  "circuit_id": "2vC5NSD1S",
  "usage": 47,
  "interval_start_utc": 1682473800,
  "interval_length": 60,
  "tags": {
    "clientId": "fms1RWfjA4",
    "hostId": "m7bVDWI5A4",
    "serviceId": "5Vidy7TwqzDfpuWgcDLZVQ"
  }
}
{
  "namespace": "fabric.usage",
  "version": 2,
  "event_type": "usage.egress.tx",
  "source_id": "U7OwPtfjg",
  "circuit_id": "2vC5NSD1S",
  "usage": 47,
  "interval_start_utc": 1682473800,
  "interval_length": 60,
  "tags": {
    "clientId": "fms1RWfjA4",
    "hostId": "m7bVDWI5A4",
    "serviceId": "5Vidy7TwqzDfpuWgcDLZVQ"
  }
}
{
  "namespace": "fabric.usage",
  "version": 2,
  "event_type": "usage.egress.rx",
  "source_id": "U7OwPtfjg",
  "circuit_id": "2vC5NSD1S",
  "usage": 47,
  "interval_start_utc": 1682473800,
  "interval_length": 60,
  "tags": {
    "clientId": "fms1RWfjA4",
    "hostId": "m7bVDWI5A4",
    "serviceId": "5Vidy7TwqzDfpuWgcDLZVQ"
  }
}

Oh, the circuit events also include failure events, when an attempt to setup a circuit fails. These can be very helpful for monitoring service health as well as debugging problems.

For context, here is the set of circuit dial failure causes:

	"INVALID_SERVICE"
	"ID_GENERATION_ERR"
	"NO_TERMINATORS"
	"NO_ONLINE_TERMINATORS"
	"NO_PATH"
	"PATH_MISSING_LINK"
	"INVALID_STRATEGY"
	"STRATEGY_ERR"
	"ROUTER_ERR_GENERIC"
	"ROUTER_ERR_INVALID_TERMINATOR"
	"ROUTER_ERR_MISCONFIGURED_TERMINATOR"
	"ROUTER_ERR_DIAL_TIMED_OUT"
	"ROUTER_ERR_CONN_REFUSED"

Cheers,
Paul

I was able to get the circuit events annotated, as the usage events are.

Example:

{
  "namespace": "fabric.circuits",
  "version": 2,
  "event_type": "created",
  "circuit_id": "NrnynTZdT",
  "timestamp": "2023-04-25T22:52:01.674155279-04:00",
  "client_id": "clgx3oc850010xwgdtfmx7kdm",
  "service_id": "5Vidy7TwqzDfpuWgcDLZVQ",
  "terminator_id": "15H4RlbJ8UA9bDTuZZTkVt",
  "instance_id": "",
  "creation_timespan": 916386,
  "path": {
    "nodes": [
      "U7OwPtfjg"
    ],
    "links": null,
    "ingress_id": "VQ06",
    "egress_id": "VXEj"
  },
  "link_count": 0,
  "path_cost": 262140,
  "tags": {
    "clientId": "fms1RWfjA4",
    "hostId": "m7bVDWI5A4",
    "serviceId": "5Vidy7TwqzDfpuWgcDLZVQ"
  }
}

There’s an issue with a PR to track this, and this should also be in the next release.

Cheers,
Paul