Prometheus metrics

Hi,
I am currently looking at the prometheus metrics exposed by the ziti-controller /metrics endpoint, and couldnt find any metric which report the availability of an identity. Is it possible at the moment? Maybe I am missing some extra configuration there.

At the same time, I was looking for a general OpenZiti dashboard for grafana, and couldnt find any. Is there something already somewhere?

Thank you!

Hi @mjtrangoni, welcome to the community and to OpenZiti!

When you say ā€œthe availability of an identityā€, what does that mean? Are you looking for the current status of the identity, if it is ā€˜onlineā€™ or ā€˜offlineā€™? Or something else? Generally, I think the ā€œonline-nessā€ of identities is not going to be captured in a metric, youā€™ll need to use the API to discover this information. I donā€™t know if we have metrics around individual identities like that. You could probably ā€œback into itā€ by looking at which identities sent data but Iā€™m not exactly sure what youā€™re after. Hopefully this helps? Perhaps another community member has some other information.

I think @qrkourier might have been working on a nascent attempt at a dashboard, but we definitely donā€™t have anything in the community yet thatā€™s refined to my knowledge.

Hi @TheLumberjack,

Thank you for your prompt response.
Exactly, I started looking for the ā€œonline-nessā€ of an identity. After looking at all the per-default exposed metrics, I figured out there is nothing related to the identites exposed.
Then I read the controller code, and found out a ā€œtemplateā€ where we could ā€œactivateā€ some extra metrics via filters. I would love to understand better what is possible to enable there on controller-side which could enable useful metrics. Is it documented somewhere in code?

Back to the Identity ā€œonline-nessā€, do you mean I should write an extra exporter for the identities, or could we expose the identity status on the openziti-controller side as well? Which would be the better approach?

I've not looked at that code myself and I'm not the most familiar with those bits. The person closest to it is on summer leave and might check in here/there. I'll check to see if another dev is more familiar and can comment. The doc we have is over at https://openziti.io/docs/learn/core-concepts/metrics/

As far as "online-ness" goes, it's a hard thing to be precise about. If it were me, yes I would collect the status of identities separately. Everyone has different definitions of what it means to be 'online'. Is "connected to a router" online? Is actively passing data online? is an http client that connects, disconnects "online" etc. It's not a "one size fits all" type of metric I'm afraid.

The ā€œonline-nessā€ of an identity is generally denoted by the presence of an API session. Iā€™ve been working on Grafana for Open Ziti just this week, just getting started. Are you looking for the point in time status, or a time series? There is a way to configure the Infinity Data Source for Grafana to access the management API to retrieve status, etc. That would give a point in time.

Thanks for the Info! Agree this is more an exporter job. Would go that way then.

Hi @mike.gorman, I am looking for time-series, as I want to track connectivity of certain identities.

BTW, looking forward about your work! Are you working on Grafana dashboards based on prometheus endpoint data or something else like a dedicated datasource from OpenZiti API?

Sort of both. I have a Prometheus server for metrics, and I am using Grafana to visualize. The issue of the ids makes it difficult to read, of course, so Iā€™ve used the API to pull up reference tables showing the name as well as the ID; not perfect, but code free, so it is an easy way to start. Of course Prometheus doensā€™t do events, so that will be another project.

There is an option to stream events to a websocket, so it sort of has a built in shipper, but I haventā€™ played with that yet, itā€™s just on my list.

1 Like

Yes, thereā€™s a nascent Grafana dashboard selects some byte counting metrics from Zitiā€™s Prometheus scrape target.

This is part of a Terraform module that deploys kube-prometheus-stack with a ServiceMonitor resource describing Zitiā€™s scrape target.

My goal was to solve for orchestrating Prometheus and the Ziti controller configuration more than a rich Grafana dashboard. If we settle on a good home for Grafana dashboards then weā€™ll be able to collect the best ideas in one place, i.e., collectively develop them. For now, Iā€™m happy to have PRs that improve this dashboard or select different event subscriptions in the controller configuration!

Hi all,

I took the opportunity to write an extra exporter for the OpenZiti Edge API.
Currently it exposes only Identities and Router Status Information, so that we can monitor it centrally, but could be extended to other endpoints if desired. See, GitHub - enthus-it/openziti_exporter: OpenZiti API exporter.

Would be more than happy to receive your feedback and suggestions on it.

Best regards,

Mario

That's very cool! I'll be checking it out. Are you cool if we end up listing/linking to it from the docs somewhere?

Happy to hear it! Yes, no problem at all, comments, bugs and fixes are welcome as well.

Would you be interested in contributing the Helm chart to Zitiā€™s repo, @mjtrangoni ?