Need help to setup metrics

smutel · June 28, 2023, 11:43am

Hello,

I installed a kubernetes cluster with my own prometheus and grafana (without operator) and k8ssandra.
I followed the doc Apache Cassandra® metrics endpoints | K8ssandra, Apache Cassandra on Kubernetes.

I am using this prometheus config: metric-collector-for-apache-cassandra/dashboards/prometheus/prometheus.yaml at master · datastax/metric-collector-for-apache-cassandra · GitHub.

I am seeing the prometheus metrics like collectd_mcac.* in prometheus.

I installed the dashboards metric-collector-for-apache-cassandra/dashboards/grafana/generated-dashboards at master · datastax/metric-collector-for-apache-cassandra · GitHub in grafana.

However the dashboards are not working as expected. In one dashboard, the prometheus item should be named mcac_client_request_latency_total but I didn’t see it in prometheus.

What is missing in my configuration ?

Thanks.

alexander · June 28, 2023, 12:09pm

Hi, we’re moving away from MCAC, so I’d recommend (unless you can’t) to switch to the new metrics endpoint.
In order to setup the dashboards properly, follow this guide and pay attention to the " When using MCAC" or " When using the new metrics endpoint" parts, depending on the metrics endpoint you’ve configured.

You shouldn’t need to use the prometheus config you’re referring to if you’re using the new metrics endpoint (in which case the metrics names shouldn’t start with mcac_…). And these metrics names would work with the new dashboards straight away.

Note that MCAC exposes its metrics on port 9000 while MCAC exposes them on port 9103.

Nice Trolls de Troy avatar

smutel · June 28, 2023, 12:18pm

I am using the old endpoint (port 9103). For the moment there is no answer on port 9000. So I tried to add in my k8ssandra config:

telemetry:
      prometheus:
        enabled: true

but I encountered an error like k8ssandra don’t find the prometheus.

With the guide you mention in your answer, can I use my own prometheus/grafana ?

Thanks.

Hebus is the best !

alexander · June 28, 2023, 12:44pm

Which version of k8ssandra-operator and Cassandra are you using?
That may explain why you’re not seeing the new endpoint.

Enabling telemetry.prometheus indeed requires to have the Prometheus operator installed. When enabled, k8ssandra-operator will create ServiceMonitor objects that will allow Prometheus discovering and scraping the endpoints automatically.

I must admit that I’ve never set up a custom Prometheus to scrape the Cassandra pods. Not entirely sure how to discover the new endpoints dynamically as the cluster expands

One alternative that you could take is to rely on Vector instead, using the prometheus remote write sink component. But if you’re using MCAC, that means you’ll be missing a lot of relabellings and that would make our dashboards unfit for the metrics you’ll be writing.

So, the options you have:

Integrated way: Use the prometheus operator and rely on telemetry.prometheus.enabled to set up the scraping automatically, and use our “old” dashboards if you scrape MCAC, or the new ones if you have access to the new metrics endpoint
Integrated way: Use Vector in conjunction with the new metrics endpoint (more on setting up Vector here), and use our new dashboards as is. This allows you to use your current Prometheus installation without requiring to use the prometheus-operator.
Custom way: Move forward with what you have right now, and adjust the dashboards to what you’re getting. I would discourage this as we’re going to fully remove MCAC in a near future. The exact relabellings that are expected in the dashboard can be found here. I’m also unsure how you’ll be able to dynamically scrape newly added nodes.

smutel · June 28, 2023, 1:40pm

It seems that the second option is the best. So I tried to do some test on port 9000.

If I do a port forwarding to port 9000 on the cassandra service, it is working fine. I am seeing the metrics on my web browser (it is working also in port 9103).

But if I connect to my prometheus pod, the port 9103 is working but the port 9000 is not working (connection refused).

Is-it because I don’t install vector yet ?

alexander · June 28, 2023, 1:59pm

oooh I know what’s wrong here. It’s not related to Vector, it’s rather a “bug” we have where the new metrics endpoint on listens on localhost, thus preventing access from outside the pod.
You’ll find the solution here.

smutel · June 28, 2023, 2:03pm

I was on the same topic
I am trying to change the config in live with openlens but I have

Failed to save resource: K8ssandraCluster.k8ssandra.io "cassandra" is invalid: spec.cassandra.telemetry.cassandra.endpoint: Invalid value: "string": spec.cassandra.telemetry.cassandra.endpoint in body must be of type object: "string"

But I added exactly the same than in the github discussion:

    telemetry:
      prometheus:
        enabled: true
      mcac:
        enabled: false
      cassandra:
        endpoint: "0.0.0.0"

smutel · June 28, 2023, 2:20pm

In fact it’s:

cassandra:
    telemetry:
      prometheus:
        enabled: true
      mcac:
        enabled: false
      cassandra:
        endpoint:
          address: "0.0.0.0"

smutel · June 28, 2023, 3:13pm

Perhaps I need to tweak a little bit the dashboard. For example the variable PROMETHEUS_DS ?

smutel · June 29, 2023, 8:52am

I tweak a little bit the dashboard variables and it seems to work. I need to setup vector.

I use the cassandra service name in the prometheus config to scrape the metrics but in the dashboards I see only one pod. Do you know if the prometheus operator use the cassandra service ?

alexander · June 29, 2023, 9:50am

That’s the thing, going through the service won’t allow Prometheus to scrape all the endpoints consistently.
The prometheus operator discovers all the pods that have to be scraped thanks to the service monitor which indicates the labels of the target pods. Then it scrapes each pod separately, without going through a service.
Vector uses a push model, so you don’t need to worry about scraping each pod separately and change the list of pods when the cluster scales.

smutel · June 29, 2023, 10:06am

Yes we need to switch to prometheus operator but not yet.

I tried to fetch the pod directly from the prometheus pod but it does not work.
Normally http://pod ip:9000 should work ?

I activated vector but I still have some empty values for metric called host_cpu_seconds_total in the dashboards. Do you know why ?

alexander · June 29, 2023, 10:25am

for the missing host_*** metrics, see here.

smutel · June 29, 2023, 1:04pm

Sorry to disturb you again.
I need to choose a specific port for the sink (not 9000 / 9103) into my vector configuration like this ?

sinks:
  - config: |
      address = "0.0.0.0:9500"
    inputs:
      - enrich_host_metrics
    name: prometheus_exporter
    type: prometheus_exporter

alexander · June 29, 2023, 1:51pm

I was recommending to use the prometheus remote write sink, not the exporter one. Otherwise you’re in the same situation with a pull model.

Topic		Replies	Views
How to Add K8ssandra in Existing Prometheus Monitoring K8ssandra help	4	510	October 7, 2022
Wrong time in Cassandra metrics collector (mcac) K8ssandra help	0	622	August 23, 2021
Deploying a Sample Application with Cassandra in Kubernetes	0	139	January 27, 2022
How to get disk usage in Vector host_metrics K8ssandra help	0	249	July 31, 2023
Error calling include: template: k8ssandra/charts/k8ssandra-common/templates/_helpers.tpl:14:3: K8ssandra help	14	5308	July 8, 2021

K8ssandra

Need help to setup metrics

Related topics