K8ssandra cluster horizontal autoscaling

waterskibum · June 4, 2022, 4:43am

I have tried 3-4 different ways to use HPA on my cluster deployment. It shows up in Kubernetes as a statefulset rather than a deployment so I have been going that route. I have several metric servers and they all respond fine to kubeclt top commands and raw requests but no matter what I try it alway comes back with the same error

k8ssandra 12s Warning FailedComputeMetricsReplicas horizontalpodautoscaler/keda-hpa-k8ssandra invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: missing request for cpu
k8ssandra 12s Warning FailedGetResourceMetric horizontalpodautoscaler/keda-hpa-k8ssandra failed to get cpu utilization: missing request for cpu

I’ve tried with Kubernetes and Keda. Using Helm and standalone yamls

Is there a guide somewhere thats is specific to this deployment ? anyone with any good experience with getting this set up ?

helm install -f k8ssandra-value.yaml k8ssandra k8ssandra/k8ssandra --namespace k8ssandra
Here is my helm value file
cassandra:
version: “4.0.3”
cassandraLibDirVolume:
storageClass: kops-ssd-1-17
size: 5Gi
allowMultipleNodesPerWorker: true
heap:
size: 1G
newGenSize: 1G
resources:
requests:
cpu: 1000m
memory: 1Gi
limits:
cpu: 2000m
memory: 5Gi
datacenters:

name: dc1
size: 3
averageCpuUtilization: 50
medusa:
enabled: true
bucketName: k8ssandra-bucket-dev
storageSecret: medusa-bucket-key
storage: s3
storage_properties:
region: ca-central-1
racks:
- name: default
  ingress:
  enabled: true
  kube-prometheus-stack:
  grafana:
  adminUser: admin
  adminPassword: CrInkleB0rk!
  stargate:
  enabled: true
  replicas: 1
  heapMB: 500
  cpuReqMillicores: 100
  cpuLimMillicores: 1000

waterskibum · June 5, 2022, 4:21am

Added namespace to the hpa.yaml and it took off like a rabbit. added stress test and it spun up the max amount of pods, then cassandra stopped taking connections as the endpoint disappeared and my reaper pod went into backout state and eventually stopped. So I will assume that the cass operator has no idea what was going on and stopped working as well, hence no end point

I attempted a simple scale command to add 1 pod and had the same results. Begs the question, do I have to use helm upgrade after adding nodes in the value file when I get cpu alerts or is there a way to automate this so I can sleep at night ?

waterskibum · June 5, 2022, 4:24pm

btw, here is my hpa.yaml file
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa
namespace: k8ssandra
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: statefulset
name: k8ssandra-dc1-default-sts
minReplicas: 3
maxReplicas: 12
metrics:

type: Resource
resource:
name: cpu
targetAverageUtilization: 50

chenweisomebody · September 10, 2023, 11:39pm

I am also looking for the answer!

chenweisomebody · September 10, 2023, 11:53pm

@waterskibum Have you managed to make it work?

alexander · September 11, 2023, 5:13am

We do not support horizontal autoscaling for Cassandra pods. Scaling a Cassandra cluster can take a lot of time for each node as data needs to stream to the new ones (depending on the data size, we’re talking about many hours to start a single new node). Adding/removing nodes too fast can have implications on data availability as token movements are involved and shouldn’t be dealt with automatically based on cpu metrics IMO.

Topic	Replies	Views
Resource configuration for single node bare metal K8 setup K8ssandra help	310	April 11, 2023
K8ssandra: Production-Ready Platform for Running Apache Cassandra on Kubernetes Welcome and Announcements	490	April 29, 2021
Cloud-Native Workshop: Apache Cassandra meets Kubernetes	117	June 15, 2022
Deploying a Sample Application with Cassandra in Kubernetes	139	January 27, 2022
Requirements for running K8ssandra for development K8ssandra help	638	April 29, 2021

K8ssandra

K8ssandra cluster horizontal autoscaling

Related topics