tajul
February 21, 2025, 4:00am
1
Hi everyone,
I am happy join here and sorry asking for help on my very first meet.
I have been trying to deploy a simple k8ssandra cluster together with medusa in my eks cluster. The k8ssandra-operator, k8cluster were up and running just fine. But when I am trying to deploy k8ssandra cluster and medusa together then the running k8ssandra-operator got crashed with the following error:
2025-02-21T02:59:53.650Z INFO Initial token computation could not be performed or is not required in this cluster
{"controller": "k8ssandracluster", "controllerGroup": "k8ssandra.io", "controllerKind": "K8ssandraCluster", "K8ssandraCluster":
{"name":"xxx-db","namespace":"k8ssandra"},
"namespace": "k8ssandra", "name": "xxx-db", "reconcileID": "430398ac-d5b8-4c1b-b64e-2f3d8447361b",
"K8ssandraCluster": {"xxx":"xxx-xx","namespace":"k8ssandra"},
"error": "cannot compute initial tokens: at least one DC has num_tokens >= 16"}
AND
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x173e4e7]
I tried with k8ssandra-operator version 1.21.1 and 1.20.2
cassandra version 4.0.1
kubernetes info
Client Version: v1.29.3-eks-ae9a62a
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.12-eks-2d5f260
k8ssandra-cluster manifest
apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
name: xxx-xx
spec:
cassandra:
serverVersion: "4.0.1"
superuserSecretRef:
name: xxx-xxx-superuser
datacenters:
- metadata:
name: datacenter1
size: 1
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 1000m
memory: 2Gi
storageConfig:
cassandraDataVolumeClaimSpec:
storageClassName: efs-sc
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
config:
jvmOptions:
heapSize: 1Gi
stargate:
size: 1
heapSize: 2Gi
resources:
requests:
cpu: 100m
memory: 1Gi
limits:
cpu: 500m
memory: 4G
medusa:
containerImage:
registry: docker.io
repository: k8ssandra/medusa
name: medusa
tag: 0.23.0
pullPolicy: IfNotPresent
cassandraUserSecretRef:
name: xxx-xxx-superuser
storageProperties:
storageProvider: s3
# apiProfile: default
# host: "s3://arn:aws:s3:ap-northeast-x:xxxxxx:accesspoint/xxxx-xxxx-backup-s3-accesspoint/"
concurrentTransfers: 1
maxBackupAge: 1
maxBackupCount: 2
multiPartUploadThreshold: 10000000000000000
bucketName: xxxx-xxxx-backup
prefix: xxxx
region: ap-northeast-x
secure: true
storageSecretRef:
name: medusa-bucket-secret
I would appreciate any of your help.
Thank you so much.
This was solved with a clean re-install of the operator.
tajul
April 9, 2025, 1:48pm
3
@alexander
Thank you so much and sorry for the delayed update.
To solve the above problem I had to set the purgeBackups of medusa to false.
medusa:
…
…
purgeBackups: false
The detail solution can be found below:
opened 04:13AM - 25 Feb 25 UTC
closed 06:24AM - 26 Feb 25 UTC
bug
I was trying to deploy a single DC cluster in my existing EKS cluster. Following… the link (<https://docs.k8ssandra.io/install/eks/>) I installed the k8ssandra operator in control plane mode.
`helm install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --set installCRDs=true`
`helm install k8ssandra-operator k8ssandra/k8ssandra-operator -n k8ssandra`
After executing the above command above got k8ssandra operator running perfectly.
```
k8ssandra-operator-6644f46cff-m6p27 1/1 Running 0
k8ssandra-operator-cass-operator-594f6c5874-njtnc 1/1 Running 0
```
The I deployed the Single-cluster with a singe DC with medusa as defined in the following manifest:
```
apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
name: test-db
spec:
cassandra:
serverVersion: "4.1.8"
superuserSecretRef:
name: test-db-superuser
datacenters:
- metadata:
name: datacenter1
size: 1
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 1000m
memory: 2Gi
storageConfig:
cassandraDataVolumeClaimSpec:
storageClassName: efs-sc
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
config:
jvmOptions:
heapSize: 512M
stargate:
size: 1
heapSize: 256M
resources:
requests:
cpu: 100m
memory: 256M
limits:
cpu: 500m
memory: 2G
medusa:
cassandraUserSecretRef:
name: test-db-superuser
storageProperties:
storageProvider: s3
storageSecretRef:
name: medusa-bucket-secret
bucketName: test-db-backup
prefix: testdb
region: ap-northeast-1
```
When I applied the above manifest the k8ssandra-operator got crashed.
```
NAME READY STATUS RESTARTS AGE
k8ssandra-operator-6644f46cff-6z4t7 0/1 CrashLoopBackOff 14 (47s ago)
k8ssandra-operator-cass-operator-594f6c5874-njtnc 1/1 Running 0
```
The log shows the following error message:
```
2025-02-25T03:48:40.027Z INFO Couldn't find medusa-restore init container {"controller": "k8ssandracluster", "controllerGroup": "k8ssandra.io", "controllerKind": "K8ssandraCluster", "K8ssandraCluster": {"name":"test-db","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "test-db", "reconcileID": "4e4ccef4-49a3-4d7b-82c1-d57b8fa6bd3f", "K8ssandraCluster": {"name":"test-db","namespace":"k8ssandra"}, "CassandraDatacenter": {"name":"datacenter1","namespace":"k8ssandra"}, "K8SContext": ""}
2025-02-25T03:48:40.027Z INFO Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference {"controller": "k8ssandracluster", "controllerGroup": "k8ssandra.io", "controllerKind": "K8ssandraCluster", "K8ssandraCluster": {"name":"lss-db","namespace":"k8ssandra"}, "namespace": "k8ssandra", "name": "test-db", "reconcileID": "4e4ccef4-49a3-4d7b-82c1-d57b8fa6bd3f"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x173e4e7]
```
If I comment out the medusa block the cluster runs without error.
`k8ssandra-operator version is 1.21.1`
kubectl version command output
```
Client Version: v1.29.3-eks-ae9a62a
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.12-eks-2d5f260
```
I don't understand what is missing or causing the crash. I would greatly appreciate any help.
┆Issue is synchronized with this [Jira Story](https://datastax.jira.com/browse/K8OP-315) by [Unito](https://www.unito.io)
┆Issue Number: K8OP-315