How to upgrade a k8ssandra cluster from 3.11.x to 4.x?

I have an existing k8ssandra cluster (recently upgraded to latest operator version 1.7.0) running a k8ssandra.io/v1alpha1 K8ssandraCluster on spec.cassandra.serverVersion: 3.11.11. I wish to upgrade to cassandra 4.0 (or 4.1). How do I do this?

What I read so far:
I saw this forum post K8ssandra 1.3.0 Upgrade and Cassandra 4.0.0 and searched github issues and found for example this open issue K8SSAND-1416 ⁃ e2e test to upgrade from 3.11 -> 4.0 · Issue #223 · k8ssandra/cass-operator · GitHub. I’m aware the num_token default is 256 in cassandra 3.11 and 16 in cassandra 4.x, but this isn’t the problem, I’m happy to keep num_tokens at 256 when running cassandra 4.x. But it seems I can’t ask k8ssandra to do a version upgrade.

What I tried so far:

  • changing serverVersion to e.g. 4.0.6 does not work - the reconciliation loops show this error:
 2023-06-08T13:33:16.180Z    DEBUG    events    invalid Cassandra config: tried to change num_tokens in an existing datacenter    {"type": "Warning", "object": {"kind":"K8ssandraCluster", │
│ "namespace":"databases","name":"default","uid":"18f94f5e-8fc5-48ac-a45a-cc96782d446e","apiVersion":"k8ssandra.io/v1alpha1","resourceVersion":"160480530"}, "reason": "Reconcile Error"}    │
│ 2023-06-08T13:33:16.195Z    INFO    updated k8ssandracluster status    {"controller": "k8ssandracluster", "controllerGroup": "k8ssandra.io", "controllerKind": "K8ssandraCluster", "K8ssan │
│ draCluster": {"name":"default","namespace":"databases"}, "namespace": "databases", "name": "default", "reconcileID": "5e90c520-fe82-4fad-9ed6-ba324eded9c6", "K8ssandraCluster": "database │
│ s/default"}                                                                                                                                                                                │
│ 2023-06-08T13:33:16.195Z    ERROR    Reconciler error    {"controller": "k8ssandracluster", "controllerGroup": "k8ssandra.io", "controllerKind": "K8ssandraCluster", "K8ssandraCluster": { │
│ "name":"default","namespace":"databases"}, "namespace": "databases", "name": "default", "reconcileID": "5e90c520-fe82-4fad-9ed6-ba324eded9c6", "error": "invalid Cassandra config: tried t │
│ o change num_tokens in an existing datacenter"}                                                                                                                                            │
│ sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler                                                                                                      │
│     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:326                                                                                           │
│ sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem                                                                                                   │
│     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:273                                                                                           │
│ sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2                                                                                                         │
│     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.13.0/pkg/internal/controller/controller.go:234
  • changing serverVersion to 4.0.6 and changing num_tokens (previously omitted) to 256 (which should be the value the actual cassandra pods use) does not work: the validation webhook then complains about a change in num_tokens which isn’t allowed.
  • deleting the validating webhook and re-trying the above (on a test dummy cluster just to see what happens): the first cassandra pod restarts but doesn’t come up / get into a ready state.

I wasn’t able to find any instructions on how to upgrade cassandra versions from 3.11 to 4.x using k8ssandra.

Previously, before the era of k8ssandra, cassandra version upgrades were not so hard to do (even from 2.x to 3.x), we had an ansible script upgrading them one by one followed by an sstable upgrade.

I wasn’t able to find any guidelines or instructions on this. Has anyone done this? Is this supported? Will it be in the future?

Thanks!

Hi,

could you share with us the K8ssandraCluster manifest you’ve used for 3.11 and the one you’re trying to apply for the 4.0 upgrade?

Thanks!

Hi @alexander, I had a similar issue during the upgrade from 3.11.x to 4.x.
This is the manifest I was using and I can confirm that the issue still happens with the latest version of k8ssandra.

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: demo
spec:
  cassandra:
    serverVersion: "3.11.10"
    datacenters:
      - metadata:
          name: dc1
        size: 1
        storageConfig:
          cassandraDataVolumeClaimSpec:
            storageClassName: hostpath
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 2Gi
        config:
          jvmOptions:
            heapSize: 512M
        stargate:
          size: 1
          heapSize: 256M

The question remains, how to perform k8ssandra cluster version upgrade.
Applying changed manifest

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: demo
spec:
  cassandra:
    serverVersion: "4.1.3"
    datacenters:
      - metadata:
          name: dc1
        size: 1
        storageConfig:
          cassandraDataVolumeClaimSpec:
            storageClassName: hostpath
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 2Gi
        config:
          jvmOptions:
            heapSize: 512M
        stargate:
          size: 1
          heapSize: 256M

Results in error related to changes between 3.11.x and 4.x.

ERROR	Reconciler error	{"controller": "k8ssandracluster", "controllerGroup": "k8ssandra.io", "controllerKind": "K8ssandraCluster", "K8ssandraCluster": {"name":"demo","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "demo", "reconcileID": "454e7078-e7b2-4397-9f6d-0fe0962604d7", "error": "invalid Cassandra config: tried to change num_tokens in an existing datacenter"}
k8ssandra-operator-587c474d8f-mngqm k8ssandra-operator sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
k8ssandra-operator-587c474d8f-mngqm k8ssandra-operator 	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:329
k8ssandra-operator-587c474d8f-mngqm k8ssandra-operator sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
k8ssandra-operator-587c474d8f-mngqm k8ssandra-operator 	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274
k8ssandra-operator-587c474d8f-mngqm k8ssandra-operator sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
k8ssandra-operator-587c474d8f-mngqm k8ssandra-operator 	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235

When trying to change the value of num_tokens to 16, the whole process is blocked because of the validation webhook, that disallows any changes to num_tokens configuration.

admission webhook "vk8ssandracluster.kb.io" denied the request: num_tokens value can't be
changed