How to Migrate an Existing Cluster to K8ssandra-operator Without Any Downtime

Originally published at: How to Migrate an Existing Cluster to K8ssandra-operator Without Any Downtime - K8ssandra, Apache Cassandra® on Kubernetes

With our recent release of K8ssandra-operator v1.0.0 – our brand new Kubernetes operator for K8ssandra – the next question is how to upgrade from K8ssandra v1.x or from Apache Cassandra to K8ssandra-operator

There’s obviously a lot of overlap between both K8ssandra v1x and Cassandra, but there are also some incompatibilities which currently prevent you from performing an in-place upgrade.

So, we’ll use a datacenter (DC) switch, instead, creating a new DC with K8ssandra-operator that will expand the cluster. Then we’ll decommission the 1.x DC. In this article, we’ll cover the whole procedure so you can migrate an existing cluster to K8ssandra-operator without any downtime.

Install k8ssandra-operator

First, install cert-manager, which will be used by cass-operator:

kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.7.1/cert-manager.yaml

Using kustomize.io or helm, install k8ssandra-operator and cass-operator in a dedicated namespace (we’ll use k8ssandra-operator here):

Using Helm

helm install k8ssandra-operator k8ssandra/k8ssandra-operator -n k8ssandra-operator --create-namespace

Using Kustomize

kustomize build github.com/k8ssandra/k8ssandra-operator/config/deployments/control-plane\?ref\=v1.0.0 | kubectl apply --server-side --force-conflicts -f -

Copy secrets from k8ssandra 1.x

Secrets can only be referenced in pods if they exist in the same namespace. Therefore, the secrets that were created in the original k8ssandra namespace need to be copied over to the k8ssandra-operator namespace.

That includes the following secrets:

  • Superuser
  • Reaper (CQL)
  • Reaper JMX
  • Medusa
  • Medusa bucket key
  • Stargate

There is no automation to copy secrets in kubernetes, so they need to be fully recreated either manually or through CD automation.

Alter your keyspaces replication strategy

Before spinning up the new datacenter, make sure all user created keyspaces are using NetworkTopologyStrategy (NTS). If not, alter them to use NTS and set the same number of replicas as currently on the existing DC:

ALTER KEYSPACE <my-keyspace> WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1':3};

Alter the following keyspaces in the same way:

  • system_auth
  • system_distributed
  • system_traces
  • data_endpoint_auth (Stargate)
  • reaper_db (Reaper)

Restrict traffic to dc1

Client traffic should be restricted to dc1 by selecting it as local datacenter and using only “LOCAL_” consistency levels (or QUORUM). This will ensure consistency during the expansion to a new datacenter.

Create a K8ssandraCluster resource

The cluster is now ready for the expansion to a new datacenter created by k8ssandra-operator.

We’ll need to create a K8ssandraCluster (k8c) object with the right settings and secret references. Here’s an example k8c object with Reaper, Medusa and Stargate enabled:

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: my-cluster
spec:
  cassandra:
    serverVersion: "4.0.1"
    resources:
      requests:
        memory: 16Gi
    additionalSeeds:
      - "10.10.1.1"
      - "10.10.1.2"
    datacenters:
      - metadata:
          name: dc2
        size: 3
        storageConfig:
          cassandraDataVolumeClaimSpec:
            storageClassName: standard
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 4096Gi
        config:
          jvmOptions:
            heapSize: 8Gi
            heapNewGenSize: 3Gi
    mgmtAPIHeap: 64Mi
    superuserSecretRef:
      name: superuser-secret
  externalDatacenters:
    - "dc1"
  reaper:
    autoScheduling:
      enabled: true
    cassandraUserSecretRef:
      name: reaper-secret
    jmxUserSecretRef:
      name: reaper-jmx-secret
    uiUserSecretRef:
      name: reaper-ui-secret
  medusa:
    storageProperties:
      storageProvider: google_storage
      bucketName: cluster-backups
      storageSecretRef: medusa-bucket-key
      prefix: my-cluster
      maxBackupCount: 10
    cassandraUserSecretRef:
      name: medusa-secret
  stargate:
    size: 1
    heapSize: 4Gi

We’ve used dc2 as the datacenter name, with the existing datacenter having a different name (dc1 for example). metadata.name needs to match the existing Cassandra cluster name so that the datacenters can connect together.

The additional Seeds section requires the IPs of a couple Cassandra nodes (or pods) from the existing datacenter. Those IPs must be reachable on port 7001, which is Cassandra’s storage port used by nodes to communicate with each other.

k8ssandra-operator automatically manages the replication strategy for several keyspaces, such as system_auth, data_endpoint_auth or reaper_db, based on the list of datacenters in a k8c object. It allows safe automation for operations such as expansions to new datacenters. A migration involves one or more datacenters which are not referenced in the k8c object, requiring the addition of extra datacenters in the replication settings. This is what the .spec.externalDatacenters entry stands for, allowing us to keep replicas on the datacenters we are migrating from.

Create the object in the same namespace as the operator:

kubectl apply -f k8ssandra-cluster.yaml -n k8ssandra-operator

The K8ssandraCluster objects can be listed using the long CRD name, K8ssandraCluster, but also their short name, k8c:

$ kubectl get k8c -n k8ssandra-operator
NAME         AGE
my-cluster   1m

Wait for the operator to proceed and create the requested resources.

Check the server-system-logger containers logs of the -sts pods to follow the bootstrapping process and see if they successfully join the cluster.

Connect to a pod’s Cassandra container and run the following command to check the topology, which at the end should show two distinct datacenters (dc1 and dc2 in our case):

$ nodetool -u <superuser name> -pw <superuser password> status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load      Tokens  Owns (effective)  Host ID                               Rack   
UN  10.xx.xx.57  7.58 MiB  16      100.0%            9299e226-d2fa-4c2e-9fba-3b344d5d47b7  default
UN  10.xx.xx.7   7.54 MiB  16      100.0%            51c2fb0e-7fad-4b6e-8e4d-2108dc204cd5  default
UN  10.xx.xx.4   7.89 MiB  16      100.0%            6a92531f-a086-4c40-9067-50899b8974a0  default

Datacenter: dc2

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
– Address Load Tokens Owns (effective) Host ID Rack
UN 10.xx.xx.36 6.97 MiB 16 100.0% eba0964a-835f-4c76-978e-f25eda0b49ad default
UN 10.xx.xx.20 6.81 MiB 16 100.0% d15f93f4-248d-478f-8832-b87b4e3981d2 default
UN 10.xx.xx.80 6.71 MiB 16 100.0%

Alter the keyspaces and rebuild the nodes

k8ssandra-operator will fail fully going through with the expansion because dc2 won’t have any replica for the system_auth keyspace. Cassandra will start but Reaper and Stargate won’t as long as replication wasn’t altered and rebuild was done.

Once all nodes in the new DC have joined the cluster, alter the user created keyspaces to NTS with the same number of replicas on both DCs:

ALTER KEYSPACE <keyspace-name> WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1':3, 'dc2': 3};

Run the above statement for the following keyspaces as well:

  • system_auth
  • system_distributed
  • system_traces
  • data_endpoint_auth (Stargate)
  • reaper_db (Reaper)

Run the rebuild command in all the Cassandra pods (from the Cassandra container) in the new datacenter:

nodetool -u <superuser-name> -pw <superuser-password> rebuild dc1

Replace the <...> placeholders with the appropriate values in the above command.

Add dc2’s cassandra service to Reaper

Reaper already has the cluster registered with dc1’s Cassandra pods service as seed host.

The operator will not, in its current form, add dc2’s service to the seed hosts. This makes it impossible for Reaper in dc2 to discover the cluster (see this ticket for more information).

Connect to Reaper’s UI in dc1, and re-register the cluster, specifying both datacenters’ Cassandra services as seed nodes:

Change the values to match your deployment settings, knowing that the Cassandra service is named as follows by K8ssandra: <cluster name>-<dc name>-service.

Restrict traffic to dc2

Client traffic should be restricted to dc2 by selecting it as local datacenter and using only “LOCAL_” consistency levels (or QUORUM). This will prevent nodes from connecting to dc1 as we decommission it.

Decommission dc1

Update the K8ssandraCluster object definition by removing the externalDatacenters and additionalSeeds sections. Only after that, alter all keyspaces to remove replicas from dc1 (if you do it before, the operator will re-alter some keyspaces replication):

ALTER KEYSPACE <my-keyspace> WITH replication = {'class': 'NetworkTopologyStrategy', 'dc2': 3};

Stop all nodes from dc1 by deleting its helm release (in the case of a K8ssandra 1.x cluster) or stopping all Cassandra processes (in the case of a non-k8s Cassandra cluster).

Connect to one of the Cassandra pods in the Cassandra container, and remove nodes from dc1 one by one using their host id which shows up in the nodetool status output.

In our example, we will run the following commands:

$ nodetool -u <superuser-name> -pw <superuser-password> removenode 9299e226-d2fa-4c2e-9fba-3b344d5d47b7

$ nodetool -u <superuser-name> -pw <superuser-password> removenode 51c2fb0e-7fad-4b6e-8e4d-2108dc204cd5

$ nodetool -u <superuser-name> -pw <superuser-password> removenode 6a92531f-a086-4c40-9067-50899b8974a0

The nodetool status output should then be:

$ nodetool -u <superuser name> -pw <superuser password> status
Datacenter: dc2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load      Tokens  Owns (effective)  Host ID                               Rack   
UN  10.xx.xx.36  6.97 MiB  16      100.0%            eba0964a-835f-4c76-978e-f25eda0b49ad  default
UN  10.xx.xx.20  6.81 MiB  16      100.0%            d15f93f4-248d-478f-8832-b87b4e3981d2  default
UN  10.xx.xx.80  6.71 MiB  16      100.0%            6bca9cdf-03f8-4658-84b9-28c5395235ba  default

Upgrade Now

We’ve developed this guide so you can prepare and execute your migration procedure now without any downtime. 

Moving forward, the K8ssandra team will focus on expanding the capabilities of the new operator, which already provides full support and orchestration for multi cluster/multi region deployments.

Let us know what you think of K8ssandra-operator by joining us on the K8ssandra Discord or K8ssandra Forum today. For exclusive posts on all things data, follow DataStax on Medium.

Resources: 

  1. Introducing the Next Generation of K8ssandra! (blog post)
  2. Apache Cassandra  
  3. Apache Cassandra – Data Center Switch (blog post)
  4. K8ssandra Documentation: Cass Operator
  5. Kustomize.io
  6. GitHub Repo for helm
  7. GitHub Repo for K8ssandra-operator
  8. GitHub Repo for k8ssandra/cass-operator