K8ssandra-operator v1.10.0 is available

Originally published at: https://k8ssandra.io/blog/announcements/release/k8ssandra-operator-v1-10-0-is-available/

The K8ssandra team has just released k8ssandra-operator v1.10.0.  

In this blog post, we will highlight some of the changes we brought in since v1.6.0, including some strong backup/restore improvements.

Backup/Restore improvements

One of the biggest changes in v1.9.0 is the upgrade to the recently released Medusa v0.16. This version has a major rewrite of the whole storage backend, which makes it much more efficient and resilient when communicating with the object storage systems.  

You can read more about this in the Medusa release blog post.

We also improved a few things in k8ssandra-operator in recent versions, based on community feedback:

Restore mapping computation by the operator

Before v1.8.0, the restore mapping was computed by Medusa itself, once for each pod. This created some issues with hostname resolutions while proving to be both hard to debug and unreliable in some environments.

In order to have a safer mapping generation, we implemented its computation in the operator itself. Preparing for an upcoming feature where a cluster could be created directly from a backup, we created a separate Medusa pod that is used for communications between the operator and Medusa’s storage backend without requiring the cluster to be up. This will also allow to recover from a failed restore by pointing to a different backup without having to fully recreate the cluster.

Status fields and printed columns

MedusaBackup objects materialize Medusa backups as a local object in the Kubernetes cluster. So far they didn’t contain much valuable information on the backup itself, which could’ve been useful to inspect the backup metadata without having to communicate with Medusa directly.

In v1.9.0, we added the following fields in the MedusaBackup CRD status:

  • .status.totalNodes: the number of nodes in the datacenter at the time of the backup
  • .status.finishedNodes: the number of nodes that successfully completed the backup
  • .status.status: the overall status of the backup (success, error, etc…)
  • .status.nodes: an array of the nodes existing in the datacenter at the time of the backup, along with the tokens they own and their dc/rack placement

Some of these fields are now printed by default when listing the objects with kubectl or displaying them with Open Lens:

% kubectl get MedusaBackup -A
NAMESPACE          NAME                                STARTED   FINISHED   NODES   COMPLETED   STATUS
dogfood-63gzhx28   medusa-backup-1                     27h       27h        3       3           SUCCESS
dogfood-63gzhx28   medusa-backup-schedule-1695346200   10h       10h        3       3           SUCCESS

The MedusaBackupJob object now has its start and finish time status fields printed in the output as well, to ease tracking backup progress:

% kubectl get MedusaBackupJob -A
NAMESPACE          NAME                                STARTED   FINISHED
dogfood-63gzhx28   medusa-backup-1                     27h       27h
dogfood-63gzhx28   medusa-backup-schedule-1695346200   10h       10h

Restore safety

We’ve had reports that k8ssandra-operator would allow triggering the restore of incomplete or incompatible backups, which would obviously result in crash looping broken clusters.

Leveraging the newly added status fields from the MedusaBackup object, we’ve added new safety checks before triggering the restore operation.

Any backup that is incomplete, that was taken on a datacenter with a different name, or that has a different rack layout would fail the MedusaRestoreJob early on without risking cluster breakage.

We recommend to perform a sync MedusaTask to update your existing MedusaBackup objects with these new fields.

Bug fixes

When running a sync task, the operator was missing proper namespace filters which would result in the deletion of MedusaBackup objects from datacenters that were not targeted by the MedusaTask.  

This was fixed in v1.9.0 and the MedusaTask boundaries are properly respected.

We also fixed a bug that would prevent scaling up a Cassandra cluster after a restore was done.

Further improvements will come in the next release to improve the backup/restore experience.

Reaper communications through management-api

Reaper communicates with Cassandra through JMX for any interaction. This includes discovering the cluster topology, accessing metrics and triggering repairs.

JMX is often considered as an insecure protocol and some security policies restrict its usage. In K8ssandra, we use the management api for operator to node communications. On top of the additional security it provides, we can augment its capabilities by accessing the internals of Cassandra through its loaded agent.

Starting with v1.10 and the recently released Reaper v3.4.0, you can enable http communications with the management api using the following CRD settings:

spec:
  reaper:
    httpManagement:
      enabled: true

arm64 support

As developer laptops (such as Apple Silicon laptops) and cloud VMs (such as EC2 Graviton instances) are starting to grow in popularity, it became evident that we needed to support the arm64 architectures across our Docker images.

With this new release all K8ssandra images are published for both the amd64 and arm64 architectures for seamless installation on multiple platforms.

Support for DC name overrides

k8ssandra-operator already supported cluster name overrides, which allowed setting the Cassandra cluster name to a different value than the K8ssandraCluster object name, which is constrained in terms of characters (no spaces, uppercase letters or special characters aside from dashes for example).

Considering that the CassandraDatacenter object is a kubernetes object, we could not have two clusters using the same DC names (dc1 for example) in the same namespace. We could not allow either using a datacenter name that wouldn’t match the Kubernetes objects naming constraints.

k8ssandra-operator now has proper support for datacenter name overrides through the .spec.cassandra.datacenters[].datacenterName field.

This way you can create a cluster with a dc1 datacenter that has a CassandraDatacenter named cluster1-dc1 using the following manifest:

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: cluster1
spec:
  cassandra:
    serverVersion: "4.0.1"
    datacenters:
      - metadata:
          name: cluster1-dc1
        datacenterName: dc1
        k8sContext: kind-k8ssandra-0
        size: 1
        storageConfig:
          cassandraDataVolumeClaimSpec:
            storageClassName: standard
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 5Gi

This also allows using spaces and uppercase letters in your DC names (not that we recommend doing so, but it’s a possibility now).

Annotation for secrets injection

We’ve had an annotation available to inject secrets at runtime for a few versions now, and we recently evolved it so that you can attach specific secrets to specific pods in the statefulset easily.

Statefulsets don’t make that easy (or even possible), since all pods are created with the exact same pod template spec, but using different names.
What we added here is a ${POD_NAME} placeholder value that will be substituted by our webhook in the secret name, with the actual pod name. This can be very handy for encryption stores management for example where each node has its own keystore/truststore pair:

Considering the above (partial) manifest would generate the following pods with each a different secret mounted:

  • test-dc1-default-sts-0 would get the test-dc1-default-sts-0-inode-cert-keystore
  • test-dc1-default-sts-1 would get the test-dc1-default-sts-1-inode-cert-keystore
  • test-dc1-default-sts-2 would get the test-dc1-default-sts-2-inode-cert-keystore

Metrics improvements

The Cassandra metrics exposed by Vector in the cassandra_metrics component have been augmented with a namespace label. This is necessary to distinguish metrics from clusters sharing the same name but running in separate namespaces.

We added new metrics that expose streaming operations and sstables operations (limited to compactions at the moment):

org_apache_cassandra_metrics_extended_compaction_stats_completed 
org_apache_cassandra_metrics_extended_compaction_stats_total 
org_apache_cassandra_metrics_extended_streaming_total_files_to_receive
org_apache_cassandra_metrics_extended_streaming_total_files_received
org_apache_cassandra_metrics_extended_streaming_total_size_to_receive
org_apache_cassandra_metrics_extended_streaming_total_size_received
org_apache_cassandra_metrics_extended_streaming_total_files_to_send
org_apache_cassandra_metrics_extended_streaming_total_files_sent
org_apache_cassandra_metrics_extended_streaming_total_size_to_send
org_apache_cassandra_metrics_extended_streaming_total_size_sent

New CassandraTasks

Four new CassandraTasks are now available as part of the upgrade to cass-operator v1.18:

  • compact
  • scrub
  • flush
  • garbagecollect

They are equivalent to the corresponding nodetool operations.

Scrub brings the following new job arguments:

  • no_validate
  • no_snapshot
  • skip_corrupted

Compact brings the following new job arguments:

  • split_output
  • start_token
  • end_token

These will allow orchestrating common maintenance operations that were required to be executed through nodetool so far. These operations are available as K8ssandraTasks as well for multi DC orchestration.

Update now

As usual, we encourage all K8ssandra users to upgrade to v1.10.0 in order to get the latest features and improvements.

Let us know what you think of K8ssandra-operator by joining us on the K8ssandra Discord or K8ssandra Forum today. For exclusive posts on all things data and GenAI, follow DataStax on Medium.