A Recap of Cass-Operator in 2021 and What's Coming Next

Originally published at: A Recap of Cass-Operator in 2021 and What's Coming Next - K8ssandra, Apache Cassandra® on Kubernetes

While much of our content here focuses on K8ssandra Helm charts and the upcoming K8ssandra-operator, it should be noted that both use cass-operator to deploy the Cassandra parts. Since we haven’t published a lot of information specifically targeting the cass-operator, I thought it was a good time to take a retrospective look at how we improved it in 2021.

If you’re only using the downstream projects, you might still find this post interesting as these projects provide a lot of extra tools and can simplify certain deployment models. The features and changes mentioned here should be mostly available now or in the future with the K8ssandra or K8ssandra-operators.

Even if you don’t need any of the additional features provided by the other projects, you can keep using the cass-operator in a standalone mode to deploy and manage your Cassandra installations in Kubernetes. Also, using cass-operator directly will give you access to certain features that might be hidden by the other projects (such as certain advanced configuration options of pods).

During the past year, some of the features have come from the requirements in other projects or weak points we’ve noticed while building them. But it has also been nice to see a large number of feature requests, bug reports from the community (and PRs fixing them!). We truly appreciate all the input and it genuinely helps us make the product better for everyone. 

So, let’s take stock of the results from all that input.

Upgrade path from older versions to current ones

Part of evolving the product means that users might want to upgrade the version of cass-operator they use. Due to the numerous changes in the underlying Kubernetes CRD handling, as well as the frameworks we use to build the cass-operator, there are some hoops to go through if you want to upgrade to a newer version of cass-operator.

If you’re still running versions from the datastax/cass-operator repository, the only supported upgrade path is updating first to k8ssandra/cass-operator version 1.7.1.

1.7.1 is also the last version that can be updated using the Helm chart from the cass-operator repository. To keep deploying using Helm, the supported method is to use k8ssandra/k8ssandra repository moving forward. Our README has instructions on how to install only the cass-operator. These Helm charts might not include the latest cass-operator version since they follow the K8ssandra release process, but should eventually get every version.

From 1.8.0, we support deploying using Kustomize, which is also bundled inside kubectl. This requires changing the kubectl’s -f parameter to -k, but should otherwise be equal to previous behavior if you don’t want to change any defaults. But, it also allows customizing your deployment more than the previous deployment models we used. We provide a few example methods, such as cluster-scoped installation as well as namespace-scoped one.

Version 1.9.0 is also available from OperatorHub if you’re using OLM. However, if upgrading from an older OLM release (1.5.1), please see the instructions below for one manual step you need to take. We’ll continue releasing updates to OLM in the future, making it one of the supported methods of installing cass-operator. This means both the certified operators as well as the community one, which is appropriately called cass-operator-community.

From version 1.7.1 to newer

Due to the modifications to cass-operator’s underlying controller-runtime and updated Kubernetes versions, you’ll need to do a couple of manual steps before updating to a newer version of cass-operator. 

Newer Kubernetes versions require stricter validation, so we have to remove preserveUnknownFields global property from the CRD to update to a newer CRD. The newer controller-runtime, on the other hand, modifies the liveness, readiness, and configuration options, which require us to delete the older deployment. Note that these commands do not delete your running Cassandra instances.

You can run the following commands assuming cass-operator is installed to cass-operator namespace (change -n parameter if it’s installed to some other namespace):

kubectl -n cass-operator delete deployment.apps/cass-operator
kubectl -n cass-operator delete service/cassandradatacenter-webhook-service

kubectl patch crd cassandradatacenters.cassandra.datastax.com -p ‘{“spec”:{“preserveUnknownFields”:false}}’

You can now install a new version of cass-operator using your preferred method. The simplest way is to use our built kustomize templates:

kubectl apply -k 'github.com/k8ssandra/cass-operator/config/deployments/default?ref=v1.9.0'

Releases in 2021

I won’t go through all the changes we made in each version, such as bug fixes, but here’s a small breakdown of the important user-facing ones in each version that we released this year.

1.9.0 – November 2021

We did most of the work for the 1.9.0 release while we waited to release 1.8.0, which led to a quicker release time. Some of the bug fixes done in the 1.9.0 cycle were backported to 1.8.0. This version is also the first one to be released with OLM support. In this release:

  • The management-api client allows to fetch the available features in the running container, which is also used internally in the cass-operator
  • Supports Cassandra 4.0 features, including FullQueryLogging
  • Adds all recommended app.kubernetes.io labels to every resource cass-operator creates
  • Allows to override repository and version tags for DSE and Cassandra images

1.8.0 – October 2021

1.8.0 marked a significant refactoring by moving from an older kubebuilder (operator-framework) to a new structure. A lot of it should be invisible to the user, but it lets us ensure the product works in newer versions of Kubernetes. We also removed Helm support in this release (since k8ssandra/k8ssandra already provides that distribution model) and instead replaced it to use Kustomize.

For build processes, we removed mage and used plain Makefiles instead to provide consistency in K8ssandra projects. Also, both cass-operator as well as system-logger images are now built on top of UBI-images (RHEL base).

Webhooks are now using cert-manager to manage certificates, requiring cert-manager to be installed in the Kubernetes cluster if you want to use validating webhooks.

Lastly, the release date was delayed since we wanted to align it with the k8ssandra/k8ssandra release. We originally released RC1 in August, but kept the release open for fixes until October.

A heads up if you use cass-operator pkgs in your go project: some package names have changed. Although this should mostly be a quick search-and-replace exercise.

1.7.0 (1.7.1) – May 2021

1.7.0 (and the bugfix 1.7.1 for upgrade scenarios) was the first release under the K8ssandra umbrella and with a new repository name. From now on, all the releases would be under k8ssandra/cass-operator as well as the development happening in the K8ssandra github repository. The main highlights were:

  • Ability to modify PodSecurityContext and tolerations
  • Configure Cassandra using a secret key
  • Changed layout of the system-logger to improve node replacement / scale-down performance
  • Custom labels and annotations for services
  • Allow hostnames as additional seeds

1.6.0 – Feb 2021

The 1.6.0 release was the last to be released under the name datastax/cass-operator. As such, if you’re using the datastax repository, you’re most likely at this version or lower. For upgrade instructions and path to newer versions, see above.

In 1.6.0, we introduced two new features:

  • Specifying additional PersistentVolumeClaims
  • Ability to specify rack labels

What’s coming in 2022 

As of writing this blog post, our feature set for 1.10.0 is already set and in use in the k8ssandra-operators alpha versions. If we find no new bugs, we’ll release the current master as 1.10 in February to align with the release of k8ssandra-operator’s 1.0 version. Most of the new features are related to multi-datacenter cluster scenarios.

  • Task scheduler to execute Cassandra nodetool tasks on every pod (cass-operator manages the lifecycle of these tasks and executes them serially on every datacenter pod, waiting for the previous pod to complete the task before moving to next one). Currently supports cleanup and rebuilds.
  • Ability to decommission the entire datacenter. Add an annotation cassandra.datastax.com/decommission-on-delete to the CassandraDatacenter object before deleting it and it will do a full decommission instead of simply deleting all the resources — if the nodes are running in multi-datacenter cluster.

We have plans to keep the releases flowing in 2022. Like before, some of those will align with the roadmap of other K8ssandra projects, while some features will be more interesting for those who use cass-operator in a standalone mode.

Plus, don’t miss the upcoming content we’ll have on k8ssandra-operator 1.0, or the mind-boggling scaling tests we did with K8ssandra and cass-operator on AWS!

Let us know what you think about this post in our buzzing K8ssandra Discord or ask us anything in the K8ssandra Forum

Resources

  1. GitHub – The Kubernetes operator for K8ssandra
  2. GitHub – datastax/cass-operator: The DataStax Kubernetes Operator for Apache Cassandra
  3. Managed Cassandra on AWS
  4. Why we decided to build a K8ssandra operator