Hello! This is my first post, so please bear with me…
I’ve built out a multi-region Cassandra cluster with K8ssandra, closely following the guide Jeff wrote in the blog: Deploy a multi-datacenter Cassandra cluster in Kubernetes - Part 2 - K8ssandra
The build-out went really smoothly and I have to say, I was quite pleased with how well the process went. The cluster is up and running in three regions, all nodes are visible (up, normal) when inspected with nodetool status
as I’d expect. Things seem to be happy and healthy.
I’m running into some issues when triggering backups with the k8ssandra/backup
chart. The backups seem to run as expected in the first cluster I stood up (central), but in the other two (east and west) the backups consistently fail and in the medusa-operator
logs I keep seeing this:
2021-09-15T07:51:46.472Z ERROR controllers.CassandraBackup backup failed {"CassandraPod": "int-multi-region-int-west1-dc-rack1-sts-0", "error": "rpc error: code = Internal desc = failed to create backups: ('Unable to connect to any servers', {'10.247.0.40:9042': AuthenticationFailed('Failed to authenticate to 10.247.0.40:9042: Error from server: code=0100 [Bad credentials] message=\"Provided username medusa and/or password are incorrect\"',)})"}
Naturally, I assumed that the data in the medusa’s secret didn’t match what was configured in the cluster. I tried resetting the password for the medusa
user in Cassandra with ALTER USER medusa WITH PASSWORD ...
and verified that I can connect with the updated password via kubectl exec -n int-us-west1 int-multi-region-int-west1-dc-rack1-sts-0 cassandra -it -- cqlsh -u medusa -p ...
– this works, and I can use cqlsh
to connect & get a prompt. After doing so, I updated medusa’s secret in the cluster, but still find that the backups are failing with the same error. As an extra step, I bounced the medusa-operator
pod in said cluster but still have the same result.
Any ideas/suggestions as to what might be wrong? Is there any more information I can provide here to shed additional light on the configuration? I’m pretty stumped at this point. I can see the failed connection attempts in the Cassandra pod logs, so I know they’re talking…but I’m not sure why medusa doesn’t seem to pick up the new/updated secret value (or, if it is, why Cassandra is denying it). I hope I didn’t break something by changing medusa’s secret – updating that seemed to be the first logical step to try to fix it.
I’d greatly appreciate any tips/help/suggestions! Did I mention that I’m new here?..can you tell?