How to migrate data from external cassandra to k8ssandra with medusa

I was successful in doing a cassandra-medusa backup of a docker-based (non k8ssandra) one-node cluster into Azure Blobstore. I also configured k8s (v1.3.0) for Azure blobstore and can successfully do a test backup and a restore as described in the docs.
But when I try to restore the real, “alien” backup, it creates the restore object and does not start the restore. In the logs of the …sts-0 pod he complains that the indicated backup is not there.

I am unsure how to specify the name of the backup.
The folder structure of the test backup/restore done in k8ssandra is:
<bucket_name>/prod-k8ssandra-dc1-rack1-sts-0/backup1
<bucket_name>/prod-k8ssandra-dc1-rack1-sts-0/data

I take “backup1” as the backup name and everything is fine.

The folder structure of the “alien” backup is:
<bucket_name>/customer/a2e71349da8f/202107051529
<bucket_name>/customer/a2e71349da8f/data

when trying
helm install restore-customer k8ssandra/restore --set name=restore-customer,backup.name=a2e71349da8f,cassandraDatacenter.name=dc1

it says that there is no backup a2e71349da8f.

when trying
helm install restore-customer k8ssandra/restore --set name=restore-customer,backup.name=202107051529,cassandraDatacenter.name=dc1

it says
“spec.backup: Invalid value: “integer”: spec.backup in body must be of type string: “integer””

So I have these questions:

  1. Can the restore of a backup, which was not made on the same cluster, work at all ?
  2. If yes, how do I correctly specify the backup name ?
  3. Is there a way to execute plain medusa CLI commands to e.g. list the existing backups ? Something like kubectl exec podname -c containername – medusa list

Regards
Peter

1 Like

Hi @c031917

Can the restore of a backup, which was not made on the same cluster, work at all ?

Not currently. For reference check out the following:

Is there a way to execute plain medusa CLI commands to e.g. list the existing backups ? Something like kubectl exec podname -c containername – medusa list

I assume you want to directly query the remote storage. It might be possible but I am not certain. No package is installed for Medusa in the image. The source tree is just copied into the image. The entry point script runs python3 -m medusa.service.grpc.server server.py.

If you read through the remote restore design doc, you will see that it talks about syncing the remote storage with the k8s CassandraBackup objects. When that is implemented you should just be able to run kubectl get cassandrabackups to get the listing of backups.

Cheers

John

Hi @jsanda,
I checked the first link and see that improvement on this issue has been bumped from release 1.3.0. , so there seem to be more critical items to develop. What timeframe could I plan with ? It is a deal breaker for our migration to k8ssandra not having a method to get the data over. Classical methods look like a nightmare to use…
Regards
Peter