Medusa restore problem with K8sssandra operator

Scenario

I have a K8ssandra cluster with 1 DC named argo-we-dc1 which has 6 nodes and 3 racks (r1, r2, r3). The cluster also has 3 stargate pods (1 per rack). Medusa is deployed in each cassandra pod.

NAME                                                       READY   STATUS    RESTARTS     AGE     IP              NODE                              NOMINATED NODE   READINESS GATES
argo-argo-we-dc1-r1-stargate-deployment-5fc988476-pctt2    1/1     Running   0            4h29m   10.129.35.138   aks-dbaspot-73504503-vmss0000sm   <none>           <none>
argo-argo-we-dc1-r1-sts-0                                  3/3     Running   0            4h35m   10.129.33.8     aks-dbaspot-73504503-vmss0000sf   <none>           <none>
argo-argo-we-dc1-r1-sts-1                                  3/3     Running   0            4h35m   10.129.33.241   aks-dbaspot-73504503-vmss0000sh   <none>           <none>
argo-argo-we-dc1-r2-stargate-deployment-fb757c568-6ndtm    1/1     Running   0            4h29m   10.129.34.129   aks-dbaspot-73504503-vmss0000sk   <none>           <none>
argo-argo-we-dc1-r2-sts-0                                  3/3     Running   0            4h35m   10.129.35.149   aks-dbaspot-73504503-vmss0000sm   <none>           <none>
argo-argo-we-dc1-r2-sts-1                                  3/3     Running   0            4h35m   10.129.34.24    aks-dbaspot-73504503-vmss0000sj   <none>           <none>
argo-argo-we-dc1-r3-stargate-deployment-69bfbb49d4-dgg4t   1/1     Running   0            4h29m   10.129.35.144   aks-dbaspot-73504503-vmss0000sm   <none>           <none>
argo-argo-we-dc1-r3-sts-0                                  3/3     Running   0            4h35m   10.129.32.179   aks-dbaspot-73504503-vmss0000si   <none>           <none>
argo-argo-we-dc1-r3-sts-1                                  3/3     Running   0            4h19m   10.129.34.134   aks-dbaspot-73504503-vmss0000sk   <none>           <none>
k8ssandra-operator-b7c97b465-dnj8k                         1/1     Running   0            4h20m   10.129.34.28    aks-dbaspot-73504503-vmss0000sj   <none>           <none>
k8ssandra-operator-b7c97b465-f9nlp                         1/1     Running   7 (8h ago)   20h     10.129.33.249   aks-dbaspot-73504503-vmss0000sh   <none>           <none>
k8ssandra-operator-b7c97b465-fzf5r                         1/1     Running   5 (8h ago)   8h      10.129.32.171   aks-dbaspot-73504503-vmss0000si   <none>           <none>
k8ssandra-operator-cass-operator-b666974f6-2r7hk           1/1     Running   7 (8h ago)   8h      10.129.33.240   aks-dbaspot-73504503-vmss0000sh   <none>           <none>
k8ssandra-operator-cass-operator-b666974f6-7bhww           1/1     Running   0            7h52m   10.129.34.14    aks-dbaspot-73504503-vmss0000sj   <none>           <none>
k8ssandra-operator-cass-operator-b666974f6-fxqmp           1/1     Running   0            4h20m   10.129.34.128   aks-dbaspot-73504503-vmss0000sk   <none>           <none>

AKS version: 1.25.5

K8ssandra-operator version: v1.5.2

Medusa version: 0.14.0-azure

Cassandra version: 4.0.4

Problem description: Backup/Restore problem with K8ssandra

Backups are working, but restores only work when we have DCs with only one cassandra node per rack. When we have DCs with more than one cassandra node per rack, restores don’t work.

Medusa configuration:

medusa:
    containerResources:
      requests:
        cpu: "1000m"
        memory: "2Gi"
      limits:
        cpu: "1000m"
        memory: "2Gi"
    initContainerResources:
      requests:
        cpu: "1000m"
        memory: "1Gi"
      limits:
        cpu: "1000m"
        memory: "2Gi"  
    containerImage:
      registry: internal_registry
      repository: internal_repo
      name: medusa
      tag: 0.14.0-azure
      pullPolicy: IfNotPresent
      pullSecretRef:
        name: internal_secret
    storageProperties:
      storageProvider: azure_blobs
      storageSecretRef:
        name: medusa-bucket-key
      bucketName: k8ssandra
      secure: false

Medusa backup configuration:

apiVersion: medusa.k8ssandra.io/v1alpha1
kind: MedusaBackupJob
metadata:
  name: demo-backup
spec:
  cassandraDatacenter: argo-we-dc1

When we run the get command of the backup object after the backup is completed:

kubectl get medusabackupjob/demo-backup -n infrastructure-k8ssandra -o yaml

The output is the following:

apiVersion: medusa.k8ssandra.io/v1alpha1
kind: MedusaBackupJob
metadata:
  creationTimestamp: "2023-06-21T08:54:22Z"
  generation: 1
  labels:
    argocd.argoproj.io/instance: infrastructure-k8ssandra-operator-argocd
  managedFields:
  - apiVersion: medusa.k8ssandra.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          .: {}
          f:argocd.argoproj.io/instance: {}
      f:spec:
        .: {}
        f:backupType: {}
        f:cassandraDatacenter: {}
    manager: argocd-application-controller
    operation: Update
    time: "2023-06-21T08:54:22Z"
  - apiVersion: medusa.k8ssandra.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:ownerReferences:
          .: {}
          k:{"uid":"8e4f78c2-7890-4b34-8739-3f09116f9634"}: {}
    manager: manager
    operation: Update
    time: "2023-06-21T08:54:22Z"
  - apiVersion: medusa.k8ssandra.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        .: {}
        f:finishTime: {}
        f:finished: {}
        f:startTime: {}
    manager: manager
    operation: Update
    subresource: status
    time: "2023-06-21T08:55:38Z"
  name: demo-backup
  namespace: infrastructure-k8ssandra
  ownerReferences:
  - apiVersion: cassandra.datastax.com/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: CassandraDatacenter
    name: argo-we-dc1
    uid: 8e4f78c2-7890-4b34-8739-3f09116f9634
  resourceVersion: "33751280"
  uid: 213c5df9-ba8c-497b-a99c-6d83ebda6b57
spec:
  backupType: differential
  cassandraDatacenter: argo-we-dc1
status:
  finishTime: "2023-06-21T08:55:38Z"
  finished:
  - argo-argo-we-dc1-r3-sts-0
  - argo-argo-we-dc1-r1-sts-1
  - argo-argo-we-dc1-r3-sts-1
  - argo-argo-we-dc1-r2-sts-1
  - argo-argo-we-dc1-r1-sts-0
  - argo-argo-we-dc1-r2-sts-0
  startTime: "2023-06-21T08:54:22Z"

Final logs of the medusa container after the backup is completed:

INFO:root:Backup done
[2023-06-21 08:55:19,925] INFO: Backup done
INFO:root:- Started: 2023-06-21 08:54:22
                        - Started extracting data: 2023-06-21 08:54:24
                        - Finished: 2023-06-21 08:55:19
[2023-06-21 08:55:19,925] INFO: - Started: 2023-06-21 08:54:22
                        - Started extracting data: 2023-06-21 08:54:24
                        - Finished: 2023-06-21 08:55:19
INFO:root:- Real duration: 0:00:55.759084 (excludes time waiting for other nodes)
[2023-06-21 08:55:19,926] INFO: - Real duration: 0:00:55.759084 (excludes time waiting for other nodes)
INFO:root:- 432 files, 1.81 GB
[2023-06-21 08:55:19,926] INFO: - 432 files, 1.81 GB
INFO:root:- 432 files copied from host
[2023-06-21 08:55:19,926] INFO: - 432 files copied from host
DEBUG:root:Emitting metrics
[2023-06-21 08:55:19,926] DEBUG: Emitting metrics
DEBUG:root:Done emitting metrics
[2023-06-21 08:55:19,927] DEBUG: Done emitting metrics
INFO:root:Updated from existing status: 0 to new status: 1 for backup id: demo-backup
[2023-06-21 08:55:19,927] INFO: Updated from existing status: 0 to new status: 1 for backup id: demo-backup
DEBUG:root:Done with backup, returning backup result information
[2023-06-21 08:55:19,927] DEBUG: Done with backup, returning backup result information

When we run nodetool status:

kubectl exec -it argo-argo-we-dc1-r1-sts-0 -n infrastructure-k8ssandra -c cassandra -- nodetool -u $CASS_USERNAME -pw $CASS_PASSWORD status

The output is the following:

Datacenter: argo-we-dc1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load      Tokens  Owns (effective)  Host ID                               Rack
UN  10.129.35.142  1.78 GiB  16      51.5%             39b3ca3e-8129-45b9-b5cf-6f9e43214e93  r2
UN  10.129.33.161  1.67 GiB  16      51.5%             71c4c25d-71b5-49fe-89f2-6b2ca2ba54d0  r3
UN  10.129.33.241  1.81 GiB  16      51.5%             eec3e68c-ba4a-4531-b75c-e81cd66c67d8  r1
UN  10.129.32.244  1.66 GiB  16      48.5%             c4b0a92b-651b-42b3-b606-58c48ac9f5a9  r1
UN  10.129.34.6    1.63 GiB  16      48.5%             46088aa0-099c-4e27-a0c2-bbfdfa383de7  r2
UN  10.129.32.182  1.67 GiB  16      48.5%             c07b5905-6d71-40c5-a26a-ef02c12eb3cd  r3

Medusa restore configuration:

apiVersion: medusa.k8ssandra.io/v1alpha1
kind: MedusaRestoreJob
metadata:
  name: argo-restore
  namespace: infrastructure-k8ssandra
spec:
  cassandraDatacenter: argo-we-dc1
  backup: demo-backup

When we run the get command of the restore object after the restore is completed:

kubectl get medusarestorejob/argo-restore -o yaml -n infrastructure-k8ssandra

The output is the following:

apiVersion: medusa.k8ssandra.io/v1alpha1
kind: MedusaRestoreJob
metadata:
  creationTimestamp: "2023-06-21T10:00:11Z"
  generation: 1
  labels:
    argocd.argoproj.io/instance: infrastructure-k8ssandra-operator-argocd
  managedFields:
  - apiVersion: medusa.k8ssandra.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          .: {}
          f:argocd.argoproj.io/instance: {}
      f:spec:
        .: {}
        f:backup: {}
        f:cassandraDatacenter: {}
    manager: argocd-application-controller
    operation: Update
    time: "2023-06-21T10:00:11Z"
  - apiVersion: medusa.k8ssandra.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:ownerReferences:
          .: {}
          k:{"uid":"8e4f78c2-7890-4b34-8739-3f09116f9634"}: {}
    manager: manager
    operation: Update
    time: "2023-06-21T10:00:11Z"
  - apiVersion: medusa.k8ssandra.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        .: {}
        f:datacenterStopped: {}
        f:restoreKey: {}
        f:restorePrepared: {}
        f:startTime: {}
    manager: manager
    operation: Update
    subresource: status
    time: "2023-06-21T10:00:26Z"
  name: argo-restore
  namespace: infrastructure-k8ssandra
  ownerReferences:
  - apiVersion: cassandra.datastax.com/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: CassandraDatacenter
    name: argo-we-dc1
    uid: 8e4f78c2-7890-4b34-8739-3f09116f9634
  resourceVersion: "33787691"
  uid: e8ce7b35-486f-4cc2-aaf1-ef4e1f584549
spec:
  backup: demo-backup
  cassandraDatacenter: argo-we-dc1
status:
  datacenterStopped: "2023-06-21T10:00:26Z"
  restoreKey: a65fae69-9c33-4588-95b6-182323f24e0e
  restorePrepared: true
  startTime: "2023-06-21T10:00:11Z"

Medusa container logs:

MEDUSA_MODE = GRPC
sleeping for 0 sec
Starting Medusa gRPC service
WARNING:root:The CQL_USERNAME environment variable is deprecated and has been replaced by the MEDUSA_CQL_USERNAME variable
WARNING:root:The CQL_PASSWORD environment variable is deprecated and has been replaced by the MEDUSA_CQL_PASSWORD variable
WARNING:root:The CQL_USERNAME environment variable is deprecated and has been replaced by the MEDUSA_CQL_USERNAME variable
WARNING:root:The CQL_PASSWORD environment variable is deprecated and has been replaced by the MEDUSA_CQL_PASSWORD variable
INFO:root:Init service
[2023-06-21 10:03:23,995] INFO: Init service
DEBUG:root:Loading storage_provider: azure_blobs
[2023-06-21 10:03:23,995] DEBUG: Loading storage_provider: azure_blobs
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): dbarepository.blob.core.windows.net:443
[2023-06-21 10:03:23,998] DEBUG: Starting new HTTPS connection (1): dbarepository.blob.core.windows.net:443
DEBUG:urllib3.connectionpool:https://dbarepository.blob.core.windows.net:443 "HEAD /k8ssandra?restype=container HTTP/1.1" 200 0
[2023-06-21 10:03:24,023] DEBUG: https://dbarepository.blob.core.windows.net:443 "HEAD /k8ssandra?restype=container HTTP/1.1" 200 0
INFO:root:Starting server. Listening on port 50051.
[2023-06-21 10:03:24,026] INFO: Starting server. Listening on port 50051.

Medusa-restore final logs in argo-argo-we-dc1-r1-sts-0 for example:

INFO:root:Finished restore of backup demo-backup
[2023-06-21 10:03:22,847] INFO: Finished restore of backup demo-backup
Mapping: {'in_place': True, 'host_map': {'argo-argo-we-dc1-r1-sts-1': {'source': ['argo-argo-we-dc1-r1-sts-1'], 'seed': False}, 'localhost': {'source': ['argo-argo-we-dc1-r1-sts-0'], 'seed': False}, 'argo-argo-we-dc1-r2-sts-1': {'source': ['argo-argo-we-dc1-r2-sts-1'], 'seed': False}, 'argo-argo-we-dc1-r2-sts-0': {'source': ['argo-argo-we-dc1-r2-sts-0'], 'seed': False}, 'argo-argo-we-dc1-r3-sts-1': {'source': ['argo-argo-we-dc1-r3-sts-1'], 'seed': False}, 'argo-argo-we-dc1-r3-sts-0': {'source': ['argo-argo-we-dc1-r3-sts-0'], 'seed': False}}}
Downloading backup demo-backup to /var/lib/cassandra

Medusa-restore final logs in argo-argo-we-dc1-r1-sts-1 which is in the same rack as the one above:

INFO:root:Finished restore of backup demo-backup
[2023-06-21 10:03:22,958] INFO: Finished restore of backup demo-backup
Mapping: {'in_place': True, 'host_map': {'argo-argo-we-dc1-r1-sts-0': {'source': ['argo-argo-we-dc1-r1-sts-1'], 'seed': False}, 'localhost': {'source': ['argo-argo-we-dc1-r1-sts-0'], 'seed': False}, 'argo-argo-we-dc1-r2-sts-1': {'source': ['argo-argo-we-dc1-r2-sts-1'], 'seed': False}, 'argo-argo-we-dc1-r2-sts-0': {'source': ['argo-argo-we-dc1-r2-sts-0'], 'seed': False}, 'argo-argo-we-dc1-r3-sts-1': {'source': ['argo-argo-we-dc1-r3-sts-1'], 'seed': False}, 'argo-argo-we-dc1-r3-sts-0': {'source': ['argo-argo-we-dc1-r3-sts-0'], 'seed': False}}}
Downloading backup demo-backup to /var/lib/cassandra

When we run nodetool status on argo-argo-we-dc1-r1-sts-0:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load      Tokens  Owns (effective)  Host ID                               Rack
DN  10.129.32.244  ?         16      0.0%              c4b0a92b-651b-42b3-b606-58c48ac9f5a9  r1
DN  10.129.34.6    ?         16      0.0%              46088aa0-099c-4e27-a0c2-bbfdfa383de7  r1
DN  10.129.32.182  ?         16      0.0%              c07b5905-6d71-40c5-a26a-ef02c12eb3cd  r1
Datacenter: argo-we-dc1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load      Tokens  Owns (effective)  Host ID                               Rack
UN  10.129.33.149  1.67 GiB  16      100.0%            71c4c25d-71b5-49fe-89f2-6b2ca2ba54d0  r3
UN  10.129.34.24   1.78 GiB  16      100.0%            39b3ca3e-8129-45b9-b5cf-6f9e43214e93  r2

When we run nodetool status on argo-argo-we-dc1-r1-sts-1:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load      Tokens  Owns (effective)  Host ID                               Rack
DN  10.129.32.244  ?         16      0.0%              c4b0a92b-651b-42b3-b606-58c48ac9f5a9  r1
DN  10.129.34.6    ?         16      0.0%              46088aa0-099c-4e27-a0c2-bbfdfa383de7  r1
DN  10.129.32.182  ?         16      0.0%              c07b5905-6d71-40c5-a26a-ef02c12eb3cd  r1

Datacenter: argo-we-dc1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load      Tokens  Owns (effective)  Host ID                               Rack
UN  10.129.33.241  1.81 GiB  16      100.0%            eec3e68c-ba4a-4531-b75c-e81cd66c67d8  r1
UN  10.129.33.149  1.67 GiB  16      100.0%            71c4c25d-71b5-49fe-89f2-6b2ca2ba54d0  r3
UN  10.129.34.24   1.78 GiB  16      100.0%            39b3ca3e-8129-45b9-b5cf-6f9e43214e93  r2

When we run nodetool status on argo-argo-we-dc1-r2-sts-0:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load      Tokens  Owns (effective)  Host ID                               Rack
DN  10.129.32.244  ?         16      0.0%              c4b0a92b-651b-42b3-b606-58c48ac9f5a9  r1
DN  10.129.34.6    ?         16      0.0%              46088aa0-099c-4e27-a0c2-bbfdfa383de7  r1
DN  10.129.32.182  ?         16      0.0%              c07b5905-6d71-40c5-a26a-ef02c12eb3cd  r1

Datacenter: argo-we-dc1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load      Tokens  Owns (effective)  Host ID                               Rack
UN  10.129.33.241  1.81 GiB  16      100.0%            eec3e68c-ba4a-4531-b75c-e81cd66c67d8  r1
UN  10.129.33.149  1.67 GiB  16      100.0%            71c4c25d-71b5-49fe-89f2-6b2ca2ba54d0  r3

When we run nodetool status on argo-argo-we-dc1-r2-sts-1:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load      Tokens  Owns (effective)  Host ID                               Rack
DN  10.129.32.244  ?         16      0.0%              c4b0a92b-651b-42b3-b606-58c48ac9f5a9  r1
DN  10.129.34.6    ?         16      0.0%              46088aa0-099c-4e27-a0c2-bbfdfa383de7  r1
DN  10.129.32.182  ?         16      0.0%              c07b5905-6d71-40c5-a26a-ef02c12eb3cd  r1

Datacenter: argo-we-dc1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load      Tokens  Owns (effective)  Host ID                               Rack
UN  10.129.33.241  1.81 GiB  16      100.0%            eec3e68c-ba4a-4531-b75c-e81cd66c67d8  r1
UN  10.129.33.149  1.67 GiB  16      100.0%            71c4c25d-71b5-49fe-89f2-6b2ca2ba54d0  r3
UN  10.129.34.24   1.78 GiB  16      100.0%            39b3ca3e-8129-45b9-b5cf-6f9e43214e93  r2

When we run nodetool status on argo-argo-we-dc1-r3-sts-0:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load      Tokens  Owns (effective)  Host ID                               Rack
DN  10.129.32.244  ?         16      0.0%              c4b0a92b-651b-42b3-b606-58c48ac9f5a9  r1
DN  10.129.34.6    ?         16      0.0%              46088aa0-099c-4e27-a0c2-bbfdfa383de7  r1
DN  10.129.32.182  ?         16      0.0%              c07b5905-6d71-40c5-a26a-ef02c12eb3cd  r1

Datacenter: argo-we-dc1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load      Tokens  Owns (effective)  Host ID                               Rack
UN  10.129.33.241  1.81 GiB  16      100.0%            eec3e68c-ba4a-4531-b75c-e81cd66c67d8  r1
UN  10.129.34.24   1.78 GiB  16      100.0%            39b3ca3e-8129-45b9-b5cf-6f9e43214e93  r2

When we run nodetool status on argo-argo-we-dc1-r3-sts-1:

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load      Tokens  Owns (effective)  Host ID                               Rack
DN  10.129.32.244  ?         16      0.0%              c4b0a92b-651b-42b3-b606-58c48ac9f5a9  r1
DN  10.129.34.6    ?         16      0.0%              46088aa0-099c-4e27-a0c2-bbfdfa383de7  r1
DN  10.129.32.182  ?         16      0.0%              c07b5905-6d71-40c5-a26a-ef02c12eb3cd  r1

Datacenter: argo-we-dc1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load      Tokens  Owns (effective)  Host ID                               Rack
UN  10.129.34.134  1.67 GiB  16      100.0%            71c4c25d-71b5-49fe-89f2-6b2ca2ba54d0  r3
UN  10.129.33.241  1.81 GiB  16      100.0%            eec3e68c-ba4a-4531-b75c-e81cd66c67d8  r1
UN  10.129.34.24   1.78 GiB  16      100.0%            39b3ca3e-8129-45b9-b5cf-6f9e43214e93  r2

Cluster pods (kubectl get pods -n infrastructure-k8ssandra -o wide):

NAME                                                       READY   STATUS    RESTARTS     AGE     IP              NODE                              NOMINATED NODE   READINESS GATES
argo-argo-we-dc1-r1-stargate-deployment-5fc988476-pctt2    1/1     Running   0            4h29m   10.129.35.138   aks-dbaspot-73504503-vmss0000sm   <none>           <none>
argo-argo-we-dc1-r1-sts-0                                  3/3     Running   0            4h35m   10.129.33.8     aks-dbaspot-73504503-vmss0000sf   <none>           <none>
argo-argo-we-dc1-r1-sts-1                                  3/3     Running   0            4h35m   10.129.33.241   aks-dbaspot-73504503-vmss0000sh   <none>           <none>
argo-argo-we-dc1-r2-stargate-deployment-fb757c568-6ndtm    1/1     Running   0            4h29m   10.129.34.129   aks-dbaspot-73504503-vmss0000sk   <none>           <none>
argo-argo-we-dc1-r2-sts-0                                  3/3     Running   0            4h35m   10.129.35.149   aks-dbaspot-73504503-vmss0000sm   <none>           <none>
argo-argo-we-dc1-r2-sts-1                                  3/3     Running   0            4h35m   10.129.34.24    aks-dbaspot-73504503-vmss0000sj   <none>           <none>
argo-argo-we-dc1-r3-stargate-deployment-69bfbb49d4-dgg4t   1/1     Running   0            4h29m   10.129.35.144   aks-dbaspot-73504503-vmss0000sm   <none>           <none>
argo-argo-we-dc1-r3-sts-0                                  3/3     Running   0            4h35m   10.129.32.179   aks-dbaspot-73504503-vmss0000si   <none>           <none>
argo-argo-we-dc1-r3-sts-1                                  3/3     Running   0            4h19m   10.129.34.134   aks-dbaspot-73504503-vmss0000sk   <none>           <none>
k8ssandra-operator-b7c97b465-dnj8k                         1/1     Running   0            4h20m   10.129.34.28    aks-dbaspot-73504503-vmss0000sj   <none>           <none>
k8ssandra-operator-b7c97b465-f9nlp                         1/1     Running   7 (8h ago)   20h     10.129.33.249   aks-dbaspot-73504503-vmss0000sh   <none>           <none>
k8ssandra-operator-b7c97b465-fzf5r                         1/1     Running   5 (8h ago)   8h      10.129.32.171   aks-dbaspot-73504503-vmss0000si   <none>           <none>
k8ssandra-operator-cass-operator-b666974f6-2r7hk           1/1     Running   7 (8h ago)   8h      10.129.33.240   aks-dbaspot-73504503-vmss0000sh   <none>           <none>
k8ssandra-operator-cass-operator-b666974f6-7bhww           1/1     Running   0            7h52m   10.129.34.14    aks-dbaspot-73504503-vmss0000sj   <none>           <none>
k8ssandra-operator-cass-operator-b666974f6-fxqmp           1/1     Running   0            4h20m   10.129.34.128   aks-dbaspot-73504503-vmss0000sk   <none>           <none>

Steps to reproduce:

  • Create a K8ssandra cluster with 1 DC named argo-we-dc1 which has 6 nodes and 3 racks (r1, r2, r3). The cluster also has 3 stargate pods (1 per rack). Medusa is deployed in each cassandra pod.

  • Add data to the cluster

  • Create a medusa backup job

  • Create a medusa restore job

  • See the result of nodetool status

Expected result when we run nodetool command and after the restore is completed:

Datacenter: argo-we-dc1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load      Tokens  Owns (effective)  Host ID                               Rack
UN  10.129.35.142  1.78 GiB  16      51.5%             39b3ca3e-8129-45b9-b5cf-6f9e43214e93  r2
UN  10.129.33.161  1.67 GiB  16      51.5%             71c4c25d-71b5-49fe-89f2-6b2ca2ba54d0  r3
UN  10.129.33.241  1.81 GiB  16      51.5%             eec3e68c-ba4a-4531-b75c-e81cd66c67d8  r1
UN  10.129.32.244  1.66 GiB  16      48.5%             c4b0a92b-651b-42b3-b606-58c48ac9f5a9  r1
UN  10.129.34.6    1.63 GiB  16      48.5%             46088aa0-099c-4e27-a0c2-bbfdfa383de7  r2
UN  10.129.32.182  1.67 GiB  16      48.5%             c07b5905-6d71-40c5-a26a-ef02c12eb3cd  r3

Important note:
When a k8ssandra cluster has DCs with racks that only have just 1 cassandra node, the backup and restore works. It does not work only when a rack has more than 1 cassandra node in it.

Hi, we’ve identified an issue with Medusa v0.14.0 and k8ssandra-operator, which will be fixed in the upcoming v1.8.0 release.
In the meantime, it is recommended to stick with Medusa v0.13.2.

1 Like

Hi @alexander, quick question, do you guys have any estimation for the release date on v1.8.0?

Thanks for the quick response on the topic.

We hope to release it around July 10th :crossed_fingers:

2 Likes