K8ssandra Forum

Medusa restore systematically corruption of a table

Hi everybody,
I have this kind of issue on k8ssandra with medusa, the restore of the first backup on the same environment works, but I think there is a problem on the next backup because it often leads to table corruption for a couple of tables.
Does anyone already have this kind of issue? I am able to replicate this behaviour many times from scratch.

here the details:

  1. configure cassandra with medusa adding this section into cassandra.yaml, the cluster has only a single node.
medusa:
  enabled: true
  #storage: local
  storage: s3_compatible
  storage_properties:
    host: $S3_ENDPOINT
    port: 80
    secure: "false"
  bucketName: $MED_S3_BUCKET
  storageSecret: medusa-bucket-key
  podStorage:
    storageClass: local-storage
    size: 25Gi
  1. install cassandra
helm upgrade -f cassandra.yaml cassandra /hyperfile/cluster/kubernetes/k8ssandra-1.2.0.tgz -n cass-operator
  1. on the database I have this problematic table (it won’t be able to be restored) that contains only a single row.
cassandra@cqlsh> describe storage.group ;

CREATE TABLE storage.group (
    group_id int PRIMARY KEY,
    ad_state text,
    ad_state_description text,
    ad_state_time bigint,
    cache_mb bigint,
    hard_quota_mb bigint,
    name text,
    revision int,
    soft_quota_mb bigint,
    state text,
    state_description text,
    state_history text,
    state_time bigint,
    update_time timeuuid
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';
CREATE INDEX group_ad_state_index ON hyperfile.group (ad_state);
CREATE INDEX group_state_index ON hyperfile.group (state);
  1. invoke backup with this command:
/usr/local/bin/helm install $BACKUP_NAME k8ssandra/backup -n cass-operator --set name=$BACKUP_NAME,cassandraDatacenter.name=dc1

The first backup will be available on s3 bucket
5) do again the backup (also multiple times).
6) restore on the same cluster/datacenter selecting a previous backup (but not the first one, because it is always working).

helm install restore1 k8ssandra/restore -n cass-operator --set name=restore1,backup.name=backup-202107211213,cassandraDatacenter.name=dc1

Datacenter will be restarted so, I check the progress of ongoing restore operation with this command:

kubectl logs cassandra-dc1-rack1-sts-0  -n cass-operator -c medusa-restore -f
  1. from cql I see
cassandra@cqlsh> select * from storage.group;
ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] 
message="Operation failed - received 0 responses and 1 failures" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
  1. here the log of cassandra
kubectl logs cassandra-dc1-rack1-sts-0   -n cass-operator -c server-system-logger
ERROR [ReadStage-2] 2021-09-16 10:52:17,683 AbstractLocalAwareExecutorService.java:166 - Uncaught exception on thread Thread[ReadStage-2,5,main]
java.lang.RuntimeException: org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: /var/lib/cassandra/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/md-1-big-Data.db
	at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2775) ~[apache-cassandra-3.11.10.jar:3.11.10]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_282]
	at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) ~[apache-cassandra-3.11.10.jar:3.11.10]
	at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) [apache-cassandra-3.11.10.jar:3.11.10]
	at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:113) [apache-cassandra-3.11.10.jar:3.11.10]
	at java.lang.Thread.run(Thread.java:748) [na:1.8.0_282]
Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: /var/lib/cassandra/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/md-1-big-Data.db

many thanks in advance,
Diego

Hi Diego,

if you invoke nodetool scrub in the Cassandra container and try again accessing the table, do you get a better result?
Could you compare the manifests of both the working backup (the first one) and a failing one?
Also, what’s the output of ls -lrt /var/lib/cassandra/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/ in the Cassandra container, before and after the restore.

Thanks!

Hi Alexander,
I have tried to invoke nodetool scurb without any output and the table group is still broken.
Then I have restored the first working backup with the second backup for having a comparison of manifest (downloaded from s3 bucket) and the content of the folder highlighted.
here the results:

  1. contents of the folder group after restoring with first working backup
ls -lrt
total 40
-rw-r--r-- 1 cassandra cassandra   10 Sep 16 14:51 md-1-big-Digest.crc32
-rw-r--r-- 1 cassandra cassandra 5318 Sep 16 14:51 md-1-big-Statistics.db
-rw-r--r-- 1 cassandra cassandra   16 Sep 16 14:51 md-1-big-Filter.db
-rw-r--r-- 1 cassandra cassandra   43 Sep 16 14:51 md-1-big-CompressionInfo.db
-rw-r--r-- 1 cassandra cassandra    8 Sep 16 14:51 md-1-big-Index.db
-rw-r--r-- 1 cassandra cassandra   56 Sep 16 14:51 md-1-big-Summary.db
-rw-r--r-- 1 cassandra cassandra  240 Sep 16 14:51 md-1-big-Data.db
-rw-r--r-- 1 cassandra cassandra   92 Sep 16 14:51 md-1-big-TOC.txt
drwxr-sr-x 4 cassandra cassandra 4096 Sep 16 14:52 backups
  1. contents of the folder group after restoring with the second backup
ls -lrt
total 28
-rw-r--r-- 1 cassandra cassandra 5318 Sep 16 14:58 md-1-big-Statistics.db
-rw-r--r-- 1 cassandra cassandra    8 Sep 16 14:58 md-1-big-Index.db
-rw-r--r-- 1 cassandra cassandra  240 Sep 16 14:58 md-1-big-Data.db
-rw-r--r-- 1 cassandra cassandra   56 Sep 16 14:58 md-1-big-Summary.db
-rw-r--r-- 1 cassandra cassandra   16 Sep 16 14:58 md-1-big-Filter.db
drwxr-sr-x 4 cassandra cassandra 4096 Sep 16 14:58 backups
  1. manifest of the s3 object diegomedusa/cassandra-dc1-rack1-sts-0/backup-202109152031/meta/manifest.json (the working backup).
    Manifest.json is too big for this message, I don’t know how to post it entirely, so I put here only the piece of json related to the group table.
[
  {
    "keyspace": "storage",
    "columnfamily": "group-aaf22b70161311ec86eff3cf7293f36d",
    "objects": [
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/md-1-big-Digest.crc32",
        "MD5": "0201fac48016752f1d0c24739110f905",
        "size": 10
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/md-1-big-Statistics.db",
        "MD5": "b094f49b2c486ac7084203acd28a961d",
        "size": 5318
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/md-1-big-Filter.db",
        "MD5": "88ddab33e4e473f24e75a466c5e82564",
        "size": 16
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/md-1-big-CompressionInfo.db",
        "MD5": "462ec6ed0d9364decf4e6893cebae161",
        "size": 43
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/md-1-big-Index.db",
        "MD5": "3ce75d2f4e72ac8b61a5ae10c568a657",
        "size": 8
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/md-1-big-Summary.db",
        "MD5": "a2df192a99f110cafacb1a020a01220b",
        "size": 56
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/md-1-big-Data.db",
        "MD5": "3b34d1d75c2b29e4609ea0a8221274d7",
        "size": 240
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/md-1-big-TOC.txt",
        "MD5": "8387905cab87e3641f12e96d67945e2d",
        "size": 92
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_ad_state_index/md-1-big-Digest.crc32",
        "MD5": "bdb44b461031b9d363856d2281f8cf9d",
        "size": 10
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_ad_state_index/md-1-big-Statistics.db",
        "MD5": "dfd581431eff2e5d3b68cb0500c6b220",
        "size": 4728
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_ad_state_index/md-1-big-Filter.db",
        "MD5": "60e9a50415b0b2519d22470d8a9d93a6",
        "size": 16
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_ad_state_index/md-1-big-CompressionInfo.db",
        "MD5": "39ccf7fae9675d58d3955ba47ed74431",
        "size": 43
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_ad_state_index/md-1-big-Index.db",
        "MD5": "6be11a6364c986aeb79504c82ec64ea8",
        "size": 14
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_ad_state_index/md-1-big-Summary.db",
        "MD5": "aaea18f0c62ba87912186b930ab05b1c",
        "size": 74
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_ad_state_index/md-1-big-Data.db",
        "MD5": "f160602440901e0ba14f335db514806a",
        "size": 49
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_ad_state_index/md-1-big-TOC.txt",
        "MD5": "8387905cab87e3641f12e96d67945e2d",
        "size": 92
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Digest.crc32",
        "MD5": "636fd7fa87dc4bcdf76d98f265e7906a",
        "size": 10
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Statistics.db",
        "MD5": "6588d7c439f0379ec7f99e9ffe3db283",
        "size": 4748
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Filter.db",
        "MD5": "9f5e8f93a80eb8b0e716683eeb1e9469",
        "size": 16
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-CompressionInfo.db",
        "MD5": "229f64a90e3c8ba137db3b8bcdffb873",
        "size": 43
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Index.db",
        "MD5": "4e54dd65f91541a9a0db946df3486ec7",
        "size": 23
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Summary.db",
        "MD5": "cef0cd584866763706e1973773e88b89",
        "size": 66
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Data.db",
        "MD5": "1530e3c1669b0560e79a299b769a406f",
        "size": 75
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-TOC.txt",
        "MD5": "8387905cab87e3641f12e96d67945e2d",
        "size": 92
      }
    ]
  },

  1. manifest of the s3 object diegomedusa/cassandra-dc1-rack1-sts-0/backup-202109152034/meta/manifest.json (the failing backup)
[
  {
    "keyspace": "storage",
    "columnfamily": "group-aaf22b70161311ec86eff3cf7293f36d",
    "objects": [
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/md-1-big-Statistics.db",
        "MD5": "b094f49b2c486ac7084203acd28a961d",
        "size": 5318
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/md-1-big-Index.db",
        "MD5": "3ce75d2f4e72ac8b61a5ae10c568a657",
        "size": 8
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/md-1-big-Summary.db",
        "MD5": "a2df192a99f110cafacb1a020a01220b",
        "size": 56
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/md-1-big-Data.db",
        "MD5": "3b34d1d75c2b29e4609ea0a8221274d7",
        "size": 240
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_ad_state_index/md-1-big-Statistics.db",
        "MD5": "dfd581431eff2e5d3b68cb0500c6b220",
        "size": 4728
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_ad_state_index/md-1-big-Index.db",
        "MD5": "6be11a6364c986aeb79504c82ec64ea8",
        "size": 14
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_ad_state_index/md-1-big-Summary.db",
        "MD5": "aaea18f0c62ba87912186b930ab05b1c",
        "size": 74
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_ad_state_index/md-1-big-Data.db",
        "MD5": "f160602440901e0ba14f335db514806a",
        "size": 49
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Digest.crc32",
        "MD5": "636fd7fa87dc4bcdf76d98f265e7906a",
        "size": 10
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Filter.db",
        "MD5": "9f5e8f93a80eb8b0e716683eeb1e9469",
        "size": 16
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-CompressionInfo.db",
        "MD5": "229f64a90e3c8ba137db3b8bcdffb873",
        "size": 43
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-TOC.txt",
        "MD5": "8387905cab87e3641f12e96d67945e2d",
        "size": 92
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Digest.crc32",
        "MD5": "636fd7fa87dc4bcdf76d98f265e7906a",
        "size": 10
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Filter.db",
        "MD5": "9f5e8f93a80eb8b0e716683eeb1e9469",
        "size": 16
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-CompressionInfo.db",
        "MD5": "229f64a90e3c8ba137db3b8bcdffb873",
        "size": 43
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-TOC.txt",
        "MD5": "8387905cab87e3641f12e96d67945e2d",
        "size": 92
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Digest.crc32",
        "MD5": "636fd7fa87dc4bcdf76d98f265e7906a",
        "size": 10
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Statistics.db",
        "MD5": "6588d7c439f0379ec7f99e9ffe3db283",
        "size": 4748
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Filter.db",
        "MD5": "9f5e8f93a80eb8b0e716683eeb1e9469",
        "size": 16
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-CompressionInfo.db",
        "MD5": "229f64a90e3c8ba137db3b8bcdffb873",
        "size": 43
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Index.db",
        "MD5": "4e54dd65f91541a9a0db946df3486ec7",
        "size": 23
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Summary.db",
        "MD5": "cef0cd584866763706e1973773e88b89",
        "size": 66
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-Data.db",
        "MD5": "1530e3c1669b0560e79a299b769a406f",
        "size": 75
      },
      {
        "path": "cassandra-dc1-rack1-sts-0/data/storage/group-aaf22b70161311ec86eff3cf7293f36d/.group_state_index/md-1-big-TOC.txt",
        "MD5": "8387905cab87e3641f12e96d67945e2d",
        "size": 92
      }
    ]
  },

Thanks for these.
There clearly are missing files in the manifests of the subsequent backups:

  • md-1-big-Digest.crc32
  • md-1-big-Filter.db
  • md-1-big-CompressionInfo.db
  • md-1-big-TOC.txt

I’m going to investigate what’s causing this as it seems pretty critical as an issue.

It turns out to be a bug in Medusa when used against tables that have secondary indexes. More info here: https://github.com/thelastpickle/cassandra-medusa/pull/407

I’ll push a patched release once the fix gets merged and we’ll update have the fix available in K8ssandra 1.4.

1 Like

Hi Alexander,
many thanks for your time, I’m going to test your patch because I have another table with only one index that seems to have the same problem of missing entries on json manifest.
Do you think that this issue is only related to the generation of js manifest? I see the 8 files correctly stored on s3 bucket, I am tempted to edit the file manually and redo the restore.

br

It’s not just an issue with the manifest generation, there could be files that won’t be uploaded due to the mixup between the base table sstable files and the index table ones.
Let me know if you’re getting good results with the patch. I can push a docker image with the patch on docker hub if you need, waiting for an actual release.

Hi Alexander,
Many thanks for your help, I have built the container from your PR, tested many times backup and restore, checked manually the manifest and everything seem ok.