Hi everyone,
We’re currently facing an issue with one of our Cassandra pods in a K8ssandra-cluster. One of the pods in the StatefulSet is unable to start due to what appears to be schema corruption. Here’s the error message we’re seeing in the logs:
No partition columns found for table ks_sv.test_test in system_schema.columns. This may be due to corruption or concurrent dropping and altering of a table. If this table is supposed to be dropped, restart cassandra with -Dcassandra.ignore_corrupted_schema_tables=true and run the following query to cleanup:
"DELETE FROM system_schema.tables WHERE keyspace_name = 'ks_sv' AND table_name = 'test_test';
DELETE FROM system_schema.columns WHERE keyspace_name = 'ks_sv' AND table_name = 'test_test';"
If the table is not supposed to be dropped, restore system_schema.columns sstables from backups.
We’re unsure how to proceed safely. Specifically, we’d like guidance on:
How to safely restart the affected pod with the -Dcassandra.ignore_corrupted_schema_tables=true
flag in a K8ssandra-managed environment and run the cleanup query as suggested by the message from the logs. Or if we should restore the entire cluster from a back-up.
Any advice or steps from others who’ve encountered similar issues would be greatly appreciated. We’re trying to avoid data loss and ensure cluster stability.