Hi Jeff,
Below are the detailed pod description and logs
I am using AWS instance type c5.9xlarge,
|c5.9xlarge|. 36|. 72 GiB|
My value.yaml
cat k8ssandra/prod_values.yaml
cassandra:
version: "3.11.10"
cassandraLibDirVolume:
storageClass: k8ssandra-storage
size: 10Gi
heap:
size: 31G
newGenSize: 31G
resources:
requests:
cpu: 7000m
memory: 30Gi
limits:
cpu: 7000m
memory: 30Gi
datacenters:
- name: demo1
size: 2
racks:
- name: us-west-1a
affinityLabels:
topology.kubernetes.io/zone: us-west-1a
- name: us-west-1b
affinityLabels:
topology.kubernetes.io/zone: us-west-1b
stargate:
enabled: true
replicas: 2
heapMB: 1024
cpuReqMillicores: 3000
cpuLimMillicores: 3000
medusa:
enabled: true
storage: s3
bucketName: cdstorage-k8ssandra-us
storageSecret: medusa-s3-credentials
storage_properties:
region: us-west-1
kube-prometheus-stack:
enabled: false
My Pod description
➜ **kubectl describe pod demo1-demo1-us-west-1b-sts-0 -n k8ssandra**
Name: demo1-demo1-us-west-1b-sts-0
Namespace: k8ssandra
Priority: 0
Node: ip-10-94-61-100.us-west-1.compute.internal/10.94.61.100
Start Time: Thu, 24 Jun 2021 13:33:15 +0530
Labels: app.kubernetes.io/managed-by=cass-operator
cassandra.datastax.com/cluster=demo1
cassandra.datastax.com/datacenter=demo1
cassandra.datastax.com/node-state=Starting
cassandra.datastax.com/rack=us-west-1b
cassandra.datastax.com/seed-node=true
controller-revision-hash=demo1-demo1-us-west-1b-sts-59647dfd7d
statefulset.kubernetes.io/pod-name=demo1-demo1-us-west-1b-sts-0
Annotations: cni.projectcalico.org/podIP: 100.96.4.50/32
cni.projectcalico.org/podIPs: 100.96.4.50/32
Status: Running
IP: 100.96.4.50
IPs:
IP: 100.96.4.50
Controlled By: StatefulSet/demo1-demo1-us-west-1b-sts
Init Containers:
base-config-init:
Container ID: docker://5afe0dc5772b27498ff900768ee975d93b3c5b4e2682975417afc5889e58bc4a
Image: k8ssandra/cass-management-api:3.11.10-v0.1.25
Image ID: docker-pullable://k8ssandra/cass-management-api@sha256:ef5e007d37b57d905c706c1221c96228c4387abb8a96f994af8aae3423dc9f2a
Port: <none>
Host Port: <none>
Command:
/bin/sh
Args:
-c
cp -r /etc/cassandra/* /cassandra-base-config/
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 24 Jun 2021 13:33:20 +0530
Finished: Thu, 24 Jun 2021 13:33:20 +0530
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/cassandra-base-config/ from cassandra-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qphm4 (ro)
server-config-init:
Container ID: docker://d967421eba795134893fa91efbf65fad2666c341d2140e186cae5e6d1cba09b8
Image: docker.io/datastax/cass-config-builder:1.0.4
Image ID: docker-pullable://datastax/cass-config-builder@sha256:0cfa1f1270f1c211ae4ac8eb690dd9e909cf690126e5ed5ddb08bba78902d1a1
Port: <none>
Host Port: <none>
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 24 Jun 2021 13:33:21 +0530
Finished: Thu, 24 Jun 2021 13:33:24 +0530
Ready: True
Restart Count: 0
Limits:
cpu: 1
memory: 256M
Requests:
cpu: 1
memory: 256M
Environment:
POD_IP: (v1:status.podIP)
HOST_IP: (v1:status.hostIP)
USE_HOST_IP_FOR_BROADCAST: false
RACK_NAME: us-west-1b
PRODUCT_VERSION: 3.11.10
PRODUCT_NAME: cassandra
DSE_VERSION: 3.11.10
CONFIG_FILE_DATA: {"cassandra-yaml":{"authenticator":"PasswordAuthenticator","authorizer":"CassandraAuthorizer","credentials_update_interval_in_ms":3600000,"credentials_validity_in_ms":3600000,"num_tokens":256,"permissions_update_interval_in_ms":3600000,"permissions_validity_in_ms":3600000,"role_manager":"CassandraRoleManager","roles_update_interval_in_ms":3600000,"roles_validity_in_ms":3600000},"cluster-info":{"name":"demo1","seeds":"demo1-seed-service"},"datacenter-info":{"graph-enabled":0,"name":"demo1","solr-enabled":0,"spark-enabled":0},"jvm-options":{"additional-jvm-opts":["-Dcassandra.system_distributed_replication_dc_names=demo1","-Dcassandra.system_distributed_replication_per_dc=2"],"heap_size_young_generation":"31G","initial_heap_size":"31G","max_heap_size":"31G"}}
Mounts:
/config from server-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qphm4 (ro)
jmx-credentials:
Container ID: docker://1ca5c1470bb05258b324f63469de3e89812f6a54b2cb111f7dbb106e70549401
Image: busybox
Image ID: docker-pullable://busybox@sha256:930490f97e5b921535c153e0e7110d251134cc4b72bbb8133c6a5065cc68580d
Port: <none>
Host Port: <none>
Args:
/bin/sh
-c
echo "$REAPER_JMX_USERNAME $REAPER_JMX_PASSWORD" > /config/jmxremote.password && echo "$SUPERUSER_JMX_USERNAME $SUPERUSER_JMX_PASSWORD" >> /config/jmxremote.password
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 24 Jun 2021 13:33:24 +0530
Finished: Thu, 24 Jun 2021 13:33:24 +0530
Ready: True
Restart Count: 0
Environment:
REAPER_JMX_USERNAME: <set to the key 'username' in secret 'demo1-reaper-jmx'> Optional: false
REAPER_JMX_PASSWORD: <set to the key 'password' in secret 'demo1-reaper-jmx'> Optional: false
SUPERUSER_JMX_USERNAME: <set to the key 'username' in secret 'demo1-superuser'> Optional: false
SUPERUSER_JMX_PASSWORD: <set to the key 'password' in secret 'demo1-superuser'> Optional: false
Mounts:
/config from server-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qphm4 (ro)
medusa-restore:
Container ID: docker://1c850b6656fbd3b86b453ba820a8a7745c62a0e28200d5e43c756613a17f6faa
Image: docker.io/k8ssandra/medusa:0.10.1
Image ID: docker-pullable://k8ssandra/medusa@sha256:c7f721dcd3529ee9769393e8c2ab1c4fcfbe5e9480a8a0fca146dff7e495db5b
Port: <none>
Host Port: <none>
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 24 Jun 2021 13:33:25 +0530
Finished: Thu, 24 Jun 2021 13:33:25 +0530
Ready: True
Restart Count: 0
Environment:
MEDUSA_MODE: RESTORE
CQL_USERNAME: <set to the key 'username' in secret 'demo1-medusa'> Optional: false
CQL_PASSWORD: <set to the key 'password' in secret 'demo1-medusa'> Optional: false
Mounts:
/etc/cassandra from server-config (rw)
/etc/medusa from k8ssandra-medusa (rw)
/etc/medusa-secrets from medusa-s3-credentials (rw)
/etc/podinfo from podinfo (rw)
/var/lib/cassandra from server-data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qphm4 (ro)
Containers:
cassandra:
Container ID: docker://8f584f327cc7636c8d8b22f78615ba4e0faa642f86e2fcf4e78f4e68f2f2ee48
Image: k8ssandra/cass-management-api:3.11.10-v0.1.25
Image ID: docker-pullable://k8ssandra/cass-management-api@sha256:ef5e007d37b57d905c706c1221c96228c4387abb8a96f994af8aae3423dc9f2a
Ports: 9042/TCP, 9142/TCP, 7000/TCP, 7001/TCP, 7199/TCP, 8080/TCP, 9103/TCP, 9160/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
State: Running
Started: Thu, 24 Jun 2021 13:36:21 +0530
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Thu, 24 Jun 2021 13:33:26 +0530
Finished: Thu, 24 Jun 2021 13:36:20 +0530
Ready: False
Restart Count: 1
Limits:
cpu: 7
memory: 30Gi
Requests:
cpu: 7
memory: 30Gi
Liveness: http-get http://:8080/api/v0/probes/liveness delay=15s timeout=1s period=15s #success=1 #failure=3
Readiness: http-get http://:8080/api/v0/probes/readiness delay=20s timeout=1s period=10s #success=1 #failure=3
Environment:
LOCAL_JMX: no
DS_LICENSE: accept
DSE_AUTO_CONF_OFF: all
USE_MGMT_API: true
MGMT_API_EXPLICIT_START: true
DSE_MGMT_EXPLICIT_START: true
Mounts:
/config from server-config (rw)
/etc/cassandra from cassandra-config (rw)
/etc/encryption/ from encryption-cred-storage (rw)
/etc/podinfo from podinfo (rw)
/var/lib/cassandra from server-data (rw)
/var/log/cassandra from server-logs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qphm4 (ro)
medusa:
Container ID: docker://f8bda03c26dfdd065aaacc343ead5dc924bf7f95f6ffb2a878db2951a16a0a35
Image: docker.io/k8ssandra/medusa:0.10.1
Image ID: docker-pullable://k8ssandra/medusa@sha256:c7f721dcd3529ee9769393e8c2ab1c4fcfbe5e9480a8a0fca146dff7e495db5b
Port: 50051/TCP
Host Port: 0/TCP
State: Running
Started: Thu, 24 Jun 2021 13:33:26 +0530
Ready: True
Restart Count: 0
Liveness: exec [/bin/grpc_health_probe -addr=:50051] delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: exec [/bin/grpc_health_probe -addr=:50051] delay=5s timeout=1s period=10s #success=1 #failure=3
Environment:
MEDUSA_MODE: GRPC
CQL_USERNAME: <set to the key 'username' in secret 'demo1-medusa'> Optional: false
CQL_PASSWORD: <set to the key 'password' in secret 'demo1-medusa'> Optional: false
Mounts:
/etc/cassandra from cassandra-config (rw)
/etc/medusa from k8ssandra-medusa (rw)
/etc/medusa-secrets from medusa-s3-credentials (rw)
/var/lib/cassandra from server-data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qphm4 (ro)
server-system-logger:
Container ID: docker://9c06fcb1316d0244ba85e787721f32d8594551ff612177e7a16d4c1200ac6adb
Image: k8ssandra/system-logger:9c4c3692
Image ID: docker-pullable://k8ssandra/system-logger@sha256:6208a1e3d710d022c9e922c8466fe7d76ca206f97bf92902ff5327114696f8b1
Port: <none>
Host Port: <none>
State: Running
Started: Thu, 24 Jun 2021 13:33:27 +0530
Ready: True
Restart Count: 0
Limits:
cpu: 100m
memory: 64M
Requests:
cpu: 100m
memory: 64M
Environment: <none>
Mounts:
/var/log/cassandra from server-logs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qphm4 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
server-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: server-data-demo1-demo1-us-west-1b-sts-0
ReadOnly: false
cassandra-config:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
podinfo:
Type: DownwardAPI (a volume populated by information about the pod)
Items:
metadata.labels -> labels
k8ssandra-medusa:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: k8ssandra-medusa
Optional: false
medusa-s3-credentials:
Type: Secret (a volume populated by a Secret)
SecretName: medusa-s3-credentials
Optional: false
server-config:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
server-logs:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
encryption-cred-storage:
Type: Secret (a volume populated by a Secret)
SecretName: demo1-keystore
Optional: false
default-token-qphm4:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-qphm4
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m33s default-scheduler Successfully assigned k8ssandra/demo1-demo1-us-west-1b-sts-0 to ip-10-94-61-100.us-west-1.compute.internal
Normal SuccessfulAttachVolume 4m30s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-d5eb60d4-2d0c-478d-95f2-3b0379fc59bb"
Normal Pulled 4m27s kubelet Container image "k8ssandra/cass-management-api:3.11.10-v0.1.25" already present on machine
Normal Created 4m27s kubelet Created container base-config-init
Normal Started 4m27s kubelet Started container base-config-init
Normal Pulled 4m26s kubelet Container image "docker.io/datastax/cass-config-builder:1.0.4" already present on machine
Normal Started 4m26s kubelet Started container server-config-init
Normal Created 4m26s kubelet Created container server-config-init
Normal Pulled 4m23s kubelet Container image "busybox" already present on machine
Normal Created 4m23s kubelet Created container jmx-credentials
Normal Started 4m23s kubelet Started container jmx-credentials
Normal Pulled 4m22s kubelet Container image "docker.io/k8ssandra/medusa:0.10.1" already present on machine
Normal Started 4m22s kubelet Started container medusa-restore
Normal Created 4m22s kubelet Created container medusa-restore
Normal Pulled 4m21s kubelet Container image "k8ssandra/cass-management-api:3.11.10-v0.1.25" already present on machine
Normal Created 4m21s kubelet Created container cassandra
Normal Started 4m21s kubelet Started container cassandra
Normal Pulled 4m21s kubelet Container image "docker.io/k8ssandra/medusa:0.10.1" already present on machine
Normal Created 4m21s kubelet Created container medusa
Normal Started 4m21s kubelet Started container medusa
Normal Pulled 4m21s kubelet Container image "k8ssandra/system-logger:9c4c3692" already present on machine
Normal Created 4m21s kubelet Created container server-system-logger
Normal Started 4m20s kubelet Started container server-system-logger
Warning Unhealthy 3m31s (x4 over 4m1s) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
Storage class Yaml
cat storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: k8ssandra-storage
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
default kubernetes.io/aws-ebs Delete Immediate false 57d
gp2 (default) kubernetes.io/aws-ebs Delete Immediate false 57d
k8ssandra-storage kubernetes.io/aws-ebs Delete WaitForFirstConsumer false 2d20h
kops-ssd-1-17 kubernetes.io/aws-ebs Delete WaitForFirstConsumer true 57d
storage pv
**kubectl get pv**
kubectl get pv | grep k8ssandra
pvc-6a2d5fab-f9cb-4548-a327-ea9d8c95cabd 10Gi RWO Delete Bound k8ssandra/server-data-demo1-demo1-us-west-1a-sts-0 k8ssandra-storage 25m
pvc-d5eb60d4-2d0c-478d-95f2-3b0379fc59bb 10Gi RWO Delete Bound k8ssandra/server-data-demo1-demo1-us-west-1b-sts-0 k8ssandra-storage 25m
storage pvc
**kubectl get pvc -n k8ssandra**
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
server-data-demo1-demo1-us-west-1a-sts-0 Bound pvc-6a2d5fab-f9cb-4548-a327-ea9d8c95cabd 10Gi RWO k8ssandra-storage 26m
server-data-demo1-demo1-us-west-1b-sts-0 Bound pvc-d5eb60d4-2d0c-478d-95f2-3b0379fc59bb 10Gi RWO k8ssandra-storage 26m
Logs from CASSANDRA POD
INFO [main] 2021-06-24 08:36:07,115 ResteasyDeploymentImpl.java:704 - RESTEASY002210: Adding provider singleton io.swagger.v3.jaxrs2.SwaggerSerializers from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2021-06-24 08:36:07,129 IPCController.java:105 - Starting Server
INFO [main] 2021-06-24 08:36:07,131 IPCController.java:114 - Started Server
Started service on file:///tmp/oss-mgmt.sock
INFO [nioEventLoopGroup-2-2] 2021-06-24 08:36:27,323 Cli.java:617 - address=/10.94.61.100:55776 url=/api/v0/probes/liveness status=200 OK
INFO [nioEventLoopGroup-3-2] 2021-06-24 08:36:31,543 DefaultMavenCoordinates.java:37 - DataStax Java driver for Apache Cassandra(R) (com.datastax.oss:java-driver-core) version 4.11.1
INFO [epollEventLoopGroup-5-1] 2021-06-24 08:36:31,777 Clock.java:47 - Using native clock for microsecond precision
WARN [epollEventLoopGroup-5-2] 2021-06-24 08:36:31,798 AbstractBootstrap.java:452 - Unknown channel option 'TCP_NODELAY' for channel '[id: 0x056e1301]'
WARN [epollEventLoopGroup-5-2] 2021-06-24 08:36:31,831 Loggers.java:39 - [s0] Error connecting to Node(endPoint=/tmp/cassandra.sock, hostId=null, hashCode=29f8d9b7), trying next node (FileNotFoundException: null)
INFO [nioEventLoopGroup-2-1] 2021-06-24 08:36:31,838 Cli.java:617 - address=/10.94.61.100:55802 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [epollEventLoopGroup-6-1] 2021-06-24 08:36:41,401 Clock.java:47 - Using native clock for microsecond precision
WARN [epollEventLoopGroup-6-2] 2021-06-24 08:36:41,402 AbstractBootstrap.java:452 - Unknown channel option 'TCP_NODELAY' for channel '[id: 0x19007cce]'
WARN [epollEventLoopGroup-6-2] 2021-06-24 08:36:41,404 Loggers.java:39 - [s1] Error connecting to Node(endPoint=/tmp/cassandra.sock, hostId=null, hashCode=469e9be6), trying next node (FileNotFoundException: null)
INFO [nioEventLoopGroup-2-2] 2021-06-24 08:36:41,406 Cli.java:617 - address=/10.94.61.100:55842 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-2-1] 2021-06-24 08:36:42,221 Cli.java:617 - address=/10.94.61.100:55844 url=/api/v0/probes/liveness status=200 OK
Logs from MEDUSA POD
MEDUSA_MODE = GRPC
sleeping for 0 sec
Starting Medusa gRPC service
/home/cassandra/.local/lib/python3.6/site-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.4) or chardet (3.0.4) doesn't match a supported version!
RequestsDependencyWarning)
INFO:root:Init service
[2021-06-24 08:36:06,429] INFO: Init service
DEBUG:root:Loading storage_provider: s3
[2021-06-24 08:36:06,429] DEBUG: Loading storage_provider: s3
DEBUG:root:Reading AWS credentials from /etc/medusa-secrets/medusa_s3_credentials
[2021-06-24 08:36:06,430] DEBUG: Reading AWS credentials from /etc/medusa-secrets/medusa_s3_credentials
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): s3-us-west-1.amazonaws.com:443
[2021-06-24 08:36:06,763] DEBUG: Starting new HTTPS connection (1): s3-us-west-1.amazonaws.com:443
DEBUG:urllib3.connectionpool:https://s3-us-west-1.amazonaws.com:443 "HEAD /cdstorage-k8ssandra-us HTTP/1.1" 200 0
[2021-06-24 08:36:06,812] DEBUG: https://s3-us-west-1.amazonaws.com:443 "HEAD /cdstorage-k8ssandra-us HTTP/1.1" 200 0
INFO:root:Starting server. Listening on port 50051.
[2021-06-24 08:36:07,143] INFO: Starting server. Listening on port 50051.