K8ssandra Forum

Error calling include: template: k8ssandra/charts/k8ssandra-common/templates/_helpers.tpl:14:3:

Tried after deleted the crd files , but still facing same issue,

helm install k8ssandra k8ssandra/ -f k8ssandra/prod_values.yaml -n k8ssandra --dry-run

Error: template: k8ssandra/charts/reaper-operator/templates/service_account.yaml:1:3: executing “k8ssandra/charts/reaper-operator/templates/service_account.yaml” at <include “k8ssandra-common.serviceAccount” .>: error calling include: template: k8ssandra/charts/k8ssandra-common/templates/_helpers.tpl:52:11: executing “k8ssandra-common.serviceAccount” at <include “k8ssandra-common.serviceAccountName” .>: error calling include: template: k8ssandra/charts/k8ssandra-common/templates/_helpers.tpl:42:13: executing “k8ssandra-common.serviceAccountName” at <include “k8ssandra-common.fullname” .>: error calling include: template: k8ssandra/charts/k8ssandra-common/templates/_helpers.tpl:14:3: executing “k8ssandra-common.fullname” at <include “common.names.fullname” .>: error calling include: template: no template “common.names.fullname” associated with template “gotpl”

It doesn’t look like you’ve specified the chart correctly. You need to specify the chart as k8ssandra/k8ssandra:

$ helm install -f k8ssandra.yaml k8ssandra k8ssandra/k8ssandra

but in the command you posted, you’ve specified it as k8ssandra/.

Please post the output of:

$ helm list --all

so we can have a look at what you’ve actually installed.

If you just want to start over, you can try to helm uninstall whatever you’ve got. Cheers!

Hi Hitendra,

Try using this script in the k8ssandra repository after you’ve made adjustments to your templates.

this is my directory , downloaded k8ssandra-1.2

➜  charts ls -lrth 
total 0
drwxr-xr-x   7 hitendra.v  149934974   224B Jun  2 20:39 restore
drwxr-xr-x   8 hitendra.v  149934974   256B Jun  2 20:39 reaper-operator
drwxr-xr-x   7 hitendra.v  149934974   224B Jun  2 20:39 medusa-operator
drwxr-xr-x   7 hitendra.v  149934974   224B Jun  2 20:39 k8ssandra-common
drwxr-xr-x   7 hitendra.v  149934974   224B Jun  2 20:39 backup
drwxr-xr-x   8 hitendra.v  149934974   256B Jun 18 18:17 cass-operator
drwxr-xr-x  11 hitendra.v  149934974   352B Jun 18 18:23 k8ssandra
➜  charts cd ..
➜  k8ssandra-1.2.0 ls
CHANGELOG-1.0.md         CONTRIBUTING.md          README.md                contrib                  go.mod                   pkg                      tests
CHANGELOG-1.1.md         LICENSE                  charts                   dev-quick-start.md       go.sum                   scripts
CHANGELOG-1.2.md         Makefile                 cmd                      docs                     package-lock.json        sonar-project.properties
➜  k8ssandra-1.2.0
cassandra:
  version: "3.11.10"
  cassandraLibDirVolume:
    storageClass: gp2
    size: 10Gi
  heap:
   size: 1024M
   newGenSize: 256G
  resources:
    requests:
      cpu: 4000m
      memory: 3Gi
    limits:
      cpu: 1000m
      memory: 3Gi
  datacenters:
  - name: chong
    size: 2
    racks:
    - name: us-west-1a
      affinityLabels:
        topology.kubernetes.io/zone: us-west-1a
    - name: us-west-1b
      affinityLabels:
        topology.kubernetes.io/zone: us-west-1b
stargate:
  enabled: true
  replicas: 2
  heapMB: 1024
  cpuReqMillicores: 1000
  cpuLimMillicores: 1000

medusa:
  enabled: true
  storage: s3
  bucketName: cdstorage-cassandra
  storageSecret: cassandra-admin-secret
  storage_properties:
    region: us-west-1

prometheus:
  enabled: false
#helm3 install -f k8ssandra.yaml k8ssandra k8ssandra/k8ssandra -n k8ssandra

Error: unable to build kubernetes objects from release manifest: error validating “”: error validating data: [ValidationError(Prometheus.spec): unknown field “probeNamespaceSelector” in com.coreos.monitoring.v1.Prometheus.spec, ValidationError(Prometheus.spec): unknown field “probeSelector” in com.coreos.monitoring.v1.Prometheus.spec, ValidationError(Prometheus.spec): unknown field “shards” in com.coreos.monitoring.v1.Prometheus.spec]

Hi Jeff , Below is the Error before And Running the script

$ helm3 install k8ssandra ../charts/k8ssandra --set size=2 --set clusterName=chong --set k8ssandra.namespace=k8ssandra -n k8ssandra --create-namespace --dry-run

Error: template: k8ssandra/charts/reaper-operator/templates/service_account.yaml:1:3: executing “k8ssandra/charts/reaper-operator/templates/service_account.yaml” at <include “k8ssandra-common.serviceAccount” .>: error calling include: template: k8ssandra/charts/k8ssandra-common/templates/_helpers.tpl:52:11: executing “k8ssandra-common.serviceAccount” at <include “k8ssandra-common.serviceAccountName” .>: error calling include: template: k8ssandra/charts/k8ssandra-common/templates/_helpers.tpl:42:13: executing “k8ssandra-common.serviceAccountName” at <include “k8ssandra-common.fullname” .>: error calling include: template: k8ssandra/charts/k8ssandra-common/templates/_helpers.tpl:14:3: executing “k8ssandra-common.fullname” at <include “common.names.fullname” .>: error calling include: template: no template “common.names.fullname” associated with template “gotpl”

$ sh update-helm-deps.sh

Getting updates for unmanaged Helm repositories…

…Successfully got an update from the “https://charts.bitnami.com/bitnami” chart repository
Hang tight while we grab the latest from your chart repositories…
…Successfully got an update from the “jaegertracing” chart repository
…Successfully got an update from the “k8ssandra” chart repository
…Successfully got an update from the “traefik” chart repository
…Successfully got an update from the “stable” chart repository
Update Complete. ⎈Happy Helming!⎈
Saving 1 charts
Downloading common from repo https://charts.bitnami.com/bitnami
Deleting outdated charts
Hang tight while we grab the latest from your chart repositories…
…Successfully got an update from the “k8ssandra” chart repository
…Successfully got an update from the “jaegertracing” chart repository
…Successfully got an update from the “traefik” chart repository
…Successfully got an update from the “stable” chart repository
Update Complete. ⎈Happy Helming!⎈
Saving 1 charts
Deleting outdated charts
Hang tight while we grab the latest from your chart repositories…
…Successfully got an update from the “jaegertracing” chart repository
…Successfully got an update from the “k8ssandra” chart repository
…Successfully got an update from the “traefik” chart repository
…Successfully got an update from the “stable” chart repository
Update Complete. ⎈Happy Helming!⎈
Saving 1 charts
Deleting outdated charts
Hang tight while we grab the latest from your chart repositories…
…Successfully got an update from the “k8ssandra” chart repository
…Successfully got an update from the “jaegertracing” chart repository
…Successfully got an update from the “traefik” chart repository
…Successfully got an update from the “stable” chart repository
Update Complete. ⎈Happy Helming!⎈
Saving 1 charts
Deleting outdated charts
Getting updates for unmanaged Helm repositories…
…Successfully got an update from the “https://prometheus-community.github.io/helm-charts” chart repository
Hang tight while we grab the latest from your chart repositories…
…Successfully got an update from the “k8ssandra” chart repository
…Successfully got an update from the “jaegertracing” chart repository
…Successfully got an update from the “traefik” chart repository
…Successfully got an update from the “stable” chart repository
Update Complete. ⎈Happy Helming!⎈
Saving 5 charts
Downloading kube-prometheus-stack from repo https://prometheus-community.github.io/helm-charts
Deleting outdated charts

$ helm2 version
Client: &version.Version{SemVer:“v2.14.3”, GitCommit:“0e7f3b6637f7af8fcfddb3d2941fcc7cbebb0085”, GitTreeState:“clean”}

$ helm3 version

version.BuildInfo{Version:“v3.5.3”, GitCommit:“041ce5a2c17a58be0fcd5f5e16fb3e7e95fea622”, GitTreeState:“dirty”, GoVersion:“go1.16”}

$ helm3 install k8ssandra ../charts/k8ssandra --set size=2 --set clusterName=chong --set k8ssandra.namespace=k8ssandra -n k8ssandra --create-namespace --dry-run

Error: unable to build kubernetes objects from release manifest: error validating “”: error validating data: [ValidationError(Prometheus.spec): unknown field “probeNamespaceSelector” in com.coreos.monitoring.v1.Prometheus.spec, ValidationError(Prometheus.spec): unknown field “probeSelector” in com.coreos.monitoring.v1.Prometheus.spec, ValidationError(Prometheus.spec): unknown field “shards” in com.coreos.monitoring.v1.Prometheus.spec]

I suspect you may have some CRDs that are stale.

If you are able to successfully do a helm delete to do a clean slate then that may be a good foundation to start from.

If not, you can selectively delete the CRDs related to Prometheus. Here are some examples of CRD removal that fixed another person’s issue that seems related.

In both cases, once you get a clean slate, I would try the apply again.

Great … !! Thanks jeff , deleting CRD using that script works.
But now i am stuck in another weird issue.

I am running k8ssandra without prometheus/grafana, i have planned it for later use.
Presently i am running below pods.

NAME                                        READY   STATUS             RESTARTS   AGE
demo1-demo1-us-west-1a-sts-0                 1/3     CrashLoopBackOff   4          2m45s
demo1-demo1-us-west-1b-sts-0                 2/3     Running            0          2m45s
k8ssandra-cass-operator-86cbb6c884-csfbs     1/1     Running            0          3m6s
k8ssandra-demo1-stargate-7df954c644-mssjf    0/1     Init:0/1           0          3m6s
k8ssandra-demo1-stargate-7df954c644-twdwr    0/1     Init:0/1           0          3m6s
k8ssandra-medusa-operator-854959bfcd-dkqkg   1/1     Running            0          3m6s
k8ssandra-reaper-operator-6ccdddb6f6-t4m2n   1/1     Running            0          3m6s

As above, two pods are running in different zones in AWS, both are having issue in liveness and readiness.

Readiness probe failed: HTTP probe failed with statuscode: 500
Source
kubelet ip-10-94-61-36.us-west-1.compute.internal
Count
26
Sub-object
spec.containers{cassandra}
Last seen
2021-06-21T13:31:31Z
Liveness probe failed:
Source
kubelet ip-10-94-61-36.us-west-1.compute.internal
Count
1
Sub-object
spec.containers{medusa}
Last seen
2021-06-21T13:26:49Z
Liveness probe failed: timeout: failed to connect service ":50051" within 1s
Source
kubelet ip-10-94-61-36.us-west-1.compute.internal
Count
1
Sub-object
spec.containers{medusa}
Last seen
2021-06-21T13:26:39Z
Readiness probe failed: timeout: failed to connect service ":50051" within 1s
Source
kubelet ip-10-94-61-36.us-west-1.compute.internal
Count
2
Sub-object
spec.containers{medusa}
Last seen
2021-06-21T13:26:47Z

Below is my Value.yaml

cassandra:
  version: "3.11.10"
  cassandraLibDirVolume:
    storageClass: k8ssandra-storage
    size: 10Gi
  heap:
   size: 1024G
   newGenSize: 1024G
  resources:
    requests:
      cpu: 1000m
      memory: 3Gi
    limits:
      cpu: 1000m
      memory: 3Gi
  datacenters:
  - name: demo1
    size: 2
    racks:
    - name: us-west-1a
      affinityLabels:
        topology.kubernetes.io/zone: us-west-1a
    - name: us-west-1b
      affinityLabels:
        topology.kubernetes.io/zone: us-west-1b
stargate:
  enabled: true
  replicas: 2
  heapMB: 2048
  cpuReqMillicores: 1000
  cpuLimMillicores: 1000

medusa:
  enabled: true
  storage: s3
  bucketName: cdstorage-k8ssandra-us
  storageSecret: medusa-s3-credentials
  storage_properties:
    region: us-west-1

kube-prometheus-stack:
  enabled: false

One thing I notice is that you may need to let the install process resume a bit longer. The AGE was only 2-3 minutes in your snapshot above. If you still see that the Stargate and/or Cassandra pods are not starting, try reducing your replica counts and adjust your memory settings to get a feel for what will work in your environment. Later, you can increase your replicas (and raise or lower memory) to better isolate the issues.

Many times, we find that an OOM error is reported in the Stargate or Cassandra logs, so I would check for that as well.

For example, with Cassandra, you can use something like this to target the server-system-logger.

kubectl logs demo1-demo1-us-west-1a-sts-0 -c server-system-logger

1 Like

Thanks Jeff , for quick response , After increasing cpu to 7G , both cassandra pods are running , but stuck at pod readiness.

Cassandra Container logs


INFO [epollEventLoopGroup-64-1] 2021-06-22 11:25:05,859 Clock.java:47 - Using native clock for microsecond precision
WARN [epollEventLoopGroup-64-2] 2021-06-22 11:25:05,860 AbstractBootstrap.java:452 - Unknown channel option 'TCP_NODELAY' for channel '[id: 0x15527781]'
WARN [epollEventLoopGroup-64-2] 2021-06-22 11:25:05,861 Loggers.java:39 - [s59] Error connecting to Node(endPoint=/tmp/cassandra.sock, hostId=null, hashCode=1e28f7c5), trying next node (FileNotFoundException: null)
INFO [nioEventLoopGroup-2-1] 2021-06-22 11:25:05,862 Cli.java:617 - address=/10.94.61.38:55490 url=/api/v0/probes/readiness status=500 Internal Server Error

Readiness probe failed: HTTP probe failed with statuscode: 500

Hitendra,

Some suggestions to get you going in the right direction.

  1. Check your environment for proper resources of cpu, memory, and volume size. Include those here if possible.

  2. Provide some detailed logging. For example, attach log outputs from your cassandra and server-system-logger containers. The events output of your pod’s lifecycle will be useful. E.g. kubectl describe pod/<pod-name>

  3. Were you able to try out your test with reduced replica and memory footprint, then scale up to the 7G memory?

Hi Jeff,

Below are the detailed pod description and logs

I am using AWS instance type c5.9xlarge,

Instance Name vCPUs RAM

|c5.9xlarge|. 36|. 72 GiB|

My value.yaml

cat k8ssandra/prod_values.yaml 
cassandra:
  version: "3.11.10"
  cassandraLibDirVolume:
    storageClass: k8ssandra-storage
    size: 10Gi
  heap:
   size: 31G
   newGenSize: 31G
  resources:
    requests:
      cpu: 7000m
      memory: 30Gi
    limits:
      cpu: 7000m
      memory: 30Gi
  datacenters:
  - name: demo1
    size: 2
    racks:
     - name: us-west-1a
       affinityLabels:
         topology.kubernetes.io/zone: us-west-1a
     - name: us-west-1b
       affinityLabels:
         topology.kubernetes.io/zone: us-west-1b
stargate:
  enabled: true
  replicas: 2
  heapMB: 1024
  cpuReqMillicores: 3000
  cpuLimMillicores: 3000

medusa:
  enabled: true
  storage: s3
  bucketName: cdstorage-k8ssandra-us
  storageSecret: medusa-s3-credentials
  storage_properties:
    region: us-west-1

kube-prometheus-stack:
  enabled: false

My Pod description

➜  **kubectl describe pod demo1-demo1-us-west-1b-sts-0 -n k8ssandra**                
Name:         demo1-demo1-us-west-1b-sts-0
Namespace:    k8ssandra
Priority:     0
Node:         ip-10-94-61-100.us-west-1.compute.internal/10.94.61.100
Start Time:   Thu, 24 Jun 2021 13:33:15 +0530
Labels:       app.kubernetes.io/managed-by=cass-operator
              cassandra.datastax.com/cluster=demo1
              cassandra.datastax.com/datacenter=demo1
              cassandra.datastax.com/node-state=Starting
              cassandra.datastax.com/rack=us-west-1b
              cassandra.datastax.com/seed-node=true
              controller-revision-hash=demo1-demo1-us-west-1b-sts-59647dfd7d
              statefulset.kubernetes.io/pod-name=demo1-demo1-us-west-1b-sts-0
Annotations:  cni.projectcalico.org/podIP: 100.96.4.50/32
              cni.projectcalico.org/podIPs: 100.96.4.50/32
Status:       Running
IP:           100.96.4.50
IPs:
  IP:           100.96.4.50
Controlled By:  StatefulSet/demo1-demo1-us-west-1b-sts
Init Containers:
  base-config-init:
    Container ID:  docker://5afe0dc5772b27498ff900768ee975d93b3c5b4e2682975417afc5889e58bc4a
    Image:         k8ssandra/cass-management-api:3.11.10-v0.1.25
    Image ID:      docker-pullable://k8ssandra/cass-management-api@sha256:ef5e007d37b57d905c706c1221c96228c4387abb8a96f994af8aae3423dc9f2a
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
    Args:
      -c
      cp -r /etc/cassandra/* /cassandra-base-config/
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 24 Jun 2021 13:33:20 +0530
      Finished:     Thu, 24 Jun 2021 13:33:20 +0530
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /cassandra-base-config/ from cassandra-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-qphm4 (ro)
  server-config-init:
    Container ID:   docker://d967421eba795134893fa91efbf65fad2666c341d2140e186cae5e6d1cba09b8
    Image:          docker.io/datastax/cass-config-builder:1.0.4
    Image ID:       docker-pullable://datastax/cass-config-builder@sha256:0cfa1f1270f1c211ae4ac8eb690dd9e909cf690126e5ed5ddb08bba78902d1a1
    Port:           <none>
    Host Port:      <none>
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 24 Jun 2021 13:33:21 +0530
      Finished:     Thu, 24 Jun 2021 13:33:24 +0530
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     1
      memory:  256M
    Requests:
      cpu:     1
      memory:  256M
    Environment:
      POD_IP:                      (v1:status.podIP)
      HOST_IP:                     (v1:status.hostIP)
      USE_HOST_IP_FOR_BROADCAST:  false
      RACK_NAME:                  us-west-1b
      PRODUCT_VERSION:            3.11.10
      PRODUCT_NAME:               cassandra
      DSE_VERSION:                3.11.10
      CONFIG_FILE_DATA:           {"cassandra-yaml":{"authenticator":"PasswordAuthenticator","authorizer":"CassandraAuthorizer","credentials_update_interval_in_ms":3600000,"credentials_validity_in_ms":3600000,"num_tokens":256,"permissions_update_interval_in_ms":3600000,"permissions_validity_in_ms":3600000,"role_manager":"CassandraRoleManager","roles_update_interval_in_ms":3600000,"roles_validity_in_ms":3600000},"cluster-info":{"name":"demo1","seeds":"demo1-seed-service"},"datacenter-info":{"graph-enabled":0,"name":"demo1","solr-enabled":0,"spark-enabled":0},"jvm-options":{"additional-jvm-opts":["-Dcassandra.system_distributed_replication_dc_names=demo1","-Dcassandra.system_distributed_replication_per_dc=2"],"heap_size_young_generation":"31G","initial_heap_size":"31G","max_heap_size":"31G"}}
    Mounts:
      /config from server-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-qphm4 (ro)
  jmx-credentials:
    Container ID:  docker://1ca5c1470bb05258b324f63469de3e89812f6a54b2cb111f7dbb106e70549401
    Image:         busybox
    Image ID:      docker-pullable://busybox@sha256:930490f97e5b921535c153e0e7110d251134cc4b72bbb8133c6a5065cc68580d
    Port:          <none>
    Host Port:     <none>
    Args:
      /bin/sh
      -c
      echo "$REAPER_JMX_USERNAME $REAPER_JMX_PASSWORD" > /config/jmxremote.password && echo "$SUPERUSER_JMX_USERNAME $SUPERUSER_JMX_PASSWORD" >> /config/jmxremote.password
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 24 Jun 2021 13:33:24 +0530
      Finished:     Thu, 24 Jun 2021 13:33:24 +0530
    Ready:          True
    Restart Count:  0
    Environment:
      REAPER_JMX_USERNAME:     <set to the key 'username' in secret 'demo1-reaper-jmx'>  Optional: false
      REAPER_JMX_PASSWORD:     <set to the key 'password' in secret 'demo1-reaper-jmx'>  Optional: false
      SUPERUSER_JMX_USERNAME:  <set to the key 'username' in secret 'demo1-superuser'>   Optional: false
      SUPERUSER_JMX_PASSWORD:  <set to the key 'password' in secret 'demo1-superuser'>   Optional: false
    Mounts:
      /config from server-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-qphm4 (ro)
  medusa-restore:
    Container ID:   docker://1c850b6656fbd3b86b453ba820a8a7745c62a0e28200d5e43c756613a17f6faa
    Image:          docker.io/k8ssandra/medusa:0.10.1
    Image ID:       docker-pullable://k8ssandra/medusa@sha256:c7f721dcd3529ee9769393e8c2ab1c4fcfbe5e9480a8a0fca146dff7e495db5b
    Port:           <none>
    Host Port:      <none>
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 24 Jun 2021 13:33:25 +0530
      Finished:     Thu, 24 Jun 2021 13:33:25 +0530
    Ready:          True
    Restart Count:  0
    Environment:
      MEDUSA_MODE:   RESTORE
      CQL_USERNAME:  <set to the key 'username' in secret 'demo1-medusa'>  Optional: false
      CQL_PASSWORD:  <set to the key 'password' in secret 'demo1-medusa'>  Optional: false
    Mounts:
      /etc/cassandra from server-config (rw)
      /etc/medusa from k8ssandra-medusa (rw)
      /etc/medusa-secrets from medusa-s3-credentials (rw)
      /etc/podinfo from podinfo (rw)
      /var/lib/cassandra from server-data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-qphm4 (ro)
Containers:
  cassandra:
    Container ID:   docker://8f584f327cc7636c8d8b22f78615ba4e0faa642f86e2fcf4e78f4e68f2f2ee48
    Image:          k8ssandra/cass-management-api:3.11.10-v0.1.25
    Image ID:       docker-pullable://k8ssandra/cass-management-api@sha256:ef5e007d37b57d905c706c1221c96228c4387abb8a96f994af8aae3423dc9f2a
    Ports:          9042/TCP, 9142/TCP, 7000/TCP, 7001/TCP, 7199/TCP, 8080/TCP, 9103/TCP, 9160/TCP
    Host Ports:     0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
    State:          Running
      Started:      Thu, 24 Jun 2021 13:36:21 +0530
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Thu, 24 Jun 2021 13:33:26 +0530
      Finished:     Thu, 24 Jun 2021 13:36:20 +0530
    Ready:          False
    Restart Count:  1
    Limits:
      cpu:     7
      memory:  30Gi
    Requests:
      cpu:      7
      memory:   30Gi
    Liveness:   http-get http://:8080/api/v0/probes/liveness delay=15s timeout=1s period=15s #success=1 #failure=3
    Readiness:  http-get http://:8080/api/v0/probes/readiness delay=20s timeout=1s period=10s #success=1 #failure=3
    Environment:
      LOCAL_JMX:                no
      DS_LICENSE:               accept
      DSE_AUTO_CONF_OFF:        all
      USE_MGMT_API:             true
      MGMT_API_EXPLICIT_START:  true
      DSE_MGMT_EXPLICIT_START:  true
    Mounts:
      /config from server-config (rw)
      /etc/cassandra from cassandra-config (rw)
      /etc/encryption/ from encryption-cred-storage (rw)
      /etc/podinfo from podinfo (rw)
      /var/lib/cassandra from server-data (rw)
      /var/log/cassandra from server-logs (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-qphm4 (ro)
  medusa:
    Container ID:   docker://f8bda03c26dfdd065aaacc343ead5dc924bf7f95f6ffb2a878db2951a16a0a35
    Image:          docker.io/k8ssandra/medusa:0.10.1
    Image ID:       docker-pullable://k8ssandra/medusa@sha256:c7f721dcd3529ee9769393e8c2ab1c4fcfbe5e9480a8a0fca146dff7e495db5b
    Port:           50051/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Thu, 24 Jun 2021 13:33:26 +0530
    Ready:          True
    Restart Count:  0
    Liveness:       exec [/bin/grpc_health_probe -addr=:50051] delay=10s timeout=1s period=10s #success=1 #failure=3
    Readiness:      exec [/bin/grpc_health_probe -addr=:50051] delay=5s timeout=1s period=10s #success=1 #failure=3
    Environment:
      MEDUSA_MODE:   GRPC
      CQL_USERNAME:  <set to the key 'username' in secret 'demo1-medusa'>  Optional: false
      CQL_PASSWORD:  <set to the key 'password' in secret 'demo1-medusa'>  Optional: false
    Mounts:
      /etc/cassandra from cassandra-config (rw)
      /etc/medusa from k8ssandra-medusa (rw)
      /etc/medusa-secrets from medusa-s3-credentials (rw)
      /var/lib/cassandra from server-data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-qphm4 (ro)
  server-system-logger:
    Container ID:   docker://9c06fcb1316d0244ba85e787721f32d8594551ff612177e7a16d4c1200ac6adb
    Image:          k8ssandra/system-logger:9c4c3692
    Image ID:       docker-pullable://k8ssandra/system-logger@sha256:6208a1e3d710d022c9e922c8466fe7d76ca206f97bf92902ff5327114696f8b1
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Thu, 24 Jun 2021 13:33:27 +0530
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  64M
    Requests:
      cpu:        100m
      memory:     64M
    Environment:  <none>
    Mounts:
      /var/log/cassandra from server-logs (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-qphm4 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  server-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  server-data-demo1-demo1-us-west-1b-sts-0
    ReadOnly:   false
  cassandra-config:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  podinfo:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.labels -> labels
  k8ssandra-medusa:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      k8ssandra-medusa
    Optional:  false
  medusa-s3-credentials:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  medusa-s3-credentials
    Optional:    false
  server-config:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  server-logs:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  encryption-cred-storage:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  demo1-keystore
    Optional:    false
  default-token-qphm4:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-qphm4
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                   From                     Message
  ----     ------                  ----                  ----                     -------
  Normal   Scheduled               4m33s                 default-scheduler        Successfully assigned k8ssandra/demo1-demo1-us-west-1b-sts-0 to ip-10-94-61-100.us-west-1.compute.internal
  Normal   SuccessfulAttachVolume  4m30s                 attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-d5eb60d4-2d0c-478d-95f2-3b0379fc59bb"
  Normal   Pulled                  4m27s                 kubelet                  Container image "k8ssandra/cass-management-api:3.11.10-v0.1.25" already present on machine
  Normal   Created                 4m27s                 kubelet                  Created container base-config-init
  Normal   Started                 4m27s                 kubelet                  Started container base-config-init
  Normal   Pulled                  4m26s                 kubelet                  Container image "docker.io/datastax/cass-config-builder:1.0.4" already present on machine
  Normal   Started                 4m26s                 kubelet                  Started container server-config-init
  Normal   Created                 4m26s                 kubelet                  Created container server-config-init
  Normal   Pulled                  4m23s                 kubelet                  Container image "busybox" already present on machine
  Normal   Created                 4m23s                 kubelet                  Created container jmx-credentials
  Normal   Started                 4m23s                 kubelet                  Started container jmx-credentials
  Normal   Pulled                  4m22s                 kubelet                  Container image "docker.io/k8ssandra/medusa:0.10.1" already present on machine
  Normal   Started                 4m22s                 kubelet                  Started container medusa-restore
  Normal   Created                 4m22s                 kubelet                  Created container medusa-restore
  Normal   Pulled                  4m21s                 kubelet                  Container image "k8ssandra/cass-management-api:3.11.10-v0.1.25" already present on machine
  Normal   Created                 4m21s                 kubelet                  Created container cassandra
  Normal   Started                 4m21s                 kubelet                  Started container cassandra
  Normal   Pulled                  4m21s                 kubelet                  Container image "docker.io/k8ssandra/medusa:0.10.1" already present on machine
  Normal   Created                 4m21s                 kubelet                  Created container medusa
  Normal   Started                 4m21s                 kubelet                  Started container medusa
  Normal   Pulled                  4m21s                 kubelet                  Container image "k8ssandra/system-logger:9c4c3692" already present on machine
  Normal   Created                 4m21s                 kubelet                  Created container server-system-logger
  Normal   Started                 4m20s                 kubelet                  Started container server-system-logger
  Warning  Unhealthy               3m31s (x4 over 4m1s)  kubelet                  Readiness probe failed: HTTP probe failed with statuscode: 500

Storage class Yaml

cat storageclass.yaml 
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: k8ssandra-storage
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete

kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
default kubernetes.io/aws-ebs Delete Immediate false 57d
gp2 (default) kubernetes.io/aws-ebs Delete Immediate false 57d
k8ssandra-storage kubernetes.io/aws-ebs Delete WaitForFirstConsumer false 2d20h
kops-ssd-1-17 kubernetes.io/aws-ebs Delete WaitForFirstConsumer true 57d

storage pv

 **kubectl get pv**  
kubectl get pv | grep k8ssandra
pvc-6a2d5fab-f9cb-4548-a327-ea9d8c95cabd   10Gi       RWO            Delete           Bound    k8ssandra/server-data-demo1-demo1-us-west-1a-sts-0       k8ssandra-storage            25m
pvc-d5eb60d4-2d0c-478d-95f2-3b0379fc59bb   10Gi       RWO            Delete           Bound    k8ssandra/server-data-demo1-demo1-us-west-1b-sts-0       k8ssandra-storage            25m

storage pvc

**kubectl get pvc -n k8ssandra**                 
NAME                                       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        AGE
server-data-demo1-demo1-us-west-1a-sts-0   Bound    pvc-6a2d5fab-f9cb-4548-a327-ea9d8c95cabd   10Gi       RWO            k8ssandra-storage   26m
server-data-demo1-demo1-us-west-1b-sts-0   Bound    pvc-d5eb60d4-2d0c-478d-95f2-3b0379fc59bb   10Gi       RWO            k8ssandra-storage   26m

Logs from CASSANDRA POD

INFO [main] 2021-06-24 08:36:07,115 ResteasyDeploymentImpl.java:704 - RESTEASY002210: Adding provider singleton io.swagger.v3.jaxrs2.SwaggerSerializers from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2021-06-24 08:36:07,129 IPCController.java:105 - Starting Server
INFO [main] 2021-06-24 08:36:07,131 IPCController.java:114 - Started Server
Started service on file:///tmp/oss-mgmt.sock
INFO [nioEventLoopGroup-2-2] 2021-06-24 08:36:27,323 Cli.java:617 - address=/10.94.61.100:55776 url=/api/v0/probes/liveness status=200 OK
INFO [nioEventLoopGroup-3-2] 2021-06-24 08:36:31,543 DefaultMavenCoordinates.java:37 - DataStax Java driver for Apache Cassandra(R) (com.datastax.oss:java-driver-core) version 4.11.1
INFO [epollEventLoopGroup-5-1] 2021-06-24 08:36:31,777 Clock.java:47 - Using native clock for microsecond precision
WARN [epollEventLoopGroup-5-2] 2021-06-24 08:36:31,798 AbstractBootstrap.java:452 - Unknown channel option 'TCP_NODELAY' for channel '[id: 0x056e1301]'
WARN [epollEventLoopGroup-5-2] 2021-06-24 08:36:31,831 Loggers.java:39 - [s0] Error connecting to Node(endPoint=/tmp/cassandra.sock, hostId=null, hashCode=29f8d9b7), trying next node (FileNotFoundException: null)
INFO [nioEventLoopGroup-2-1] 2021-06-24 08:36:31,838 Cli.java:617 - address=/10.94.61.100:55802 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [epollEventLoopGroup-6-1] 2021-06-24 08:36:41,401 Clock.java:47 - Using native clock for microsecond precision
WARN [epollEventLoopGroup-6-2] 2021-06-24 08:36:41,402 AbstractBootstrap.java:452 - Unknown channel option 'TCP_NODELAY' for channel '[id: 0x19007cce]'
WARN [epollEventLoopGroup-6-2] 2021-06-24 08:36:41,404 Loggers.java:39 - [s1] Error connecting to Node(endPoint=/tmp/cassandra.sock, hostId=null, hashCode=469e9be6), trying next node (FileNotFoundException: null)
INFO [nioEventLoopGroup-2-2] 2021-06-24 08:36:41,406 Cli.java:617 - address=/10.94.61.100:55842 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-2-1] 2021-06-24 08:36:42,221 Cli.java:617 - address=/10.94.61.100:55844 url=/api/v0/probes/liveness status=200 OK

Logs from MEDUSA POD

MEDUSA_MODE = GRPC
sleeping for 0 sec
Starting Medusa gRPC service
/home/cassandra/.local/lib/python3.6/site-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.4) or chardet (3.0.4) doesn't match a supported version!
RequestsDependencyWarning)
INFO:root:Init service
[2021-06-24 08:36:06,429] INFO: Init service
DEBUG:root:Loading storage_provider: s3
[2021-06-24 08:36:06,429] DEBUG: Loading storage_provider: s3
DEBUG:root:Reading AWS credentials from /etc/medusa-secrets/medusa_s3_credentials
[2021-06-24 08:36:06,430] DEBUG: Reading AWS credentials from /etc/medusa-secrets/medusa_s3_credentials
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): s3-us-west-1.amazonaws.com:443
[2021-06-24 08:36:06,763] DEBUG: Starting new HTTPS connection (1): s3-us-west-1.amazonaws.com:443
DEBUG:urllib3.connectionpool:https://s3-us-west-1.amazonaws.com:443 "HEAD /cdstorage-k8ssandra-us HTTP/1.1" 200 0
[2021-06-24 08:36:06,812] DEBUG: https://s3-us-west-1.amazonaws.com:443 "HEAD /cdstorage-k8ssandra-us HTTP/1.1" 200 0
INFO:root:Starting server. Listening on port 50051.
[2021-06-24 08:36:07,143] INFO: Starting server. Listening on port 50051.

Based on your last set of output evidence, I would fine-tune your memory settings. You seem to be getting an OOM.

Containers:
  cassandra:
    Container ID:   docker://8f584f327cc7636c8d8b22f78615ba4e0faa642f86e2fcf4e78f4e68f2f2ee48
    Image:          k8ssandra/cass-management-api:3.11.10-v0.1.25
    Image ID:       docker-pullable://k8ssandra/cass-management-api@sha256:ef5e007d37b57d905c706c1221c96228c4387abb8a96f994af8aae3423dc9f2a
    Ports:          9042/TCP, 9142/TCP, 7000/TCP, 7001/TCP, 7199/TCP, 8080/TCP, 9103/TCP, 9160/TCP
    Host Ports:     0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
    State:          Running
      Started:      Thu, 24 Jun 2021 13:36:21 +0530
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137

Take a look at your heap and resource settings to ensure you have adequate space for your C* instances.

heap:
   size: 31G
   newGenSize: 31G
  resources:
    requests:
      cpu: 7000m
      memory: 30Gi
    limits:
      cpu: 7000m
      memory: 30Gi

Thanks EveryOne for help, K8ssandra pods are now perfectly running.

I was facing issue related to DNS earlier. Now its resolved.

Great help.:v:

3 Likes

Good to hear that you got it resolved.

And thanks for circling back and letting us know. It will be valuable to the rest of the community. Cheers! :beers:

1 Like

@hitendra just an FYI, we have some new channels on Discord that could be useful in the future for your work with K8ssandra.