Using Cassandra Operator with Openshift 4.x
I’ve been running Cassandra on Openshift for some time now through a homebrew stateful set, but it’s a tedious process, and prone to failures. Naturally one of the first things I tried with OpenShift 4.x is to make use of Instaclustr Cassandra Operator to automate the provisioning of Cassandra clusters.
Deployment
Just follow the instructions for how to Deploy the Operator. First deploy the custom resource descriptors:
❯ oc apply -f deploy/crds.yaml
Then the operator itself:
❯ oc apply -f deploy/bundle.yaml
After the operator pod starts up, I tried to spin up a small 3-node Cassandra cluster using the following YAML:
apiVersion: cassandraoperator.instaclustr.com/v1alpha1
kind: CassandraDataCenter
metadata:
name: test-cluster-dc1
labels:
app: cassandra
datacenter: dc1
cluster: test-cluster
spec:
prometheusSupport: true
optimizeKernelParams: false
serviceAccountName: cassandra
nodes: 3
cassandraImage: "gcr.io/cassandra-operator/cassandra-3.11.6:latest"
sidecarImage: "gcr.io/cassandra-operator/cassandra-sidecar:latest"
imagePullPolicy: Always
imagePullSecrets:
- name: regcred
resources:
limits:
memory: 2Gi
requests:
memory: 2Gi
sidecarResources:
limits:
memory: 512Mi
requests:
memory: 512Mi
dataVolumeClaimSpec:
accessModes:
- ReadWriteOnce
storageClassName: thick
resources:
requests:
storage: 5Gi
cassandraAuth:
authenticator: PasswordAuthenticator
authorizer: CassandraAuthorizer
roleManager: CassandraRoleManager
Easy right?
❯ oc get pod
NAME READY STATUS RESTARTS AGE
cassandra-operator-55d759bcd-6h5sm 1/1 Running 0 47s
cassandra-test-cluster-dc1-rack1-0 1/2 CrashLoopBackOff 1 22s
CrashLoopBackOff. Let’s see if we can get any meaningful error from the logs:
❯ oc logs cassandra-test-cluster-dc1-rack1-0 -c cassandra
+ '[' unset == true ']'
+ exec /bin/bash -xue /usr/bin/entry-point /tmp/operator-config /tmp/cassandra-rack-config
+ for config_directory in "$@"
+ cd /tmp/operator-config
+ find -L . -name '..*' -prune -o '(' -type f -print0 ')'
+ cpio -pmdLv0 /etc/cassandra
cpio: /etc/cassandra/./cassandra.yaml.d/001-operator-overrides.yaml: Cannot open: Permission denied
cpio: /etc/cassandra/./jvm.options.d/001-jvm-memory-gc.options: Cannot open: Permission denied
cpio: /etc/cassandra/./cassandra-env.sh.d/001-cassandra-exporter.sh: Cannot open: Permission denied
0 blocks
Well, dammit. Permission denied. Aren’t containers supposed to be solving these problems? Let’s work the problem a little further:
❯ oc debug pod/cassandra-test-cluster-dc1-rack1-0
Defaulting container name to cassandra.
Use 'oc describe pod/cassandra-test-cluster-dc1-rack1-0-debug -n akka-ledger' to see all of the containers in this pod.
Starting pod/cassandra-test-cluster-dc1-rack1-0-debug ...
Pod IP: 10.128.4.38
If you don't see a command prompt, try pressing enter.
$ whoami
1000690000
$ cat > /etc/cassandra/hello_world
/bin/sh: 3: cannot create /etc/cassandra/hello_world: Permission denied
$ ls -l /etc/| grep cassandra
drwxr-xr-x. 1 cassandra cassandra 267 Jul 30 14:14 cassandra
$ grep cassandra /etc/passwd
cassandra: x:999:999::/home/cassandra:
So there is the issue. User cassandra (uid 999
) owns the /etc/cassandra
folder, but when the
image is running on OpenShift the user assigned to the pod has uid 1000690000
. Going over this
issue requires understanding of OpenShift SCC and how they can be adjusted.
Understanding Service Accounts and SCCs
Security Context Constraints, or SCCs, are the means for administrators in Openshift to control permissions for pods. Here are some relevant links:
- Understanding Service Accounts and SCCs (somewhat outdated)
- Managing Security Context Constraints
What we need is to grant the operator the ability to spawn cassandra pods using the uid that is baked into the images, instead of randomly getting one in the range assigned in the namespace where the operator is running. See for example the namespace configuration I am using:
❯ oc describe namespace/akka-ledger
Name: akka-ledger
Labels: <none>
Annotations: openshift.io/description:
openshift.io/display-name:
openshift.io/requester: AGeorgiadis
openshift.io/sa.scc.mcs: s0:c26,c20
openshift.io/sa.scc.supplemental-groups: 1000690000/10000
openshift.io/sa.scc.uid-range: 1000690000/10000
Status: Active
No resource quota.
No LimitRange resource.
openshift.io/sa.scc.uid-range
annotation is being used to define the range of uids that will be
used, hence the value of 1000690000
that I got.
So what can we do? Turns out one of the SCCs defined out of the box on OpenShift is anyuid
, which
allows the operator to override the uid range defined in the namespace and have the cassandra server
run as uid 999
in the pod. This is an OpenShift specific issue, since I can use the same operator
in a stock Kubernetes installation without getting any error. We can modify the cassandra
Role that
has been created and add the following rule to allow use of anyuid
SCC:
- apiGroups:
- security.openshift.io
resourceNames:
- anyuid
resources:
- securitycontextconstraints
verbs:
- use
This role is bound to the cassandra
service account which is declared in the pod spec of the
stateful set that the operator is creating to provision the cassandra server pods.
After that, let’s monitor the logs again to verify that the pod starts with the correct uid:
[...]
WARN [main] StartupChecks.java:332 Directory /var/lib/cassandra/data doesn't exist
ERROR [main] CassandraDaemon.java:775 Has no permission to create directory /var/lib/cassandra/data
So the process now succesfully updated the /etc/cassandra
folder, but got a new error after a few
steps. Again, let’s use debug to understand the problem:
❯ oc debug pod/cassandra-test-cluster-dc1-rack1-0
Defaulting container name to cassandra.
Use 'oc describe pod/cassandra-test-cluster-dc1-rack1-0-debug -n akka-ledger' to see all of the containers in this pod.
Starting pod/cassandra-test-cluster-dc1-rack1-0-debug ...
Pod IP: 10.128.4.41
If you don't see a command prompt, try pressing enter.
$ whoami
cassandra
$ ls -la /var/lib/cassandra
total 0
drwxr-xr-x. 2 root root 6 Aug 5 14:31 .
drwxr-xr-x. 1 root root 57 Jul 30 14:14 ..
So the pod has been configured with the proper uid. Unfortunately when the PV was mounted the
permissions set only allow root
to make notifications to the filesystem. In order to resolve this
the operator would have to specify fsGroup: 999
on the stateful set that is used to provision the
pods (see e.g. Set the security context for a Pod). Can we achieve the same thing without
requiring changes to the operator code?
Turns out, we can. Just create a new SCC that presets the fsGroup to a suitable range. Like for example the following one:
|
|
This is the same as anyuid
SCC, with the highlighted changes applied, in place of a
type: RunAsAny
policy. Let’s edit again the cassandra
role, replace anyuid
with
cassandra
in resourceNames
list and recreate the datacenter crd cluster definition.
After a while all 3 pods have been created correctly:
❯ oc get pod
NAME READY STATUS RESTARTS AGE
cassandra-operator-55d759bcd-8vtwq 1/1 Running 0 23m
cassandra-test-cluster-dc1-rack1-0 2/2 Running 0 3m43s
cassandra-test-cluster-dc1-rack1-1 2/2 Running 0 2m11s
cassandra-test-cluster-dc1-rack1-2 2/2 Running 0 27s
Nodetool status also reports a proper cluster formation:
❯ oc exec cassandra-test-cluster-dc1-rack1-0 -c cassandra -- nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.129.4.55 85.2 KiB 256 69.8% 2495ad71-384a-43fe-996f-65d245c15fe4 rack1
UN 10.128.4.42 84.79 KiB 256 65.4% 84011ee3-0dc5-482e-90ab-d06cf0844155 rack1
UN 10.129.2.13 15.5 KiB 256 64.8% 12bb34c0-1ac5-49dc-971d-9c0bc1df7a81 rack1
We can now follow the rest of the guide.
TODO
- Enable optimizeKernelParams