1+ scaling with OCS using local storage

1. Overview

OpenShift Container Platform has been verified to work in conjunction with local storage devices and OpenShift Container Storage (OCS) on AWS EC2, VMware, Azure and Bare Metal hosts. These instructions require that you have the ability to install version 4.6 for both OCS and the Local Storage Operator (LSO).

In addition, using the method detailed in this document, you can use 1+ scaling for adding new storage nodes (vs 3+ scaling or needing to add 3 nodes to expand cluster horizontal). This method also allows starting with a number of OCP nodes that does not need to be a multiple of 3 (i.e., 3, 6, 9 …​). You will still need a minimum of 3 nodes and these nodes need to meet the OCS CPU and Mem requirements. The difference is that the initial number of nodes do not need to be multiple of 3 (3, 4, 5, …​) as documented in the official OCS documentation.

2. Installing the Local Storage Operator v4.6

First, you will need to create a namespace for the Local Storage Operator. A self descriptive openshift-local-storage namespace is recommended.

cat <<EOF | oc apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: openshift-local-storage
spec: {}
EOF

Create Operator Group for Local Storage Operator.

cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: local-operator-group
  namespace: openshift-local-storage
spec:
  targetNamespaces:
  - openshift-local-storage
EOF

Subscribe to Local Storage Operator.

cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: local-storage-operator
  namespace: openshift-local-storage
spec:
  channel: "4.6"
  installPlanApproval: Automatic
  name: local-storage-operator
  source: redhat-operators  # <-- Modify the name of the redhat-operators catalogsource if not default
  sourceNamespace: openshift-marketplace
EOF

2.1. Preparing Nodes

You will need to add the OCS label to each OCP node that has storage devices. The OCS operator looks for this label to know which nodes can be scheduling targets for OCS components. Later we will configure Local Storage Operator Custom Resources to create PVs from storage devices on nodes with this label. You must have a minimum of three labeled nodes with the same number of devices or disks and similar performance capability. Only SSDs or NVMe devices can be used for OCS.

To label the nodes use the following command:

oc label node <NodeName> cluster.ocs.openshift.io/openshift-storage=''

You will also need to add a unique rack label to each node that a OCS label was added to. Example below for first host. Additional host will have unique rack label (i.e., rack1, rack2, rack3, …​).

oc label node <NodeName> topology.rook.io/rack=rack0

2.2. Manual Method to create Persistent Volumes

This is the manual method of creating PVs for OCS starting with OCS v4.4 and will still work with OCS v4.6 and LSO v4.6.

If you use this section to create PVs skip Auto Discovering Devices and creating Persistent Volumes and proceed to installing OCS.

You will need to know the device names on the nodes labeled for OCS. You can access the nodes using oc debug node and issuing the lsblk command after chroot.

$ oc debug node/<node_name>

sh-4.4# chroot /host
sh-4.4# lsblk
NAME                         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
nvme0n1                      259:0    0   120G  0 disk
|-nvme0n1p1                  259:1    0   384M  0 part /boot
|-nvme0n1p2                  259:2    0   127M  0 part /boot/efi
|-nvme0n1p3                  259:3    0     1M  0 part
`-nvme0n1p4                  259:4    0 119.5G  0 part
  `-coreos-luks-root-nocrypt 253:0    0 119.5G  0 dm   /sysroot
nvme1n1                      259:5    0  1000G  0 disk
nvme2n1                      259:6    0  1000G  0 disk

After you know which local devices are available, in this case nvme0n1 and nvme1n1. You can now find the by-id, a unique name created using the hardware serial number for each device.

sh-4.4# ls -l /dev/disk/by-id/
total 0
lrwxrwxrwx. 1 root root 10 Mar 17 16:24 dm-name-coreos-luks-root-nocrypt -> ../../dm-0
lrwxrwxrwx. 1 root root 13 Mar 17 16:24 nvme-Amazon_EC2_NVMe_Instance_Storage_AWS10382E5D7441494EC -> ../../nvme0n1
lrwxrwxrwx. 1 root root 13 Mar 17 16:24 nvme-Amazon_EC2_NVMe_Instance_Storage_AWS60382E5D7441494EC -> ../../nvme1n1
lrwxrwxrwx. 1 root root 13 Mar 17 16:24 nvme-nvme.1d0f-4157533130333832453544373434313439344543-416d617a6f6e20454332204e564d6520496e7374616e63652053746f72616765-00000001 -> ../../nvme0n1
lrwxrwxrwx. 1 root root 13 Mar 17 16:24 nvme-nvme.1d0f-4157533630333832453544373434313439344543-416d617a6f6e20454332204e564d6520496e7374616e63652053746f72616765-00000001 -> ../../nvme1n1

Article that has utility for gathering /dev/disk/by-id for all OCP nodes with OCS label (cluster.ocs.openshift.io/openshift-storage) https://access.redhat.com/solutions/4928841.

The next step is to create the LocalVolume resource using the by-id for each OCP node with OCS label. In this case there is one device per node and the by-id is added manually under devicePaths: in the localvolume.yaml file.

apiVersion: local.storage.openshift.io/v1
kind: LocalVolume
metadata:
  name: local-block
  namespace: openshift-local-storage
spec:
  nodeSelector:
    nodeSelectorTerms:
    - matchExpressions:
        - key: cluster.ocs.openshift.io/openshift-storage
          operator: In
          values:
          - ""
  storageClassDevices:
    - storageClassName: localblock
      volumeMode: Block
      devicePaths:
        - /dev/disk/by-id/nvme-Amazon_EC2_NVMe_Instance_Storage_AWS10382E5D7441494EC   # <-- modify this line
        - /dev/disk/by-id/nvme-Amazon_EC2_NVMe_Instance_Storage_AWS1F45C01D7E84FE3E9   # <-- modify this line
        - /dev/disk/by-id/nvme-Amazon_EC2_NVMe_Instance_Storage_AWS136BC945B4ECB9AE4   # <-- modify this line
oc create -f block-storage.yaml

After the localvolume resource is created check that Available PVs are created for each device with a by-id in the localvolume.yaml file.

2.3. Auto Discovering Devices and creating Persistent Volumes

This is the method available starting with OCS v4.6 and LSO v4.6.

Local Storage Operator v4.6 supports discovery of devices on OCP nodes with the OCS label cluster.ocs.openshift.io/openshift-storage="". Create the LocalVolumeDiscovery resource using this file after the OCP nodes are labeled with the OCS label.

cat <<EOF | oc apply -f -
apiVersion: local.storage.openshift.io/v1alpha1
kind: LocalVolumeDiscovery
metadata:
  name: auto-discover-devices
  namespace: openshift-local-storage
spec:
  nodeSelector:
    nodeSelectorTerms:
      - matchExpressions:
        - key: cluster.ocs.openshift.io/openshift-storage
          operator: In
          values:
            - ""
EOF

After this resource is created you should see a new localvolumediscoveries resource and there will be a localvolumediscoveryresults for each OCP node labeled with the OCS label. Each localvolumediscoveryresults will have the detail for each disk on the node including the by-id, size and type of disk.

2.3.1. Create LocalVolumeSet

The disk used must be SSDs or NVMe disks and must be raw block devices. This is due to the fact that the operator creates distinct partitions on the provided raw block devices for the OSD metadata and OSD data.

Use this file localvolumeset.yaml to create the LocalVolumeSet. Configure the parameters with comments to meet the needs of your environment. If not required, the parameters with comments can be deleted.

apiVersion: local.storage.openshift.io/v1alpha1
kind: LocalVolumeSet
metadata:
  name: local-block
  namespace: openshift-local-storage
spec:
  nodeSelector:
    nodeSelectorTerms:
      - matchExpressions:
          - key: cluster.ocs.openshift.io/openshift-storage
            operator: In
            values:
              - ""
  storageClassName: localblock
  volumeMode: Block
  fstype: ext4
  maxDeviceCount: 1  # <-- Maximum number of devices per node to be used
  deviceInclusionSpec:
    deviceTypes:
    - disk
    - part   # <-- Remove this if not using partitions
    deviceMechanicalProperties:
    - NonRotational
    #minSize: 0Ti   # <-- Uncomment and modify to limit the minimum size of disk used
    #maxSize: 0Ti   # <-- Uncomment and modify to limit the maximum size of disk used
oc create -f localvolumeset.yaml

After the localvolumesets resource is created check that Available PVs are created for each disk on OCP nodes with the OCS label.

3. Installing OpenShift Container Storage

These instructions are used after OCS is generally available (GA). If you have a need to install pre-release OCS different instructions are required as well as access to pre-release entitled registries.

3.1. Install Operator

Create openshift-storage namespace.

cat <<EOF | oc apply -f -
apiVersion: v1
kind: Namespace
metadata:
  labels:
    openshift.io/cluster-monitoring: "true"
  name: openshift-storage
spec: {}
EOF

Create Operator Group for OCS Operator.

cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: openshift-storage-operatorgroup
  namespace: openshift-storage
spec:
  targetNamespaces:
  - openshift-storage
EOF

Subscribe to OCS Operator.

cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: ocs-operator
  namespace: openshift-storage
spec:
  channel: "stable-4.6"
  installPlanApproval: Automatic
  name: ocs-operator
  source: redhat-operators  # <-- Modify the name of the redhat-operators catalogsource if not default
  sourceNamespace: openshift-marketplace
EOF

3.2. Create Cluster

As described above this method of deployment does not require initial deployment to have OCP node counts in multiples of 3 or adding nodes in multiples of 3.

Before creating the storage all OCP nodes to be used by OCS must be labeled with OCS label and a unique rack label (i.e., rack0, rack1, rack2, …​). There must be a minimum of 3 OCP nodes with storage devices.

The modification in the storagecluster.yaml below is the count value. This value should be the total number of disks on all of the OCP servers with the OCS and rack label (i.e., 5 servers with 4 disks each count = 20) that you want to use for your OCS cluster.

Under the managedResources section is the default setting of manage for OCS services (i.e., block, file, object using RGW, object using NooBaa). This means any changes to OCS CustomResources (CRs) will always reconcile back to default values. The other choices instead of manage are init and ignore. The setting of init for the service (i.e., cephBlockPools) will not reconcile back to default if changes are made to the CR. The setting of ignore will not deploy the particular service.

apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
  name: ocs-storagecluster
  namespace: openshift-storage
spec:
  manageNodes: false
  resources:
    mds:
      limits:
        cpu: "3"
        memory: "8Gi"
      requests:
        cpu: "3"
        memory: "8Gi"
  monDataDirHostPath: /var/lib/rook
  managedResources:
    cephBlockPools:
      reconcileStrategy: manage
    cephFilesystems:
      reconcileStrategy: manage
    cephObjectStoreUsers:
      reconcileStrategy: manage
    cephObjectStores:
      reconcileStrategy: manage
    snapshotClasses:
      reconcileStrategy: manage
    storageClasses:
      reconcileStrategy: manage
  multiCloudGateway:
    reconcileStrategy: manage
  storageDeviceSets:
  - count: 20  # <-- Modify count to number of disks
    dataPVCTemplate:
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: "100Mi"
        storageClassName: localblock
        volumeMode: Block
    name: ocs-deviceset
    placement:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: cluster.ocs.openshift.io/openshift-storage
              operator: Exists
    portable: false
    replica: 1
    resources:
      limits:
        cpu: "2"
        memory: "5Gi"
      requests:
        cpu: "2"
        memory: "5Gi"
oc create -f storagecluster.yaml

4. Verifying the Installation

Deploy the Rook-Ceph toolbox pod.

oc patch OCSInitialization ocsinit -n openshift-storage --type json --patch  '[{ "op": "replace", "path": "/spec/enableCephTools", "value": true }]'

Establish a remote shell to the toolbox pod.

TOOLS_POD=$(oc get pods -n openshift-storage -l app=rook-ceph-tools -o name)
oc rsh -n openshift-storage $TOOLS_POD

Run ceph status and ceph osd tree to see that status of the Ceph cluster.

sh-4.4# ceph status
sh-4.4# ceph osd tree

4.1. Create test CephRBD PVC and CephFS PVC

cat <<EOF | oc apply -f -
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rbd-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: ocs-storagecluster-ceph-rbd
EOF

Validate new PVC is created.

oc get pvc | grep rbd-pvc
cat <<EOF | oc apply -f -
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cephfs-pvc
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
  storageClassName: ocs-storagecluster-cephfs
EOF

Validate new PVC is created.

oc get pvc | grep cephfs-pvc

4.2. Upgrade OCS version (major version)

Validate current version of OCS.

oc get csv -n openshift-storage

Example output.

NAME                  DISPLAY                       VERSION   REPLACES   PHASE
ocs-operator.v4.5.2   OpenShift Container Storage   4.5.2                Succeeded

Verify there is a new OCS stable channel.

oc describe packagemanifests ocs -n openshift-marketplace |grep stable-

Example output.

    Name:           stable-4.5
    Name:           stable-4.6
  Default Channel:  stable-4.6

Apply subscription with new stable-4.6 channel.

cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: ocs-operator
  namespace: openshift-storage
spec:
  channel: "stable-4.6"
  installPlanApproval: Automatic
  name: ocs-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
EOF

Validate subscription is updating.

watch oc get csv -n openshift-storage

Example output.

NAME                  DISPLAY                       VERSION   REPLACES              PHASE
ocs-operator.v4.5.2   OpenShift Container Storage   4.5.2                           Replacing
ocs-operator.v4.6.0   OpenShift Container Storage   4.6.0     ocs-operator.v4.5.2   Installing

Validate new version of OCS.

oc get csv -n openshift-storage

Example output.

NAME                  DISPLAY                       VERSION   REPLACES              PHASE
ocs-operator.v4.6.0   OpenShift Container Storage   4.6.0     ocs-operator.v4.5.2   Succeeded

Validate that all pods in openshift-storage are eventually in a running state after updating. Also verify that Ceph is healthy using instructions in prior section.