Install OCS without the Openshift Web UI

1. Overview

OpenShift Container Platform has been verified to work in conjunction with localstorage devices and OpenShift Container Storage (OCS) on AWS EC2, VMware, Azure and Bare Metal hosts.

These instructions require that you have the ability to install version 4.6 for both OCS and the Local Storage Operator (LSO).

2. Installing the Local Storage Operator v4.6

First, you will need to create a namespace for the Local Storage Operator. A self descriptive openshift-local-storage namespace is recommended.

cat <<EOF | oc apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: openshift-local-storage
spec: {}
EOF

Create Operator Group for Local Storage Operator.

cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: local-operator-group
  namespace: openshift-local-storage
spec:
  targetNamespaces:
  - openshift-local-storage
EOF

Subscribe to Local Storage Operator.

cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: local-storage-operator
  namespace: openshift-local-storage
spec:
  channel: "4.6"
  installPlanApproval: Automatic
  name: local-storage-operator
  source: redhat-operators  # <-- Modify the name of the redhat-operators catalogsource if not default
  sourceNamespace: openshift-marketplace
EOF

2.1. Preparing Nodes

You will need to add the OCS label to each OCP node that has storage devices. The OCS operator looks for this label to know which nodes can be scheduling targets for OCS components. Later we will configure Local Storage Operator Custom Resources to create PVs from storage devices on nodes with this label. You must have a minimum of three labeled nodes with the same number of devices or disks and similar performance capability. Only SSDs or NVMe devices can be used for OCS.

To label the nodes use the following command:

oc label node <NodeName> cluster.ocs.openshift.io/openshift-storage=''
Make sure to label OCP nodes in 3 different AWS availability zones if OCS installation is on AWS. For most other infrastructures (VMware, bare metal, etc.) rack labels are added to create 3 different zones (rack0, rack1, rack2) when the StorageCluster is created.

2.2. Manual Method to create Persistent Volumes

This is the manual method of creating PVs for OCS starting with OCS v4.4 and will still work with OCS v4.6 and LSO v4.6.

If you use this section to create PVs skip Auto Discovering Devices and creating Persistent Volumes and proceed to installing OCS.

You will need to know the device names on the nodes labeled for OCS. You can access the nodes using oc debug node and issuing the lsblk command after chroot.

$ oc debug node/<node_name>

sh-4.4# chroot /host
sh-4.4# lsblk
NAME                         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
nvme0n1                      259:0    0   120G  0 disk
|-nvme0n1p1                  259:1    0   384M  0 part /boot
|-nvme0n1p2                  259:2    0   127M  0 part /boot/efi
|-nvme0n1p3                  259:3    0     1M  0 part
`-nvme0n1p4                  259:4    0 119.5G  0 part
  `-coreos-luks-root-nocrypt 253:0    0 119.5G  0 dm   /sysroot
nvme1n1                      259:5    0  1000G  0 disk
nvme2n1                      259:6    0  1000G  0 disk

After you know which local devices are available, in this case nvme0n1 and nvme1n1. You can now find the by-id, a unique name created using the hardware serial number for each device.

sh-4.4# ls -l /dev/disk/by-id/
total 0
lrwxrwxrwx. 1 root root 10 Mar 17 16:24 dm-name-coreos-luks-root-nocrypt -> ../../dm-0
lrwxrwxrwx. 1 root root 13 Mar 17 16:24 nvme-Amazon_EC2_NVMe_Instance_Storage_AWS10382E5D7441494EC -> ../../nvme0n1
lrwxrwxrwx. 1 root root 13 Mar 17 16:24 nvme-Amazon_EC2_NVMe_Instance_Storage_AWS60382E5D7441494EC -> ../../nvme1n1
lrwxrwxrwx. 1 root root 13 Mar 17 16:24 nvme-nvme.1d0f-4157533130333832453544373434313439344543-416d617a6f6e20454332204e564d6520496e7374616e63652053746f72616765-00000001 -> ../../nvme0n1
lrwxrwxrwx. 1 root root 13 Mar 17 16:24 nvme-nvme.1d0f-4157533630333832453544373434313439344543-416d617a6f6e20454332204e564d6520496e7374616e63652053746f72616765-00000001 -> ../../nvme1n1

Article that has utility for gathering /dev/disk/by-id for all OCP nodes with OCS label (cluster.ocs.openshift.io/openshift-storage) https://access.redhat.com/solutions/4928841.

The next step is to create the LocalVolume resource using the by-id for each OCP node with OCS label. In this case there is one device per node and the by-id is added manually under devicePaths: in the localvolume.yaml file.

apiVersion: local.storage.openshift.io/v1
kind: LocalVolume
metadata:
  name: local-block
  namespace: openshift-local-storage
spec:
  nodeSelector:
    nodeSelectorTerms:
    - matchExpressions:
        - key: cluster.ocs.openshift.io/openshift-storage
          operator: In
          values:
          - ""
  storageClassDevices:
    - storageClassName: localblock
      volumeMode: Block
      devicePaths:
        - /dev/disk/by-id/nvme-Amazon_EC2_NVMe_Instance_Storage_AWS10382E5D7441494EC   # <-- modify this line
        - /dev/disk/by-id/nvme-Amazon_EC2_NVMe_Instance_Storage_AWS1F45C01D7E84FE3E9   # <-- modify this line
        - /dev/disk/by-id/nvme-Amazon_EC2_NVMe_Instance_Storage_AWS136BC945B4ECB9AE4   # <-- modify this line
oc create -f block-storage.yaml

After the localvolume resource is created check that Available PVs are created for each device with a by-id in the localvolume.yaml file. It can take a few minutes until all disks appear as PVs while the Local Storage Operator is preparing the disks.

2.3. Auto Discovering Devices and creating Persistent Volumes

This is the method available starting with OCS v4.6 and LSO v4.6.

Local Storage Operator v4.6 supports discovery of devices on OCP nodes with the OCS label cluster.ocs.openshift.io/openshift-storage="". Create the LocalVolumeDiscovery resource using this file after the OCP nodes are labeled with the OCS label.

cat <<EOF | oc apply -f -
apiVersion: local.storage.openshift.io/v1alpha1
kind: LocalVolumeDiscovery
metadata:
  name: auto-discover-devices
  namespace: openshift-local-storage
spec:
  nodeSelector:
    nodeSelectorTerms:
      - matchExpressions:
        - key: cluster.ocs.openshift.io/openshift-storage
          operator: In
          values:
            - ""
EOF

After this resource is created you should see a new localvolumediscoveries resource and there will be a localvolumediscoveryresults for each OCP node labeled with the OCS label. Each localvolumediscoveryresults will have the detail for each disk on the node including the by-id, size and type of disk.

2.3.1. Create LocalVolumeSet

The disk used must be SSDs or NVMe disks and must be raw block devices. This is due to the fact that the operator creates distinct partitions on the provided raw block devices for the OSD metadata and OSD data.

Use this file localvolumeset.yaml to create the LocalVolumeSet. Configure the parameters with comments to meet the needs of your environment. If not required, the parameters with comments can be deleted.

apiVersion: local.storage.openshift.io/v1alpha1
kind: LocalVolumeSet
metadata:
  name: local-block
  namespace: openshift-local-storage
spec:
  nodeSelector:
    nodeSelectorTerms:
      - matchExpressions:
          - key: cluster.ocs.openshift.io/openshift-storage
            operator: In
            values:
              - ""
  storageClassName: localblock
  volumeMode: Block
  fstype: ext4
  maxDeviceCount: 1  # <-- Maximum number of devices per node to be used
  deviceInclusionSpec:
    deviceTypes:
    - disk
    - part   # <-- Remove this if not using partitions
    deviceMechanicalProperties:
    - NonRotational
    #minSize: 0Ti   # <-- Uncomment and modify to limit the minimum size of disk used
    #maxSize: 0Ti   # <-- Uncomment and modify to limit the maximum size of disk used
oc create -f localvolumeset.yaml

After the localvolumesets resource is created check that Available PVs are created for each disk on OCP nodes with the OCS label. It can take a few minutes until all disks appear as PVs while the Local Storage Operator is preparing the disks.

3. Installing OpenShift Container Storage

These instructions are used after OCS is generally available (GA). If you have a need to install pre-release OCS different instructions are required as well as access to pre-release entitled registries.

3.1. Install Operator

Create openshift-storage namespace.

cat <<EOF | oc apply -f -
apiVersion: v1
kind: Namespace
metadata:
  labels:
    openshift.io/cluster-monitoring: "true"
  name: openshift-storage
spec: {}
EOF

Create Operator Group for OCS Operator.

cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: openshift-storage-operatorgroup
  namespace: openshift-storage
spec:
  targetNamespaces:
  - openshift-storage
EOF

Subscribe to OCS Operator.

cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: ocs-operator
  namespace: openshift-storage
spec:
  channel: "stable-4.6"
  installPlanApproval: Automatic
  name: ocs-operator
  source: redhat-operators  # <-- Modify the name of the redhat-operators catalogsource if not default
  sourceNamespace: openshift-marketplace
EOF

3.2. Create Cluster

Storage Cluster CR. For each set of 3 OSDs increment the count.

Under the managedResources section is the default setting of manage for OCS services (i.e., block, file, object using RGW, object using NooBaa). This means any changes to OCS CustomResources (CRs) will always reconcile back to default values. The other choices instead of manage are init and ignore. The setting of init for the service (i.e., cephBlockPools) will not reconcile back to default if changes are made to the CR. The setting of ignore will not deploy the particular service.

apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
  name: ocs-storagecluster
  namespace: openshift-storage
spec:
  manageNodes: false
  resources:
    mds:
      limits:
        cpu: "3"
        memory: "8Gi"
      requests:
        cpu: "3"
        memory: "8Gi"
  monDataDirHostPath: /var/lib/rook
  managedResources:
    cephBlockPools:
      reconcileStrategy: manage
    cephFilesystems:
      reconcileStrategy: manage
    cephObjectStoreUsers:
      reconcileStrategy: manage
    cephObjectStores:
      reconcileStrategy: manage
    snapshotClasses:
      reconcileStrategy: manage
    storageClasses:
      reconcileStrategy: manage
  multiCloudGateway:
    reconcileStrategy: manage
  storageDeviceSets:
  - count: 1  # <-- Modify count to desired value. For each set of 3 disks increment the count by 1.
    dataPVCTemplate:
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: "100Mi"
        storageClassName: localblock
        volumeMode: Block
    name: ocs-deviceset
    placement: {}
    portable: false
    replica: 3
    resources:
      limits:
        cpu: "2"
        memory: "5Gi"
      requests:
        cpu: "2"
        memory: "5Gi"
oc create -f storagecluster.yaml

4. Verifying the Installation

Deploy the Rook-Ceph toolbox pod.

oc patch OCSInitialization ocsinit -n openshift-storage --type json --patch  '[{ "op": "replace", "path": "/spec/enableCephTools", "value": true }]'

Establish a remote shell to the toolbox pod.

TOOLS_POD=$(oc get pods -n openshift-storage -l app=rook-ceph-tools -o name)
oc rsh -n openshift-storage $TOOLS_POD

Run ceph status and ceph osd tree to see that status of the Ceph cluster.

sh-4.4# ceph status
sh-4.4# ceph osd tree

4.1. Create test CephRBD PVC and CephFS PVC

cat <<EOF | oc apply -f -
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rbd-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: ocs-storagecluster-ceph-rbd
EOF

Validate new PVC is created.

oc get pvc | grep rbd-pvc
cat <<EOF | oc apply -f -
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cephfs-pvc
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
  storageClassName: ocs-storagecluster-cephfs
EOF

Validate new PVC is created.

oc get pvc | grep cephfs-pvc

4.2. Upgrade OCS version (major version)

Validate current version of OCS.

oc get csv -n openshift-storage

Example output.

NAME                  DISPLAY                       VERSION   REPLACES   PHASE
ocs-operator.v4.5.2   OpenShift Container Storage   4.5.2                Succeeded

Verify there is a new OCS stable channel.

oc describe packagemanifests ocs -n openshift-marketplace |grep stable-

Example output.

    Name:           stable-4.5
    Name:           stable-4.6
  Default Channel:  stable-4.6

Apply subscription with new stable-4.6 channel.

cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: ocs-operator
  namespace: openshift-storage
spec:
  channel: "stable-4.6"
  installPlanApproval: Automatic
  name: ocs-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
EOF

Validate subscription is updating.

watch oc get csv -n openshift-storage

Example output.

NAME                  DISPLAY                       VERSION   REPLACES              PHASE
ocs-operator.v4.5.2   OpenShift Container Storage   4.5.2                           Replacing
ocs-operator.v4.6.0   OpenShift Container Storage   4.6.0     ocs-operator.v4.5.2   Installing

Validate new version of OCS.

oc get csv -n openshift-storage

Example output.

NAME                  DISPLAY                       VERSION   REPLACES              PHASE
ocs-operator.v4.6.0   OpenShift Container Storage   4.6.0     ocs-operator.v4.5.2   Succeeded

Validate that all pods in openshift-storage are eventually in a running state after updating. Also verify that Ceph is healthy using instructions in prior section.