Post

Rook을 이용한 Ceph Cluster배포 (in k8s)

개요

Rook 프로젝트는 kubernetes cloud native storage 배포 오케스트레이션 입니다. 자세한 내용은 이곳을 참조해주세요.
이 Rook 을 이용하여 kubernetes cluster 위에 Ceph Cluster 를 배포 및 StorageClass 생성, PVC 동적 바인딩 테스트까지 진행해보겠습니다.

핸즈온 순서는 다음과 같습니다.

  1. Prerequisite
  2. Rook-Ceph Git clone
  3. Install
  4. Test


클러스터의 사양은 다음과 같으며, kubernetes 가 배포중이여야 합니다.

kubernetes 배포 관련 문서는 이곳을 참조해주세요.

HOST: 3 (kubernetes cluster)
OS: Ubuntu 20.04
vCPU: 8
RAM: 16G
HDD: vda: 50G, vdb: 100G (OSD 용)
Network: 10.10.0.59, 10.10.0.4, 10.10.0.20

Prerequisite

rbd module load

1
modprobe rbd

Git Clone

1
git clone --single-branch --branch v1.9.1 https://github.com/rook/rook.git

Install

Rook Operator Install

1
2
3
4
5
6
7
8
9
cd rook/deploy/examples

kubectl create -f crds.yaml -f common.yaml -f operator.yaml

# rook-ceph-operator Running 까지 대기
kubectl -n rook-ceph get pod
...
rook-ceph-operator-7cdcfd4c8b-smnwr               1/1     Running     0                12s
...

Ceph Cluster Install

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
kubectl create -f cluster.yaml

kubectl -n rook-ceph get pod
NAME                                              READY   STATUS      RESTARTS        AGE
csi-cephfsplugin-cchrv                            3/3     Running     0               4d9h
csi-cephfsplugin-provisioner-86cb89d98d-c892d     6/6     Running     0               4d9h
csi-cephfsplugin-provisioner-86cb89d98d-t4hl2     6/6     Running     0               4d9h
csi-cephfsplugin-px5th                            3/3     Running     0               4d9h
csi-cephfsplugin-w2h9j                            3/3     Running     0               4d9h
csi-rbdplugin-2hmns                               3/3     Running     0               4d9h
csi-rbdplugin-84fqc                               3/3     Running     0               4d9h
csi-rbdplugin-dc2v8                               3/3     Running     0               4d9h
csi-rbdplugin-provisioner-76987b994f-9tsrn        6/6     Running     0               4d9h
csi-rbdplugin-provisioner-76987b994f-hmtgh        6/6     Running     0               4d9h
rook-ceph-crashcollector-node1-75fd74b7bd-fz9x9   1/1     Running     0               4d9h
rook-ceph-crashcollector-node2-94f67dd77-6ld78    1/1     Running     0               4d9h
rook-ceph-crashcollector-node3-76f65c77d-dbk9t    1/1     Running     0               4d9h
rook-ceph-mgr-a-5969dbf6b7-nlbcm                  1/1     Running     0               4d9h
rook-ceph-mon-a-5c5746f959-hx5d8                  1/1     Running     0               4d9h
rook-ceph-mon-c-75479fdd64-lnlpf                  1/1     Running     0               4d9h
rook-ceph-mon-d-654b4b5c5c-p8dc9                  1/1     Running     0               4d9h
rook-ceph-operator-7cdcfd4c8b-smnwr               1/1     Running     0               4d9h
rook-ceph-osd-0-5c6c564d7-d2c4c                   1/1     Running     0               4d9h
rook-ceph-osd-1-74bf755799-mbnjk                  1/1     Running     0               4d9h
rook-ceph-osd-2-79d895b5df-w2w7c                  1/1     Running     0               4d9h
rook-ceph-osd-prepare-node1-ldpg8                 0/1     Completed   0               8h
rook-ceph-osd-prepare-node2-45htd                 0/1     Completed   0               8h
rook-ceph-osd-prepare-node3-7x2vd                 0/1     Completed   0               8h

ToolBox Install

1
2
3
4
5
6
7
kubectl create -f toolbox.yaml

kubectl -n rook-ceph get pod
NAME                                              READY   STATUS      RESTARTS         AGE
...
rook-ceph-tools-78cbb855d8-p2gwv                  1/1     Running     0                4d9h
...

Cluster Check

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
kubectl exec -n rook-ceph -it rook-ceph-tools-78cbb855d8-p2gwv -- ceph -s
  cluster:
    id:     75d5519c-2c57-46ae-9cc8-6520d18dc9cd
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum a,c,d (age 4h)
    mgr: a(active, since 4d)
    osd: 3 osds: 3 up (since 4d), 3 in (since 4d)

  data:
    pools:   2 pools, 33 pgs
    objects: 231 objects, 351 MiB
    usage:   3.7 GiB used, 296 GiB / 300 GiB avail
    pgs:     33 active+clean

  io:
    client:   7.3 KiB/s wr, 0 op/s rd, 0 op/s wr

StorageClass Create

1
cd csi/rbd

storageclass.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph # namespace:cluster
spec:
  failureDomain: host
  replicated:
    size: 3
    # Disallow setting pool with replica 1, this could lead to data loss without recovery.
    # Make sure you're *ABSOLUTELY CERTAIN* that is what you want
    requireSafeReplicaSize: true
    # gives a hint (%) to Ceph in terms of expected consumption of the total cluster capacity of a given pool
    # for more info: https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size
    #targetSizeRatio: .5
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: rook-ceph-block
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
  # clusterID is the namespace where the rook cluster is running
  # If you change this namespace, also change the namespace below where the secret namespaces are defined
  clusterID: rook-ceph # namespace:cluster

  # If you want to use erasure coded pool with RBD, you need to create
  # two pools. one erasure coded and one replicated.
  # You need to specify the replicated pool here in the `pool` parameter, it is
  # used for the metadata of the images.
  # The erasure coded pool must be set as the `dataPool` parameter below.
  #dataPool: ec-data-pool
  pool: replicapool

  # (optional) mapOptions is a comma-separated list of map options.
  # For krbd options refer
  # https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
  # For nbd options refer
  # https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
  # mapOptions: lock_on_read,queue_depth=1024

  # (optional) unmapOptions is a comma-separated list of unmap options.
  # For krbd options refer
  # https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
  # For nbd options refer
  # https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
  # unmapOptions: force

  # (optional) Set it to true to encrypt each volume with encryption keys
  # from a key management system (KMS)
  # encrypted: "true"

  # (optional) Use external key management system (KMS) for encryption key by
  # specifying a unique ID matching a KMS ConfigMap. The ID is only used for
  # correlation to configmap entry.
  # encryptionKMSID: <kms-config-id>

  # RBD image format. Defaults to "2".
  imageFormat: "2"

  # RBD image features. Available for imageFormat: "2". CSI RBD currently supports only `layering` feature.
  imageFeatures: layering

  # The secrets contain Ceph admin credentials. These are generated automatically by the operator
  # in the same namespace as the cluster.
  csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph # namespace:cluster
  csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph # namespace:cluster
  csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph # namespace:cluster
  # Specify the filesystem type of the volume. If not specified, csi-provisioner
  # will set default as `ext4`. Note that `xfs` is not recommended due to potential deadlock
  # in hyperconverged settings where the volume is mounted on the same node as the osds.
  csi.storage.k8s.io/fstype: ext4
# uncomment the following to use rbd-nbd as mounter on supported nodes
# **IMPORTANT**: CephCSI v3.4.0 onwards a volume healer functionality is added to reattach
# the PVC to application pod if nodeplugin pod restart.
# Its still in Alpha support. Therefore, this option is not recommended for production use.
#mounter: rbd-nbd
allowVolumeExpansion: true
reclaimPolicy: Delete
1
2
3
4
5
kubectl create -f storageclass.yaml

kubectl get storageclass
NAME              PROVISIONER                  RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
rook-ceph-block   rook-ceph.rbd.csi.ceph.com   Delete          Immediate           true                   4d9h

Test

StorageClass Test (PVC 생성 및 PV 동적할당)

pvc.yaml

1
2
3
4
5
6
7
8
9
10
11
12
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rbd-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: rook-ceph-block
1
2
3
4
5
6
7
8
9
kubectl create -f pvc.yaml

kubectl get pvc
NAME                             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
rbd-pvc                          Bound    pvc-6d3f82d5-96f9-451f-931a-43bb6ccff196   1Gi        RWO            rook-ceph-block   3s

kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM              STORAGECLASS      REASON   AGE
pvc-6d3f82d5-96f9-451f-931a-43bb6ccff196   1Gi        RWO            Delete           Bound    default/rbd-pvc    rook-ceph-block            3m27s

StorageClass Pool 및 PV - rbd block 비교

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
kubectl exec -n rook-ceph -it rook-ceph-tools-78cbb855d8-p2gwv -- ceph osd pool ls
...
replicapool
...

kubectl exec -n rook-ceph -it rook-ceph-tools-78cbb855d8-p2gwv -- rbd ls replicapool
csi-vol-23cf0afc-c24e-11ec-951e-8a4620ab752b

kubectl exec -n rook-ceph -it rook-ceph-tools-78cbb855d8-p2gwv -- rbd info replicapool/csi-vol-23cf0afc-c24e-11ec-951e-8a4620ab752b
rbd image 'csi-vol-23cf0afc-c24e-11ec-951e-8a4620ab752b':
	size 1 GiB in 256 objects
	order 22 (4 MiB objects)
	snapshot_count: 0
	id: 149cc96ea0030
	block_name_prefix: rbd_data.149cc96ea0030
	format: 2
	features: layering
	op_features:
	flags:
	create_timestamp: Fri Apr 22 15:09:00 2022
	access_timestamp: Fri Apr 22 15:09:00 2022
	modify_timestamp: Fri Apr 22 15:09:00 2022

마치며

Rook Orchestration 을 통해 쉽게 cluster native storage 를 배포할 수 있었습니다. Cephadm을 이용한 온프레미스 서버에 Ceph Cluster 를 배포하는 내용도 소개하겠습니다. 다음은 kubernetes 배포 툴인 argocd 를 설치해보도록 하겠습니다.

Reference

  • https://rook.io/docs/rook/v1.9/quickstart.html
  • https://docs.ceph.com/en/latest/rbd/rados-rbd-cmds/?#create-a-block-device-pool
  • https://yjwang.tistory.com/128
This post is licensed under CC BY 4.0 by the author.