自宅環境のvCSAの評価ライセンスが切れてしまったので、この状況下でのOpenshiftの状態について確認してみました。
なお、ESXiの評価ライセンスは有効な状態です。
環境
- バージョン
- Openshift: 4.6.49
- vCSA:7.0.3
- ESXi:6.5.0 (Build 5310538)
- vSphere UPI
ノード異常無し
[root@prov ~]# oc get nodes
NAME STATUS ROLES AGE VERSION
master1 Ready master 47d v1.19.14+fcff70a
master2 Ready master 47d v1.19.14+fcff70a
master3 Ready master 47d v1.19.14+fcff70a
worker1 Ready app,worker 47d v1.19.14+fcff70a
worker2 Ready app,worker 47d v1.19.14+fcff70a
[root@prov ~]# oc adm top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
master1 1327m 17% 4474Mi 65%
master2 923m 12% 3396Mi 49%
master3 1750m 23% 5694Mi 83%
worker1 726m 9% 3361Mi 37%
worker2 697m 9% 8393Mi 94%
Pod異常無し(Running, Completed以外は存在しない)
vSphere Volumeを接続しているPodが存在していなかったので問題なかっただけかもしれません。
一番大事なところを確認できなかった、、、
[root@prov ~]# oc get pods -A -o wide | egrep -v 'Running|Completed'
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
PV異常有り
既存のPVをoc get/describe pv
コマンドで確認した限りでは、Storage Classを使用した動的プロビジョニングPVはアクセス不可の状態になっていた。
一方、ESXi上にPV用のVMDKファイルを作成し、それを直接マウントさせた(静的プロビジョニング)PVはアクセス不可の状態にはなっていなかった。
[root@prov ~]# oc get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
image-registry-volume 100Gi RWX Retain Bound openshift-image-registry/image-registry-storage 47d
pvc-6b02d34b-070a-48fd-bf9a-99f9883eef90 100Mi RWO Delete Failed test1/vol1 thin 46d
pvc-fc4a8797-c8b2-49df-a130-d7df626f4fef 100Mi RWO Delete Failed test1/vol2 app 46d
testpv 100Mi RWO Retain Released test1/testpvc testsc 38d
[root@prov ~]# oc describe pv pvc-6b02d34b-070a-48fd-bf9a-99f9883eef90
Name: pvc-6b02d34b-070a-48fd-bf9a-99f9883eef90
Labels: <none>
Annotations: kubernetes.io/createdby: vsphere-volume-dynamic-provisioner
pv.kubernetes.io/bound-by-controller: yes
pv.kubernetes.io/provisioned-by: kubernetes.io/vsphere-volume
Finalizers: [kubernetes.io/pv-protection]
StorageClass: thin
Status: Failed
Claim: test1/vol1
Reclaim Policy: Delete
Access Modes: RWO
VolumeMode: Filesystem
Capacity: 100Mi
Node Affinity: <none>
Message: Datastore 'target0' is not accessible. No connected and accessible host is attached to this datastore.
Source:
Type: vSphereVolume (a Persistent Disk resource in vSphere)
VolumePath: [target0] kubevols/testcl-xf5mb-dynamic-pvc-6b02d34b-070a-48fd-bf9a-99f9883eef90.vmdk
FSType: ext4
StoragePolicyName:
Events: <none>
[root@prov ~]# oc describe pv pvc-fc4a8797-c8b2-49df-a130-d7df626f4fef
Name: pvc-fc4a8797-c8b2-49df-a130-d7df626f4fef
Labels: <none>
Annotations: kubernetes.io/createdby: vsphere-volume-dynamic-provisioner
pv.kubernetes.io/bound-by-controller: yes
pv.kubernetes.io/provisioned-by: kubernetes.io/vsphere-volume
Finalizers: [kubernetes.io/pv-protection]
StorageClass: app
Status: Failed
Claim: test1/vol2
Reclaim Policy: Delete
Access Modes: RWO
VolumeMode: Filesystem
Capacity: 100Mi
Node Affinity: <none>
Message: Datastore 'target1' is not accessible. No connected and accessible host is attached to this datastore.
Source:
Type: vSphereVolume (a Persistent Disk resource in vSphere)
VolumePath: [target1] kubevols/testcl-xf5mb-dynamic-pvc-fc4a8797-c8b2-49df-a130-d7df626f4fef.vmdk
FSType: ext4
StoragePolicyName:
Events: <none>
[root@prov ~]# oc describe pv testpv
Name: testpv
Labels: <none>
Annotations: pv.kubernetes.io/bound-by-controller: yes
Finalizers: [kubernetes.io/pv-protection]
StorageClass: testsc
Status: Released
Claim: test1/testpvc
Reclaim Policy: Retain
Access Modes: RWO
VolumeMode: Filesystem
Capacity: 100Mi
Node Affinity: <none>
Message:
Source:
Type: vSphereVolume (a Persistent Disk resource in vSphere)
VolumePath: [target1] kubevols/test1.vmdk
FSType: ext4
StoragePolicyName:
Events: <none>
- 動的プロビジョニングできるか確認したところ実施できなかった。
oc create -f - <<'EOF'
apiVersion: "v1"
kind: "PersistentVolumeClaim"
metadata:
name: "vol3"
namespace: "test1"
spec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: "100Mi"
storageClassName: "thin"
EOF
[root@prov ~]# oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
vol3 Pending thin 3s
[root@prov ~]# oc get event
LAST SEEN TYPE REASON OBJECT MESSAGE
5s Warning ProvisioningFailed persistentvolumeclaim/vol3 Failed to provision volume with StorageClass "thin": ServerFaultCode: Datastore 'target0' is not accessible. No connected and accessible host with required privilege is attached to this datastore.
- 一方、静的プロビジョニングについては実施できた。
・ESXi上でvmdkファイルを作成
[root@esxi1:~] vmkfstools -c 10MB /vmfs/volumes/target0/kubevols/testvol1.vmdk
Create: 100% done.
・PV/PVC作成
oc create -f - <<'EOF'
apiVersion: "v1"
kind: "PersistentVolume"
metadata:
name: "testvol1pv"
spec:
capacity:
storage: 10Mi
volumeMode: Filesystem
accessModes:
- "ReadWriteOnce"
persistentVolumeReclaimPolicy: Retain
storageClassName: ""
vsphereVolume:
volumePath: "[target0] kubevols/testvol1.vmdk"
fsType: ext4
EOF
oc create -f - <<'EOF'
apiVersion: "v1"
kind: "PersistentVolumeClaim"
metadata:
name: "testvol1pvc"
namespace: "test1"
spec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: "10Mi"
storageClassName: ""
volumeName: "testvol1pv"
EOF
・Bound状態になっていることを確認
[root@prov ~]# oc get pvc testvol1pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
testvol1pvc Bound testvol1pv 10Mi RWO 17s
・適当なPodを作ってマウントしてみるが、Podにはマウントできないもよう
oc new-app mysql-ephemeral -n test1
oc set volumes deploymentconfig/mysql \
--add \
--name vol1 \
--type pvc \
--claim-name testvol1pvc \
--mount-path /vol1
[root@prov ~]# oc get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
mysql-1-deploy 0/1 Completed 0 18m 10.128.3.177 worker2 <none> <none>
mysql-1-hgkhn 1/1 Running 0 3m48s 10.128.3.185 worker2 <none> <none>
mysql-2-deploy 0/1 Error 0 14m 10.128.3.180 worker2 <none> <none>
[root@prov ~]# oc get event | grep mysql-2
22m Normal Scheduled pod/mysql-2-deploy Successfully assigned test1/mysql-2-deploy to worker2
22m Normal AddedInterface pod/mysql-2-deploy Add eth0 [10.128.3.180/23]
22m Normal Pulled pod/mysql-2-deploy Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8ee9e15664f9bc83965fef612596c786bf6e21bd0263c973241b51e7f7f67200" already present on machine
22m Normal Created pod/mysql-2-deploy Created container deployment
22m Normal Started pod/mysql-2-deploy Started container deployment
22m Normal Scheduled pod/mysql-2-zlqmp Successfully assigned test1/mysql-2-zlqmp to worker2
14m Warning FailedAttachVolume pod/mysql-2-zlqmp AttachVolume.Attach failed for volume "testvol1pv" : Unable to communicate with the remote host, since it is disconnected.
18m Warning FailedMount pod/mysql-2-zlqmp Unable to attach or mount volumes: unmounted volumes=[vol1], unattached volumes=[vol1 default-token-7lbld mysql-data]: timed out waiting for the condition
16m Warning FailedMount pod/mysql-2-zlqmp Unable to attach or mount volumes: unmounted volumes=[vol1], unattached volumes=[default-token-7lbld mysql-data vol1]: timed out waiting for the condition
13m Warning FailedMount pod/mysql-2-zlqmp Unable to attach or mount volumes: unmounted volumes=[vol1], unattached volumes=[mysql-data vol1 default-token-7lbld]: timed out waiting for the condition
11m Warning FailedMount pod/mysql-2-zlqmp Unable to attach or mount volumes: unmounted volumes=[vol1 default-token-7lbld mysql-data], unattached volumes=[vol1 default-token-7lbld mysql-data]: timed out waiting for the condition
22m Normal SuccessfulCreate replicationcontroller/mysql-2 Created pod: mysql-2-zlqmp
12m Normal SuccessfulDelete replicationcontroller/mysql-2 Deleted pod: mysql-2-zlqmp
22m Normal DeploymentCreated deploymentconfig/mysql Created new replication controller "mysql-2" for version 2
12m Normal ReplicationControllerScaled deploymentconfig/mysql Scaled replication controller "mysql-2" from 1 to 0
以上です。