More than 3 years have passed since last update.

Container Engine for Kubernetes (OKE)でCluster Autoscalerの動作を確認する

Last updated at 2022-06-30Posted at 2022-06-30

はじめに

Cluster Autoscalerはリソース不足でPodがスケジューリングできないときに、Worker nodeを自動で追加する機能です。
以下のマニュアルを参考にContainer Engine for Kubernetes (OKE)での動作を確認します。

事前準備

メトリックサーバのデプロイ

クラスタのメトリクス監視のために、以下のマニュアルを参考にメトリックサーバをデプロイします。

動的グループの作成

以下のマニュアルの「ステップ1: コンパートメント・レベルの動的グループの作成」を参考に動的グループを作成します。
今回は作成済みの動的グループを使用します。

ポリシーの設定

作成した動的グループに以下のようにポリシーを設定します。

Cluster Autoscalerのデプロイ

クラスタの確認

OKEは各Worker nodeをノードプールに登録し、ノードプールをクラスタに登録します。
今回は1ノードのノードプールを2つクラスタに登録しています。

$ kubectl get node
NAME          STATUS   ROLES   AGE     VERSION
10.0.10.241   Ready    node    4h21m   v1.23.4
10.0.10.75    Ready    node    29d     v1.23.4

マニフェストファイルの作成

以下のようにマニフェストファイルを作成します。Cluster AutoscalerはDeploymentでデプロイされます。
マニュアルそのままですが、変更したところはコメントで記載しています。（2ヶ所）

cluster-autoscaler.yaml

---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
  name: cluster-autoscaler
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["events", "endpoints"]
    verbs: ["create", "patch"]
  - apiGroups: [""]
    resources: ["pods/eviction"]
    verbs: ["create"]
  - apiGroups: [""]
    resources: ["pods/status"]
    verbs: ["update"]
  - apiGroups: [""]
    resources: ["endpoints"]
    resourceNames: ["cluster-autoscaler"]
    verbs: ["get", "update"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["watch", "list", "get", "patch", "update"]
  - apiGroups: [""]
    resources:
      - "pods"
      - "services"
      - "replicationcontrollers"
      - "persistentvolumeclaims"
      - "persistentvolumes"
    verbs: ["watch", "list", "get"]
  - apiGroups: ["extensions"]
    resources: ["replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["policy"]
    resources: ["poddisruptionbudgets"]
    verbs: ["watch", "list"]
  - apiGroups: ["apps"]
    resources: ["statefulsets", "replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses", "csinodes"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["batch", "extensions"]
    resources: ["jobs"]
    verbs: ["get", "list", "watch", "patch"]
  - apiGroups: ["coordination.k8s.io"]
    resources: ["leases"]
    verbs: ["create"]
  - apiGroups: ["coordination.k8s.io"]
    resourceNames: ["cluster-autoscaler"]
    resources: ["leases"]
    verbs: ["get", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["create","list","watch"]
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
    verbs: ["delete", "get", "update", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 3
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '8085'
    spec:
      serviceAccountName: cluster-autoscaler
      containers:
        - image: iad.ocir.io/oracle/oci-cluster-autoscaler:1.23.0-4 # ①OKEのKubernetesバージョンに応じて設定
          name: cluster-autoscaler
          resources:
            limits:
              cpu: 100m
              memory: 300Mi
            requests:
              cpu: 100m
              memory: 300Mi
          command:
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=oci
            - --max-node-provision-time=25m
            - --nodes=1:2:ocid1.nodepool.oc1.uk-london-xxxxx # ②最小ノード数:最大ノード数：対象ノードプールのOCID
            - --scale-down-delay-after-add=10m
            - --scale-down-unneeded-time=10m
            - --unremovable-node-recheck-timeout=5m
            - --balance-similar-node-groups
            - --balancing-ignore-label=displayName
            - --balancing-ignore-label=hostname
            - --balancing-ignore-label=internal_addr
            - --balancing-ignore-label=oci.oraclecloud.com/fault-domain
          imagePullPolicy: "Always"
          env:
          - name: OKE_USE_INSTANCE_PRINCIPAL
            value: "true"

①使用しているKubernetesクラスタのバージョンに合わせて、コンテナイメージのパスを変更します。
- クラスタがデプロイされているリージョンに近い方がイメージをPullするのが速いですが、一回だけですのでバージョンが間違っていなければ大丈夫です。
- 今回はクラスタがロンドンリージョンにありますが、アッシュバーン（アメリカ）のイメージを指定しています。
②スケールさせるノードプールの最小ノード数、最大ノード数、OCIDを指定します。
- マニュアルには2行ありますが、今回はスケールさせるノードプールは1つですので、1行は削除しました。
- ノードプールのOCIDは以下で確認、コピーします。

Cluster Autoscalerのデプロイ

作成したマニフェストファイルを指定して、デプロイします。

$ kubectl apply -f cluster-autoscaler.yaml 
serviceaccount/cluster-autoscaler created
clusterrole.rbac.authorization.k8s.io/cluster-autoscaler created
role.rbac.authorization.k8s.io/cluster-autoscaler created
clusterrolebinding.rbac.authorization.k8s.io/cluster-autoscaler created
rolebinding.rbac.authorization.k8s.io/cluster-autoscaler created
deployment.apps/cluster-autoscaler created

確認します。

$ kubectl -n kube-system get deployment
NAME                  READY   UP-TO-DATE   AVAILABLE   AGE
cluster-autoscaler    3/3     3            3           5m11s
coredns               2/2     2            2           29d
kube-dns-autoscaler   1/1     1            1           29d
metrics-server        1/1     1            1           4h20m

ログを確認して、エラーが出ていないこと、動作していることを確認します。

kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler
・・・
I0630 05:21:42.688168       1 filter_out_schedulable.go:65] Filtering out schedulables
I0630 05:21:42.688201       1 filter_out_schedulable.go:132] Filtered out 0 pods using hints
I0630 05:21:42.688211       1 filter_out_schedulable.go:170] 0 pods were kept as unschedulable based on caching
I0630 05:21:42.688216       1 filter_out_schedulable.go:171] 0 pods marked as unschedulable can be scheduled.
I0630 05:21:42.688241       1 filter_out_schedulable.go:82] No schedulable pods
I0630 05:21:42.688271       1 static_autoscaler.go:419] No unschedulable pods
I0630 05:21:42.688305       1 static_autoscaler.go:466] Calculating unneeded nodes
・・・

動作確認

Deploymentのデプロイ

動作確認用に以下のDeploymentをデプロイします。
このとき、requestsの指定が必須です。

dep.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx-dep
  name: nginx-dep
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx-dep
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: nginx-dep
    spec:
      containers:
      - image: nginx
        name: nginx
        resources: 
          requests:
            memory: "500Mi"

$ kubectl apply -f dep.yaml 
deployment.apps/nginx-dep created
$ kubectl get pod
NAME                        READY   STATUS    RESTARTS   AGE
nginx-dep-5bd48994f-k9p6g   1/1     Running   0          70s
nginx-dep-5bd48994f-lmvh7   1/1     Running   0          70s

Scale

DeploymentのReplicasを100までスケールさせます。

kubectl scale deployment nginx-dep --replicas=100
deployment.apps/nginx-dep scaled

$ kubectl get pod | wc -l
101

確認

スケールさせた後のCluster Autoscalerのログを確認すると、以下のようにスケジューリングできないPodを検出してノードプールをスケールさせていることがわかります。

kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler
・・・
I0630 05:29:27.142292       1 klogx.go:86] Pod default/nginx-dep-5bd48994f-g6prq is unschedulable
I0630 05:29:27.142298       1 klogx.go:86] Pod default/nginx-dep-5bd48994f-w94ck is unschedulable
・・・
I0630 05:29:27.144485       1 scale_up.go:468] Best option to resize: ocid1.nodepool.oc1.uk-london-xxxxxxxxxxx
I0630 05:29:27.144499       1 balancing_processor.go:111] Requested scale-up (2) exceeds node group set capacity, capping to 1
I0630 05:29:27.144506       1 scale_up.go:595] Final scale-up plan: [{ocid1.nodepool.oc1.uk-london-xxxxxxxxxxx 1->2 (max: 2)}]
I0630 05:29:27.144516       1 scale_up.go:691] Scale-up: setting group ocid1.nodepool.oc1.uk-london-xxxxxxxxxxx size to 2

コンソールでも以下のように確認できます。

↓

クラスタにも追加されていますね。

$ kubectl get node
NAME          STATUS   ROLES   AGE     VERSION
10.0.10.241   Ready    node    4h56m   v1.23.4
10.0.10.254   Ready    node    56s     v1.23.4
10.0.10.75    Ready    node    29d     v1.23.4

追加されたノードでgrepすると、リソース不足でスケジューリングできなかったPodが追加されたノードにスケジューリングされていることがわかります。

$ kubectl get pod -o wide |grep 10.0.10.254
nginx-dep-5bd48994f-4pj25   1/1     Running            0          4m49s   10.244.2.27    10.0.10.254   <none>           <none>
nginx-dep-5bd48994f-5cvdx   1/1     Running            0          4m49s   10.244.2.31    10.0.10.254   <none>           <none>
nginx-dep-5bd48994f-5nkjm   1/1     Running            0          4m49s   10.244.2.11    10.0.10.254   <none>           <none>
nginx-dep-5bd48994f-6bc7s   1/1     Running            0          4m49s   10.244.2.29    10.0.10.254   <none>           <none>
nginx-dep-5bd48994f-cpw9q   1/1     Running            0          4m49s   10.244.2.13    10.0.10.254   <none>           <none>
nginx-dep-5bd48994f-dnjlx   1/1     Running            0          4m49s   10.244.2.17    10.0.10.254   <none>           <none>
nginx-dep-5bd48994f-dr5cz   1/1     Running            0          4m50s   10.244.2.7     10.0.10.254   <none>           <none>
nginx-dep-5bd48994f-fbw96   1/1     Running            0          4m50s   10.244.2.4     10.0.10.254   <none>           <none>
nginx-dep-5bd48994f-g6prq   1/1     Running            0          4m49s   10.244.2.20    10.0.10.254   <none>           <none>
・・・

DeploymentのReplicasを元に戻すと、追加されたWorker nodeも削除されます。
今回の設定では、10分後に削除されます。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up