More than 5 years have passed since last update.

AKS で Cluster の Autoscaler を実行する

Posted at 2019-12-09

Kubernetes の特徴のひとつは高い拡張性です。ここでいう拡張性とは、性能が不足した際に Node を自動的に増やし、負荷が落ちたら Node を自動的に減らす水平自動スケール機能です。これは Kubernetes の Add on 機能として開発されており、「Cluster Autoscaler」と呼ばれています。

どういったときに Autoscaler を行うか

これまで見てきたとおり、Pod の作成時に欲しいリソース量を指定します。このリソース量に対して実際に稼働しているリソースが不足しているかどうかを判断します。つまり、「Node を増やす基準は、シンプルに Pending の Pod があるかどうか」になります。

Cluster Autoscaler 機能では、定期的に Pending 状態の Pod がないかをチェックします。デフォルトの値は 10 秒になっています。

Pending 状態を作成する

Microsoft のドキュメントに従って Node がひとつのクラスターを構築します。

クイックスタート:Azure CLI を使用して Azure Kubernetes Service クラスターをデプロイする
https://docs.microsoft.com/ja-jp/azure/aks/kubernetes-walkthrough

具体的には以下のコマンドを実行してクラスターを構築します。

az aks create --resource-group myResourceGroup --name myAKSCluster --node-count 1 --enable-addons monitoring --generate-ssh-keys

ここで kubectl describe nodes を実行して Node の詳細情報を確認します。

$ kubectl describe nodes
~省略~
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                       Requests     Limits
  --------                       --------     ------
  cpu                            615m (32%)   400m (21%)
  memory                         739Mi (16%)  1940Mi (42%)
  ephemeral-storage              0 (0%)       0 (0%)
  attachable-volumes-azure-disk  0            0

次に Node が足りていない、つまり Pending 状態をあえて作成します。
以下の様なマニフェストファイルを適用し、CPU が 1000m 必要なものを作成してみます。

nginx.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        resources:
          requests:
            cpu: 1000m

$ kubectl apply -f nginx.yaml
deployment.apps/nginx created

kubectlt get pod コマンドを使用して、Pod のステータスを確認してみます。

$ kubectl get pods
NAME                    READY   STATUS    RESTARTS   AGE
nginx-5d7cb88cc-74vpz   0/1     Pending   0          42s
nginx-5d7cb88cc-l6dgq   1/1     Running   0          42s

二つ Pod があるうちのひとつが Pending 状態にあることが分かります。Describe をしてみて詳細を確認しましょう。

$ kubectl describe po nginx-5d7cb88cc-74vpz
Name:           nginx-5d7cb88cc-74vpz
Namespace:      default
Priority:       0
Node:           <none>
Labels:         app=nginx
                pod-template-hash=5d7cb88cc
Annotations:    <none>
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/nginx-5d7cb88cc
Containers:
  nginx:
    Image:      nginx
    Port:       <none>
    Host Port:  <none>
    Requests:
      cpu:        1
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-bjfhs (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  default-token-bjfhs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-bjfhs
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  54s (x3 over 119s)  default-scheduler  0/1 nodes are available: 1 Insufficient cpu.

上記を読み取ると、CPU リソースが不十分でスケジューリングに失敗しているということが分かります。

実際に Autoscaler を試してみる

以下のようなマニフェストファイルを適用することで、 Autoscaler を実行できます。

cluster-autoscaler.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
  name: cluster-autoscaler
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
  resources: ["events","endpoints"]
  verbs: ["create", "patch"]
- apiGroups: [""]
  resources: ["pods/eviction"]
  verbs: ["create"]
- apiGroups: [""]
  resources: ["pods/status"]
  verbs: ["update"]
- apiGroups: [""]
  resources: ["endpoints"]
  resourceNames: ["cluster-autoscaler"]
  verbs: ["get","update"]
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["watch","list","get","update"]
- apiGroups: [""]
  resources: ["pods","services","replicationcontrollers","persistentvolumeclaims","persistentvolumes"]
  verbs: ["watch","list","get"]
- apiGroups: ["extensions"]
  resources: ["replicasets","daemonsets"]
  verbs: ["watch","list","get"]
- apiGroups: ["policy"]
  resources: ["poddisruptionbudgets"]
  verbs: ["watch","list"]
- apiGroups: ["apps"]
  resources: ["statefulsets"]
  verbs: ["watch","list","get"]
- apiGroups: ["storage.k8s.io"]
  resources: ["storageclasses"]
  verbs: ["get", "list", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["create"]
- apiGroups: [""]
  resources: ["configmaps"]
  resourceNames: ["cluster-autoscaler-status"]
  verbs: ["delete","get","update"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: cluster-autoscaler
  name: cluster-autoscaler
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
    spec:
      serviceAccountName: cluster-autoscaler
      containers:
      - image: gcr.io/google-containers/cluster-autoscaler:v1.3.5
        imagePullPolicy: Always
        name: cluster-autoscaler
        resources:
          limits:
            cpu: 100m
            memory: 300Mi
          requests:
            cpu: 100m
            memory: 300Mi
        command:
        - ./cluster-autoscaler
        - --v=3
        - --logtostderr=true
        - --cloud-provider=azure
        - --skip-nodes-with-local-storage=false
        - --nodes=1:5:default
        env:
        - name: ARM_SUBSCRIPTION_ID
          valueFrom:
            secretKeyRef:
              key: SubscriptionID
              name: cluster-autoscaler-azure
        - name: ARM_RESOURCE_GROUP
          valueFrom:
            secretKeyRef:
              key: ResourceGroup
              name: cluster-autoscaler-azure
        - name: ARM_TENANT_ID
          valueFrom:
            secretKeyRef:
              key: TenantID
              name: cluster-autoscaler-azure
        - name: ARM_CLIENT_ID
          valueFrom:
            secretKeyRef:
              key: ClientID
              name: cluster-autoscaler-azure
        - name: ARM_CLIENT_SECRET
          valueFrom:
            secretKeyRef:
              key: ClientSecret
              name: cluster-autoscaler-azure
        - name: ARM_VM_TYPE
          valueFrom:
            secretKeyRef:
              key: VMType
              name: cluster-autoscaler-azure
        - name: AZURE_CLUSTER_NAME
          valueFrom:
            secretKeyRef:
              key: ClusterName
              name: cluster-autoscaler-azure
        - name: AZURE_NODE_RESOURCE_GROUP
          valueFrom:
            secretKeyRef:
              key: NodeResourceGroup
              name: cluster-autoscaler-azure
      restartPolicy: Always

※このマニフェストファイルは他のリソースの作成のための条件も入っています。作成したマニフェストファイルを実行してみましょう。

$ kubectl apply -f cluster-autoscaler.yaml
serviceaccount/cluster-autoscaler created
clusterrole.rbac.authorization.k8s.io/cluster-autoscaler created
role.rbac.authorization.k8s.io/cluster-autoscaler created
clusterrolebinding.rbac.authorization.k8s.io/cluster-autoscaler created
rolebinding.rbac.authorization.k8s.io/cluster-autoscaler created
deployment.apps/cluster-autoscaler created

しばらく待った後に pod の状態を確認してみます。

$ kubectl get po
NAME                    READY   STATUS    RESTARTS   AGE
nginx-5d7cb88cc-74vpz   1/1     Running   0          44m54s
nginx-5d7cb88cc-l6dgq   1/1     Running   0          44m54s

先ほど Pending だったものが解消されて Running に変わったことが分かります。マニフェストファイルで Autoscaler を実行できるということが分かりました。次回の記事では、マニフェストファイルのどの部分がどのような役割を持っているかを深掘りしていきます。

参考

Cluster Autoscaler on Azure
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/azure/README.md

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up