EKSのAutoscalingまとめ

Last updated at 2020-05-26Posted at 2020-05-21

AWSでEKSのautoscalingを組むに当たり、必要なコンポーネントが多かったのでまとめた。

managed nodegroup

煩雑なEC2の管理をやってくれる
実態はAutoscaling + LaunchTemplate
managed nodegroupの機能自体はspot instanceの起動をサポートしていない
- 自動作成されるautoscaling groupをいじればspot起動にできる
- 何故かeksctlで作るnodegroupはspotをサポートしている
namespaceを指定できない(fargateはできるのに？)
desired_sizeはmanaged nodegroupだけでは変わらない。

terraformの設定例

resource "aws_eks_node_group" "default" {
  cluster_name    = aws_eks_cluster.default.name
  node_group_name = "default"
  node_role_arn   = module.aws-iam-role-eks-worker.aws_iam_role.this.arn
  instance_types = [
    "t2.small",
  ]
  subnet_ids      = [
    module.vpc.aws_subnet.private[0].id,
    module.vpc.aws_subnet.private[1].id,
  ]

  scaling_config {
    desired_size = 1 # ここはautoscalingしていると動的に変更される(何故必須,,)
    min_size     = 1
    max_size     = 10
  }

  release_version = "1.16.8-20200507"
  version = "1.16"
  lifecycle {
    ignore_changes = [
      scaling_config[0].desired_size
    ]
  }
}

fargate profile

data planeにFargateが使える
難しい設定とかもほとんどなくて楽
spot起動は現時点ではサポートしていない
- がロードマップには入っているのでいずれサポートされるはず
- ECSのcapacity providerがサポートされる？
pendingのpodが検出されると自動でscale
cluster autoscalerと同じような機能がデフォルトである
namespace単位で起動するfargateを作れる

terraformの設定例

resource "aws_eks_fargate_profile" "example" {
  cluster_name           = aws_eks_cluster.example.name
  fargate_profile_name   = "example"
  pod_execution_role_arn = aws_iam_role.example.arn
  subnet_ids             = aws_subnet.example[*].id

  selector {
    namespace = "example"
  }
}

terraformでEKSを管理する場合

EKSで使うsubnetに以下のようなtagが書き込まれるため、差分が出てしまう
- kubernetes.io/cluster/CLUSTER_NAME: CLUSTER_NAME
ignore_changesでtagの変更を無視するようにする必要がある
managed nodegroupがspot起動をサポートしていないが、以下のような設定を組むことでspot化できる
- https://qiita.com/toyama0919/items/38dea9446bb3cf062539

cluster autoscaler

nodeのautoscalingをしてくれる
awsの場合はec2をscale(増やす)してくれる
pendingのpodを検出して自動でscaling
podが減ればscaledownもしてくれる
ECSのcapacity providerと同じような機能

eksのworkerに以下のような権限が必要

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:DescribeLaunchConfigurations",
                "autoscaling:DescribeTags",
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup",
                "ec2:DescribeLaunchTemplateVersions"
            ],
            "Resource": "*",
            "Effect": "Allow"
        }
    ]
}

以下のようなコマンドでclusterに反映

kubectl apply -f <(curl https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml | sed 's/<YOUR CLUSTER NAME>/my-cluster/g')

Horizontal Pod Autoscaler

pod(deployment)のautoscalingをしてくれる
- nodeのscalingではない
deploymentのreplica数はhpaによって上書きされる
kind: HorizontalPodAutoscalerを使う
minReplicasとmaxReplicasを指定し、その間でscale

$ kubectl get hpa --all-namespaces
NAMESPACE   NAME                  REFERENCE                     TARGETS      MINPODS   MAXPODS   REPLICAS   AGE
default     sqs-consumer-scaler   Deployment/nginx-deployment   0/10 (avg)   1         100       1          14h

metrics server

Kubernetesのクラスタのリソースを取得するのに必要。
CPU使用率やメモリ使用率などを取得してくれる
systemのnamespaceであるkube-systemにinstallされる
外部のメトリクス(awsのcloudwatch mtricsなど)もadapterを使うことによって転送可能

metrics serverをclusterにdeploy

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml

k8s-cloudwatch-adapter

Kubernetesのmetrics serverにcloudwatchのmetricsを送信できる。
https://aws.amazon.com/jp/blogs/compute/scaling-kubernetes-deployments-with-amazon-cloudwatch-metrics/

adapterをclusterにdeploy

$ kubectl apply -f https://raw.githubusercontent.com/awslabs/k8s-cloudwatch-adapter/master/deploy/adapter.yaml

以下のようなmetricsを作れる(SQSのcloudwatch metricsをmetrics serverに転送する例)

kind: ExternalMetricとする

apiVersion: metrics.aws/v1alpha1
kind: ExternalMetric
metadata:
  name: hello-queue-length
spec:
  name: hello-queue-length
  resource:
    resource: "deployment"
  queries:
    - id: sqs_helloworld
      metricStat:
        metric:
          namespace: "AWS/SQS"
          metricName: "ApproximateNumberOfMessagesVisible"
          dimensions:
            - name: QueueName
              value: "helloworld"
        period: 60
        stat: Maximum
        unit: Count
      returnData: true

このmetricsでautoscalingする場合、以下のようなHorizontalPodAutoscalerを作成する。

kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
metadata:
  name: sqs-consumer-scaler
spec:
  # v1.18の機能
  # behavior:
  #   scaleUp:
  #     stabilizationWindowSeconds: 0
  #     policies:
  #     - type: Percent
  #       value: 100
  #       periodSeconds: 15
  #     - type: Pods
  #       value: 4
  #       periodSeconds: 15
  #     selectPolicy: Max
  scaleTargetRef:
    apiVersion: apps/v1beta1
    kind: Deployment
    name: nginx-deployment
  minReplicas: 1
  maxReplicas: 100
  metrics:
  - type: External
    external:
      metricName: hello-queue-length
      targetAverageValue: 10

ポイント

scaleTargetRef.nameにはdeploymentのnameを指定する。
metricsのtypeをExternal(外部metricsの参照)にする。
targetAverageValueが望ましい値
- 10未満だとscaledown(podを減らす)
- 10より大きいとscaleup(podを増やす)

behavior?

k8sのv1.18からHorizontalPodAutoscalerにbehaviorという設定が追加される。
- EKSの最新バージョンはまだ1.16なので早く出てほしい。。
scalingの速さを調節できるので便利そう(現状はゆっくりscaleする設定しかない)

(おまけ)複数clusterの取り扱い

configの切り替え

export KUBECONFIG=$HOME/.kube/config-hogecluster

eksのclusterにkubectlを切り替え

aws eks update-kubeconfig --name hoge-cluster

このコマンドで~/.kube/configがEKS cluster用に上書きされる

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up