0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

EKSで機械学習 #10 Container Insightsの導入

Posted at

はじめに

このシリーズでは Amazon EKSで機械学習を行っていきたいと思います

シリーズ目次

EKSで機械学習 #1 準備編
EKSで機械学習 #2 クラスター作成編
EKSで機械学習 #3 Managed Worker Node作成編
EKSで機械学習 #4 GPU Managed Worker Node作成編
EKSで機械学習 #5 Cluster AutoScaler設定編
EKSで機械学習 #6 HPAの設定
EKSで機械学習 #7 EFSの設定
EKSで機械学習 #8 Argo CDを利用したCD環境の構築
EKSで機械学習 #9 SageMaker Operaterの導入
EKSで機械学習 #10 Container Insightsの導入(この記事)

この記事の目的

AWSではContainer基盤のメトリクス取得・モニタリング基盤として、Container Insightを提供しており、
それを導入してみたいと思います。

参考にしたドキュメント

違う点

このドキュメントでは、IAM roleはworker nodeに割り当てて利用していますが、
せっかくなので、service accountにIAM roleを割り当ててしまいます。

service account with IAM roleの作成

以下のコマンドを実行します。
container insightでは、amazon-cloudwatchというnamespaceと
その中にcloudwatch-agent / fluentd というservice accountを作成します・

まず、 cloudwatch-agent

eksctl create iamserviceaccount --cluster=ml --name=cloudwatch-agent --namespace=amazon-cloudwatch --attach-policy-arn=arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy --region=us-west-2 --approve
hAgentServerPolicy --region=us-west-2 --approve
[ℹ]  eksctl version 0.13.0
[ℹ]  using region us-west-2
[ℹ]  2 iamserviceaccount(s) that already exist (kube-system/cluster-autoscaler,sagemaker-k8s-operator-system/sagemaker-k8s-operator-default) will be excluded
[ℹ]  1 iamserviceaccount (amazon-cloudwatch/cloudwatch-agent) was included (based on the include/exclude rules)
[ℹ]  combined exclude rules: kube-system/cluster-autoscaler,sagemaker-k8s-operator-system/sagemaker-k8s-operator-default
[ℹ]  no iamserviceaccounts present in the current set were excluded by the filter
[!]  serviceaccounts that exists in Kubernetes will be excluded, use --override-existing-serviceaccounts to override
[ℹ]  1 task: { 2 sequential sub-tasks: { create IAM role for serviceaccount "amazon-cloudwatch/cloudwatch-agent", create serviceaccount "amazon-cloudwatch/cloudwatch-agent" } }
[ℹ]  building iamserviceaccount stack "eksctl-ml-addon-iamserviceaccount-amazon-cloudwatch-cloudwatch-agent"
[ℹ]  deploying stack "eksctl-ml-addon-iamserviceaccount-amazon-cloudwatch-cloudwatch-agent"
[ℹ]  created namespace "amazon-cloudwatch"
[ℹ]  created serviceaccount "amazon-cloudwatch/cloudwatch-agent"

次に fluentd

eksctl create iamserviceaccount --cluster=ml --name=fluentd --namespace=amazon-cloudwatch --attach-policy-arn=arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy --region=us-west-2 --approve
[ℹ]  eksctl version 0.13.0
[ℹ]  using region us-west-2
[ℹ]  3 iamserviceaccount(s) that already exist (amazon-cloudwatch/cloudwatch-agent,kube-system/cluster-autoscaler,sagemaker-k8s-operator-system/sagemaker-k8s-operator-default) will be excluded
[ℹ]  1 iamserviceaccount (amazon-cloudwatch/fluentd) was included (based on the include/exclude rules)
[ℹ]  combined exclude rules: amazon-cloudwatch/cloudwatch-agent,kube-system/cluster-autoscaler,sagemaker-k8s-operator-system/sagemaker-k8s-operator-default
[ℹ]  no iamserviceaccounts present in the current set were excluded by the filter
[!]  serviceaccounts that exists in Kubernetes will be excluded, use --override-existing-serviceaccounts to override
[ℹ]  1 task: { 2 sequential sub-tasks: { create IAM role for serviceaccount "amazon-cloudwatch/fluentd", create serviceaccount "amazon-cloudwatch/fluentd" } }
[ℹ]  building iamserviceaccount stack "eksctl-ml-addon-iamserviceaccount-amazon-cloudwatch-fluentd"
[ℹ]  deploying stack "eksctl-ml-addon-iamserviceaccount-amazon-cloudwatch-fluentd"
[ℹ]  created serviceaccount "amazon-cloudwatch/fluentd"

namespaceができていることを確認

k get ns
NAME                            STATUS   AGE
amazon-cloudwatch               Active   105s
argocd                          Active   2d2h
default                         Active   4d23h
kube-node-lease                 Active   4d23h
kube-public                     Active   4d23h
kube-system                     Active   4d23h
sagemaker-k8s-operator-system   Active   2d1h

service account ができていることを確認

k get sa -n amazon-cloudwatch
NAME               SECRETS   AGE
cloudwatch-agent   1         27m
default            1         27m
fluentd            1         26m

cloudwatch-agent daemonsetのインストール

(ドキュメントと全く同じ手順なため省略)

fluentd daemonsetのインストール

ドキュメントと同じですが、なぜかconfigmapの作成がyamlになっていないためyamlを書きました

fluentd-configmap.yaml
apiVersion: v1
data:
  # key, value形式で設定を記述する
  cluster.name: ml
  logs.region: us-west-2
kind: ConfigMap
metadata:
  name: cluster-info
  namespace: amazon-cloudwatch

あとは同じです。

動作確認

CloudWatch Agentの確認

image.png

image.png

正常にメトリクスが取得できていることが分かりました

fluentd の確認

image.png

こちらもPodのログ(標準出力)がCloudWatch logsに保存されていることが分かります

まとめ

Container Insightsを導入して、EKS周りのメトリクスを取得してみました。

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?