More than 5 years have passed since last update.

Prometheus+Grafana & Elasticsearch+Fluentd+KibanaをAKS上に構築し、ログアラートをGrafanaで行う手順（1/2）

Last updated at 2018-08-19Posted at 2018-08-19

はじめに

システムを運用する上で、リソース監視やログ監視、アラート通知はいつだって必要になります。Kubernetesを運用する場合でも同じです。
クラウド上でManaged Kubernetesを構築する場合、クラウドが備える監視サービスとアラートサービスを使うことが多いと思います。しかし例えば既存システムのノウハウの有効活用がしたい場合、あるいはポータビリティを重視する場合など、リソースとログの監視とアラートを自前で準備したくなることもあるでしょう。

そこで今回は、Microsoft Azire AKSのKubernetes上に、Prometheus & Grafanaによるリソース監視、Elasticsearch & fluentd & Kibanaによるログ集約、及びGrafanaによるリソースとログのアラート通知を構築しようと思います。

記事が長くなってしまったので、今回はAKS上にPrometheusとGrafanaを構築するところまで解説します。
後半のAElasticsearch + Fluentd + Kibanaを立ち上げて各NodeやPodのログを集約し、ログに特定の文字列が存在する場合にはGrafanaからSlackへ通知させる、というところは次回を御覧ください。

第一回: https://qiita.com/nmatsui/items/6d8319f3216bd8786eb9
第二回: https://qiita.com/nmatsui/items/ef7cf8f5c957f82d2ca1

検証した環境

クラウド側

	バージョン
Microsoft Azure AKS	1.11.1

クライアント側

	バージョン
kubectl	1.11.2
azure-cli	2.0.44
helm	2.9.1

検証で用いたyaml等の詳細は、githubに公開しています。nmatsui/kubernetes-monitoringを参照してください。

環境構築

Microsoft Azure AKSの準備

azコマンドを用いてリソースグループを作成し、AKSを起動します。この際、Dsv3-seriesのようなPremium Storageを使えるvm sizeを指定します。

$ az group create --name k8s --location japaneast
$ az aks create --resource-group k8s --name k8saks --node-count 3 --ssh-key-value $HOME/.ssh/azure.pub --node-vm-size Standard_D2s_v3 --kubernetes-version 1.11.1
$ az aks get-credentials --resource-group k8s --name k8saks

Helmの準備

PrometeusとGrafanaは、CoreOSが公開したHelm Chartであるcoreos/prometheus-operatorとcoreos/kube-prometheusを用いてインストールします。
ただしRBACが有効になっているKubernetesの場合、様々なリソースを内部的に操作するHelmには、かなり強い権限を与えておかないと動作しません。本来は必要最小限の権限を探るべきなのですが、今回はまるっとスーパーユーザー権限（cluster-admin）を与えてしまいます。

rbac/tiller-rbac.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: tiller
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: tiller
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: ServiceAccount
    name: tiller
    namespace: kube-system

$ kubectl apply -f rbac/tiller-rbac.yaml

$ helm init --service-account tiller
$ helm repo update
$ helm repo add coreos https://s3-eu-west-1.amazonaws.com/coreos-charts/stable/

tiller podが起動していることを確認しておきましょう。

$ kubectl get pod --namespace kube-system -l app=helm -l name=tiller
NAME                            READY     STATUS    RESTARTS   AGE
tiller-deploy-759cb9df9-fqcrv   1/1       Running   0          2m

Prometheus & Grafanaのインストール

coreos/prometheus-operatorとcoreos/kube-prometheusを用いて、PrometheusとGrafanaをインストールします。

ただし2018/08/19時点のデフォルト設定をそのまま使うと、次のような問題が発生します。そのため設定を上書きしてインストールします。なおストレージの容量等は、必要に応じて書き換えてください。

PrometheusやAlertManagerでpersistent volumeが利用されない
ElasticsearchのAlertingに対応したGrafana 5.2系ではなく、5.0系がインストールされる

monitoring/kube-prometheus-azure.yaml

global:
  rbacEnable: true

alertmanager:
  image:
    repository: quay.io/prometheus/alertmanager
    tag: v0.15.1
  storageSpec:
    volumeClaimTemplate:
      metadata:
        name: pg-alertmanager-storage-claim
      spec:
        storageClassName: managed-premium
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 30Gi

prometheus:
  image:
    repository: quay.io/prometheus/prometheus
    tag: v2.3.2
  storageSpec:
    volumeClaimTemplate:
      metadata:
        name: pg-prometheus-storage-claim
      spec:
        storageClassName: managed-premium
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 30Gi

grafana:
  image:
    repository: grafana/grafana
    tag: 5.2.2
  auth:
    anonymous:
      enabled: "false"

デフォルト設定がどうなっているのかは、関連するhelmのvalues.yamlを確認してください。
grafanaのvalues.yaml
prometheusのvalues.yaml
alertmanagerのvalues.yaml
kube-prometheusのvalues.yaml

prometheus-operatorのインストール

namespaceをmonitoringと指定して、coreos/prometheus-operatorをインストールします。

$ helm install coreos/prometheus-operator --name pg-op --namespace monitoring

ネットワークの状況によっては、watchが切断されてError: watch closed before Until timeoutのようなエラーが出る場合がありますが、Kubernetesクラスタ側で構築は進んでいます。
構築が成功していれば、prometheus-operatorのjobが一つ完了し、podが一つ起動しているはずです。

$ kubectl get jobs --namespace monitoring -l app=prometheus-operator -l release=pg-op
NAME                                      DESIRED   SUCCESSFUL   AGE
pg-op-prometheus-operator-create-sm-job   1         1            5m

$ kubectl get pods --namespace monitoring -l app=prometheus-operator -l release=pg-op
NAME                                            READY     STATUS      RESTARTS   AGE
pg-op-prometheus-operator-688494b68f-lrcst      1/1       Running     0          5m
pg-op-prometheus-operator-create-sm-job-fnsgq   0/1       Completed   0          5m

PrometheusとGrafanaのインストール

namespaceをmonitoringと指定し、coreos/kube-prometheusを用いてPrometheusとGrafanaをインストールします。

$ helm install coreos/kube-prometheus --name pg --namespace monitoring -f monitoring/kube-prometheus-azure.yaml

構築が成功していれば、AlertManagerとPrometheus、及びGrafanaが立ち上がり、各ノードにnode-exporterが一つずつ起動しているはずです。

AlertManager

$ kubectl get persistentvolumeclaims --namespace monitoring -l app=alertmanager
NAME                                   STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
alertmanager-pg-db-alertmanager-pg-0   Bound     pvc-3c2ef8c2-a340-11e8-8990-caec6aa008cf   30Gi       RWO            managed-premium   5m

$ kubectl get pods -n monitoring -l app=alertmanager
NAME                READY     STATUS    RESTARTS   AGE
alertmanager-pg-0   2/2       Running   0          5m

Prometheus

$ kubectl get persistentvolumeclaims --namespace monitoring -l app=prometheus
NAME                                                     STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
prometheus-pg-prometheus-db-prometheus-pg-prometheus-0   Bound     pvc-3c7fb880-a340-11e8-8990-caec6aa008cf   30Gi       RWO            managed-premium   6m

$ kubectl get pods --namespace monitoring -l app=prometheus
NAME                         READY     STATUS    RESTARTS   AGE
prometheus-pg-prometheus-0   3/3       Running   1          6m

Grafana

$ kubectl get pods --namespace monitoring -l app=pg-grafana
NAME                          READY     STATUS    RESTARTS   AGE
pg-grafana-75cdf6b96d-njxwb   2/2       Running   0          7m

node-exporter

$ kubectl get daemonsets --namespace monitoring
NAME               DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
pg-exporter-node   3         3         3         3            3           <none>          7m

$ kubectl get pods -o wide -n monitoring -l app=pg-exporter-node
NAME                     READY     STATUS    RESTARTS   AGE       IP           NODE
pg-exporter-node-4wtfd   1/1       Running   0          8m        10.240.0.4   aks-nodepool1-14983502-2
pg-exporter-node-9mjdg   1/1       Running   0          8m        10.240.0.5   aks-nodepool1-14983502-1
pg-exporter-node-l2gnx   1/1       Running   0          8m        10.240.0.6   aks-nodepool1-14983502-0

Azure AKSに合わせてパッチ

GCPだとこれで問題なく動くはずですが、2018/08/19時点では、Azure AKSにあわせていくつかパッチを当てる必要があります。

`kube-dns-v20`のmetricsをexport

デフォルトでは、Azure AKSのkube-dnsはmetricsをexportしないようです。kube-dnsの状態をprometheusにexportするsidecarを突っ込みます。

monitoring/kube-dns-metrics-patch.yaml

spec:
  template:
    spec:
      containers:
      - name: kubedns
        env:
        - name: PROMETHEUS_PORT
          value: "10055"
      - name: sidecar
        image: k8s.gcr.io/k8s-dns-sidecar-amd64:1.14.10
        livenessProbe:
          httpGet:
            path: /metrics
            port: 10054
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        args:
        - --v=2
        - --logtostderr
        - --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local
        - --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local
        ports:
        - containerPort: 10054
          name: metrics
          protocol: TCP
        resources:
          requests:
            memory: 20Mi
            cpu: 10m

$ kubectl patch deployment -n kube-system kube-dns-v20 --patch "$(cat monitoring/kube-dns-metrics-patch.yaml)"

https://github.com/Azure/AKS/issues/345 を参照

kubelet exporterが利用するportをhttpsからhttpに変更

デフォルトでは、Azure AKSではhttpsでのexportがうまくいかないようです。kubeletsの状態をprometheusにexportするポートをhttpsからhttpに変更します。

$ kubectl get servicemonitors pg-exporter-kubelets --namespace monitoring -o yaml | sed 's/https/http/' | kubectl replace -f -

https://github.com/coreos/prometheus-operator/issues/926 を参照

Kubernetesのcontrol planeのexporterを削除

デフォルトのAzure AKSは、apiserverの状態を外部から取得することができないようです。残念ながら2018/08/19時点では良い回避策が無いようで、あきらめてkubernetesのcontrol planeのexporterを削除してしまいます。

$ kubectl delete servicemonitor pg-exporter-kubernetes --namespace monitoring

https://github.com/coreos/prometheus-operator/issues/1522

Kubernetesのcontrol planeに関わるAlertの削除

control planeのexporterを削除してしまったため、Kubernetesからいくつかの値が取れなくなり、coreos/kube-prometheusが設定した次のAlertが上がりっぱなしになってしまいます。

alert: DeadMansSwitch
alert: K8SApiserverDown
alert: K8SControllerManagerDown
alert: K8SSchedulerDown

Prometheusのexporterやalertのルールは、configmap prometheus-pg-prometheus-rulefiles に定義されており、次のコマンドで確認することができます。ではこのconfigmapを修正すれば良いのかというと、実はそうではありません。

$ kubectl get configmap prometheus-pg-prometheus-rulefiles --namespace monitoring -o yaml

このconfigmapはcoreos/prometheus-operatorが動的に生成しており、直接書き換えても強制的にもとのconfigmapに戻されてしまいます。
実はcoreos/kube-prometheusが登録したcustom resourceである PrometheusRule を書き換えるのが、正しい手順となります。

この事実に気がつくまで、3時間を要しました・・・

coreos/kube-prometheusが生成したPrometheusRuleは10個あり、次のコマンドで確かめることができます。このうち、次の4つを修正することになります。

$ kubectl get prometheusrules --namespace monitoring

PrometheusRule	削除するalert
pg-kube-prometheus	DeadMansSwitch
pg-exporter-kubernetes	K8SApiserverDown
pg-exporter-kube-controller-manager	K8SControllerManagerDown
pg-exporter-kube-scheduler	K8SSchedulerDown

$ kubectl edit prometheusrule pg-kube-prometheus --namespace monitoring

pg-kube-prometheus

       for: 10m
       labels:
         severity: warning
-    - alert: DeadMansSwitch
-      annotations:
-        description: This is a DeadMansSwitch meant to ensure that the entire Alerting
-          pipeline is functional.
-        summary: Alerting DeadMansSwitch
-      expr: vector(1)
-      labels:
-        severity: none
     - expr: process_open_fds / process_max_fds
       record: fd_utilization
     - alert: FdExhaustionClose

$ kubectl edit prometheusrule pg-exporter-kubernetes --namespace monitoring

pg-exporter-kubernetes

       for: 10m
       labels:
         severity: critical
-    - alert: K8SApiserverDown
-      annotations:
-        description: No API servers are reachable or all have disappeared from service
-          discovery
-        summary: No API servers are reachable
-      expr: absent(up{job="apiserver"} == 1)
-      for: 20m
-      labels:
-        severity: critical
     - alert: K8sCertificateExpirationNotice
       annotations:
         description: Kubernetes API Certificate is expiring soon (less than 7 days)

$ kubectl edit prometheusrule pg-exporter-kube-controller-manager --namespace monitoring

pg-exporter-kube-controller-manager

 spec:
   groups:
   - name: kube-controller-manager.rules
-    rules:
-    - alert: K8SControllerManagerDown
-      annotations:
-        description: There is no running K8S controller manager. Deployments and replication
-          controllers are not making progress.
-        runbook: https://coreos.com/tectonic/docs/latest/troubleshooting/controller-recovery.html#recovering-a-controller-manager
-        summary: Controller manager is down
-      expr: absent(up{job="kube-controller-manager"} == 1)
-      for: 5m
-      labels:
-        severity: critical
+    rules: []

$ kubectl edit prometheusrule pg-exporter-kube-scheduler --namespace monitoring

pg-exporter-kube-scheduler

       labels:
         quantile: "0.5"
       record: cluster:scheduler_binding_latency_seconds:quantile
-    - alert: K8SSchedulerDown
-      annotations:
-        description: There is no running K8S scheduler. New pods are not being assigned
-          to nodes.
-        runbook: https://coreos.com/tectonic/docs/latest/troubleshooting/controller-recovery.html#recovering-a-scheduler
-        summary: Scheduler is down
-      expr: absent(up{job="kube-scheduler"} == 1)
-      for: 5m
-      labels:
-        severity: critical

動作確認

やっと環境構築が終わりました。では、PrometheusとGrafanaのWebコンソールを開き、その状態を確認してみましょう。
ただし今回は、PrometheusもGrafanaもClusterIPとして構築しています。そのためAKSの外部からWebコンソールを使うことはできません。そこで各々ポートフォワードしてWebコンソールを使います。

Prometheusの確認

Prometheusの 9090ポートをforwardします。

$ kubectl port-forward $(kubectl get pod --namespace monitoring -l prometheus=kube-prometheus -l app=prometheus -o template --template "{{(index .items 0).metadata.name}}") --namespace monitoring 9090:9090

正しく環境構築されていれば、Webブラウザから http://localhost:9090/ を開くと、PrometheusのWebコンソールが表示されます。
insert metric at cluster のプルダウンリストを見れば、Prometheusがexportしているmetricsを確認できます。Kubernetesのpodやvolumeのmetrics、nodeのmetrics等がexportされているはずです。

また http://localhost:9090/targets を開くと、Prometheusがmetricsを収集するtargetが表示されます。DownしているTargetが無いことを確認してください。

最後に http://localhost:9090/alerts を開くと、Alertの一覧が表示されます。FireしているAlertが無いことを確認してください。

Grafanaの確認

Prometheusの 3000ポートをforwardします。

$ kubectl port-forward $(kubectl get pod --namespace monitoring -l app=pg-grafana -o template --template "{{(index .items 0).metadata.name}}") --namespace monitoring 3000:3000

正しく環境構築されていれば、Webブラウザから http://localhost:3000/ を開くと、Grafanaのログイン画面が表示されます。初期パスワードはadmin/adminです（初回ログイン時にパスワードの変更を促されます）。

ログインすると、Home画面が表示されます。まずはConfiguration -> DataSourceからPrometheusの設定を行います。

coreos/kube-prometeusが自動的にPrometheusをDataSourceとして登録するのですが、なぜかヘンなURLが設定されている場合があります。Prometheus Serviceの名前を確認し、正しいURL（今回は http://pg-prometheus:9090/）に変更してください。

$ kubectl get services --namespace monitoring -l app=prometheus -l prometheus=pg
NAME            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
pg-prometheus   ClusterIP   10.0.105.143   <none>        9090/TCP   3h

ここまで設定を進めれば、GrafanaがPrometeusからmetricsを取得し各Dashboardが機能します。デフォルトで登録されているDashbordを確認してみてください。
（ただしKuberntesのControl Planeのexporterを削除してしまっているため、Control Planeに関するPanelはN/Aになっています）

なおデフォルトで登録されるDashboardには、Persistent Volumeの使用量を表示するものがありません。追加しておきましょう。
GrafanaのImport機能を用いて monitoring/dashboard_persistent_volumes.json をimportしてください。

Persitent Volumeの使用量を表示するDashboardとPanelが追加されます。

次回は

ということで、Azure AKS上にPrometheusとGrafanaを構築し、イイカンジにリソース監視をすることができました。
次回は、今回のAKSにElasticsearch+Fluentd+Kibanaを追加してKuberntes上のログをElasticserchに集約し、加えてElasticsearchとGrafanaを連携させ、特定のログが出力された場合にSlackへ通知を飛ばしたいと思います。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up