0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

OpenShift Cluster Monitoring のリソース使用量を調べる

Last updated at Posted at 2022-03-31

はじめに

個人的な実験メモで、他の方が読む事を想定してはおらず、読みやすいものではないのでご了承頂ければ幸いです。

環境認識

記録用に環境の情報を取得

各 Node の情報

バージョンは OpenShift 4.8.26 (Kubernetes 1.21)

[root@bastion openshift]# oc version
Client Version: 4.8.26
Server Version: 4.8.26
Kubernetes Version: v1.21.6+bb8d50a
[root@bastion openshift]# oc get nodes
NAME                          STATUS   ROLES          AGE     VERSION
ocp48-6vldl-infra-94vjm       Ready    infra,worker   12d     v1.21.6+bb8d50a
ocp48-6vldl-infra-gjgwb       Ready    infra,worker   12d     v1.21.6+bb8d50a
ocp48-6vldl-infra-ocs-dbjbd   Ready    infra,worker   5d12h   v1.21.6+bb8d50a
ocp48-6vldl-infra-ocs-rdt8b   Ready    infra,worker   5d12h   v1.21.6+bb8d50a
ocp48-6vldl-infra-ocs-xdhvn   Ready    infra,worker   5d12h   v1.21.6+bb8d50a
ocp48-6vldl-infra-qvwvk       Ready    infra,worker   12d     v1.21.6+bb8d50a
ocp48-6vldl-master-0          Ready    master         17d     v1.21.6+bb8d50a
ocp48-6vldl-master-1          Ready    master         17d     v1.21.6+bb8d50a
ocp48-6vldl-master-2          Ready    master         17d     v1.21.6+bb8d50a
ocp48-6vldl-worker-85crs      Ready    worker         17d     v1.21.6+bb8d50a
ocp48-6vldl-worker-hdj9r      Ready    worker         17d     v1.21.6+bb8d50a
ocp48-6vldl-worker-xp4bf      Ready    worker         17d     v1.21.6+bb8d50a
[root@bastion openshift]# 

  • Master Node x 3
  • Worker Node x 3
  • Infrastructure Node x 6

Cluster Monitoring のバージョン

Cluster Monitoring は、CO (Cluster Operator) の一つとして導入され、OpenShift を導入するとデフォルトで導入されている。

[root@bastion openshift]# oc get co | grep monitoring
monitoring                                 4.8.26    True        False         False      16d
[root@bastion openshift]# 

monitoring のバージョンは 4.8.26

Cluster Monitoring 用にデプロイされている Pod

openshift-monitoring に所属する Pod。nodeSelector を付けて Infrastructure Node に配置。

[root@bastion openshift]# oc get pods -o wide
NAME                                          READY   STATUS    RESTARTS   AGE     IP             NODE                          NOMINATED NODE   READINESS GATES
alertmanager-main-0                           5/5     Running   0          41m     10.128.4.21    ocp48-6vldl-infra-gjgwb       <none>           <none>
alertmanager-main-1                           5/5     Running   0          42m     10.130.2.17    ocp48-6vldl-infra-qvwvk       <none>           <none>
alertmanager-main-2                           5/5     Running   0          42m     10.131.2.14    ocp48-6vldl-infra-94vjm       <none>           <none>
cluster-monitoring-operator-95674b95b-slbjr   2/2     Running   4          16d     10.129.0.7     ocp48-6vldl-master-2          <none>           <none>
grafana-5666d69fc9-d8plz                      2/2     Running   0          42m     10.130.2.15    ocp48-6vldl-infra-qvwvk       <none>           <none>
kube-state-metrics-5f5f79ccbc-858xx           3/3     Running   0          42m     10.130.2.13    ocp48-6vldl-infra-qvwvk       <none>           <none>
node-exporter-4bcsq                           2/2     Running   0          11d     172.18.0.43    ocp48-6vldl-infra-94vjm       <none>           <none>
node-exporter-68q5d                           2/2     Running   0          4d13h   172.18.0.154   ocp48-6vldl-infra-ocs-xdhvn   <none>           <none>
node-exporter-6nqpj                           2/2     Running   0          4d13h   172.18.0.113   ocp48-6vldl-infra-ocs-rdt8b   <none>           <none>
node-exporter-9fjj4                           2/2     Running   0          4d13h   172.18.0.186   ocp48-6vldl-infra-ocs-dbjbd   <none>           <none>
node-exporter-btbb7                           2/2     Running   0          16d     172.18.0.180   ocp48-6vldl-worker-85crs      <none>           <none>
node-exporter-cr6m2                           2/2     Running   0          11d     172.18.0.67    ocp48-6vldl-infra-qvwvk       <none>           <none>
node-exporter-cwglh                           2/2     Running   0          16d     172.18.0.32    ocp48-6vldl-master-2          <none>           <none>
node-exporter-m6hn9                           2/2     Running   0          16d     172.18.0.25    ocp48-6vldl-master-1          <none>           <none>
node-exporter-mplgz                           2/2     Running   0          11d     172.18.0.37    ocp48-6vldl-infra-gjgwb       <none>           <none>
node-exporter-qtxdl                           2/2     Running   0          16d     172.18.0.124   ocp48-6vldl-master-0          <none>           <none>
node-exporter-vrshn                           2/2     Running   0          16d     172.18.0.126   ocp48-6vldl-worker-hdj9r      <none>           <none>
node-exporter-zgsmz                           2/2     Running   0          16d     172.18.0.53    ocp48-6vldl-worker-xp4bf      <none>           <none>
openshift-state-metrics-5bbdb5896-nnx65       3/3     Running   0          42m     10.130.2.12    ocp48-6vldl-infra-qvwvk       <none>           <none>
prometheus-adapter-7b757d8db7-gm8v2           1/1     Running   0          42m     10.130.2.14    ocp48-6vldl-infra-qvwvk       <none>           <none>
prometheus-adapter-7b757d8db7-h5lt5           1/1     Running   0          42m     10.131.2.12    ocp48-6vldl-infra-94vjm       <none>           <none>
prometheus-k8s-0                              7/7     Running   1          46s     10.131.2.15    ocp48-6vldl-infra-94vjm       <none>           <none>
prometheus-k8s-1                              7/7     Running   1          46s     10.130.2.22    ocp48-6vldl-infra-qvwvk       <none>           <none>
prometheus-operator-7c8f55cc45-qjx6s          2/2     Running   0          43m     10.131.2.11    ocp48-6vldl-infra-94vjm       <none>           <none>
telemeter-client-844fdfd96-xzfm5              3/3     Running   0          42m     10.128.4.17    ocp48-6vldl-infra-gjgwb       <none>           <none>
thanos-querier-68d474b7df-bzqdv               5/5     Running   0          42m     10.130.2.16    ocp48-6vldl-infra-qvwvk       <none>           <none>
thanos-querier-68d474b7df-q5lmw               5/5     Running   0          42m     10.131.2.13    ocp48-6vldl-infra-94vjm       <none>           <none>
[root@bastion openshift]#

Cluster Monitoring 構成用に作成したConfigMap

OpenShift install 後、デフォルトのままでも動くが、モニタリング・データ保存用のPVを作成したり、Infrastructure NodePodを配置するには ConfigMap を作成して設定を行う必要がある。

この YAML は、VMware 環境にIPIでインストールした時のデフォルトである StorageClass thin を使用するように構成してある。

設定カスタマイズ用のConfigMap
[root@bastion openshift]# cat cluster-monitoring-configmap-vm.yaml 
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |+
    alertmanagerMain:
      nodeSelector:            # nodeSelectorでinfraを選ぶ
        node-role.kubernetes.io/infra: ""
      tolerations:              # toleration を付ける
      - key: infra
        value: reserved
        effect: NoSchedule
      - key: infra
        value: reserved
        effect: NoExecute
    prometheusK8s:
      volumeClaimTemplate:                            # volumeCliaimTemplate
         spec:                                          # 追加
            storageClassName: thin  # VMware の in-tree
            volumeMode: Filesystem                       # FileSystem
            resources:                                   # 追加
              requests:                                  # 追加
                  storage: 40Gi                             # Size はとりあえず 40Gi
      nodeSelector:                 # nodeSelectorでinfraを選ぶ
        node-role.kubernetes.io/infra: ""
      tolerations:                  # toleration を付ける
      - key: infra
        value: reserved
        effect: NoSchedule
      - key: infra
        value: reserved
        effect: NoExecute
    prometheusOperator:
      nodeSelector:                # nodeSelectorでinfraを選ぶ
        node-role.kubernetes.io/infra: ""
      tolerations:                 # toleration を付ける
      - key: infra
        value: reserved
        effect: NoSchedule
      - key: infra
        value: reserved
        effect: NoExecute
    grafana:
      nodeSelector:                 # nodeSelectorでinfraを選ぶ
        node-role.kubernetes.io/infra: ""
      tolerations:                   # toleration を付ける
      - key: infra
        value: reserved
        effect: NoSchedule
      - key: infra
        value: reserved
        effect: NoExecute
    k8sPrometheusAdapter:
      nodeSelector:                # nodeSelectorでinfraを選ぶ
        node-role.kubernetes.io/infra: ""
      tolerations:                  # toleration を付ける
      - key: infra
        value: reserved
        effect: NoSchedule
      - key: infra
        value: reserved
        effect: NoExecute
    kubeStateMetrics:
      nodeSelector:                # nodeSelectorでinfraを選ぶ
        node-role.kubernetes.io/infra: ""
      tolerations:                  # toleration を付ける
      - key: infra
        value: reserved
        effect: NoSchedule
      - key: infra
        value: reserved
        effect: NoExecute
    telemeterClient:
      nodeSelector:               # nodeSelectorでinfraを選ぶ
        node-role.kubernetes.io/infra: ""
      tolerations:                 # toleration を付ける
      - key: infra
        value: reserved
        effect: NoSchedule
      - key: infra
        value: reserved
        effect: NoExecute
    openshiftStateMetrics:
      nodeSelector:              # nodeSelectorでinfraを選ぶ
        node-role.kubernetes.io/infra: ""
      tolerations:                # toleration を付ける
      - key: infra
        value: reserved
        effect: NoSchedule
      - key: infra
        value: reserved
        effect: NoExecute
    thanosQuerier:
      nodeSelector:              # nodeSelectorでinfraを選ぶ
        node-role.kubernetes.io/infra: ""
      tolerations:                # toleration を付ける
      - key: infra
        value: reserved
        effect: NoSchedule
      - key: infra
        value: reserved
        effect: NoExecute

主要 Pod の Requests / Limits 調査

openshift-monitoring namespace に存在する Pod に指定されている RequestsLimitsoc (kubectl) get pod を使って調べて行く。

コマンドによる確認結果

grep で RequestsLimits を引っかけてコンテナ名を調べるのが大変だったので jsonpath で取得する方法を開発した。

.spec.containers[*].name と .spec.containers[*].resources

のペアと

.spec.initContainers[*].name.spec.initContainers[*].resources

のペアを取得している。

[root@bastion openshift]# oc get pod alertmanager-main-0 -o=jsonpath='{range .spec.containers[*]}{.name}{" "}{.resources}{"\n"}{end}{range .spec.initContainers[*]}{.name} {" "}{.resources}{"\n"}{end}'
alertmanager {"requests":{"cpu":"4m","memory":"40Mi"}}      
config-reloader {"requests":{"cpu":"1m","memory":"10Mi"}}   
alertmanager-proxy {"requests":{"cpu":"1m","memory":"20Mi"}}
kube-rbac-proxy {"requests":{"cpu":"1m","memory":"15Mi"}}   
prom-label-proxy {"requests":{"cpu":"1m","memory":"20Mi"}}  
[root@bastion openshift]#
[root@bastion openshift]# oc get pod cluster-monitoring-operator-95674b95b-slbjr -o=jsonpath='{range .spec.containers[*]}{.name}{" "}{.resources}{"\n"}{end}{range .spec.initContainers[*]}{.name} {" "}{.resources}{"\n"}{end}'
kube-rbac-proxy {"requests":{"cpu":"1m","memory":"20Mi"}}
cluster-monitoring-operator {"requests":{"cpu":"10m","memory":"75Mi"}}
[root@bastion openshift]# 
[root@bastion openshift]# oc get pod grafana-5666d69fc9-d8plz -o=jsonpath='{range .spec.containers[*]}{.name}{" "}{.resources}{"\n"}{end}{range .spec.initContainers[*]}{.name} {" "}{.resources}{"\n"}{end}'
grafana {"requests":{"cpu":"4m","memory":"64Mi"}}      
grafana-proxy {"requests":{"cpu":"1m","memory":"20Mi"}}
[root@bastion openshift]# 
[root@bastion openshift]# oc get pod kube-state-metrics-5f5f79ccbc-858xx  -o=jsonpath='{range .spec.containers[*]}{.name}{" "}{.resources}{"\n"}{end}{range .spec.initContainers[*]}{.name} {" "}{.resources}{"\n"}{end}'
kube-state-metrics {"requests":{"cpu":"2m","memory":"80Mi"}}  
kube-rbac-proxy-main {"requests":{"cpu":"1m","memory":"15Mi"}}
kube-rbac-proxy-self {"requests":{"cpu":"1m","memory":"15Mi"}}
[root@bastion openshift]#
[root@bastion openshift]# oc get pod node-exporter-4bcsq -o=jsonpath='{range .spec.containers[*]}{.name}{" "}{.resources}{"\n"}{end}{range .spec.initContainers[*]}{.name} {" "}{.resources}{"\n"}{end}'
node-exporter {"requests":{"cpu":"8m","memory":"32Mi"}}  
kube-rbac-proxy {"requests":{"cpu":"1m","memory":"15Mi"}}
init-textfile  {"requests":{"cpu":"1m","memory":"1Mi"}}  
[root@bastion openshift]# 
[root@bastion openshift]# oc get pod openshift-state-metrics-5bbdb5896-nnx65   -o=jsonpath='{range .spec.containers[*]}{.name}{" "}{.resources}{"\n"}{end}{range .spec.initContainers[*]}{.name} {" "}{.resources}{"\n"}{end}'
kube-rbac-proxy-main {"requests":{"cpu":"1m","memory":"20Mi"}}   
kube-rbac-proxy-self {"requests":{"cpu":"1m","memory":"20Mi"}}   
openshift-state-metrics {"requests":{"cpu":"1m","memory":"32Mi"}}
[root@bastion openshift]# 
[root@bastion openshift]# oc get pod prometheus-adapter-7b757d8db7-gm8v2 -o=jsonpath='{range .spec.containers[*]}{.name}{" "}{.resources}{"\n"}{end}{range .spec.initContainers[*]}{.name} {" "}{.resources}{"\n"}{end}'
prometheus-adapter {"requests":{"cpu":"1m","memory":"40Mi"}}
[root@bastion openshift]# 
[root@bastion openshift]# oc get pod prometheus-k8s-0  -o=jsonpath='{range .spec.containers[*]}{.name}{" "}{.resources}{"\n"}{end}{range .spec.initContainers[*]}{.name} {" "}{.resources}{"\n"}{end}'
prometheus {"requests":{"cpu":"70m","memory":"1Gi"}}
config-reloader {"requests":{"cpu":"1m","memory":"10Mi"}}       
thanos-sidecar {"requests":{"cpu":"1m","memory":"25Mi"}}        
prometheus-proxy {"requests":{"cpu":"1m","memory":"20Mi"}}      
kube-rbac-proxy {"requests":{"cpu":"1m","memory":"15Mi"}}       
prom-label-proxy {"requests":{"cpu":"1m","memory":"15Mi"}}      
kube-rbac-proxy-thanos {"requests":{"cpu":"1m","memory":"10Mi"}}
[root@bastion openshift]# 
[root@bastion openshift]# oc get pod prometheus-operator-7c8f55cc45-qjx6s -o=jsonpath='{range .spec.containers[*]}{.name}{" "}{.resources}{"\n"}{end}{range .spec.initContainers[*]}{.name} {" "}{.resources}{"\n"}{end}'
prometheus-operator {"requests":{"cpu":"5m","memory":"150Mi"}}
kube-rbac-proxy {"requests":{"cpu":"1m","memory":"15Mi"}}     
[root@bastion openshift]# 
[root@bastion openshift]# oc get pod telemeter-client-844fdfd96-xzfm5 -o=jsonpath='{range .spec.containers[*]}{.name}{" "}{.resources}{"\n"}{end}{range .spec.initContainers[*]}{.name} {" "}{.resources}{"\n"}{end}'
telemeter-client {"requests":{"cpu":"1m","memory":"40Mi"}}
reload {"requests":{"cpu":"1m","memory":"10Mi"}}
kube-rbac-proxy {"requests":{"cpu":"1m","memory":"20Mi"}}
[root@bastion openshift]#
[root@bastion openshift]# oc get pod thanos-querier-68d474b7df-bzqdv  -o=jsonpath='{range .spec.containers[*]}{.name}{" "}{.resources}{"\n"}{end}{range .spec.initContainers[*]}{.name} {" "}{.resources}{"\n"}{end}'
thanos-query {"requests":{"cpu":"10m","memory":"12Mi"}}
oauth-proxy {"requests":{"cpu":"1m","memory":"20Mi"}}
kube-rbac-proxy {"requests":{"cpu":"1m","memory":"15Mi"}}
prom-label-proxy {"requests":{"cpu":"1m","memory":"15Mi"}}
kube-rbac-proxy-rules {"requests":{"cpu":"1m","memory":"15Mi"}}
[root@bastion openshift]# 

コマンドの結果のまとめ

空白部分は特に指定が無かった事を表す。

Pod 名 container名 CPU (Limits) CPU (Requests) Memory (Limits) Memory (Requests)
Alert Manager
alertmanager 4m 40Mi
config-reloader 1m 10Mi
alertmanager-proxy 1m 20Mi
kube-rbac-proxy 1m 15Mi
prom-label-proxy 1m 20Mi
cluster-monitoring-operator-xxxx
kube-rbac-proxy 1m 20Mi
cluster-monitoring-operator 10m 75Mi
grafana-xxxx
grafana 4m 64Mi
grafana-proxy 1m 20Mi
kube-state-metrics-xxxx
kube-state-metrics 2m 80Mi
kube-rbac-proxy-main 1m 15Mi
kube-rbac-proxy-self 1m 15Mi
node-exporter-xxxx
node-exporter 8m 32Mi
kube-rbac-proxy 1m 15Mi
init-textfile(init Container) 2m 1Mi
openshift-state-metrics-xxxx
kube-rbac-proxy-main 1m 20Mi
kube-rbac-proxy-self 1m 20Mi
openshift-state-metrics 1m 32Mi
prometheus-adapter-xxxx 1m 40Mi
prometheus-k8s-n
prometheus 70m 1Gi
config-reloader 1m 10Mi
thanos-sidecar 1m 25Mi
prometheus-proxy 1m 20Mi
kube-rbac-proxy 1m 15Mi
prom-label-proxy 1m 15Mi
kube-rbac-proxy-thanos 1m 10Mi
prometheus-operator-xxxx
prometheus-operator 5m 150Mi
kube-rbac-proxy 1m 15Mi
telemeter-client-xxxx
telemeter-client 1m 40Mi
reload 1m 10Mi
kube-rbac-proxy 1m 20Mi
thanos-querier-xxxx
thanos-query 10m 12Mi
oauth-proxy 1m 20Mi
kube-rbac-proxy 1m 15Mi
prom-label-proxy 1m 15Mi
kube-rbac-proxy-rules 1m 15Mi

実際のリソース使用量

kubectl top pods の結果

[root@bastion openshift]# kubectl top pods -n openshift-monitoring --use-protocol-buffers 
NAME                                          CPU(cores)   MEMORY(bytes)   
alertmanager-main-0                           2m           117Mi
alertmanager-main-1                           3m           110Mi
alertmanager-main-2                           2m           104Mi
cluster-monitoring-operator-95674b95b-slbjr   9m           116Mi
grafana-5666d69fc9-d8plz                      3m           136Mi
kube-state-metrics-5f5f79ccbc-858xx           3m           117Mi
node-exporter-4bcsq                           3m           46Mi
node-exporter-68q5d                           5m           52Mi
node-exporter-6nqpj                           6m           55Mi
node-exporter-9fjj4                           3m           58Mi
node-exporter-btbb7                           4m           39Mi
node-exporter-cr6m2                           3m           48Mi
node-exporter-cwglh                           5m           46Mi
node-exporter-m6hn9                           5m           47Mi
node-exporter-mplgz                           5m           48Mi
node-exporter-qtxdl                           3m           39Mi
node-exporter-vrshn                           4m           39Mi
node-exporter-zgsmz                           2m           40Mi
openshift-state-metrics-5bbdb5896-nnx65       0m           57Mi
prometheus-adapter-7b757d8db7-gm8v2           5m           72Mi
prometheus-adapter-7b757d8db7-h5lt5           4m           74Mi
prometheus-k8s-0                              1230m        2425Mi
prometheus-k8s-1                              514m         2513Mi
prometheus-operator-7c8f55cc45-qjx6s          7m           140Mi
telemeter-client-844fdfd96-xzfm5              0m           73Mi
thanos-querier-68d474b7df-bzqdv               4m           121Mi
thanos-querier-68d474b7df-q5lmw               2m           123Mi
[root@bastion openshift]# 

alertmanager-main-n や、prometheus-k80-n のメモリ使用量、prometheus-k80-n のCPU使用量は reuests値を大きく上回っているのが分かる。

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?