LoginSignup
8
2

More than 1 year has passed since last update.

[kubernetes] metrics-serverが立ち上がらないのを解消したメモ

Posted at

環境

  • Mac OS: 11.3.1
  • Kubernetes (Docker for Mac): v1.19.7
  • metrics-server: v0.4.4

インストール

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

エラー

インストール後いつまでもPodがReadyにならない

kubectl get po -n kube-system | grep metrics-server  
metrics-server-5fbdc54f8c-4kjrw          0/1     Running   4          118s

Podの状態を見るとReadinessProbeとLivenessProbeが失敗してコンテナのRestartが起こっていることが分かる

kubectl describe po metrics-server-5fbdc54f8c-4kjrw -n kube-system
...
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  2m43s                 default-scheduler  Successfully assigned kube-system/metrics-server-5fbdc54f8c-4kjrw to docker-desktop
  Warning  Unhealthy  2m41s                 kubelet            Liveness probe failed: Get "https://10.1.0.46:4443/livez": dial tcp 10.1.0.46:4443: connect: connection refused
  Normal   Killing    111s (x2 over 2m21s)  kubelet            Container metrics-server failed liveness probe, will be restarted
  Normal   Created    110s (x3 over 2m41s)  kubelet            Created container metrics-server
  Normal   Started    110s (x3 over 2m41s)  kubelet            Started container metrics-server
  Normal   Pulled     110s (x3 over 2m41s)  kubelet            Container image "k8s.gcr.io/metrics-server/metrics-server:v0.4.4" already present on machine
  Warning  Unhealthy  101s (x6 over 2m31s)  kubelet            Liveness probe failed: HTTP probe failed with statuscode: 500
  Warning  Unhealthy  95s (x7 over 2m35s)   kubelet            Readiness probe failed: HTTP probe failed with statuscode: 500

Logを見ると、 unable to fully scrape metrics: unable to fully scrape metrics from node docker-desktop: unable to fetch metrics from node docker-desktop: Get "https://192.168.65.4:10250/stats/summary?only_cpu_and_memory=true": x509: cannot validate certificate for 192.168.65.4 because it doesn't contain any IP SANs というエラーが出ている。

kubectl logs metrics-server-5fbdc54f8c-mq7sp -n kube-system
E0519 00:42:03.826071       1 server.go:132] unable to fully scrape metrics: unable to fully scrape metrics from node docker-desktop: unable to fetch metrics from node docker-desktop: Get "https://192.168.65.4:10250/stats/summary?only_cpu_and_memory=true": x509: cannot validate certificate for 192.168.65.4 because it doesn't contain any IP SANs
I0519 00:42:03.837334       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0519 00:42:03.837378       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0519 00:42:03.837403       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0519 00:42:03.837418       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0519 00:42:03.837429       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0519 00:42:03.837436       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0519 00:42:03.837775       1 secure_serving.go:197] Serving securely on [::]:4443
I0519 00:42:03.838700       1 dynamic_serving_content.go:130] Starting serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key
I0519 00:42:03.838931       1 tlsconfig.go:240] Starting DynamicServingCertificateController
I0519 00:42:03.937992       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
I0519 00:42:03.938075       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
I0519 00:42:03.938106       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController 

対応

Metrics server issue with hostname resolution of kubelet and apiserver unable to communicate with metric-server clusterIP #131 というGithub Issueがありその中の一つのコメントに沿って

  1. metrics-serverのDeploymentを編集する

    kubectl edit deploy metrics-server -n kube-system
    
  2. args以下に変更

          - args:
            - --cert-dir=/tmp
            - --secure-port=4443
            - --v=2
            - --kubelet-insecure-tls
            - --kubelet-preferred-address-types=InternalIP
    

確認

PodがRunningとなった

kubectl get po -n kube-system | grep metrics-server 
metrics-server-796f7767bb-28xsg          1/1     Running   0          2m16s

Describeしてみても問題なく起動できている

kubectl describe po -n kube-system metrics-server-796f7767bb-28xsg
...
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  2m50s  default-scheduler  Successfully assigned kube-system/metrics-server-796f7767bb-28xsg to docker-desktop
  Normal  Pulled     2m48s  kubelet            Container image "k8s.gcr.io/metrics-server/metrics-server:v0.4.4" already present on machine
  Normal  Created    2m48s  kubelet            Created container metrics-server
  Normal  Started    2m48s  kubelet            Started container metrics-server

以下で、 kubectl top podが使えるようになってることを確認

kubectl top pod metrics-server-796f7767bb-28xsg -n kube-system
NAME                              CPU(cores)   MEMORY(bytes)   
metrics-server-796f7767bb-28xsg   5m           15Mi 
8
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
8
2