環境
- Mac OS: 11.3.1
- Kubernetes (Docker for Mac): v1.19.7
- metrics-server: v0.4.4
インストール
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
エラー
インストール後いつまでもPodがReadyにならない
kubectl get po -n kube-system | grep metrics-server
metrics-server-5fbdc54f8c-4kjrw 0/1 Running 4 118s
Podの状態を見るとReadinessProbeとLivenessProbeが失敗してコンテナのRestartが起こっていることが分かる
kubectl describe po metrics-server-5fbdc54f8c-4kjrw -n kube-system
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m43s default-scheduler Successfully assigned kube-system/metrics-server-5fbdc54f8c-4kjrw to docker-desktop
Warning Unhealthy 2m41s kubelet Liveness probe failed: Get "https://10.1.0.46:4443/livez": dial tcp 10.1.0.46:4443: connect: connection refused
Normal Killing 111s (x2 over 2m21s) kubelet Container metrics-server failed liveness probe, will be restarted
Normal Created 110s (x3 over 2m41s) kubelet Created container metrics-server
Normal Started 110s (x3 over 2m41s) kubelet Started container metrics-server
Normal Pulled 110s (x3 over 2m41s) kubelet Container image "k8s.gcr.io/metrics-server/metrics-server:v0.4.4" already present on machine
Warning Unhealthy 101s (x6 over 2m31s) kubelet Liveness probe failed: HTTP probe failed with statuscode: 500
Warning Unhealthy 95s (x7 over 2m35s) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
Logを見ると、 unable to fully scrape metrics: unable to fully scrape metrics from node docker-desktop: unable to fetch metrics from node docker-desktop: Get "https://192.168.65.4:10250/stats/summary?only_cpu_and_memory=true": x509: cannot validate certificate for 192.168.65.4 because it doesn't contain any IP SANs
というエラーが出ている。
kubectl logs metrics-server-5fbdc54f8c-mq7sp -n kube-system
E0519 00:42:03.826071 1 server.go:132] unable to fully scrape metrics: unable to fully scrape metrics from node docker-desktop: unable to fetch metrics from node docker-desktop: Get "https://192.168.65.4:10250/stats/summary?only_cpu_and_memory=true": x509: cannot validate certificate for 192.168.65.4 because it doesn't contain any IP SANs
I0519 00:42:03.837334 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0519 00:42:03.837378 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0519 00:42:03.837403 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0519 00:42:03.837418 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0519 00:42:03.837429 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0519 00:42:03.837436 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0519 00:42:03.837775 1 secure_serving.go:197] Serving securely on [::]:4443
I0519 00:42:03.838700 1 dynamic_serving_content.go:130] Starting serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key
I0519 00:42:03.838931 1 tlsconfig.go:240] Starting DynamicServingCertificateController
I0519 00:42:03.937992 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0519 00:42:03.938075 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0519 00:42:03.938106 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
対応
Metrics server issue with hostname resolution of kubelet and apiserver unable to communicate with metric-server clusterIP #131 というGithub Issueがありその中の一つのコメントに沿って
-
metrics-serverのDeploymentを編集する
kubectl edit deploy metrics-server -n kube-system
-
args以下に変更
- args: - --cert-dir=/tmp - --secure-port=4443 - --v=2 - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP
確認
PodがRunningとなった
kubectl get po -n kube-system | grep metrics-server
metrics-server-796f7767bb-28xsg 1/1 Running 0 2m16s
Describeしてみても問題なく起動できている
kubectl describe po -n kube-system metrics-server-796f7767bb-28xsg
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m50s default-scheduler Successfully assigned kube-system/metrics-server-796f7767bb-28xsg to docker-desktop
Normal Pulled 2m48s kubelet Container image "k8s.gcr.io/metrics-server/metrics-server:v0.4.4" already present on machine
Normal Created 2m48s kubelet Created container metrics-server
Normal Started 2m48s kubelet Started container metrics-server
以下で、 kubectl top pod
が使えるようになってることを確認
kubectl top pod metrics-server-796f7767bb-28xsg -n kube-system
NAME CPU(cores) MEMORY(bytes)
metrics-server-796f7767bb-28xsg 5m 15Mi