#KubernetesでCoreDNS立ち上がらない
##構成
さくらのクラウドで以下の構成で組んでいます。
Master | Worker1 | Worker2 |
---|---|---|
debian9 | debian9 | debian9 |
CPU:1 | CPU:1 | CPU:1 |
メモリ:1G | メモリ:1G | メモリ:1G |
##Kubernetes死んじゃった。
上図時点ではWorker1のみサービスが死んでいたのでサービスの削除作成を繰り返したのですが治らず。
なおすために試した作業としては、
デプロイメントとサービスの削除作成
worker1と2ともにクラスタから排出、再びクラスタに参加
結果としては、Worker1とWorker2ともにサービスが死んじゃった。。。。
#原因
原因としてはCoreDNSが立ち上がらないのが、原因と推測した。(しっかり調べてない)
一度デフォで作られれるPodたち見てみる。
root@Master:~# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-576cbf47c7-7wqgj 0/1 ContainerCreating 0 4d19h
kube-system coredns-576cbf47c7-p8vrh 1/1 Running 0 4d19h
kube-system etcd-master 1/1 Running 0 19d
kube-system kube-apiserver-master 1/1 Running 0 19d
kube-system kube-controller-manager-master 1/1 Running 0 19d
kube-system kube-flannel-ds-amd64-5x8pd 1/1 Running 0 19d
kube-system kube-flannel-ds-amd64-fm66t 1/1 Running 0 4d18h
kube-system kube-flannel-ds-amd64-r24pl 1/1 Running 0 19d
kube-system kube-proxy-fwtvn 1/1 Running 0 4d18h
kube-system kube-proxy-mhtsl 1/1 Running 0 19d
kube-system kube-proxy-q4nk8 1/1 Running 0 19d
kube-system kube-scheduler-master 1/1 Running 0 19d
corednsの一つContainerCreateingで止まってる。
kube-system coredns-576cbf47c7-7wqgj 0/1 ContainerCreating 0 4d19h
クラスタ作り直したら治るっしょ!(笑)って感じで作り直したら
両方逝っちゃいました。
root@Master:~# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-576cbf47c7-7v2tg 0/1 ContainerCreating 0 2m50s
kube-system coredns-576cbf47c7-zc24d 0/1 ContainerCreating 0 2m50s
kube-system etcd-master 1/1 Running 0 2m
kube-system kube-apiserver-master 1/1 Running 0 2m1s
kube-system kube-controller-manager-master 1/1 Running 0 119s
kube-system kube-flannel-ds-amd64-f5whc 1/1 Running 0 44s
kube-system kube-flannel-ds-amd64-kt6jc 1/1 Running 0 44s
kube-system kube-flannel-ds-amd64-z97zq 1/1 Running 0 44s
kube-system kube-proxy-5nv2w 1/1 Running 0 104s
kube-system kube-proxy-ts5ch 1/1 Running 0 2m50s
kube-system kube-proxy-xb2mr 1/1 Running 0 99s
kube-system kube-scheduler-master 1/1 Running 0 2m24s
##詰んだらエラー読め
とりあえず、corednsのPodがContainerCreatingで止まってるから詳細見てみる。
root@Master:~# kubectl describe pods --all-namespaces
Name: coredns-576cbf47c7-7v2tg
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: worker2/172.16.10.3
Start Time: Mon, 26 Nov 2018 14:01:23 +0900
Labels: k8s-app=kube-dns
pod-template-hash=576cbf47c7
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/coredns-576cbf47c7
Containers:
coredns:
Container ID:
Image: k8s.gcr.io/coredns:1.2.2
Image ID:
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-dvqkx (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-dvqkx:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-dvqkx
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 3m42s (x8 over 4m41s) default-scheduler 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
Normal Scheduled 3m35s default-scheduler Successfully assigned kube-system/coredns-576cbf47c7-7v2tg to worker2
Warning NetworkNotReady 2m44s (x5 over 3m35s) kubelet, worker2 network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized]
Warning FailedCreatePodSandBox 2m29s kubelet, worker2 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "3e24e718c6cbce355aad4611b3096d16de0616cb69c5fe499b7c2e0a2aa218a8" network for pod "coredns-576cbf47c7-7v2tg": NetworkPlugin cni failed to set up pod "coredns-576cbf47c7-7v2tg_kube-system" network: failed to set bridge addr: "cni0" already has an IP address different from 10.244.6.1/24
Warning FailedCreatePodSandBox 2m28s kubelet, worker2 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "aafe4d9bf9090962c15bbbe315200658a88b17182ef6c4e53988cde7bfcd5f50" network for pod "coredns-576cbf47c7-7v2tg": NetworkPlugin cni failed to set up pod "coredns-576cbf47c7-7v2tg_kube-system" network: failed to set bridge addr: "cni0" already has an IP address different from 10.244.1.1/24
Warning FailedCreatePodSandBox 2m27s kubelet, worker2 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "6fb0f07923f6c503705cdddc460f3fe8b49c5f5f70daf20a42aa35cbbaa355cf" network for pod "coredns-576cbf47c7-7v2tg": NetworkPlugin cni failed to set up pod "coredns-576cbf47c7-7v2tg_kube-system" network: failed to set bridge addr: "cni0" already has an IP address different from 10.244.1.1/24
Warning FailedCreatePodSandBox 2m26s kubelet, worker2 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "65bc520b187d7b4dce733ce81ebe60de950c241579b8cd73ac43946c7436e6d4" network for pod "coredns-576cbf47c7-7v2tg": NetworkPlugin cni failed to set up pod "coredns-576cbf47c7-7v2tg_kube-system" network: failed to set bridge addr: "cni0" already has an IP address different from 10.244.1.1/24
Warning FailedCreatePodSandBox 2m25s kubelet, worker2 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "1ef92d7831131f4fc48f2f77d2213cb5f90b4e80fc5c92e0746196791c13902f" network for pod "coredns-576cbf47c7-7v2tg": NetworkPlugin cni failed to set up pod "coredns-576cbf47c7-7v2tg_kube-system" network: failed to set bridge addr: "cni0" already has an IP address different from 10.244.1.1/24
Warning FailedCreatePodSandBox 2m24s kubelet, worker2 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "c55745906f29dd49560dfc4f7c4b8643f9bf1a18f05e92edef8b8ea0d28cc6ae" network for pod "coredns-576cbf47c7-7v2tg": NetworkPlugin cni failed to set up pod "coredns-576cbf47c7-7v2tg_kube-system" network: failed to set bridge addr: "cni0" already has an IP address different from 10.244.1.1/24
Warning FailedCreatePodSandBox 2m23s kubelet, worker2 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "3b39b003072aa77fd69f3b21db1203da0ebb53da824d8d0815a49e2f075db3b2" network for pod "coredns-576cbf47c7-7v2tg": NetworkPlugin cni failed to set up pod "coredns-576cbf47c7-7v2tg_kube-system" network: failed to set bridge addr: "cni0" already has an IP address different from 10.244.1.1/24
Warning FailedCreatePodSandBox 2m22s kubelet, worker2 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "cdb363c72a64a056c2b32b35da1592d81ed0d2f2bab5fc674aa6a057cdea718b" network for pod "coredns-576cbf47c7-7v2tg": NetworkPlugin cni failed to set up pod "coredns-576cbf47c7-7v2tg_kube-system" network: failed to set bridge addr: "cni0" already has an IP address different from 10.244.1.1/24
Warning FailedCreatePodSandBox 2m21s kubelet, worker2 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "43e225fe1bbd1e02f41dc44bd222073ae420ccc619a836e72d5235680043015e" network for pod "coredns-576cbf47c7-7v2tg": NetworkPlugin cni failed to set up pod "coredns-576cbf47c7-7v2tg_kube-system" network: failed to set bridge addr: "cni0" already has an IP address different from 10.244.1.1/24
Normal SandboxChanged 2m19s (x10 over 2m28s) kubelet, worker2 Pod sandbox changed, it will be killed and re-created.
Warning FailedCreatePodSandBox 2m19s kubelet, worker2 (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "b1f5c8840f1517a7f08d59814aaaf2a2143afdea473dc0121d9cf6f71abbf562" network for pod "coredns-576cbf47c7-7v2tg": NetworkPlugin cni failed to set up pod "coredns-576cbf47c7-7v2tg_kube-system" network: failed to set bridge addr: "cni0" already has an IP address different from 10.244.1.1/24
エラー吐きまくりですねぇ。ってことで読んでみると
NetworkPlugin cni failed to set up pod "coredns-576cbf47c7-7v2tg_kube-system" network: failed to set bridge addr: "cni0" already has an IP address different from 10.244.1.1/24
なにやらnetworkがややこしいことになってるみたいで
flannelで作られる"cni0"のインタフェースのアドレスがすでにありまっせ的なエラーだと思う(たぶん)。
##解決法
この辺まで来ると大体察しが付くが、
Workerをクラスタから排出して、workerでkubeadm reset
打ち込めばまっさらになると思ってた。
実際はflannelで作られたインタフェース"cni0"が残っててそれが邪魔してたっぽい。
なので、クラスタから排出されてまっさらのWorkerのインタフェース消してやろうぜ!!
ってことで、 ip link delete flannel.1
, ip link delete cni0
5: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
link/ether 2a:37:e7:65:d7:ad brd ff:ff:ff:ff:ff:ff
inet 10.244.2.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::2837:e7ff:fe65:d7ad/64 scope link
valid_lft forever preferred_lft forever
6: cni0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether 0a:58:0a:f4:01:01 brd ff:ff:ff:ff:ff:ff
inet 10.244.1.1/24 scope global cni0
valid_lft forever preferred_lft forever
inet6 fe80::f014:26ff:fe4a:3f3c/64 scope link
valid_lft forever preferred_lft forever
んで新たにクラスタ組んでjoinしてみると~!!
動いた!!!!!!!
root@Master:~# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-576cbf47c7-8qd58 1/1 Running 0 2m38s
kube-system coredns-576cbf47c7-ggp9h 1/1 Running 0 2m38s
kube-system etcd-master 1/1 Running 0 112s
kube-system kube-apiserver-master 1/1 Running 0 2m
kube-system kube-controller-manager-master 1/1 Running 0 2m9s
kube-system kube-flannel-ds-amd64-6lfkx 1/1 Running 0 30s
kube-system kube-flannel-ds-amd64-dhscs 1/1 Running 0 30s
kube-system kube-flannel-ds-amd64-pg8h7 1/1 Running 0 30s
kube-system kube-proxy-hkzs7 1/1 Running 0 2m38s
kube-system kube-proxy-ngqr9 1/1 Running 0 82s
kube-system kube-proxy-qjvm4 1/1 Running 0 87s
kube-system kube-scheduler-master 1/1 Running 0 105s
サービス展開しても起動・停止を繰り返さなくなったよ!
root@Worker1:~# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
95488509ba1d bc26f1ed35cf "nginx -g 'daemon of…" About an hour ago Up About an hour k8s_nginx_nginx2-6868b4858d-z22c4_default_335e2e5b-f13b-11e8-828b-9ca3ba3154aa_0
76bf7e0bb079 k8s.gcr.io/pause:3.1 "/pause" About an hour ago Up About an hour k8s_POD_nginx2-6868b4858d-z22c4_default_335e2e5b-f13b-11e8-828b-9ca3ba3154aa_0
7216efb109f0 367cdc8433a4 "/coredns -conf /etc…" About an hour ago Up About an hour k8s_coredns_coredns-576cbf47c7-ggp9h_kube-system_a0128f69-f13a-11e8-828b-9ca3ba3154aa_0
570b5908d4f9 k8s.gcr.io/pause:3.1 "/pause" About an hour ago Up About an hour k8s_POD_coredns-576cbf47c7-ggp9h_kube-system_a0128f69-f13a-11e8-828b-9ca3ba3154aa_0
e3a1e4dbc8a2 367cdc8433a4 "/coredns -conf /etc…" About an hour ago Up About an hour k8s_coredns_coredns-576cbf47c7-8qd58_kube-system_a00897df-f13a-11e8-828b-9ca3ba3154aa_0
0964ddee9476 k8s.gcr.io/pause:3.1 "/pause" About an hour ago Up About an hour k8s_POD_coredns-576cbf47c7-8qd58_kube-system_a00897df-f13a-11e8-828b-9ca3ba3154aa_0
032ea0484428 f0fad859c909 "/opt/bin/flanneld -…" About an hour ago Up About an hour k8s_kube-flannel_kube-flannel-ds-amd64-dhscs_kube-system_ec8d24cb-f13a-11e8-828b-9ca3ba3154aa_0
3c2c18d4d341 f0fad859c909 "cp -f /etc/kube-fla…" About an hour ago Exited (0) About an hour ago k8s_install-cni_kube-flannel-ds-amd64-dhscs_kube-system_ec8d24cb-f13a-11e8-828b-9ca3ba3154aa_0
49cbe9004657 k8s.gcr.io/pause:3.1 "/pause" About an hour ago Up About an hour k8s_POD_kube-flannel-ds-amd64-dhscs_kube-system_ec8d24cb-f13a-11e8-828b-9ca3ba3154aa_0
7a2fe82d651e 15e9da1ca195 "/usr/local/bin/kube…" About an hour ago Up About an hour k8s_kube-proxy_kube-proxy-qjvm4_kube-system_cab7a89d-f13a-11e8-828b-9ca3ba3154aa_0
7320287c2d27 k8s.gcr.io/pause:3.1 "/pause" About an hour ago Up About an hour k8s_POD_kube-proxy-qjvm4_kube-system_cab7a89d-f13a-11e8-828b-9ca3ba3154aa_0
##まとめ
クラスタ組みなおすときはしっかりflannelインタフェースも消そう!