LoginSignup
10
3

More than 3 years have passed since last update.

kubernetesでcorednsが起動しない時に確認すること

Posted at

先日kubernetesでネットワークプラグインをcalicoからflannelに入れ替えた際にcorednsが起動しなくなる、
という事態に遭遇したため、その時の原因と対処法をログとして記載します。

具体的には以下のようにcorednsのSTATUSがContainerCreatingのままで止まってしまい、何時までもRunningになりませんでした。

sho@Desktop $ kubectl get po -n kube-system                                                                       [~/workspace/k8s]
NAME                              READY   STATUS              RESTARTS   AGE
coredns-66bff467f8-mbqc6          0/1     ContainerCreating   0          6m18s
coredns-66bff467f8-xjr6j          0/1     ContainerCreating   0          4m33s
etcd-vagrant                      1/1     Running             8          21h
kube-apiserver-vagrant            1/1     Running             8          21h
kube-controller-manager-vagrant   1/1     Running             8          21h
kube-flannel-ds-amd64-gbwbm       1/1     Running             0          5m15s
kube-proxy-8zfjf                  1/1     Running             8          21h
kube-scheduler-vagrant            1/1     Running             8          21h

そこでPodをdescribeしてみると「networkPlugin cni failed to set up」と表示されています。
どうやらflannelを起動する際に作られるはずのcni0インターフェースの作成に失敗しているようです。

sho@Desktop $ k describe po -n kube-system coredns-66bff467f8-mbqc6                                         [~/workspace/k8s]
Name:                 coredns-66bff467f8-mbqc6
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 vagrant/10.0.2.15
Start Time:           Sun, 12 Apr 2020 21:48:08 +0900
Labels:               k8s-app=kube-dns
                      pod-template-hash=66bff467f8
Annotations:          <none>
Status:               Pending
IP:                   
Controlled By:        ReplicaSet/coredns-66bff467f8
Containers:
  coredns:
    Container ID:  
    Image:         k8s.gcr.io/coredns:1.6.7
    Image ID:      
    Ports:         53/UDP, 53/TCP, 9153/TCP
    Host Ports:    0/UDP, 0/TCP, 0/TCP
    Args:
      -conf
      /etc/coredns/Corefile
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  170Mi
    Requests:
      cpu:        100m
      memory:     70Mi
    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
    Readiness:    http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /etc/coredns from config-volume (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-rrhhr (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      coredns
    Optional:  false
  coredns-token-rrhhr:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  coredns-token-rrhhr
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  kubernetes.io/os=linux
Tolerations:     CriticalAddonsOnly
                 node-role.kubernetes.io/master:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                  Age                  From               Message
  ----     ------                  ----                 ----               -------
  Normal   Scheduled               3m26s                default-scheduler  Successfully assigned kube-system/coredns-66bff467f8-mbqc6 to vagrant
  Warning  FailedCreatePodSandBox  3m24s                kubelet, vagrant   Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "414eadf5bc603455533b031b30c86a22b3b3fb8aae0b4114842d6928bc7e9a86" network for pod "coredns-66bff467f8-mbqc6": networkPlugin cni failed to set up pod "coredns-66bff467f8-mbqc6_kube-system" network: error getting ClusterInformation: connection is unauthorized: Unauthorized, failed to clean up sandbox container "414eadf5bc603455533b031b30c86a22b3b3fb8aae0b4114842d6928bc7e9a86" network for pod "coredns-66bff467f8-mbqc6": networkPlugin cni failed to teardown pod "coredns-66bff467f8-mbqc6_kube-system" network: error getting ClusterInformation: connection is unauthorized: Unauthorized]
  Normal   SandboxChanged          8s (x16 over 3m24s)  kubelet, vagrant   Pod sandbox changed, it will be killed and re-created.
sho@Desktop $ 

実際にVM環境にログインしてネットワーク・インターフェースの一覧を見ると、確かにcni0が存在しません。
(Vagrant環境にkubernetesをインストールし、kubectlコマンドはmacで実行しています)

vagrant@vagrant:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:76:b9:09 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic eth0
       valid_lft 85863sec preferred_lft 85863sec
    inet6 fe80::a00:27ff:fe76:b909/64 scope link 
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:0c:8f:7c brd ff:ff:ff:ff:ff:ff
    inet 192.168.33.11/24 brd 192.168.33.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe0c:8f7c/64 scope link 
       valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:e6:58:68:1b brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
6: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default 
    link/ether e6:89:0e:e0:86:99 brd ff:ff:ff:ff:ff:ff
    inet 10.244.0.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::e489:eff:fee0:8699/64 scope link 
       valid_lft forever preferred_lft forever
vagrant@vagrant:~$ 

当初はどうしてcni0の作成に失敗しているか分からなかったのですが、
調べていくうちに別のネットワークプラグイン(calico)を利用していたことが原因であることが分かり、
問題のもとになっているファイルがあることが分かりました。

vagrant@vagrant:~$ ll /etc/cni/net.d/
total 20
drwxr-xr-x 2 root root 4096 Apr 12 12:49 ./
drwxr-xr-x 3 root root 4096 Apr 11 14:56 ../
-rw-rw-r-- 1 root root  526 Apr 12 12:01 10-calico.conflist
-rw-r--r-- 1 root root  292 Apr 12 12:49 10-flannel.conflist
-rw------- 1 root root 2623 Apr 12 12:01 calico-kubeconfig
vagrant@vagrant:~$ 

「10-calico.conflist」、「calico-kubeconfig」を削除すると、corednsのSTATUSがRunningになりました。

10
3
1

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
10
3