LoginSignup
3
1

More than 3 years have passed since last update.

ラズパイクラスタで「Welcome to nginx!」をしようとしたら結構大変だった

Posted at

ラズパイ4の4GBx2でクラスタを組もうとして、MetalLBでtype:loadBalancerを使いつつ、nginxを動かせるようにしてみた

クラスタの構築

↓ここを参考にした↓
https://qiita.com/yasthon/items/c29d0b9ce34d66eab3ec

MetalLBの設定

↓この内容で、MetalLBの設定をした↓

metallb-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: metallb-system
  name: config
data:
  config: |
    address-pools:
    - name: default
      protocol: layer2
      addresses:
      - 10.155.0.51-10.155.0.253

そして事件は起きた

なぜか、作ったLoadBalancerに、External-IPが降ってこない
具体的には、EXTERNAL-IPがずっとpendigになっている

MetalLBのコントローラーのログを見てみる

ubuntu@raspi-01:~/.kube$ kubectl logs controller-64f86798cc-x8xcr -n metallb-system
{"branch":"HEAD","caller":"main.go:142","commit":"v0.9.6","msg":"MetalLB controller starting version 0.9.6 (commit v0.9.6, branch HEAD)","ts":"2021-05-07T00:17:43.878059907Z","version":"0.9.6"}
I0507 00:18:14.179787       1 trace.go:205] Trace[1298498081]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:17:44.078) (total time: 30101ms):
Trace[1298498081]: [30.101140624s] [30.101140624s] END
E0507 00:18:14.180080       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0507 00:18:14.277716       1 trace.go:205] Trace[1427131847]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:17:44.078) (total time: 30198ms):
Trace[1427131847]: [30.198637811s] [30.198637811s] END
E0507 00:18:14.277873       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: Get https://10.96.0.1:443/api/v1/namespaces/metallb-system/configmaps?fieldSelector=metadata.name%3Dconfig&limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0507 00:18:45.429836       1 trace.go:205] Trace[911902081]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:18:15.428) (total time: 30001ms):
Trace[911902081]: [30.001390571s] [30.001390571s] END
E0507 00:18:45.429960       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: Get https://10.96.0.1:443/api/v1/namespaces/metallb-system/configmaps?fieldSelector=metadata.name%3Dconfig&limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0507 00:18:45.734039       1 trace.go:205] Trace[140954425]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:18:15.732) (total time: 30001ms):
Trace[140954425]: [30.001163647s] [30.001163647s] END
E0507 00:18:45.734146       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0507 00:19:17.586176       1 trace.go:205] Trace[208240456]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:18:47.584) (total time: 30001ms):
Trace[208240456]: [30.001208258s] [30.001208258s] END
E0507 00:19:17.586324       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0507 00:19:18.130903       1 trace.go:205] Trace[1106410694]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:18:48.129) (total time: 30001ms):
Trace[1106410694]: [30.001594995s] [30.001594995s] END
E0507 00:19:18.131033       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: Get https://10.96.0.1:443/api/v1/namespaces/metallb-system/configmaps?fieldSelector=metadata.name%3Dconfig&limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0507 00:19:51.751033       1 trace.go:205] Trace[460128162]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:19:21.749) (total time: 30001ms):
Trace[460128162]: [30.001316224s] [30.001316224s] END
E0507 00:19:51.751166       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0507 00:19:53.936670       1 trace.go:205] Trace[683024728]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:19:23.935) (total time: 30001ms):
Trace[683024728]: [30.001505965s] [30.001505965s] END
E0507 00:19:53.936788       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: Get https://10.96.0.1:443/api/v1/namespaces/metallb-system/configmaps?fieldSelector=metadata.name%3Dconfig&limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

↓これっぽい

Trace[683024728]: [30.001505965s] [30.001505965s] END
E0507 00:19:53.936788       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: Get https://10.96.0.1:443/api/v1/namespaces/metallb-system/configmaps?fieldSelector=metadata.name%3Dconfig&limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

???タイムアウト???

コントローラーから、クラスタへの疎通ができていないっぽい

Pod・ServiceのIP範囲を変えて再構築→変化なし

ラズパイたちが接続しているネットワークのCIDRが 10.155.0.0/16 なので、もしかしたらそれが被っているのが原因かもしれないと思い、PodとServiceのIPアドレス範囲を 172.16.0.0/16に変更してみた

→変化なし

解決

↓こんな記事を見つけた↓
https://blog.net.ist.i.kyoto-u.ac.jp/2019/11/06/kubernetes-%E6%97%A5%E8%A8%98-2019-11-05/

原因は、デフォルトの --service-cluster-ip-range (10.96.0.0/12) と指定した --pod-network-cidr (10.100.0.0/16) が重複していた結果、node に割り当てられるレンジ(/24)が無くなって PodCIDR が付かなくなったのが原因でした。

ほ???
flannelのpodが一部エラーになっていた

ubuntu@raspi-01:~$ kubectl get pods -A
NAMESPACE        NAME                                                         READY   STATUS             RESTARTS   AGE
kube-system      coredns-558bd4d5db-6mwsx                                     0/1     Running            0          14m
kube-system      coredns-558bd4d5db-kj287                                     0/1     Running            0          14m
kube-system      etcd-raspi-01.clusters.local                                 1/1     Running            0          14m
kube-system      kube-apiserver-raspi-01.clusters.local                       1/1     Running            0          14m
kube-system      kube-controller-manager-raspi-01.clusters.local              1/1     Running            0          14m
kube-system      kube-flannel-ds-sgd6f                                        0/1     CrashLoopBackOff   5          5m47s
kube-system      kube-flannel-ds-x6gvz                                        0/1     CrashLoopBackOff   5          5m47s
kube-system      kube-proxy-8gvwb                                             1/1     Running            0          12m
kube-system      kube-proxy-qp59q                                             1/1     Running            0          14m
kube-system      kube-scheduler-raspi-01.clusters.local                       1/1     Running            0          14m
metallb-system   controller-64f86798cc-d7wbq                                  1/1     Running            0          3m20s
metallb-system   speaker-gvsck                                                1/1     Running            0          3m20s
metallb-system   speaker-j4x6s                                                1/1     Running            0          3m20s
ubuntu@raspi-01:~$ kubectl get events
LAST SEEN   TYPE     REASON                    OBJECT                                    MESSAGE
16m         Normal   NodeHasSufficientMemory   node/raspi-01.clusters.local              Node raspi-01.clusters.local status is now: NodeHasSufficientMemory
16m         Normal   NodeHasNoDiskPressure     node/raspi-01.clusters.local              Node raspi-01.clusters.local status is now: NodeHasNoDiskPressure
16m         Normal   NodeHasSufficientPID      node/raspi-01.clusters.local              Node raspi-01.clusters.local status is now: NodeHasSufficientPID
16m         Normal   NodeAllocatableEnforced   node/raspi-01.clusters.local              Updated Node Allocatable limit across pods
104s        Normal   CIDRNotAvailable          node/raspi-01.clusters.local              Node raspi-01.clusters.local status is now: CIDRNotAvailable
15m         Normal   RegisteredNode            node/raspi-01.clusters.local              Node raspi-01.clusters.local event: Registered Node raspi-01.clusters.local in Controller

ServiceのIP範囲とPodのIP範囲は重複してはいけないらしい・・・

というわけで、IP範囲を被らせないように、クラスタを再構成

ubuntu@raspi-01:~$ sudo kubeadm init --apiserver-advertise-address=10.155.0.2 --pod-network-cidr=172.16.0.0/16 --service-cidr=172.18.0.0/16

今度はちゃんとRunningになった

ubuntu@raspi-01:~$ kubectl get pods -A
NAMESPACE     NAME                                                         READY   STATUS    RESTARTS   AGE
kube-system   coredns-558bd4d5db-69th7                                     1/1     Running   0          114s
kube-system   coredns-558bd4d5db-wwmfr                                     1/1     Running   0          114s
kube-system   etcd-raspi-01.clusters.local                                 1/1     Running   0          111s
kube-system   kube-apiserver-raspi-01.clusters.local                       1/1     Running   0          111s
kube-system   kube-controller-manager-raspi-01.clusters.local              1/1     Running   0          111s
kube-system   kube-proxy-6cx9f                                             1/1     Running   0          114s
kube-system   kube-proxy-sf2q2                                             1/1     Running   0          77s
kube-system   kube-scheduler-raspi-01.clusters.local                       1/1     Running   0          111s

MetalLBも再設定すると・・・

3日間を費やしました・・・

3
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
3
1