ラズパイ4の4GBx2でクラスタを組もうとして、MetalLBでtype:loadBalancer
を使いつつ、nginxを動かせるようにしてみた
クラスタの構築
↓ここを参考にした↓
https://qiita.com/yasthon/items/c29d0b9ce34d66eab3ec
MetalLBの設定
↓この内容で、MetalLBの設定をした↓
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- 10.155.0.51-10.155.0.253
そして事件は起きた
めげずにラズパイクラスタやってみてるけど、External-IPが降ってこない・・・! pic.twitter.com/4mo3dneDDs
— べあ🐻 (@beah_s1) May 6, 2021
なぜか、作ったLoadBalancerに、External-IPが降ってこない
具体的には、EXTERNAL-IP
がずっとpendig
になっている
MetalLBのコントローラーのログを見てみる
ubuntu@raspi-01:~/.kube$ kubectl logs controller-64f86798cc-x8xcr -n metallb-system
{"branch":"HEAD","caller":"main.go:142","commit":"v0.9.6","msg":"MetalLB controller starting version 0.9.6 (commit v0.9.6, branch HEAD)","ts":"2021-05-07T00:17:43.878059907Z","version":"0.9.6"}
I0507 00:18:14.179787 1 trace.go:205] Trace[1298498081]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:17:44.078) (total time: 30101ms):
Trace[1298498081]: [30.101140624s] [30.101140624s] END
E0507 00:18:14.180080 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0507 00:18:14.277716 1 trace.go:205] Trace[1427131847]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:17:44.078) (total time: 30198ms):
Trace[1427131847]: [30.198637811s] [30.198637811s] END
E0507 00:18:14.277873 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: Get https://10.96.0.1:443/api/v1/namespaces/metallb-system/configmaps?fieldSelector=metadata.name%3Dconfig&limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0507 00:18:45.429836 1 trace.go:205] Trace[911902081]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:18:15.428) (total time: 30001ms):
Trace[911902081]: [30.001390571s] [30.001390571s] END
E0507 00:18:45.429960 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: Get https://10.96.0.1:443/api/v1/namespaces/metallb-system/configmaps?fieldSelector=metadata.name%3Dconfig&limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0507 00:18:45.734039 1 trace.go:205] Trace[140954425]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:18:15.732) (total time: 30001ms):
Trace[140954425]: [30.001163647s] [30.001163647s] END
E0507 00:18:45.734146 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0507 00:19:17.586176 1 trace.go:205] Trace[208240456]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:18:47.584) (total time: 30001ms):
Trace[208240456]: [30.001208258s] [30.001208258s] END
E0507 00:19:17.586324 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0507 00:19:18.130903 1 trace.go:205] Trace[1106410694]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:18:48.129) (total time: 30001ms):
Trace[1106410694]: [30.001594995s] [30.001594995s] END
E0507 00:19:18.131033 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: Get https://10.96.0.1:443/api/v1/namespaces/metallb-system/configmaps?fieldSelector=metadata.name%3Dconfig&limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0507 00:19:51.751033 1 trace.go:205] Trace[460128162]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:19:21.749) (total time: 30001ms):
Trace[460128162]: [30.001316224s] [30.001316224s] END
E0507 00:19:51.751166 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0507 00:19:53.936670 1 trace.go:205] Trace[683024728]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167 (07-May-2021 00:19:23.935) (total time: 30001ms):
Trace[683024728]: [30.001505965s] [30.001505965s] END
E0507 00:19:53.936788 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: Get https://10.96.0.1:443/api/v1/namespaces/metallb-system/configmaps?fieldSelector=metadata.name%3Dconfig&limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
↓これっぽい
Trace[683024728]: [30.001505965s] [30.001505965s] END
E0507 00:19:53.936788 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: Get https://10.96.0.1:443/api/v1/namespaces/metallb-system/configmaps?fieldSelector=metadata.name%3Dconfig&limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
???タイムアウト???
コントローラーから、クラスタへの疎通ができていないっぽい
Pod・ServiceのIP範囲を変えて再構築→変化なし
ラズパイたちが接続しているネットワークのCIDRが 10.155.0.0/16
なので、もしかしたらそれが被っているのが原因かもしれないと思い、PodとServiceのIPアドレス範囲を 172.16.0.0/16
に変更してみた
→変化なし
解決
↓こんな記事を見つけた↓
https://blog.net.ist.i.kyoto-u.ac.jp/2019/11/06/kubernetes-%E6%97%A5%E8%A8%98-2019-11-05/
原因は、デフォルトの --service-cluster-ip-range (10.96.0.0/12) と指定した --pod-network-cidr (10.100.0.0/16) が重複していた結果、node に割り当てられるレンジ(/24)が無くなって PodCIDR が付かなくなったのが原因でした。
ほ???
flannelのpodが一部エラーになっていた
ubuntu@raspi-01:~$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-558bd4d5db-6mwsx 0/1 Running 0 14m
kube-system coredns-558bd4d5db-kj287 0/1 Running 0 14m
kube-system etcd-raspi-01.clusters.local 1/1 Running 0 14m
kube-system kube-apiserver-raspi-01.clusters.local 1/1 Running 0 14m
kube-system kube-controller-manager-raspi-01.clusters.local 1/1 Running 0 14m
kube-system kube-flannel-ds-sgd6f 0/1 CrashLoopBackOff 5 5m47s
kube-system kube-flannel-ds-x6gvz 0/1 CrashLoopBackOff 5 5m47s
kube-system kube-proxy-8gvwb 1/1 Running 0 12m
kube-system kube-proxy-qp59q 1/1 Running 0 14m
kube-system kube-scheduler-raspi-01.clusters.local 1/1 Running 0 14m
metallb-system controller-64f86798cc-d7wbq 1/1 Running 0 3m20s
metallb-system speaker-gvsck 1/1 Running 0 3m20s
metallb-system speaker-j4x6s 1/1 Running 0 3m20s
ubuntu@raspi-01:~$ kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
16m Normal NodeHasSufficientMemory node/raspi-01.clusters.local Node raspi-01.clusters.local status is now: NodeHasSufficientMemory
16m Normal NodeHasNoDiskPressure node/raspi-01.clusters.local Node raspi-01.clusters.local status is now: NodeHasNoDiskPressure
16m Normal NodeHasSufficientPID node/raspi-01.clusters.local Node raspi-01.clusters.local status is now: NodeHasSufficientPID
16m Normal NodeAllocatableEnforced node/raspi-01.clusters.local Updated Node Allocatable limit across pods
104s Normal CIDRNotAvailable node/raspi-01.clusters.local Node raspi-01.clusters.local status is now: CIDRNotAvailable
15m Normal RegisteredNode node/raspi-01.clusters.local Node raspi-01.clusters.local event: Registered Node raspi-01.clusters.local in Controller
ServiceのIP範囲とPodのIP範囲は重複してはいけないらしい・・・
というわけで、IP範囲を被らせないように、クラスタを再構成
ubuntu@raspi-01:~$ sudo kubeadm init --apiserver-advertise-address=10.155.0.2 --pod-network-cidr=172.16.0.0/16 --service-cidr=172.18.0.0/16
今度はちゃんとRunning
になった
ubuntu@raspi-01:~$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-558bd4d5db-69th7 1/1 Running 0 114s
kube-system coredns-558bd4d5db-wwmfr 1/1 Running 0 114s
kube-system etcd-raspi-01.clusters.local 1/1 Running 0 111s
kube-system kube-apiserver-raspi-01.clusters.local 1/1 Running 0 111s
kube-system kube-controller-manager-raspi-01.clusters.local 1/1 Running 0 111s
kube-system kube-proxy-6cx9f 1/1 Running 0 114s
kube-system kube-proxy-sf2q2 1/1 Running 0 77s
kube-system kube-scheduler-raspi-01.clusters.local 1/1 Running 0 111s
MetalLBも再設定すると・・・
うおおおキタァァァ pic.twitter.com/9WMp7gyCEa
— べあ🐻 (@beah_s1) May 7, 2021
3日間を費やしました・・・