0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

Ubuntu で Kubeadm により k8s 環境構築実験(4)

Last updated at Posted at 2024-11-03

Ubuntu で Kubeadm により k8s 環境構築実験(3)からの続きです。

前回は、おかしくなってしまったクラスタを修復するので終わってしまったので、今回こそは、ワーカーノード2を一から追加してみたいです。

ubuntu@Worker-Node2:~$ sudo kubeadm join 10.0.11.67:6443 --token <TOKEN>         --discovery-token-ca-cert-hash sha256:<HASH>
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 524.612332ms
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

前回までに必要な設定を行っていたために、新しい join コマンドの設定で、クラスタに参加できました。
確認します。

ubuntu@Worker-Node2:~$ kubectl get nodes
E1103 18:11:38.097921   66164 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://10.0.11.67:6443/api?timeout=32s\": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubernetes\")"
E1103 18:11:38.107360   66164 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://10.0.11.67:6443/api?timeout=32s\": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubernetes\")"
E1103 18:11:38.115637   66164 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://10.0.11.67:6443/api?timeout=32s\": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubernetes\")"
E1103 18:11:38.124100   66164 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://10.0.11.67:6443/api?timeout=32s\": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubernetes\")"
E1103 18:11:38.132349   66164 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://10.0.11.67:6443/api?timeout=32s\": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubernetes\")"
Unable to connect to the server: tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

あれ、おかしいな。
マスターノードで確認してみます。

ubuntu@Master-Node:~$ kubectl get nodes
NAME           STATUS     ROLES           AGE     VERSION
master-node    Ready      control-plane   18h     v1.31.2
worker-node    Ready      <none>          16h     v1.31.2
worker-node2   NotReady   <none>          4m10s   v1.31.2

STATUS が NotReady となっています。
.kube/config を確認すると、client-key-dataマスターノードのものと異なっています。
マスターノードに合わせます。もう一度確認します。

ubuntu@Worker-Node2:~$ kubectl get nodes
error: tls: private key does not match public key

うーん、一度、worker-node2をクラスタから削除したいです。
マスターノードで、次のコマンドを実行します。

ubuntu@Master-Node:~$ kubectl drain worker-node2 --force --ignore-daemonsets
node/worker-node2 cordoned
Warning: ignoring DaemonSet-managed Pods: kube-system/kube-proxy-z72lc
node/worker-node2 drained
ubuntu@Master-Node:~$ kubectl get nodes
NAME           STATUS                        ROLES           AGE   VERSION
master-node    Ready                         control-plane   19h   v1.31.2
worker-node    Ready                         <none>          17h   v1.31.2
worker-node2   NotReady,SchedulingDisabled   <none>          22m   v1.31.2

ワーカーノード2では、次のコマンドを実行します。

ubuntu@Worker-Node2:~$ sudo kubeadm reset
W1103 18:33:03.636816   68759 preflight.go:56] [reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W1103 18:33:05.753008   68759 removeetcdmember.go:106] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Deleted contents of the etcd data directory: /var/lib/etcd
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of directories: [/etc/kubernetes/manifests /var/lib/kubelet /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/super-admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.

マスターノードでワーカーノード2を削除します。

ubuntu@Master-Node:~$ kubectl delete node worker-node2
node "worker-node2" deleted
ubuntu@Master-Node:~$ kubectl get nodes
NAME          STATUS   ROLES           AGE   VERSION
master-node   Ready    control-plane   19h   v1.31.2
worker-node   Ready    <none>          17h   v1.31.2

もう一度 join を実施しても、変化なしです。
確認すると、ワーカーノードの /etc/kubernetes/admin.conf が存在していません。
マスターノードのものをコピーして、このコマンドを実行します。

ubuntu@Worker-Node2:~$  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
cp: overwrite '/home/ubuntu/.kube/config'? y

確認します。

ubuntu@Worker-Node2:~$ kubectl get nodes
NAME           STATUS     ROLES           AGE    VERSION
master-node    Ready      control-plane   19h    v1.31.2
worker-node    Ready      <none>          17h    v1.31.2
worker-node2   NotReady   <none>          9m2s   v1.31.2

worker-node2 が NotReadyとなっているのはなぜなんだろう。node を describe してみます。

ubuntu@Worker-Node2:~$ kubectl describe node worker-node2
Name:               worker-node2
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=worker-node2
                    kubernetes.io/os=linux
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/containerd/containerd.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Sun, 03 Nov 2024 18:55:10 +0000
Taints:             node.kubernetes.io/not-ready:NoExecute
                    node.kubernetes.io/not-ready:NoSchedule
Unschedulable:      false
Lease:
  HolderIdentity:  worker-node2
  AcquireTime:     <unset>
  RenewTime:       Sun, 03 Nov 2024 19:15:44 +0000
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Sun, 03 Nov 2024 19:14:22 +0000   Sun, 03 Nov 2024 18:55:10 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Sun, 03 Nov 2024 19:14:22 +0000   Sun, 03 Nov 2024 18:55:10 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Sun, 03 Nov 2024 19:14:22 +0000   Sun, 03 Nov 2024 18:55:10 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            False   Sun, 03 Nov 2024 19:14:22 +0000   Sun, 03 Nov 2024 18:55:10 +0000   KubeletNotReady              container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
Addresses:
  InternalIP:  10.0.12.218
  Hostname:    worker-node2
Capacity:
  cpu:                2
  ephemeral-storage:  7034376Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             936100Ki
  pods:               110
Allocatable:
  cpu:                2
  ephemeral-storage:  6482880911
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             833700Ki
  pods:               110
System Info:
  Machine ID:                 ec29f53375b92e24d6186a6cad778ae0
  System UUID:                ec29f533-75b9-2e24-d618-6a6cad778ae0
  Boot ID:                    e39c4aa5-dda2-493d-94b3-130e61668978
  Kernel Version:             6.8.0-1016-aws
  OS Image:                   Ubuntu 24.04.1 LTS
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://2.0.0-rc.6
  Kubelet Version:            v1.31.2
  Kube-Proxy Version:         v1.31.2
PodCIDR:                      192.168.4.0/24
PodCIDRs:                     192.168.4.0/24
Non-terminated Pods:          (1 in total)
  Namespace                   Name                CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                ------------  ----------  ---------------  -------------  ---
  kube-system                 kube-proxy-hk462    0 (0%)        0 (0%)      0 (0%)           0 (0%)         20m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests  Limits
  --------           --------  ------
  cpu                0 (0%)    0 (0%)
  memory             0 (0%)    0 (0%)
  ephemeral-storage  0 (0%)    0 (0%)
  hugepages-1Gi      0 (0%)    0 (0%)
  hugepages-2Mi      0 (0%)    0 (0%)
Events:
  Type     Reason                   Age                From             Message
  ----     ------                   ----               ----             -------
  Normal   Starting                 51m                kubelet          Starting kubelet.
  Warning  InvalidDiskCapacity      51m                kubelet          invalid capacity 0 on image filesystem
  Normal   NodeAllocatableEnforced  51m                kubelet          Updated Node Allocatable limit across pods
  Normal   NodeHasSufficientMemory  51m                kubelet          Node worker-node2 status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    51m                kubelet          Node worker-node2 status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     51m                kubelet          Node worker-node2 status is now: NodeHasSufficientPID
  Normal   NodeNotSchedulable       44m                kubelet          Node worker-node2 status is now: NodeNotSchedulable
  Warning  InvalidDiskCapacity      37m                kubelet          invalid capacity 0 on image filesystem
  Normal   NodeHasSufficientMemory  37m (x2 over 37m)  kubelet          Node worker-node2 status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    37m (x2 over 37m)  kubelet          Node worker-node2 status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     37m (x2 over 37m)  kubelet          Node worker-node2 status is now: NodeHasSufficientPID
  Normal   NodeAllocatableEnforced  37m                kubelet          Updated Node Allocatable limit across pods
  Normal   Starting                 31m                kubelet          Starting kubelet.
  Warning  InvalidDiskCapacity      31m                kubelet          invalid capacity 0 on image filesystem
  Normal   NodeAllocatableEnforced  31m                kubelet          Updated Node Allocatable limit across pods
  Normal   NodeNotSchedulable       23m                kubelet          Node worker-node2 status is now: NodeNotSchedulable
  Normal   NodeSchedulable          22m                kubelet          Node worker-node2 status is now: NodeSchedulable
  Normal   NodeHasSufficientMemory  22m (x7 over 31m)  kubelet          Node worker-node2 status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    22m (x7 over 31m)  kubelet          Node worker-node2 status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     22m (x7 over 31m)  kubelet          Node worker-node2 status is now: NodeHasSufficientPID
  Warning  InvalidDiskCapacity      20m                kubelet          invalid capacity 0 on image filesystem
  Normal   Starting                 20m                kubelet          Starting kubelet.
  Normal   NodeHasNoDiskPressure    20m (x2 over 20m)  kubelet          Node worker-node2 status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientMemory  20m (x2 over 20m)  kubelet          Node worker-node2 status is now: NodeHasSufficientMemory
  Normal   NodeHasSufficientPID     20m (x2 over 20m)  kubelet          Node worker-node2 status is now: NodeHasSufficientPID
  Normal   NodeAllocatableEnforced  20m                kubelet          Updated Node Allocatable limit across pods
  Normal   RegisteredNode           20m                node-controller  Node worker-node2 event: Registered Node worker-node2 in Controller
  Normal   Starting                 11m                kubelet          Starting kubelet.
  Warning  InvalidDiskCapacity      11m                kubelet          invalid capacity 0 on image filesystem
  Normal   NodeAllocatableEnforced  11m                kubelet          Updated Node Allocatable limit across pods
  Normal   NodeHasSufficientMemory  11m                kubelet          Node worker-node2 status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    11m                kubelet          Node worker-node2 status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     11m                kubelet          Node worker-node2 status is now: NodeHasSufficientPID

ちなみに、正常な Worker-Node の方は、こうなっています。worker-node2 上で、worker-node を確認しました。

ubuntu@Worker-Node2:~$ kubectl describe node worker-node
Name:               worker-node
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=worker-node
                    kubernetes.io/os=linux
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/containerd/containerd.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Sun, 03 Nov 2024 01:20:35 +0000
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  worker-node
  AcquireTime:     <unset>
  RenewTime:       Sun, 03 Nov 2024 19:20:06 +0000
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Sun, 03 Nov 2024 19:16:57 +0000   Sun, 03 Nov 2024 01:20:35 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Sun, 03 Nov 2024 19:16:57 +0000   Sun, 03 Nov 2024 01:20:35 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Sun, 03 Nov 2024 19:16:57 +0000   Sun, 03 Nov 2024 01:20:35 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Sun, 03 Nov 2024 19:16:57 +0000   Sun, 03 Nov 2024 01:20:36 +0000   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  10.0.11.173
  Hostname:    worker-node
Capacity:
  cpu:                2
  ephemeral-storage:  7034376Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             936104Ki
  pods:               110
Allocatable:
  cpu:                2
  ephemeral-storage:  6482880911
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             833704Ki
  pods:               110
System Info:
  Machine ID:                 ec2ee19a652380319af6e6df73b52a9d
  System UUID:                ec2ee19a-6523-8031-9af6-e6df73b52a9d
  Boot ID:                    692f111f-94a0-4cd7-9bec-8f8003122da5
  Kernel Version:             6.8.0-1017-aws
  OS Image:                   Ubuntu 24.04.1 LTS
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://2.0.0-rc.6
  Kubelet Version:            v1.31.2
  Kube-Proxy Version:         v1.31.2
PodCIDR:                      192.168.1.0/24
PodCIDRs:                     192.168.1.0/24
Non-terminated Pods:          (1 in total)
  Namespace                   Name                CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                ------------  ----------  ---------------  -------------  ---
  kube-system                 kube-proxy-qdpkp    0 (0%)        0 (0%)      0 (0%)           0 (0%)         17h
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests  Limits
  --------           --------  ------
  cpu                0 (0%)    0 (0%)
  memory             0 (0%)    0 (0%)
  ephemeral-storage  0 (0%)    0 (0%)
  hugepages-1Gi      0 (0%)    0 (0%)
  hugepages-2Mi      0 (0%)    0 (0%)
Events:              <none>

Events にいろいろ出ているようです。どうしたら Ready になるのか。何が悪いのか。

(参考)
https://monowar-mukul.medium.com/kubernetes-remove-worker-node-from-the-cluster-and-completely-uninstall-af41e00c1244
https://komodor.com/learn/how-to-fix-kubernetes-node-not-ready-error/

Ubuntu で Kubeadm により k8s 環境構築実験(5)に続く。

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?