0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

Ubuntu で Kubeadm により k8s 環境構築実験(3)

Last updated at Posted at 2024-11-03

Ubuntu で Kubeadm により k8s 環境構築実験(2)からの続きです。

2台めのワーカーノードを追加してみようとしたら、kubeadm join でエラーになりました。

 kubeadm join 10.0.11.67:6443 --token <TOKEN> \
        --discovery-token-ca-cert-hash sha256:<HASH>
[preflight] Running pre-flight checks
error execution phase preflight: couldn't validate the identity of the API Server: failed to request the cluster-info ConfigMap: Get "https://10.0.11.67:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": context deadline exceeded
To see the stack trace of this error execute with --v=5 or higher

10.0.11.67 は、マスターノードの IP アドレスです。
マスターノードが壊れてしまったのでしょうか。

新しいワーカーノードを追加するどころか、マスターノードもおかしくなっているようです。
これは、新しいワーカーノードの追加をあきらめ、なんとかマスターノードを直さないといけないです。

マスターノードで、もう一度、sudo kubeadm init を実行します。
kube-apiserver, kube-controller-manager, kube-scheduler, etcd が立ち上がっていたら、sudo kill 9 しておきます。

ubuntu@Master-Node:~$ sudo kubeadm init --pod-network-cidr=192.168.0.0/16
[init] Using Kubernetes version: v1.31.2
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
        [ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
        [ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
        [ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
        [ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

preflightcheck をパスすればいいのかもしれないのですが、オプションに何を渡したらいいのかわからないので、それぞれ、削除しました。

ubuntu@Master-Node:~$ sudo rm /etc/kubernetes/manifests/kube-apiserver.yaml
ubuntu@Master-Node:~$ sudo rm /etc/kubernetes/manifests/kube-controller-manager.yaml
ubuntu@Master-Node:~$ sudo rm /etc/kubernetes/manifests/kube-scheduler.yaml
ubuntu@Master-Node:~$ sudo rm /etc/kubernetes/manifests/etcd.yaml
ubuntu@Master-Node:~$ sudo rm -rf /var/lib/etcd

もう一度、kubeadm init を実行します。

ubuntu@Master-Node:~$ sudo kubeadm init --pod-network-cidr=192.168.0.0/16
[init] Using Kubernetes version: v1.31.2
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
        [ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
        [ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
        [ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
        [ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
ubuntu@Master-Node:~$ sudo rm /etc/kubernetes/manifests/kube-apiserver.yaml
ubuntu@Master-Node:~$ sudo rm /etc/kubernetes/manifests/kube-controller-manager.yaml
ubuntu@Master-Node:~$ sudo rm /etc/kubernetes/manifests/kube-scheduler.yaml
ubuntu@Master-Node:~$ sudo rm /etc/kubernetes/manifests/etcd.yaml
ubuntu@Master-Node:~$ sudo rm -rf /var/lib/etcd
ubuntu@Master-Node:~$ sudo kubeadm init --pod-network-cidr=192.168.0.0/16
[init] Using Kubernetes version: v1.31.2
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
W1102 23:24:18.467182   11919 checks.go:846] detected that the sandbox image "registry.k8s.io/pause:3.8" of the container runtime is inconsistent with that used by kubeadm.It is recommended to use "registry.k8s.io/pause:3.10" as the CRI sandbox image.
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Using existing ca certificate authority
[certs] Using existing apiserver certificate and key on disk
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] Using existing etcd/ca certificate authority
[certs] Using existing etcd/server certificate and key on disk
[certs] Using existing etcd/peer certificate and key on disk
[certs] Using existing etcd/healthcheck-client certificate and key on disk
[certs] Using existing apiserver-etcd-client certificate and key on disk
[certs] Using the existing "sa" key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/super-admin.conf"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/scheduler.conf"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 501.670093ms
[api-check] Waiting for a healthy API server. This can take up to 4m0s
[api-check] The API server is healthy after 20.502683967s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node master-node as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node master-node as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: fx9wwh.8br2g9eincnjdgex
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.0.11.67:6443 --token <TOKEN> \
        --discovery-token-ca-cert-hash sha256:<HASH>

このコマンドを実行します。

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

一応直ったみたい。

$ kubectl get nodes
NAME          STATUS   ROLES           AGE     VERSION
master-node   Ready    control-plane   5m14s   v1.31.2

では、ワーカーノード1で、kubeadm join を実行してみます。

ubuntu@Worker-Node:~$ sudo kubeadm join 10.0.11.67:6443 --token <TOKEN>         --discovery-token-ca-cert-hash sha256:<HASH>
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
        [ERROR Port-10250]: Port 10250 is in use
        [ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

ファイルを消します。

ubuntu@Worker-Node:~$ sudo rm -rf /etc/kubernetes/kubelet.conf
ubuntu@Worker-Node:~$ sudo rm -rf /etc/kubernetes/pki/ca.crt

port 10250 は、kubelet が使っています。

ubuntu@Worker-Node:~$ sudo systemctl stop kubelet

もう一度実行します。

ubuntu@Worker-Node:~$ sudo kubeadm join 10.0.11.67:6443 --token <TOKEN>         --discovery-token-ca-cert-hash sha256:<HASH>
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 502.248124ms
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap
error execution phase kubelet-start: error uploading crisocket: Unauthorized
To see the stack trace of this error execute with --v=5 or higher

ここで、kubectl get nodes を実行します。

ubuntu@Worker-Node:~$ kubectl get nodes
E1102 23:57:03.708258    2443 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://10.0.11.67:6443/api?timeout=32s\": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubernetes\")"
E1102 23:57:03.714075    2443 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://10.0.11.67:6443/api?timeout=32s\": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubernetes\")"
E1102 23:57:03.718729    2443 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://10.0.11.67:6443/api?timeout=32s\": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubernetes\")"
E1102 23:57:03.723403    2443 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://10.0.11.67:6443/api?timeout=32s\": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubernetes\")"
E1102 23:57:03.728156    2443 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://10.0.11.67:6443/api?timeout=32s\": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubernetes\")"
Unable to connect to the server: tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

だめです。
マスターノードから、admin.confをコピーしました。
このコマンドも実行します。

ubuntu@Worker-Node:~$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
ubuntu@Worker-Node:~$ sudo rm /etc/kubernetes/kubelet.conf
ubuntu@Worker-Node:~$ sudo rm /etc/kubernetes/pki/ca.crt
ubuntu@Worker-Node:~$ sudo systemctl stop kubelet

再度の join コマンドの実行で、このエラーとなりました。

error execution phase kubelet-start: error uploading crisocket: Unauthorized

--v=5 を追加して実行したところ、containerd のソケットファイルが邪魔をしているようでした。

I1103 00:17:53.042825    3287 initconfiguration.go:123] detected and using CRI socket: unix:///var/run/containerd/containerd.sock

削除します。

ubuntu@Worker-Node:~$ sudo rm /var/run/containerd/containerd.sock
ubuntu@Worker-Node:~$ sudo systemctl restart containerd

再度 join をしてみても、エラーとなりました。
このコマンドを実行します。

ubuntu@Worker-Node:~$ sudo swapoff -a    # will turn off the swap 
ubuntu@Worker-Node:~$ sudo kubeadm reset
ubuntu@Worker-Node:~$ sudo systemctl daemon-reload
ubuntu@Worker-Node:~$ sudo systemctl restart kubelet
ubuntu@Worker-Node:~$ sudo iptables -F && sudo iptables -t nat -F && sudo iptables -t mangle -F && sudo iptables -X

さあ、join できるようになったかな。

ubuntu@Worker-Node:~$ sudo kubeadm join 10.0.11.67:6443 --token <TOKEN>         --discovery-token-ca-cert-hash sha256:<HASH>
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.001234209s
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

できました!
では、確認してみます。

ubuntu@Worker-Node:~$ kubectl get nodes
NAME          STATUS   ROLES           AGE    VERSION
master-node   Ready    control-plane   116m   v1.31.2
worker-node   Ready    <none>          13s    v1.31.2

もう一台増やすつもりが、現状の不具合の修正で終わってしまいました。

(参考)Creating a cluster with kubeadm
https://stackoverflow.com/questions/53525975/kubernetes-error-uploading-crisocket-timed-out-waiting-for-the-condition/54540512#54540512

Ubuntu で Kubeadm により k8s 環境構築実験(4)につづく。

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?