0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

OpenShift 4.12 インストール時の OAuthServerRouteEndpointAccessibleControllerAvailable 解決方法

Posted at

OpenShiftを検証用に一時的に導入しようとした際に、性能が低いマシンを利用するとインストール途中で不整合となりインストール処理が進まなくなる事が多々ある。おそらく処理性能が不足し各コンテナ間の協調処理に失敗していることが原因ではないかと考える。

本記事では、原因の特定や根本対応方法は調査できていないため記載されていない。インストール処理が進められるように、OAuthServerRouteEndpointAccessibleControllerAvailableを解消できた手順を記載したものである。

OpenShiftの導入は凡そ以下のよう流れである。冒頭で触れた通り、おそらく性能が高いマシンを利用する際は、各ノードの起動順序はあまり気にせずうまくインストール作業が進むのかもしれない。

1.bootstrapノード起動
2.masterノード起動
3.masterノード承認(approve csr)
4.bootstrapノードインストール状況確認
5.bootstrapノード振分停止
6.workerノード起動
7.workerノード承認(approve csr)
8.インストール状況確認

masterノード承認

masterノードを起動し暫くすると以下のようなPending状態のcsrが出力されるので、approve(承認)する。

$ oc get csr|grep -v Approved
NAME        AGE    SIGNERNAME                                    REQUESTOR                                                                   REQUESTEDDURATION   CONDITION
csr-trxr4   102m   kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   <none>              Pending
csr-6zx9n   111m   kubernetes.io/kubelet-serving                 system:node:master02                                                        <none>              Pending
csr-q5fs5   117m   kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   <none>              Pending
〜省略〜
$ oc adm certiicate approve csr-dklhf
certificatesigningrequest.certificates.k8s.io/csr-dklhf approved

bootstrapノードインストール状況確認

masterノード承認が終わり暫くすると、以下最後の出力のように「DEBUG Bootstrap status: complete 」が出力されるので、bootstrapノード振分停止(remove the bootstrap resources)する。

$ openshift-install --dir=work/openshift wait-for bootstrap-complete --log-level=debug
DEBUG OpenShift Installer 4.12.10                  
DEBUG Built from commit 5ba9890f636459b49a927a3bcb35a81929a0fc73 
INFO Waiting up to 20m0s (until 2:17PM) for the Kubernetes API at https://api.cluster.example.local:6443... 
DEBUG Loading Agent Config...
DEBUG Still waiting for the Kubernetes API: the server has asked for the client to provide credentials 
DEBUG Still waiting for the Kubernetes API: the server has asked for the client to provide credentials 
DEBUG Still waiting for the Kubernetes API: the server has asked for the client to provide credentials 
DEBUG Still waiting for the Kubernetes API: the server has asked for the client to provide credentials 
INFO API v1.25.7+eab9cc9 up                       
DEBUG Loading Install Config...                    
DEBUG   Loading SSH Key...                         
DEBUG   Loading Base Domain...                     
DEBUG     Loading Platform...                      
DEBUG   Loading Cluster Name...                    
DEBUG     Loading Base Domain...                   
DEBUG     Loading Platform...                      
DEBUG   Loading Networking...                      
DEBUG     Loading Platform...                      
DEBUG   Loading Pull Secret...                     
DEBUG   Loading Platform...      
DEBUG Using Install Config loaded from state file  
INFO Waiting up to 30m0s (until 2:27PM) for bootstrapping to complete... 
DEBUG Bootstrap status: complete                   
INFO It is now safe to remove the bootstrap resources 
INFO Time elapsed: 0s                             
$ 

workerノード承認

masterノード承認と同様に実施

インストール状況確認

以下の状態から何も進まなくなり、そのまあm時間経過でタイムアウトエラーとなっていた。

$ openshift-install --dir=work/openshift wait-for install-complete --log-level=debug
DEBUG OpenShift Installer 4.12.10                  
DEBUG Built from commit 5ba9890f636459b49a927a3bcb35a81929a0fc73 
DEBUG Loading Install Config...                    
DEBUG   Loading SSH Key...                         
DEBUG   Loading Base Domain...                     
DEBUG     Loading Platform...                      
DEBUG   Loading Cluster Name...                    
DEBUG     Loading Base Domain...                   
DEBUG     Loading Platform...                      
DEBUG   Loading Networking...                      
DEBUG     Loading Platform...                      
DEBUG   Loading Pull Secret...                     
DEBUG   Loading Platform...                        
DEBUG Using Install Config loaded from state file  
DEBUG Loading Agent Config...                      
INFO Waiting up to 40m0s (until 3:32PM) for the cluster at https://api.cluster.example.local:6443 to initialize...                       

ClusterOperatorsリソースの状況確認

以下のように3つのリソースが正常に動作していないと推測。
各種ログを確認し、おそらく console -> authentication -> ingress の順番で依存している事を推測。

$ oc get co
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.12.10   False       False         True       104m    OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.cluster.example.local/healthz": EOF
baremetal                                  4.12.10   True        False         False      103m    
cloud-controller-manager                   4.12.10   True        False         False      105m    
cloud-credential                           4.12.10   True        False         False      113m    
cluster-autoscaler                         4.12.10   True        False         False      101m    
config-operator                            4.12.10   True        False         False      104m    
console                                    4.12.10   False       True          False      84m     DeploymentAvailable: 0 replicas available for console deployment...
control-plane-machine-set                  4.12.10   True        False         False      101m    
csi-snapshot-controller                    4.12.10   True        False         False      99m     
dns                                        4.12.10   True        False         False      99m     
etcd                                       4.12.10   True        False         False      9m1s    
image-registry                             4.12.10   True        False         False      81m     
ingress                                    4.12.10   True        False         True       89m     The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing)
insights                                   4.12.10   True        False         False      91m     
kube-apiserver                             4.12.10   True        False         False      81m     
kube-controller-manager                    4.12.10   True        False         False      99m     
kube-scheduler                             4.12.10   True        False         False      101m    
kube-storage-version-migrator              4.12.10   True        False         False      99m     
machine-api                                4.12.10   True        False         False      102m    
machine-approver                           4.12.10   True        False         False      103m    
machine-config                             4.12.10   True        False         False      19m     
〜省略〜

試行錯誤の結果、以下のroute-default-XXX Podsを1つ意図的に削除し、再作成することで上記のClusterOperatorsリソースが正常状態へ遷移した。

$ oc get pods -n openshift-ingress
NAME                             READY   STATUS    RESTARTS   AGE
router-default-98bf9dc84-xbrhn   1/1     Running   0          3h11m
router-default-98bf9dc84-zc67g   1/1     Running   0          87m
$ oc delete pods router-default-98bf9dc84-zc67g -n openshift-ingress
pod "router-default-98bf9dc84-zc67g" deleted

インストール完了確認

$ openshift-install --dir=work/openshift wait-for install-complete --log-level=debug
DEBUG OpenShift Installer 4.12.10                  
〜省略〜
INFO Waiting up to 40m0s (until 3:45PM) for the cluster at https://api.cluster.example.local:6443 to initialize... 
DEBUG Cluster is initialized                       
INFO Checking to see if there is a route at openshift-console/console... 
DEBUG Route found in openshift-console namespace: console 
DEBUG OpenShift console route is admitted          
INFO Install complete!                            
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=work/openshift/auth/kubeconfig' 
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.cluster.example.com 
INFO Login to the console with user: "kubeadmin", and password: "ABCDEFGHIJKLMNOPQRSTUVW" 
INFO Time elapsed: 0s                             
$
0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?