1
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

備忘録:OpenShift AIのCR: DataScienceClusterコンポーネントをInfra NodeにDeployする方法

Last updated at Posted at 2024-10-11

三行まとめ

  • OpenShift AIを使う為にはCustomeResource「DataScienceCluster」を作成する必要がある
  • DataScienceClusterを作成する際に指定する事はできない
  • 各コンポーネントがDeployされた後にNodeSelector等の設定をする

Infra Nodeには以下のLabelとTaintsが付いているとする

metadata:
  labels:
    node-role.kubernetes.io/infra: ''
(中略)
spec:
  taints:
    - key: node-role.kubernetes.io/infra
      effect: NoSchedule

OpenShift AI OperatorをインストールしてからDataScienceClusterを作成する

image.png
※例ではデフォルト設定で作成している

各コンポーネントはNameSpace「redhat-ods-applications」にDeployされるので、しばし待つ。
※5,6分くらいは待つイメージ。

image.png

ocコマンドでProjectを変更する

oc project redhat-ods-applications

一応oc get deploymentで各コンポーネントを確認してみる

~ % oc get deployment 
NAME                                                 READY   UP-TO-DATE   AVAILABLE   AGE
codeflare-operator-manager                           1/1     1            1           37m
data-science-pipelines-operator-controller-manager   1/1     1            1           40m
etcd                                                 1/1     1            1           40m
kserve-controller-manager                            1/1     1            1           39m
kubeflow-training-operator                           1/1     1            1           34m
kuberay-operator                                     1/1     1            1           35m
kueue-controller-manager                             1/1     1            1           39m
modelmesh-controller                                 3/3     3            3           40m
notebook-controller-deployment                       1/1     1            1           41m
odh-model-controller                                 3/3     3            3           40m
odh-notebook-controller-manager                      1/1     1            1           41m
rhods-dashboard                                      4/5     5            4           42m
trustyai-service-operator-controller-manager         1/1     1            1           34m

以下のコマンドを流す

for deployment in $(kubectl get deployments -o jsonpath='{.items[*].metadata.name}'); do
  kubectl patch deployment $deployment --type='json' -p='[{"op": "add", "path": "/spec/template/spec/nodeSelector", "value": {"node-role.kubernetes.io/infra": ""}}, {"op": "add", "path": "/spec/template/spec/tolerations", "value": [{"key": "node-role.kubernetes.io/infra", "operator": "Exists", "effect": "NoSchedule"}]}]'
done

パッチが適用された旨Logが出る。

deployment.apps/codeflare-operator-manager patched
deployment.apps/data-science-pipelines-operator-controller-manager patched
deployment.apps/etcd patched
deployment.apps/kserve-controller-manager patched
deployment.apps/kueue-controller-manager patched
deployment.apps/modelmesh-controller patched
deployment.apps/notebook-controller-deployment patched
deployment.apps/odh-model-controller patched
deployment.apps/odh-notebook-controller-manager patched
deployment.apps/rhods-dashboard patched

各コンポーネントがInfra Nodeにリスケジュールされ始めるのでしばらく待つ。
この操作により、例えばDeployment「rhods-dashboard」はのYAMLに以下のようなパラメータが付与される。

rhods-dashboard.yaml
spec:
  template:
    spec:
      nodeSelector:
        node-role.kubernetes.io/infra: ''
      tolerations:
        - key: node-role.kubernetes.io/infra
          operator: Exists
          effect: NoSchedule

おわり

調べてもイマイチドンピシャな情報が出てこなかったので、どこかの誰かにお役に立てば...

1
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?