ZOZOAdvent Calendar 2024

GKE上に構成したRay Clusterに内部Load Balancer経由でプライベートアクセスする

Last updated at 2024-12-01Posted at 2024-12-01

これは ZOZO Advent Calendar 2024 カレンダー Vol.9 の 2日目の記事です。

はじめに

昨日はGKE上に構成したRay Clusterに外部Load Balancer経由でアクセスするというタイトルで、Ray Clusterに関する記事を投稿しました。
昨日に引き続き、本日もRay Cluster関連の投稿になります。Ray Clusterについては上記の記事で簡単に説明しているため、ご参照ください。

本記事では、昨日ご紹介したRay Clusterへ外部Load Balancer経由でアクセスする方法に加えて、内部Load Balancer経由でのプライベートアクセスを構成する方法をご紹介します。

Head Nodeに対して2つのLoad Balancerを構成する上でのボトルネック

昨日の記事では、Ingress構成時の注意点としてServiceとHead Nodeは1対1で作成するを記載しました。

Ray Cluster用のCustom ControllerであるKubeRay OperatorはRayClusterオブジェクトのデプロイ時にHead Nodeに対応するServiceリソースを1つ作成します。そして、Serviceに付与するAnnotation等の設定情報はRayClusterオブジェクトのマニフェストに記述することで適用可能です。

一方で、Google Kubernetes Engine（以下GKE）で外部Load Balancerと内部Load Balancerを構成する場合、対応するServiceリソースに外部・内部ごとのAnnotationを付与する必要があります。

自動作成されるServiceに加えて、独自にServiceリソースを作成し同一のHead Nodeを参照すれば解決しそうですがこの方法は取れません。KubeRay Operatorの仕様でHead Node Podを参照するServiceは1つになるように決まっています。（参考）

Head Nodeに対して2つのLoad Balancerの構成を実現

上記のボトルネックに対して、Istioを導入することで解決しました。
Istioについて詳細な説明は省きますが、Istioの導入により柔軟なルーティングを実現することが可能です。

外部・内部Load Balancer用のAnnotationを付与したServiceリソースをそれぞれ作成し、IstioのVirtual ServiceでHead Nodeに対応するServiceリソースを参照します。この構成により、Head Nodeに対応するServiceリソースがKubeRay Operatorにより自動作成されない問題を回避することができます。

Ray Clusterオブジェクトの作成

Ray Clusterリソースについては特にService用のAnnotationを付与せずに作成します。

ray-cluster.yaml

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: raycluster-autoscaler
  labels:
    app.kubernetes.io/instance: raycluster-autoscaler
spec:
  # The version of Ray you are using. Make sure all Ray containers are running this version of Ray.
  rayVersion: '2.22.0'
  # If `enableInTreeAutoscaling` is true, the Autoscaler sidecar will be added to the Ray head pod.
  # Ray Autoscaler integration is Beta with KubeRay >= 0.3.0 and Ray >= 2.0.0.
  enableInTreeAutoscaling: true
  # `autoscalerOptions` is an OPTIONAL field specifying configuration overrides for the Ray Autoscaler.
  # The example configuration shown below below represents the DEFAULT values.
  # (You may delete autoscalerOptions if the defaults are suitable.)
  autoscalerOptions:
    # `upscalingMode` is "Default" or "Aggressive."
    # Conservative: Upscaling is rate-limited; the number of pending worker pods is at most the size of the Ray cluster.
    # Default: Upscaling is not rate-limited.
    # Aggressive: An alias for Default; upscaling is not rate-limited.
    upscalingMode: Default
    # `idleTimeoutSeconds` is the number of seconds to wait before scaling down a worker pod which is not using Ray resources.
    idleTimeoutSeconds: 60
    # `image` optionally overrides the Autoscaler's container image. The Autoscaler uses the same image as the Ray container by default.
    ## image: "my-repo/my-custom-autoscaler-image:tag"
    # `imagePullPolicy` optionally overrides the Autoscaler container's default image pull policy (IfNotPresent).
    imagePullPolicy: IfNotPresent
    # Optionally specify the Autoscaler container's securityContext.
    securityContext: {}
    env: []
    envFrom: []
    # resources specifies optional resource request and limit overrides for the Autoscaler container.
    # The default Autoscaler resource limits and requests should be sufficient for production use-cases.
    # However, for large Ray clusters, we recommend monitoring container resource usage to determine if overriding the defaults is required.
    resources:
      limits:
        cpu: "500m"
        memory: "512Mi"
      requests:
        cpu: "500m"
        memory: "512Mi"
  # Ray head pod template
  headGroupSpec:
    # The `rayStartParams` are used to configure the `ray start` command.
    # See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
    # See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available opheadServiceAnnotations in `rayStartParams`.
    rayStartParams:
      # Setting "num-cpus: 0" to avoid any Ray actors or tasks being scheduled on the Ray head Pod.
      num-cpus: "0"
      # Use `resources` to optionally specify custom resource annotations for the Ray node.
      # The value of `resources` is a string-integer mapping.
      # Currently, `resources` must be provided in the specific format demonstrated below:
      # resources: '"{\"Custom1\": 1, \"Custom2\": 5}"'
      dashboard-host: '0.0.0.0'
    # Pod template
    template:
      spec:
        containers:
        # The Ray head container
        - name: ray-head
          image: rayproject/ray:2.22.0-py311
          ports:
          - containerPort: 6379
            name: gcs
          - containerPort: 8265
            name: dashboard
          - containerPort: 10001
            name: client
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh","-c","ray stop"]
          resources:
            limits:
              cpu: "1"
              memory: "2G"
            requests:
              cpu: "1"
              memory: "2G"
  workerGroupSpecs:
  # the Pod replicas in this group typed worker
  - replicas: 0
    minReplicas: 0
    maxReplicas: 3
    # logical group name, for this called small-group, also can be functional
    groupName: small-group
    # If worker pods need to be added, Ray Autoscaler can increment the `replicas`.
    # If worker pods need to be removed, Ray Autoscaler decrements the replicas, and populates the `workersToDelete` list.
    # KubeRay operator will remove Pods from the list until the desired number of replicas is satisfied.
    #scaleStrategy:
    #  workersToDelete:
    #  - raycluster-complete-worker-small-group-bdtwh
    #  - raycluster-complete-worker-small-group-hv457
    #  - raycluster-complete-worker-small-group-k8tj7
    rayStartParams: {}
    # Pod template
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.22.0-py311
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh","-c","ray stop"]
          resources:
            limits:
              cpu: "1"
              memory: "1G"
            requests:
              cpu: "1"
              memory: "1G"

ASM Ingress Gatewayの作成

構成にはGoogle CloudにおけるIstioのマネージドサービスである、Anthos Service Mesh(以下ASM)を利用しています。
ASMの導入手順については割愛しますので、公式の手順をご参照ください。

ASM Ingress GatewayはEnvoyプロキシであり、リクエストをIstioの各サービスにルーティングします。

ingress-gateway.yaml

# created from GCP official sample code
# https://github.com/GoogleCloudPlatform/anthos-service-mesh-samples/tree/156d4f0e41bbdce44f799a68ed8dfdc26219efac/docs/shared/asm-ingress-gateway
apiVersion: v1
kind: ServiceAccount
metadata:
  name: asm-ingressgateway
  labels:
    app.kubernetes.io/instance: asm
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: asm-ingressgateway
  labels:
    app.kubernetes.io/instance: asm
spec:
  replicas: 2
  selector:
    matchLabels:
      asm: ingressgateway
  template:
    metadata:
      annotations:
        # This is required to tell Anthos Service Mesh to inject the gateway with the
        # required configuration.
        inject.istio.io/templates: gateway
      labels:
        asm: ingressgateway
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: cloud.google.com/gke-nodepool
                operator: In
                values:
                - "asm-ingressgateway-20240711-v1"
      tolerations:
        - key: "dedicated"
          operator: "Equal"
          value: "asm-ingressgateway-20240711-v1"
          effect: "NoSchedule"
      containers:
      - name: istio-proxy
        image: auto # The image will automatically update each time the pod starts.
        resources:
          limits:
            cpu: 2000m
            memory: 1024Mi
          requests:
            cpu: 100m
            memory: 128Mi
      serviceAccountName: asm-ingressgateway
---

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: asm-ingressgateway
  labels:
    app.kubernetes.io/instance: asm
spec:
  maxReplicas: 5
  metrics:
  - resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 80
    type: Resource
  minReplicas: 2
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: asm-ingressgateway
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: asm-ingressgateway
  labels:
    app.kubernetes.io/instance: asm
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      asm: ingressgateway
      app: asm-ingressgateway
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: asm-ingressgateway
  labels:
    app.kubernetes.io/instance: asm
spec:
  selector:
    asm: ingressgateway # ASM ingress gateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "*"

上記で作成したASM Ingress GatewayのPodにアクセスするためのService・Ingressオブジェクトを作成します。

service-ingress.yaml

# For Google Cloud Armor
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: backend-config
  labels:
    app.kubernetes.io/instance: asm-ingressgateway
spec:
  # envoy proxy health check endpoint
  healthCheck:
    requestPath: /healthz/ready
    port: 15021
    type: HTTP
  # For Google Cloud Armor
  securityPolicy:
    name: <sample-waf-rule>
---
 
# ref. https://cloud.google.com/kubernetes-engine/docs/how-to/ingress-configuration?hl=ja#ssl
apiVersion: networking.gke.io/v1beta1
kind: FrontendConfig
metadata:
  name: asm-ingeressgateway-frontend-config
spec:
  sslPolicy: <sample-ssl-policy>
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: internal-lb
  labels:
    app.kubernetes.io/instance: asm-ingressgateway
spec:
  healthCheck:
    requestPath: /healthz/ready
    port: 15021
    type: HTTP
---
apiVersion: v1
kind: Service
metadata:
  name: asm-ingressgateway-external-ingress
  labels:
    app.kubernetes.io/instance: asm-ingressgateway
  annotations:
    cloud.google.com/neg: '{"ingress":true,"exposed_ports":{"80":{}}}' # ref. https://cloud.google.com/kubernetes-engine/docs/how-to/container-native-load-balancing
    beta.cloud.google.com/backend-config: '{"ports": {"80":"backend-config"}}'
spec:
  type: NodePort
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 8080
  # status-port exposes a envoy proxy /healthz/ready endpoint that can be used with GKE Ingress health checks
  - name: status-port
    port: 15021
    protocol: TCP
    targetPort: 15021
  selector:
    asm: ingressgateway
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: asm-ingressgateway-external-ingress
  labels:
    app.kubernetes.io/instance: asm-ingressgateway
  annotations:
    kubernetes.io/ingress.allow-http: "false"
    kubernetes.io/ingress.global-static-ip-name: "ray-global-ip"
    ingress.gcp.kubernetes.io/pre-shared-cert: "ray-external-cert"
    networking.gke.io/v1beta1.FrontendConfig: "asm-ingeressgateway-frontend-config"
spec:
  defaultBackend:
    service:
      name: asm-ingressgateway-external-ingress
      port:
        number: 80
---
apiVersion: v1
kind: Service
metadata:
  name: asm-ingressgateway-internal-ingress
  labels:
    app.kubernetes.io/instance: asm-ingressgateway
  annotations:
    cloud.google.com/neg: '{"ingress":true,"exposed_ports":{"80":{}}}'
    beta.cloud.google.com/backend-config: '{"ports": {"80":"internal-lb"}}'
spec:
  type: NodePort
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 8080
  # status-port exposes a envoy proxy /healthz/ready endpoint that can be used with GKE Ingress health checks
  - name: status-port
    port: 15021
    protocol: TCP
    targetPort: 15021
  selector:
    asm: ingressgateway
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: asm-ingressgateway-internal-ingress
  labels:
    app.kubernetes.io/instance: asm-ingressgateway
  annotations:
    kubernetes.io/ingress.class: "gce-internal"
    kubernetes.io/ingress.regional-static-ip-name: "ray-internal"
spec:
  defaultBackend:
    service:
      name: asm-ingressgateway-internal-ingress
      port:
        number: 80

Virtual Serviceの作成

続いてASM Ingress GatewayからRay ClusterのHead Nodeにリクエストをルーティングするため、Virtual Serviceオブジェクトを作成します。
外部Load Balancer・内部Load Balancerに割り当てたホスト名をspec.hostsに指定してください。

service.yaml

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: raycluster-autoscaler
  labels:
    app.kubernetes.io/instance: raycluster-autoscaler
spec:
  gateways:
  - asm-ingress/asm-ingressgateway
  hosts:
  - <external-host>
  - <internal-host>
  http:
  - route:
    - destination:
        host: raycluster-autoscaler-head-svc
        port:
          number: 8265
      weight: 100
    rewrite:
      uri: /
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: raycluster-autoscaler
  labels:
    app.kubernetes.io/instance: raycluster-autoscaler
spec:
  host: raycluster-autoscaler-head-svc
  trafficPolicy:
    tls:
      mode: DISABLE
    loadBalancer:
      simple: RANDOM

これにより、作成した外部Load Balancer・内部Load Balancerが同一のHead Nodeを参照する状態を作成することができました。

内部Load Balancer経由でのRay Clusterへのプライベートアクセスの検証

Ray Clusterへのプライベートアクセスを試すために、Vertex PipelinesからRay Clusterへ内部Load Balancer経由でJobをSubmitできるか試します。
事前準備として、GKEを構築したVPCでプライベートサービスアクセスを構成する必要があります。またVertex PipelinesのRunを作成する際に、ピアリング先のVPCを指定するようにしてください。

VPCにおけるプライベートサービスアクセスの構成については公式ドキュメントを参照してください。
またVertex Pipelinesでのプライベートサービスアクセスの設定については公式ドキュメントを参照してください。

検証用に次のKubeflow Pipelines YAML Componentを作成しました。
RAY_ADDRESSについては作成した内部Load Balancerのホスト名を指定してください。

component.yaml

name: submit job to ray
description: |
  Run with Ray
implementation:
  container:
    image: us-docker.pkg.dev/repo:sample
    command:
      - bash
      - -cuex
      - |
        echo "start to submit ray job"
        RAY_ADDRESS="<host>" ray job submit --working-dir ./scripts -- python scripts.py
        echo "submitted ray job has been finished"

pyproject.toml

[tool.poetry]
name = "submit-job-to-ray"
version = "0.1.0"
description = ""
authors = [""]
readme = "README.md"

[tool.poetry.dependencies]
python = "^3.11"
ray = {version = "2.22.0", extras = ["default"]}


[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

scripts.py

import ray
import time

def main():
    @ray.remote
    def _hello_world():
        return "Hello World"
    start = time.time()
    ray.init()
    ray.get([_hello_world.remote() for i in range(30)])
    print(f"%s seconds" % (time.time() - start))
    ray.shutdown()


if __name__ == "__main__":
    main()

FROM python:3.11 as base

RUN pip install poetry==1.8.3
COPY pyproject.toml poetry.lock ./
RUN poetry config virtualenvs.create false \
  && poetry install

RUN mkdir scripts
COPY scripts.py ./scripts/

上記のコンポーネントをVertex Pipelinesで実行すると、Ray ClusterのDashboardから上記でSubmitしたRayのJobを確認できます。

これにより、Google Cloudのサービスからはプライベートな通信でJobをSubmitし、Ray ClusterのDashboardへはアクセス制限付きでアクセス可能という構成を実現することができます。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up