0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

自動運転研究用データセットをAWSで分析する #4

0
Posted at

はじめに

#3でRedashダッシュボードが完成しました。
今回の#4では、AWS CDKでインフラをコード化し、GitHub ActionsでCI/CDパイプラインを構築します。

アーキテクチャ(#4追加分)

CDKとTerraformの使い分け

CDK Terraform
言語 Python/TypeScript等 HCL
対応クラウド AWS専用 マルチクラウド
plan相当 cdk synth terraform plan
apply相当 cdk deploy terraform apply
事前準備 cdk bootstrap(初回のみ) なし

ハマりポイント

① CDK v1とv2のパッケージが混在する

pip install aws-cdk.aws-s3はv1のパッケージ。v2ではaws-cdk-libに全部統合されているので追加インストール不要。requirements.txtaws-cdk-libだけ書けばOK。

② 既存リソースとCDKの名前が衝突する

手動で作成済みのIAMロール・Glue Jobと同じ名前をCDKで定義するとデプロイ失敗。別名(-cdkサフィックス)をつけて回避した。

③ GitHub ActionsでuvのPATHが通らない

curlでuvをインストールしてsource $HOME/.local/bin/envする方法はGitHub Actions環境では失敗する。astral-sh/setup-uv@v5公式アクションを使うのが正解。

手順

Step 1:Node.jsとCDKインストール

# Node.js 22(20はEOL)
curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash -
sudo apt install -y nodejs

sudo npm install -g aws-cdk
cdk --version  # 2.x.x

Step 2:プロジェクト初期化

mkdir -p ~/a2d2-driving-data-analytics-aws && cd ~/a2d2-driving-data-analytics-aws
cdk init app --language python

# uvでvenv管理
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv
source .venv/bin/activate

cat > requirements.txt << 'EOF'
aws-cdk-lib==2.180.0
constructs>=10.0.0,<11.0.0
ruff>=0.9.0
EOF

uv pip install -r requirements.txt

またpyproject.tomlでruffの設定を追加します:

[tool.ruff]
line-length = 120
exclude = [".venv"]

[tool.ruff.lint]
select = ["E", "F", "I"]

Step 3:スタック定義

# a2d2_driving_data_analytics_aws/a2d2_driving_data_analytics_aws_stack.py
from aws_cdk import (
    Stack,
    RemovalPolicy,
    aws_s3 as s3,
    aws_iam as iam,
    aws_glue as glue,
)
from constructs import Construct


class A2D2DrivingDataAnalyticsAwsStack(Stack):

    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        s3.Bucket(
            self, "RawBucket",
            bucket_name="your-adas-portfolio",
            removal_policy=RemovalPolicy.RETAIN,
        )

        s3.Bucket(
            self, "AthenaBucket",
            bucket_name="your-adas-portfolio-athena",
            removal_policy=RemovalPolicy.RETAIN,
        )

        s3.Bucket(
            self, "GlueBucket",
            bucket_name="your-adas-portfolio-glue",
            removal_policy=RemovalPolicy.RETAIN,
        )

        glue_role = iam.Role(
            self, "GlueEtlRole",
            role_name="GlueAdasEtlRole-cdk",
            assumed_by=iam.ServicePrincipal("glue.amazonaws.com"),
            managed_policies=[
                iam.ManagedPolicy.from_aws_managed_policy_name("AmazonS3FullAccess"),
                iam.ManagedPolicy.from_aws_managed_policy_name("service-role/AWSGlueServiceRole"),
            ],
        )

        iam.Role(
            self, "RedshiftS3Role",
            role_name="RedshiftS3Role-cdk",
            assumed_by=iam.ServicePrincipal("redshift.amazonaws.com"),
            managed_policies=[
                iam.ManagedPolicy.from_aws_managed_policy_name("AmazonS3ReadOnlyAccess"),
            ],
        )

        glue.CfnJob(
            self, "A2D2EtlJob",
            name="adas-a2d2-etl-cdk",
            role=glue_role.role_arn,
            command=glue.CfnJob.JobCommandProperty(
                name="glueetl",
                script_location="s3://your-adas-portfolio-glue/scripts/glue_etl_a2d2.py",
                python_version="3",
            ),
            default_arguments={
                "--RAW_BUCKET": "your-adas-portfolio",
                "--BUS_TAR_KEY": "raw/a2d2/camera_lidar-20190401121727_bus_signals.tar",
                "--CAM_TAR_KEY": "raw/a2d2/camera_lidar-20190401121727_camera_frontcenter.tar",
                "--LIDAR_TAR_KEY": "raw/a2d2/camera_lidar-20190401121727_lidar_frontcenter.tar",
                "--PROCESSED_BUCKET": "your-adas-portfolio",
                "--DB_NAME": "adas_portfolio",
            },
            glue_version="4.0",
            number_of_workers=6,
            worker_type="G.1X",
        )

Step 4:ローカルでlint確認

ruff check . --exclude .venv
ruff format --check . --exclude .venv
cdk synth 2>/dev/null | head -10

Step 5:bootstrap & deploy

cdk bootstrap  # 初回のみ
cdk deploy

結果:

| Name                       | Status          | Type             |
|----------------------------|-----------------|------------------|
| adas-a2d2-etl-cdk          | CREATE_COMPLETE | AWS::Glue::Job   |
| your-adas-portfolio-athena | CREATE_COMPLETE | AWS::S3::Bucket  |
| your-adas-portfolio-glue   | CREATE_COMPLETE | AWS::S3::Bucket  |
| GlueAdasEtlRole-cdk        | CREATE_COMPLETE | AWS::IAM::Role   |
| your-adas-portfolio        | CREATE_COMPLETE | AWS::S3::Bucket  |
| RedshiftS3Role-cdk         | CREATE_COMPLETE | AWS::IAM::Role   |

Step 6:GitHub Actions設定

# .github/workflows/ci.yml
name: CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  lint-and-synth:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"

      - name: Install uv
        uses: astral-sh/setup-uv@v5

      - name: Install dependencies
        run: |
          uv venv
          source .venv/bin/activate
          uv pip install -r requirements.txt

      - name: Ruff lint
        run: |
          source .venv/bin/activate
          ruff check . --exclude .venv

      - name: Ruff format check
        run: |
          source .venv/bin/activate
          ruff format --check . --exclude .venv


      - name: Install Node.js
        uses: actions/setup-node@v4
        with:
          node-version: "22"

      - name: Install CDK
        run: npm install -g aws-cdk

      - name: CDK Synth
        run: |
          source .venv/bin/activate
          cdk synth
        env:
          AWS_DEFAULT_REGION: ap-northeast-1
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

GitHub → Settings → Secrets → Actions に登録:

  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY

#4 まとめ

項目 内容
CDKデプロイ S3×3 / IAMロール×2 / Glue Job×1
GitHub Actions ruff lint + cdk synth 43秒で成功
リポジトリ a2d2-driving-data-analytics-aws

次の#5ではBedrockで自然言語による走行データ分析エージェントを構築します。

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?