0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

【Terraform de Azure】 Azure Databricks を作成してみました

Last updated at Posted at 2022-01-13

概要

「Infrastructure as Code」ということで、Terraform を用いて Azure環境上に Azure Databricks を構築してみます。

ローカル環境

  • macOS Monterey 12.1
  • Azure CLI 2.28.0
  • terraform v1.0.11

前提条件

  1. Azure環境がすでに用意されていること(テナント/サブスクリプション)
  2. ローカル環境に「azure cli」がインストールされていること
  3. ローカル環境に「terraform」環境が構成されていること
  4. TerraformでAzure上に環境構築するためのサービスプリンシパルが作成されており、Terraform のためのローカル環境変数が定義されていること

Terraform で Azure Databricks を作成してみる

terraform 定義ファイルの作成

メイン定義ファイル

main.tf
terraform {
  required_providers {
    azurerm =  "~> 2.33"
    random = "~> 2.2"

    databricks = {
      source = "databrickslabs/databricks"
      version = "0.4.4"
    }
  }
}

provider "azurerm" {
  features {}
  tenant_id     = var.ARM_TENANT_ID
  client_id     = var.ARM_CLIENT_ID
  client_secret = var.ARM_CLIENT_SECRET
}

パラメータ定義ファイル

variables.tf
# 環境変数(Azureサービスプリンシパル)
variable ARM_TENANT_ID {}
variable ARM_CLIENT_ID {}
variable ARM_CLIENT_SECRET {}

# タグ情報
variable tags_def {
  default = {
    owner      = "ituru"
    period     = "2022-03-31"
    CostCenter = "psg2"
    Environment = "Demo"
  }
}

# 各種パラメータ 
variable resource_group_name {}
variable region {}
variable email_notifier {
  description = "The email address to send job status to"
  type = list(string)
}

パラメータ値定義ファイル

terraform.tfvars
# 環境変数の定義(Azureサービスプリンシパル)
ARM_TENANT_ID       = "zzzzzzzz-cccc-4645-5757-zzzzzzzzzzzz"
ARM_CLIENT_ID       = "xxxxxxxx-xxxx-4444-9922-xxxxxxxxxxxx"
ARM_CLIENT_SECRET   = "hogehogehogehogehogehogehogehogege"

# パラメータ値の定義
resource_group_name = "rg_ituru_bricks01"     // リソースグループ名
region              = "japaneast"             // 利用リージョン
email_notifier      = ["hogehoge@gegege.com"] // 送信先メールアドレス(List型)

Databricks定義ファイル

databricks.tf
# Databricksプロバイダー
# provider "databricks" {}
provider "databricks" {
  host = azurerm_databricks_workspace.this.workspace_url
}

# Get information about the Databricks user that is calling
# the Databricks API (the one associated with "databricks_connection_profile").
# ユーザの定義
data "databricks_current_user" "me" {
  depends_on = [azurerm_databricks_workspace.this]
}

# Get the latest Spark version to use for the cluster.
# Sparkバージョンの定義
data "databricks_spark_version" "latest" {
  long_term_support = true
  depends_on = [azurerm_databricks_workspace.this]
}

# Get the smallest available node type to use for the cluster. Choose
# only from among available node types with local storage.
# ノードタイプの定義
data "databricks_node_type" "smallest" {
  local_disk = true
  depends_on = [azurerm_databricks_workspace.this]
}

# ランダム文字列生成
resource "random_string" "naming" {
  special = false
  upper   = false
  length  = 6
}

# # The prefix to use when naming the databricks workspace
# ローカル変数定義
locals {
  prefix = "databricksdemo-${random_string.naming.result}"
}

# リソースグループ
resource "azurerm_resource_group" "this" {
  name     = var.resource_group_name
  location = var.region
  tags     = var.tags_def
}

# Databricks ワークスペース
resource "azurerm_databricks_workspace" "this" {
  name                        = "ituru-${local.prefix}-workspace"
  resource_group_name         = azurerm_resource_group.this.name
  location                    = azurerm_resource_group.this.location
  sku                         = "trial"
  managed_resource_group_name = "ituru-${local.prefix}-workspace-rg"
  tags                        = var.tags_def
}

# シークレットスコープ生成
resource "databricks_secret_scope" "this" {
  name = "demo-${data.databricks_current_user.me.alphanumeric}"
}

# シークレットトークン生成
resource "databricks_token" "pat" {
  comment          = "Created from ${abspath(path.module)}"
  lifetime_seconds = 3600
}

# シークレット発行
resource "databricks_secret" "token" {
  string_value = databricks_token.pat.token_value
  scope        = databricks_secret_scope.this.name
  key          = "token"
}


# Create a simple, sample notebook. Store it in a subfolder within
# the Databricks current user's folder. The notebook contains the
# following basic Spark code in Python.
# ノートブック
resource "databricks_notebook" "this" {
  path     = "${data.databricks_current_user.me.home}/Terraform"
  language = "PYTHON"
  content_base64 = base64encode(<<-EOT
    token = dbutils.secrets.get('${databricks_secret_scope.this.name}', '${databricks_secret.token.key}')
    print(f'This should be redacted: {token}')
    EOT
  )
}

# Create a job to run the sample notebook. The job will create
# a cluster to run on. The cluster will use the smallest available
# node type and run the latest version of Spark.
# ジョブ
resource "databricks_job" "this" {
  name = "Terraform Demo (${data.databricks_current_user.me.alphanumeric})"
  new_cluster {
    num_workers   = 1
    spark_version = data.databricks_spark_version.latest.id
    node_type_id  = data.databricks_node_type.smallest.id
  }

  notebook_task {
    notebook_path = databricks_notebook.this.path
  }

  email_notifications {
    on_success = var.email_notifier
    on_failure = var.email_notifier
  }

  depends_on = [azurerm_databricks_workspace.this, databricks_notebook.this]
}

# クラスター
resource "databricks_cluster" "this" {
  cluster_name = "Exploration (${data.databricks_current_user.me.alphanumeric})"
  spark_version           = data.databricks_spark_version.latest.id
  instance_pool_id        = databricks_instance_pool.smallest_nodes.id
  autotermination_minutes = 20
  autoscale {
    min_workers = 1
    max_workers = 10
  }
}

# クラスターポリシー
resource "databricks_cluster_policy" "this" {
  name = "Minimal (${data.databricks_current_user.me.alphanumeric})"
  definition = jsonencode({
    "dbus_per_hour" : {
      "type" : "range",
      "maxValue" : 10
    },
    "autotermination_minutes" : {
      "type" : "fixed",
      "value" : 20,
      "hidden" : true
    }
  })
}

# インスタンスプール
resource "databricks_instance_pool" "smallest_nodes" {
  instance_pool_name = "Smallest Nodes (${data.databricks_current_user.me.alphanumeric})"
  min_idle_instances = 0
  max_capacity       = 30
  node_type_id       = data.databricks_node_type.smallest.id
  preloaded_spark_versions = [
    data.databricks_spark_version.latest.id
  ]

  idle_instance_autotermination_minutes = 20
}

アウトプット情報ファイル

outputs.tf
# Print the URL to the databrickes workspace.
output "databricks_url" {
  value = "https://${azurerm_databricks_workspace.this.workspace_url}/"
}

# Print the URL to the notebook.
output "notebook_url" {
  value = databricks_notebook.this.url
}

# Print the URL to the job.
output "job_url" {
  value = databricks_job.this.url
}

terraform の実行

## init
$ terraform init

## plan
$ terraform plan

## apply
$ terraform apply

terraform 実行後の確認

## 作成されたリソースグループの確認
$ az group show --name rg_ituru_bricks01
{
  "id": "/subscriptions/nnnnnnnn-1717-4334-9779-mmmmmmmmmmmm/resourceGroups/rg_ituru_bricks01",
  "location": "japaneast",
  "managedBy": null,
  "name": "rg_ituru_bricks01",
  "properties": {
    "provisioningState": "Succeeded"
  },
  "tags": {
    "CostCenter": "psg2",
    "Environment": "Demo",
    "owner": "ituru",
    "period": "2022-03-31"
  },
  "type": "Microsoft.Resources/resourceGroups"
}

## 作成した Azure Databricks Workspace の確認
$ az databricks workspace list --resource-group rg_ituru_bricks01 -o table
CreatedDateTime                   Location    ManagedResourceGroupId                                                                                       Name                                   ProvisioningState    ResourceGroup      WorkspaceId       WorkspaceUrl
--------------------------------  ----------  -----------------------------------------------------------------------------------------------------------  -------------------------------------  -------------------  -----------------  ----------------  ------------------------------------------
2022-01-13T00:11:27.479190+00:00  japaneast   /subscriptions/nnnnnnnn-1717-4334-9779-mmmmmmmmmmmm/resourceGroups/ituru-databricksdemo-7ke24v-workspace-rg  ituru-databricksdemo-7ke24v-workspace  Succeeded            rg_ituru_bricks01  1975197519751975  adb-1975197519751975.6.azuredatabricks.net

ローカルの作業ディレクトの状況

$ tree -a         
.
├── .terraform
│   ├── providers
│   │   └── registry.terraform.io
│   │       ├── databrickslabs
│   │       │   └── databricks
│   │       │       ├── 0.3.9
│   │       │       │   └── darwin_amd64
│   │       │       │       ├── CHANGELOG.md
│   │       │       │       ├── LICENSE
│   │       │       │       ├── NOTICE
│   │       │       │       └── terraform-provider-databricks_v0.3.9
│   │       │       ├── 0.4.0
│   │       │       │   └── darwin_amd64
│   │       │       │       ├── CHANGELOG.md
│   │       │       │       ├── LICENSE
│   │       │       │       ├── NOTICE
│   │       │       │       └── terraform-provider-databricks_v0.4.0
│   │       │       └── 0.4.4
│   │       │           └── darwin_amd64
│   │       │               ├── CHANGELOG.md
│   │       │               ├── LICENSE
│   │       │               ├── NOTICE
│   │       │               └── terraform-provider-databricks_v0.4.4
│   │       └── hashicorp
│   │           ├── azurerm
│   │           │   ├── 2.88.1
│   │           │   │   └── darwin_amd64
│   │           │   │       └── terraform-provider-azurerm_v2.88.1_x5
│   │           │   └── 2.91.0
│   │           │       └── darwin_amd64
│   │           │           └── terraform-provider-azurerm_v2.91.0_x5
│   │           ├── external
│   │           │   └── 2.1.0
│   │           │       └── darwin_amd64
│   │           │           └── terraform-provider-external_v2.1.0_x5
│   │           └── random
│   │               └── 2.3.1
│   │                   └── darwin_amd64
│   │                       └── terraform-provider-random_v2.3.1_x4
│   └── terraform.tfstate
├── .terraform.lock.hcl
├── databricks.tf
├── main.tf
├── outputs.tf
├── terraform.tfstate
├── terraform.tfstate.backup
├── terraform.tfvars
└── variables.tf

terraform による作成したリソースの削除

$ terraform destroy

まとめ

これで、Terraform でサクッと Azure環境上に Azure Databricks を構成できました。

参考記事

以下の記事を参考にさせていただきました。感謝申し上げます。

0
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?