BigQueryのユーザー定義関数をterraformで管理する

Last updated at 2023-12-16Posted at 2023-12-16

以下の記事のようなことをUser Defined Functions(UDF)に対して行ってみます。

UDFを格納するディレクトリ構造は以下のようにします。

udfs
├── データセット名1
│   ├── UDF名1.sql
│   └── UDF名2.sql
├── データセット名2
│   └── UDF名3.sql
└── データセット名3
    └── UDF名4.sql

そして、それぞれのSQLファイルには以下のようなcreate function文を書いておきます。

create function `プロジェクトID.データセット名.UDF名`(引数1 型, 引数2 型) returns 型 as (
  -- UDFの本体をここに書く
);

create functionだけではなく、create procedureにも対応しているので、ストアドを管理することもできます。

そして、以下の気合に満ち溢れたterraformファイルをapplyすれば、SQLファイルの追加に連動してUDFが反映されます。
google_bigquery_routineは関数の引数を可変個argumentsブロックで渡す必要があるため、気合に満ち溢れた正規表現で該当箇所をパースしてリソース内に展開しています。

locals {
  extract_udf_from_arg_and_def_regexp = <<-EOT
  (?i:CREATE)\s+(?P<routine_type>(?i:FUNCTION|PROCEDURE))\s+
  `?[-_\w]+`?\.`?(?P<dataset_name>[-_\w]+)`?\.`?(?P<function_name>[-_\w]+)`?\s*
  \((?P<arguments>[\w_,\s]+)\)\s+
  (?P<returns>(?i:RETURNS)\s+\w+\s+)?
  (?P<as_or_begin>(?i:AS\s+\(|BEGIN\s+))
  (?P<definition>(?s:.+))
  (?P<end>(?i:\)|END))
  ;?
  EOT

  format_regexp = join("", [for line in split("\n", local.extract_udf_from_arg_and_def_regexp) : chomp(line)])
}

resource "google_bigquery_routine" "udf" {
  for_each = fileset("${path.module}/udfs", "*/*.sql")

  dataset_id   = split("/", each.value)[0]
  routine_id   = trimsuffix(split("/", each.value)[1], ".sql")
  routine_type = upper(regex(local.format_regexp, file("${path.module}/udfs/${each.value}"))["routine_type"]) == "FUNCTION" ? "SCALAR_FUNCTION" : "PROCEDURE"
  language     = "SQL"
  dynamic "arguments" {
    for_each = [for argument in split(",", regex(local.format_regexp, file("${path.module}/udfs/${each.value}"))["arguments"]) : trimspace(argument)]
    content {
      name          = split(" ", arguments.value)[0]
      argument_kind = split(" ", arguments.value)[1] != "ANY" ? "FIXED_TYPE" : "ANY_TYPE"
      data_type     = split(" ", arguments.value)[1] != "ANY" ? jsonencode({ "typeKind" : upper(split(" ", arguments.value)[1]) }) : null
    }
  }
  definition_body = regex(local.format_regexp, file("${path.module}/udfs/${each.value}"))["definition"]

  lifecycle {
    # ディレクトリ構成に反する関数名がcreate文に書かれていた場合にエラーにする
    precondition {
      condition     = split("/", each.value)[0] == regex(local.format_regexp, file("${path.module}/udfs/${each.value}"))["dataset_name"] && trimsuffix(split("/", each.value)[1], ".sql") == regex(local.format_regexp, file("${path.module}/udfs/${each.value}"))["function_name"]
      error_message = "The dataset or function name does not match the file path."
    }
  }
}

参考:

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up