BigQueryのメタデータを収集するためによく使うものを備忘録としてまとめました。
テーブルの作成日時や件数
SELECT
project_id
, dataset_id
, table_id
, FORMAT_TIMESTAMP('%Y-%m-%d %H:%M:%S', TIMESTAMP_MILLIS(creation_time), 'Asia/Tokyo') as creation_time
, FORMAT_TIMESTAMP('%Y-%m-%d %H:%M:%S', TIMESTAMP_MILLIS(last_modified_time), 'Asia/Tokyo') as last_modified_time
, row_count
, size_bytes
FROM
<dataset_id>.__TABLES__
ORDER BY
table_id
-- +-------------+-------------+----------+---------------------+---------------------+-----------+------------+
-- | project_id | dataset_id | table_id | creation_time | last_modified_time | row_count | size_bytes |
-- +-------------+-------------+----------+---------------------+---------------------+-----------+------------+
-- | example_pj | example_ds | table_a | 2022-01-01 10:00:00 | 2022-01-01 10:00:00 | 100 | 2000 |
-- +-------------+-------------+----------+---------------------+---------------------+-----------+------------+
-- | example_pj | example_ds | table_b | 2022-01-01 11:00:00 | 2022-01-01 11:00:00 | 500 | 30000 |
-- +-------------+-------------+----------+---------------------+---------------------+-----------+------------+
テーブルのカラム情報取得
SELECT
*
FROM
<dataset_id>.INFORMATION_SCHEMA.COLUMNS
WHERE
table_name = '<table_name>'
ORDER BY
ordinal_position
-- +---------------+--------------+------------+-------------+------------------+-----------
-- | table_catalog | table_schema | table_name | column_name | ordinal_position | ・・・
-- +---------------+--------------+------------+-------------+------------------+-----------
-- | example_pj | example_ds | table_a | column_a | 1 | ・・・
-- +---------------+--------------+------------+-------------+------------------+-----------
-- | example_pj | example_ds | table_a | column_b | 2 | ・・・
-- +---------------+--------------+------------+-------------+------------------+-----------
テーブルのカラム情報をカンマ区切りでつなげて取得(クエリでカラム名を列挙する場合に使用)
SELECT
STRING_AGG(column_name, ", " ORDER BY ordinal_position) as columns
, CONCAT("'", STRING_AGG(column_name, "', '" ORDER BY ordinal_position), "'") as columns_with_singlequote
FROM
<dataset_id>.INFORMATION_SCHEMA.COLUMNS
WHERE
table_name = '<table_name>'
-- +-----------------------------------+------------------------------------+
-- | columns | columns_with_singlequote |
-- +-----------------------------------+------------------------------------+
-- | column_a, column_b, column_c | 'column_a', 'column_b', 'column_c' |
-- +-----------------------------------+------------------------------------+
テーブルのカラム情報詳細(ネスト列や説明 等)の取得
SELECT
*
FROM
<dataset_id>.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
WHERE
table_name = '<table_name>'
-- +---------------+--------------+------------+-------------+----------------+-----------
-- | table_catalog | table_schema | table_name | column_name | field_path | ・・・
-- +---------------+--------------+------------+-------------+----------------+-----------
-- | example_pj | example_ds | table_a | column_a | column_a | ・・・
-- +---------------+--------------+------------+-------------+----------------+-----------
-- | example_pj | example_ds | table_a | column_b | column_b.id | ・・・
-- +---------------+--------------+------------+-------------+----------------+-----------
-- | example_pj | example_ds | table_a | column_b | column_b.name | ・・・
-- +---------------+--------------+------------+-------------+----------------+-----------
テーブルの CREATE TABLE 文取得
SELECT
table_name
, ddl
FROM
<dataset_id>.INFORMATION_SCHEMA.TABLES
WHERE
table_name = '<table_name>'
データセット一覧の取得
-- 東京リージョンのデータセットの場合
SELECT
schema_name
FROM
`<プロジェクトID>`.`region-asia-northeast1`.INFORMATION_SCHEMA.SCHEMATA
ORDER BY 1;
-- USリージョンのデータセットの場合
SELECT
schema_name
FROM
`<プロジェクトID>`.INFORMATION_SCHEMA.SCHEMATA
ORDER BY 1;
特定のデータセット配下のテーブルが参照されているかの確認
SELECT
*
FROM
`region-asia-northeast1`.INFORMATION_SCHEMA.JOBS_BY_PROJECT
WHERE
DATE(creation_time) BETWEEN DATE_ADD(CURRENT_DATE('Asia/Tokyo'), INTERVAL -60 DAY )
AND CURRENT_DATE('Asia/Tokyo')
AND REGEXP_CONTAINS(query, r'`<プロジェクトID>.<データセット>.*`')
ORDER BY creation_time DESC
この記事は以下の情報を参考にして執筆しました。
https://cloud.google.com/bigquery/docs/information-schema-tables?hl=ja
https://www.niandc.co.jp/tech/20200923_1893/