1
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

BigQuery の メタデータ収集

Last updated at Posted at 2022-04-27

BigQueryのメタデータを収集するためによく使うものを備忘録としてまとめました。

テーブルの作成日時や件数
SELECT
  project_id
  , dataset_id
  , table_id
  , FORMAT_TIMESTAMP('%Y-%m-%d %H:%M:%S', TIMESTAMP_MILLIS(creation_time), 'Asia/Tokyo') as creation_time
  , FORMAT_TIMESTAMP('%Y-%m-%d %H:%M:%S', TIMESTAMP_MILLIS(last_modified_time), 'Asia/Tokyo') as last_modified_time
  , row_count
  , size_bytes
FROM
  <dataset_id>.__TABLES__
ORDER BY
  table_id


--  +-------------+-------------+----------+---------------------+---------------------+-----------+------------+
--  | project_id  | dataset_id  | table_id |  creation_time      | last_modified_time  | row_count | size_bytes |
--  +-------------+-------------+----------+---------------------+---------------------+-----------+------------+
--  | example_pj  | example_ds  | table_a  | 2022-01-01 10:00:00 | 2022-01-01 10:00:00 |  100      | 2000       |
--  +-------------+-------------+----------+---------------------+---------------------+-----------+------------+
--  | example_pj  | example_ds  | table_b  | 2022-01-01 11:00:00 | 2022-01-01 11:00:00 |  500      | 30000       |
--  +-------------+-------------+----------+---------------------+---------------------+-----------+------------+
テーブルのカラム情報取得
SELECT
  *
FROM
  <dataset_id>.INFORMATION_SCHEMA.COLUMNS
WHERE
  table_name = '<table_name>'
ORDER BY
  ordinal_position


--  +---------------+--------------+------------+-------------+------------------+-----------
--  | table_catalog | table_schema | table_name | column_name | ordinal_position | ・・・
--  +---------------+--------------+------------+-------------+------------------+-----------
--  | example_pj    | example_ds   | table_a    | column_a    | 1                | ・・・
--  +---------------+--------------+------------+-------------+------------------+-----------
--  | example_pj    | example_ds   | table_a    | column_b    | 2                | ・・・
--  +---------------+--------------+------------+-------------+------------------+-----------
テーブルのカラム情報をカンマ区切りでつなげて取得(クエリでカラム名を列挙する場合に使用)
SELECT
  STRING_AGG(column_name, ", " ORDER BY ordinal_position) as columns
  , CONCAT("'", STRING_AGG(column_name, "', '" ORDER BY ordinal_position), "'") as columns_with_singlequote
FROM
  <dataset_id>.INFORMATION_SCHEMA.COLUMNS
WHERE
  table_name = '<table_name>'


--  +-----------------------------------+------------------------------------+
--  | columns                           | columns_with_singlequote           |
--  +-----------------------------------+------------------------------------+
--  | column_a, column_b, column_c      | 'column_a', 'column_b', 'column_c' | 
--  +-----------------------------------+------------------------------------+
テーブルのカラム情報詳細(ネスト列や説明 等)の取得
SELECT
  *
FROM
  <dataset_id>.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
WHERE
  table_name = '<table_name>'


--  +---------------+--------------+------------+-------------+----------------+-----------
--  | table_catalog | table_schema | table_name | column_name | field_path     | ・・・
--  +---------------+--------------+------------+-------------+----------------+-----------
--  | example_pj    | example_ds   | table_a    | column_a    | column_a                  | ・・・
--  +---------------+--------------+------------+-------------+----------------+-----------
--  | example_pj    | example_ds   | table_a    | column_b    | column_b.id    | ・・・
--  +---------------+--------------+------------+-------------+----------------+-----------
--  | example_pj    | example_ds   | table_a    | column_b    | column_b.name  | ・・・
--  +---------------+--------------+------------+-------------+----------------+-----------

テーブルの CREATE TABLE 文取得
SELECT
  table_name
  , ddl
FROM
  <dataset_id>.INFORMATION_SCHEMA.TABLES
WHERE
  table_name = '<table_name>'
データセット一覧の取得
-- 東京リージョンのデータセットの場合
SELECT 
  schema_name
FROM
  `<プロジェクトID>`.`region-asia-northeast1`.INFORMATION_SCHEMA.SCHEMATA
ORDER BY 1;

-- USリージョンのデータセットの場合
SELECT 
  schema_name
FROM
  `<プロジェクトID>`.INFORMATION_SCHEMA.SCHEMATA
ORDER BY 1;
特定のデータセット配下のテーブルが参照されているかの確認
SELECT
  *
FROM
  `region-asia-northeast1`.INFORMATION_SCHEMA.JOBS_BY_PROJECT
WHERE
  DATE(creation_time) BETWEEN DATE_ADD(CURRENT_DATE('Asia/Tokyo'), INTERVAL -60 DAY )
  AND CURRENT_DATE('Asia/Tokyo')
  AND REGEXP_CONTAINS(query, r'`<プロジェクトID>.<データセット>.*`')
 ORDER BY creation_time DESC

この記事は以下の情報を参考にして執筆しました。
https://cloud.google.com/bigquery/docs/information-schema-tables?hl=ja
https://www.niandc.co.jp/tech/20200923_1893/

1
3
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?