0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

BigQueryのウィンドウ関数徹底活用!便利関数をまとめてみた

Posted at

LAGやLEADばかり活用していましたが、それ以外にも便利な関数がたくさんあることに気づき、まとめてみました。


LAG / LEAD

使いどころ:

  • 直前・直後の値を取りたいとき
  • 時系列データの前後比較
SELECT
  user_id,
  event_date,
  value,
  LAG(value, 1) OVER(PARTITION BY user_id ORDER BY event_date) AS prev_value,
  LEAD(value, 1) OVER(PARTITION BY user_id ORDER BY event_date) AS next_value
FROM
  my_table

FIRST_VALUE / LAST_VALUE

使いどころ:

  • セッション開始時の値を知りたい
  • 最新の値を取得したい
SELECT
  user_id,
  event_date,
  value,
  FIRST_VALUE(value) OVER(PARTITION BY user_id ORDER BY event_date) AS first_val,
  LAST_VALUE(value) OVER(PARTITION BY user_id ORDER BY event_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_val
FROM
  my_table

NTH_VALUE

使いどころ:

  • 例えば「3番目の訪問日の金額が知りたい」ときに便利!
SELECT
  user_id,
  event_date,
  NTH_VALUE(value, 3) OVER(PARTITION BY user_id ORDER BY event_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS third_val
FROM
  my_table

ROW_NUMBER / RANK / DENSE_RANK

使いどころ:

  • 順位付けや、最新データの抽出などに大活躍!
SELECT
  user_id,
  value,
  ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY event_date DESC) AS row_num,
  RANK() OVER(PARTITION BY user_id ORDER BY value DESC) AS rank_val,
  DENSE_RANK() OVER(PARTITION BY user_id ORDER BY value DESC) AS dense_rank_val
FROM
  my_table

PERCENT_RANK / CUME_DIST

使いどころ:

  • スコアの相対位置を知るのに便利。0〜1の範囲で順位感がわかる!
SELECT
  user_id,
  value,
  PERCENT_RANK() OVER(PARTITION BY user_id ORDER BY value) AS percent_rank_val,
  CUME_DIST() OVER(PARTITION BY user_id ORDER BY value) AS cume_dist_val
FROM
  my_table

NTILE

使いどころ:

  • 分位点(例:4等分でグループ分け)をサクッと計算!
SELECT
  user_id,
  value,
  NTILE(4) OVER(PARTITION BY user_id ORDER BY value) AS quartile
FROM
  my_table

ARRAY_AGG OVER

使いどころ:

  • ウィンドウ内の値を配列でまとめられる!
SELECT
  user_id,
  event_date,
  ARRAY_AGG(value) OVER(PARTITION BY user_id ORDER BY event_date) AS value_array
FROM
  my_table

SUM / AVG / MIN / MAX OVER

使いどころ:

  • 移動平均や累積和をサクッと出せる
SELECT
  user_id,
  event_date,
  value,
  SUM(value) OVER(PARTITION BY user_id ORDER BY event_date) AS cumulative_sum,
  AVG(value) OVER(PARTITION BY user_id ORDER BY event_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg
FROM
  my_table

まとめ

これらを使いこなすことで、SQLの表現力が一気に広がると思います。

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?