10
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

BigQueryで重複IDをもつ行をひとつにするやつ

Posted at

概要

重複してほしくないカラムがあるとき、その重複行を削除するクエリ

SELECT AS VALUE
  ARRAY_AGG(t)[OFFSET(0)]
FROM `hoge.fuga.piyo` AS t
GROUP BY choufuku_column

SELECT句で呼ぶテーブル名ってSTRUCTなんですね。今まで何気なく使ってたが初めて認識してちょっと感動した。

ちなみに公式で推奨されるもの。

# standardSQL
SELECT
  * EXCEPT(row_number)
FROM (
  SELECT
    *,
    ROW_NUMBER()
          OVER (PARTITION BY ID_COLUMN) row_number
  FROM
    `TABLE_NAME`)
WHERE
  row_number = 1

余談

ストリーミングAPIで格納した重複検知用のID付のテーブルをマテリアライズドビューで重複除外しようと考えた。
しかし、マテリアライズドビューにはROW_NUMBER()が使えないとのこと。
なのでARRAY_AGGで書き直したが、結局ダメだった。

10
3
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
10
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?