More than 1 year has passed since last update.

Redshiftでテーブル更新する方法（ステージング）

redshift

Posted at 2024-05-21

0.参考教材

英語ではありますが、Redshiftについて詳しく記載されているのでとても良い教材かと思います。

1.Redshiftでは、テーブル同士のmergeやupsertができない

ステージングを活用する

salesテーブルはマスター

sales_updateテーブルは更新後のテーブル

create table sales_update as 

select * from sales

この段階では、

salesとsales_updateテーブル行は同じ

次に更新テーブルに対して、2008-11-30以降のデータを更新する

**update sales_update

set qtysold = qtysold *2,

pricepaid = pricepaid * 0,

commission = commission *1.1

where saletime > ‘2008-11-30’

and mod(sallerid,5) = 0;**

行の全セラーの中から５番目のセラーIDを選び、販売数量を２の倍数で更新し、

支払い価格を０の倍数で更新する。

2008-11-30以降のsalleridが4で割り切れるデータ分salesidを追加

**insert into sales_update

select (salesid + 172456) as salesid,listid,sellerid,……,

from sales

where saletime > ‘2008-11-30’

and mod(sellerid,4) =0;**

sales_updateに新たなデータを入れている。

結果833行を追加

結果として、新たな売り上げデータは、

sales_updateテーブルに格納されている。

sales_updateは、

seller_idが5で割り切れる行が更新されている
seller_idが4で割り切れている行（833行）に対して、sales_idのみが変更されているデータを追加

元テーブルをどのように更新するか？

→ステージングテーブルの助けを用いてマージする

--ステージングの作成
create temp table stagesales as 
select * from sales_update
--更新されたデータのみを抽出している。
where sales_update.saletime > '2008-11-30'
and sales_update.salesid = (select sales.salesid from sales 
where sales.salesid = sales_update.salesid
and sales.listid = sales_update.listid
and sales_update.qtysold !=sales.qtysold
or sales_update.pricepaid !=sales.pricepaid);

insert into stagesales
select su.salesid,su.listid,.......su.saletime
from sales_update.su left join sales on su.salesid = sales.saleid
where sales.salesid is null

-- sales_updateの2つ目の処理で、salesidを変更している。もし異なれば、
-- nullになっているはず。
--つまり、新たな情報をstagesalesに加えている。

ここまでのstagesalesの状態

更新された2008年11月30日以降のデータと、追加データのみが入っている。

--更新予定のsalesidを元テーブルから消す
delete from sales
using stagesales
where sales.salesid = stagesales.salesid
and sales.listid = stagesales.listid
and sales.saletime > '2008-11-30'

--insert
--これには新しいデータしか入っていない
insert into sales
select * from stagesales;

ステージングテーブルを作成することで、

最新テーブルへの結合が可能となる。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up