PIVOT clause (Databricks SQL) | Databricks on AWS [2022/4/25時点]の翻訳です
本書は抄訳であり内容の正確性を保証するものではありません。正確な内容に関しては原文を参照ください。
指定されたカラムリストのユニークな値を別々のカラムに回転させることでFROM
句の中間結果セットを変換します。
構文
PIVOT ( { aggregate_expression [ [ AS ] agg_column_alias ] } [, ...]
FOR column_list IN ( expression_list ) )
column_list
{ column_name |
( column_name [, ...] ) }
expression_list
{ expression [ AS ] [ column_alias ] |
{ ( expression [, ...] ) [ AS ] [ column_alias] } [, ...] ) }
パラメーター
-
FROM
句へのすべてのカラムのリファレンスが集計関数の引数となる任意のタイプの表現です。 -
集計結果のエイリアスのオプションです。エイリアスが指定されない場合、
PIVOT
はaggregate_expression
に基づいてエイリアスを生成します。 -
column_list
回転するカラムのセットです。
-
FROM
句のカラムです。
-
-
expression_list
column_list
からカラムエイリアスに値をマッピングします。-
少なくともそれぞれの
column_name
と共通の型を持つリテラル表現です。それぞれのタプルのexpressionの数は、
column_list
のcolumn_names
の数と一致する必要があります。 -
生成されたカラムの名前を指定するオプションのエイリアスです。エイリアスが指定されない場合、
PIVOT
はexpression
に基づいてエイリアスを生成します。
-
結果
一時テーブルは以下の形態を取ります。
-
aggregate_expression
やcolumn_list
で指定されなかったFROM
句の中間セットから構成されるすべてのカラム。 -
それぞれの
expression
タプルとaggregate_expression
の組み合わせに対して、PIVOT
は一つのカラムを生成します。型はaggregate_expression
の型になります。aggregate_expression
が一つのみの場合、column_alias
を用いてカラム名が付けられます。それ以外の場合には、column_alias_agg_column_alias
で名前付けが行われます。それぞれのセルの値は、
FILTER ( WHERE column_list IN (expression, ...)
を用いたaggregation_expression
の結果となります。
サンプル
-- A very basic PIVOT
-- Given a table with sales by quarter, return a table that returns sales across quarters per year.
> CREATE TEMP VIEW sales(year, quarter, region, sales) AS
VALUES (2018, 1, 'east', 100),
(2018, 2, 'east', 20),
(2018, 3, 'east', 40),
(2018, 4, 'east', 40),
(2019, 1, 'east', 120),
(2019, 2, 'east', 110),
(2019, 3, 'east', 80),
(2019, 4, 'east', 60),
(2018, 1, 'west', 105),
(2018, 2, 'west', 25),
(2018, 3, 'west', 45),
(2018, 4, 'west', 45),
(2019, 1, 'west', 125),
(2019, 2, 'west', 115),
(2019, 3, 'west', 85),
(2019, 4, 'west', 65);
> SELECT year, region, q1, q2, q3, q4
FROM sales
PIVOT (sum(sales) AS sales
FOR quarter
IN (1 AS q1, 2 AS q2, 3 AS q3, 4 AS q4));
2018 east 100 20 40 40
2019 east 120 110 80 60
2018 west 105 25 45 45
2019 west 125 115 85 65
-- The same query written without PIVOT
> SELECT year, region,
sum(sales) FILTER(WHERE quarter = 1) AS q1,
sum(sales) FILTER(WHERE quarter = 2) AS q2,
sum(sales) FILTER(WHERE quarter = 3) AS q2,
sum(sales) FILTER(WHERE quarter = 4) AS q4
FROM sales
GROUP BY year, region;
2018 east 100 20 40 40
2019 east 120 110 80 60
2018 west 105 25 45 45
2019 west 125 115 85 65
-- Also PIVOT on region
> SELECT year, q1_east, q1_west, q2_east, q2_west, q3_east, q3_west, q4_east, q4_west
FROM sales
PIVOT (sum(sales) AS sales
FOR (quarter, region)
IN ((1, 'east') AS q1_east, (1, 'west') AS q1_west, (2, 'east') AS q2_east, (2, 'west') AS q2_west,
(3, 'east') AS q3_east, (3, 'west') AS q3_west, (4, 'east') AS q4_east, (4, 'west') AS q4_west));
2018 100 105 20 25 40 45 40 45
2019 120 125 110 115 80 85 60 65
-- The same query written without PIVOT
> SELECT year,
sum(sales) FILTER(WHERE (quarter, region) = (1, 'east')) AS q1_east,
sum(sales) FILTER(WHERE (quarter, region) = (1, 'west')) AS q1_west,
sum(sales) FILTER(WHERE (quarter, region) = (2, 'east')) AS q2_east,
sum(sales) FILTER(WHERE (quarter, region) = (2, 'west')) AS q2_west,
sum(sales) FILTER(WHERE (quarter, region) = (3, 'east')) AS q3_east,
sum(sales) FILTER(WHERE (quarter, region) = (3, 'west')) AS q3_west,
sum(sales) FILTER(WHERE (quarter, region) = (4, 'east')) AS q4_east,
sum(sales) FILTER(WHERE (quarter, region) = (4, 'west')) AS q4_west
FROM sales
GROUP BY year, region;
2018 100 105 20 25 40 45 40 45
2019 120 125 110 115 80 85 60 65
-- To aggregate across regions the column must be removed from the input.
> SELECT year, q1, q2, q3, q4
FROM (SELECT year, quarter, sales FROM sales) AS s
PIVOT (sum(sales) AS sales
FOR quarter
IN (1 AS q1, 2 AS q2, 3 AS q3, 4 AS q4));
2018 205 45 85 85
2019 245 225 165 125
-- The same query without PIVOT
> SELECT year,
sum(sales) FILTER(WHERE quarter = 1) AS q1,
sum(sales) FILTER(WHERE quarter = 2) AS q2,
sum(sales) FILTER(WHERE quarter = 3) AS q3,
sum(sales) FILTER(WHERE quarter = 4) AS q4
FROM sales
GROUP BY year;
-- A PIVOT with multiple aggregations
> SELECT year, q1_total, q1_avg, q2_total, q2_avg, q3_total, q3_avg, q4_total, q4_avg
FROM (SELECT year, quarter, sales FROM sales) AS s
PIVOT (sum(sales) AS total, avg(sales) AS avg
FOR quarter
IN (1 AS q1, 2 AS q2, 3 AS q3, 4 AS q4));
2018 205 102.5 45 22.5 85 42.5 85 42.5
2019 245 122.5 225 112.5 165 82.5 125 62.5
-- The same query without PIVOT
> SELECT year,
sum(sales) FILTER(WHERE quarter = 1) AS q1_total,
avg(sales) FILTER(WHERE quarter = 1) AS q1_avg,
sum(sales) FILTER(WHERE quarter = 1) AS q2_total,
avg(sales) FILTER(WHERE quarter = 1) AS q2_avg,
sum(sales) FILTER(WHERE quarter = 1) AS q3_total,
avg(sales) FILTER(WHERE quarter = 1) AS q3_avg,
sum(sales) FILTER(WHERE quarter = 1) AS q4_total,
avg(sales) FILTER(WHERE quarter = 1) AS q4_avg
FROM sales
GROUP BY year;
> CREATE TEMP VIEW person (id, name, age, class, address) AS
VALUES (100, 'John', 30, 1, 'Street 1'),
(200, 'Mary', NULL, 1, 'Street 2'),
(300, 'Mike', 80, 3, 'Street 3'),
(400, 'Dan', 50, 4, 'Street 4');
2018 205 102.5 45 22.5 85 42.5 85 42.5
2019 245 122.5 225 112.5 165 82.5 125 62.5