6
6

More than 5 years have passed since last update.

PostgreSQLで行を列に変換して月別集計する

Posted at

月別集計+行を列に変換するSQLを書いたのでメモ。

create table sales_category(
  sales_category_id int primary key,
  sales_name        text not null
);

insert into sales_category values(1, 'ダウンロード販売');
insert into sales_category values(2, 'ポイント購入');
insert into sales_category values(3, '物販');

create table customer_class(
  customer_class_id   int primary key,
  customer_class_name text not null
);
insert into customer_class values(1, '非会員');
insert into customer_class values(2, '一般会員');
insert into customer_class values(3, 'プレミア会員');

create table sales(
  sales_category_id int references sales_category,
  customer_class_id int references customer_class,
  period            date not null check(extract(day from period) = 1),
  amount            int not null,
  primary key(sales_category_id, customer_class_id, period)
);
insert into sales
  select s, c, p, s * c * extract(year from p) * extract(month from p) from
    generate_series(1,3) as t1(s),
    generate_series(1,3) as t2(c),
    generate_series('2014-04-01'::date, '2015-04-01', '1 month'::interval) as t3(p);
select * from sales;
 sales_category_id | customer_class_id |   period   | amount 
-------------------+-------------------+------------+--------
                 1 |                 1 | 2014-04-01 |   8056
                 1 |                 2 | 2014-04-01 |  16112
                 1 |                 3 | 2014-04-01 |  24168
                 2 |                 1 | 2014-04-01 |  16112
                 2 |                 2 | 2014-04-01 |  32224
                 2 |                 3 | 2014-04-01 |  48336
                 3 |                 1 | 2014-04-01 |  24168
                 3 |                 2 | 2014-04-01 |  48336
                 3 |                 3 | 2014-04-01 |  72504
                 1 |                 1 | 2014-05-01 |  10070
                 1 |                 2 | 2014-05-01 |  20140
                 1 |                 3 | 2014-05-01 |  30210
                 2 |                 1 | 2014-05-01 |  20140
                 2 |                 2 | 2014-05-01 |  40280
                 2 |                 3 | 2014-05-01 |  60420
                 3 |                 1 | 2014-05-01 |  30210
                 3 |                 2 | 2014-05-01 |  60420
                 3 |                 3 | 2014-05-01 |  90630
                 1 |                 1 | 2014-06-01 |  12084
...

MySQLでは年月データをintにする場合があるようだけれども、PostgreSQLではcheck制約や関数インデックスが使えるのでdateでやった。
普通はどうするんだろう。

https://postgresql.g.hatena.ne.jp/pgsql/20110416
ここを参考に、販売種別ごとの月別集計を表にする。

まず、ある月の集計は以下のようになる。

SELECT period, sales_name, customer_class_name, amount
FROM sales_category
LEFT OUTER JOIN sales USING(sales_category_id)
LEFT OUTER JOIN customer_class USING(customer_class_id)
WHERE period = '2015-01-01'
ORDER BY sales_category_id, customer_class_id
;
   period   |    sales_name    | customer_class_name | amount 
------------+------------------+---------------------+--------
 2015-01-01 | ダウンロード販売 | 非会員              |   2015
 2015-01-01 | ダウンロード販売 | 一般会員            |   4030
 2015-01-01 | ダウンロード販売 | プレミア会員        |   6045
 2015-01-01 | ポイント購入     | 非会員              |   4030
 2015-01-01 | ポイント購入     | 一般会員            |   8060
 2015-01-01 | ポイント購入     | プレミア会員        |  12090
 2015-01-01 | 物販             | 非会員              |   6045
 2015-01-01 | 物販             | 一般会員            |  12090
 2015-01-01 | 物販             | プレミア会員        |  18135

販売種別ごとの集計

SELECT period, sales_name, sum(amount)
FROM sales_category
LEFT OUTER JOIN sales USING(sales_category_id)
WHERE period = '2015-01-01'
GROUP BY sales_category_id, sales_name, period
ORDER BY sales_category_id
;
   period   |    sales_name    |  sum  
------------+------------------+-------
 2015-01-01 | ダウンロード販売 | 12090
 2015-01-01 | ポイント購入     | 24180
 2015-01-01 | 物販             | 36270

ここから、列を行に変換してExcelのような表にする。

SELECT sales_name
  ,sum(CASE ym WHEN '2015-01-01' THEN amount END) AS "2015/01"
  ,sum(CASE ym WHEN '2015-02-01' THEN amount END) AS "2015/02"
  ,sum(CASE ym WHEN '2015-03-01' THEN amount END) AS "2015/03"
  ,sum(CASE ym WHEN '2015-04-01' THEN amount END) AS "2015/04"
FROM
  (SELECT * FROM
    (SELECT generate_series('2014-04-01'::date,
                            '2015-04-01', '1 month'::interval) AS ym
    ) AS foo,
    (SELECT * from sales_category) as bar
  ) as s
  LEFT OUTER JOIN sales
    ON s.sales_category_id = sales.sales_category_id
    AND ym = period
GROUP BY s.sales_category_id, sales_name
ORDER BY s.sales_category_id
;
    sales_name    | 2015/01 | 2015/02 | 2015/03 | 2015/04 
------------------+---------+---------+---------+---------
 ダウンロード販売 |   12090 |   24180 |   36270 |   48360
 ポイント購入     |   24180 |   48360 |   72540 |   96720
 物販             |   36270 |   72540 |  108810 |  145080

まずFROMの最初でCROSS JOINを使って行列の初期値を出している。
salesやsales_categoryを元にOUTER JOINしていくと、データが空の時に行が抜け落ちる。Excel的に、データ0件でも3行固定にしたかったので。

更に、月別購入回数のテーブルがあるとき

create table sales_count(
  sales_category_id int references sales_category,
  period            date not null check(extract(day from period) = 1),
  sales_count       int not null,
  primary key(sales_category_id, period)
);
insert into sales_count
  select s, p, s * (extract(year from p) + extract(month from p)) from
    generate_series(1,3) as t1(s),
    generate_series('2014-04-01'::date, '2015-04-01', '1 month'::interval) as t3(p);

select * from sales_count ;
 sales_category_id |   period   | sales_count 
-------------------+------------+-------------
                 1 | 2014-04-01 |        2018
                 1 | 2014-05-01 |        2019
                 1 | 2014-06-01 |        2020
                 1 | 2014-07-01 |        2021
                 1 | 2014-08-01 |        2022
                 1 | 2014-09-01 |        2023
                 1 | 2014-10-01 |        2024
                 1 | 2014-11-01 |        2025
                 1 | 2014-12-01 |        2026
                 1 | 2015-01-01 |        2016
                 1 | 2015-02-01 |        2017
                 1 | 2015-03-01 |        2018
                 1 | 2015-04-01 |        2019
                 2 | 2014-04-01 |        4036
                 2 | 2014-05-01 |        4038
                 2 | 2014-06-01 |        4040
                 2 | 2014-07-01 |        4042

これも合わせて表示する。

SELECT sales_name
  ,sum(CASE ym WHEN '2015-01-01' THEN amount      END) AS "a2015/01"
  ,max(CASE ym WHEN '2015-01-01' THEN sales_count END) AS "c2015/01"
  ,sum(CASE ym WHEN '2015-02-01' THEN amount      END) AS "a2015/02"
  ,max(CASE ym WHEN '2015-02-01' THEN sales_count END) AS "c2015/02"
FROM
  (SELECT * FROM
    (SELECT generate_series('2014-04-01'::date,
                            '2015-04-01', '1 month'::interval) AS ym
    ) AS foo,
    (SELECT * from sales_category) as bar
  ) as s
  LEFT OUTER JOIN sales
    ON s.sales_category_id = sales.sales_category_id
    AND ym = sales.period
  LEFT OUTER JOIN sales_count
    ON s.sales_category_id = sales_count.sales_category_id
    AND ym = sales_count.period
GROUP BY s.sales_category_id, sales_name
ORDER BY s.sales_category_id
;
    sales_name    | a2015/01 | c2015/01 | a2015/02 | c2015/02 
------------------+----------+----------+----------+----------
 ダウンロード販売 |    12090 |     2016 |    24180 |     2017
 ポイント購入     |    24180 |     4032 |    48360 |     4034
 物販             |    36270 |     6048 |    72540 |     6051

amountは、customer_classごとの合計値なのでsumを使い、sales_countは1件のデータが欲しいだけなのでmaxまたはminを使う。
(sales_countが customer_class と CROSS JOIN されてしまうので、一行にまとめている。)

あとはCROSS JOINに気をつけて、重複をうまく消すように結合すれば色々出力できる。

SELECT sales_name
  ,sum(CASE ym WHEN '2015-03-01' THEN sales.amount END) AS "2015/03"
  ,sum(CASE ym WHEN '2015-03-01' THEN last.amount  END) AS "前年"
  ,max(CASE ym WHEN '2015-03-01' THEN sales_count  END) AS "件数"
  ,sum(CASE ym WHEN '2015-04-01' THEN sales.amount END) AS "2015/04"
  ,sum(CASE ym WHEN '2015-04-01' THEN last.amount  END) AS "前年"
  ,max(CASE ym WHEN '2015-04-01' THEN sales_count  END) AS "件数"
FROM
  (SELECT * FROM
    (SELECT generate_series('2014-04-01'::date,
                            '2015-04-01', '1 month'::interval) AS ym
    ) AS foo,
    (SELECT * from sales_category) as bar
  ) as s
  LEFT OUTER JOIN sales
    ON s.sales_category_id = sales.sales_category_id
    AND ym = sales.period
  LEFT OUTER JOIN sales_count
    ON s.sales_category_id = sales_count.sales_category_id
    AND ym = sales_count.period
  LEFT OUTER JOIN sales AS last
    ON s.sales_category_id = last.sales_category_id
    AND sales.customer_class_id = last.customer_class_id
    AND ym = last.period + '1 year'::interval
GROUP BY s.sales_category_id, sales_name
ORDER BY s.sales_category_id
;
    sales_name    | 2015/03 | 前年 | 件数 | 2015/04 |  前年  | 件数 
------------------+---------+------+------+---------+--------+------
 ダウンロード販売 |   36270 |      | 2018 |   48360 |  48336 | 2019
 ポイント購入     |   72540 |      | 4036 |   96720 |  96672 | 4038
 物販             |  108810 |      | 6054 |  145080 | 145008 | 6057
6
6
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
6
6