2
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

Sequelizeで "SELECT count(DISTINCT(`user_id`))"を実現する

Last updated at Posted at 2022-05-12

Node.js + Sequelize で開発中・・・
重複したデータを除外したデータ数が欲しくなった。
「期間内にサインインしたユニークユーザ数が知りたい」とかそんな感じのこと。

ドキュメント探すのに時間がかかったので、未来の自分と誰かのために書き残します。

テーブル

こんな感じのテーブルを使う想定です。

mysql> select * from access_logs;
+----+-----------+---------+---------+---------------------+
| id | tenant_id | user_id | type    | created_at          |
+----+-----------+---------+---------+---------------------+
|  1 |         1 |       2 | signin  | 2022-04-13 15:55:50 |
|  2 |         1 |       1 | signout | 2022-04-13 15:55:59 |
|  3 |         1 |       1 | signin  | 2022-04-01 00:00:00 |
|  4 |         1 |       2 | signout | 2022-04-14 09:52:43 |
|  5 |         2 |       2 | signin  | 2022-04-14 15:31:28 |
+----+-----------+---------+---------+---------------------+

(型やモデルは・・・推論してください☆)

方法1 distinct:true

一定の期間にサインインしたユーザ数(ユニーク)を取得したいなら、こんな感じで書けばOK。

await AccessLog.count({
  where: {
    tenant_id,
    type: 'signin',
    created_at: { [Op.between]: ['2022-04-01', '2022-04-30'] },
  },
  distinct: true,  // <-重要
  col: 'user_id',  // <-重要
})

distinct: true を入れることでDISTINCT句を再現できます
col: 'user_id' は、primary key 以外のカラムでユニークにしたい場合に使います

↓SQL的にはこんな感じになってるはずです

SELECT
    count(DISTINCT(`user_id`)) AS `count`
FROM
    `access_logs` AS `AccessLog`
WHERE
    `AccessLog`.`tenant_id` = 1
AND `AccessLog`.`type` = 'signin'
AND `AccessLog`.`created_at` BETWEEN '2022-04-01' AND '2022-04-30'
;

col: 'user_id' を抜いたらどうなるか・・

primary keyでDISTINCTされます (count(DISTINCT(id))) 。
↓SQL的にはこんな感じです。

SELECT
    count(DISTINCT(`id`)) AS `count`
FROM
    `access_logs` AS `AccessLog`
WHERE
    
;

方法2 literalを使う

↓例えばユニークなユーザ数を日毎に集計したい場合にはこう書きます。

await AccessLog.findAll({
  attributes: [
    [sequelize.fn('DATE_FORMAT', sequelize.col('created_at'), '%Y/%m/%d'), 'label'],
    [sequelize.literal(`COUNT(DISTINCT user_id)`), 'value1'],
    [sequelize.fn('COUNT', sequelize.col('id')), 'value2'],
  ],
  where,
  group: 'label',
  raw: true,
})

↓SQL的にはこんな感じです。

SELECT
    DATE_FORMAT(`created_at`, '%Y/%m/%d') AS `label`,
    COUNT(DISTINCT user_id) AS `value1`,
    COUNT(`id`) AS `value2`
FROM
    `access_logs` AS `AccessLog`
WHERE
    `AccessLog`.`tenant_id` = 1
AND `AccessLog`.`created_at` BETWEEN '2022-04-18 23:59:59' AND '2023-01-15 23:59:59'
GROUP BY
    `label`
;

↓こんな感じのデータが取れます

+------------+--------+--------+
| label      | value1 | value2 |
+------------+--------+--------+
| 2022/04/30 |      1 |      1 |
| 2022/05/02 |      2 |     11 |
| 2022/05/03 |      1 |     12 |
| 2022/05/06 |      2 |      9 |
| 2022/05/09 |      2 |     16 |
| 2022/05/10 |      1 |      1 |
| 2022/05/11 |      1 |      1 |
| 2022/05/16 |      1 |      1 |
| 2022/05/17 |      1 |      8 |
| 2022/05/19 |      2 |      3 |
| 2022/05/20 |      1 |      3 |
| 2022/05/24 |      1 |      1 |
| 2022/05/25 |      1 |      1 |
| 2022/05/27 |      2 |      9 |
| 2022/05/30 |      2 |      8 |
| 2022/05/31 |      1 |      1 |
| 2022/06/01 |      1 |      1 |
| 2022/06/02 |      1 |      1 |
| 2022/06/10 |      1 |     13 |
| 2022/06/13 |      1 |      1 |
| 2022/06/14 |      1 |      1 |
| 2022/06/15 |      2 |      7 |
| 2022/06/16 |      1 |      2 |
+------------+--------+--------+
23 rows in set (0.00 sec)

まとめ

SequelizeでDISTINCTする方法を探してるとき、リテラルで書くなどの複雑な記事ばかり当ててしまい、この簡単な結論に行き着くのに時間がかかりました。
このメモが未来の自分やどなたかの助けになれば幸いです。

参考:

2
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?