4
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

PostgreSQLCursor を使って order しつつ find_each 的なことを行う

Last updated at Posted at 2016-09-18

概要

find_each の問題点

ActiveRecord::Batches#find_each 便利ですよね。大量のレコードを処理する場合に、レコードの取得を分割することができます。ただ find_each には無視できない制約があります。それは id の昇順でしかレコードを取得できないことです。

User.order(:name).find_each(batch_size: 10).map(&:name)
Scoped order and limit are ignored, it's forced to be batch order and batch size.
  User Load (0.4ms)  SELECT  "users".* FROM "users" ORDER BY "users"."id" ASC LIMIT $1  [["LIMIT", 10]]
  User Load (0.3ms)  SELECT  "users".* FROM "users" WHERE ("users"."id" > 10) ORDER BY "users"."id" ASC LIMIT $1  [["LIMIT", 10]]
  User Load (0.3ms)  SELECT  "users".* FROM "users" WHERE ("users"."id" > 20) ORDER BY "users"."id" ASC LIMIT $1  [["LIMIT", 10]]
  User Load (0.3ms)  SELECT  "users".* FROM "users" WHERE ("users"."id" > 30) ORDER BY "users"."id" ASC LIMIT $1  [["LIMIT", 10]]
  User Load (0.3ms)  SELECT  "users".* FROM "users" WHERE ("users"."id" > 40) ORDER BY "users"."id" ASC LIMIT $1  [["LIMIT", 10]]
  User Load (0.3ms)  SELECT  "users".* FROM "users" WHERE ("users"."id" > 50) ORDER BY "users"."id" ASC LIMIT $1  [["LIMIT", 10]]

=> ["にしもと たまほ",
 "まつざき あやこ",
 "さかた らんな",
 "しまむら りさ",
 "どい めい",
 "あさの すみな",
 "たかぎ まもる",
 "かわかみ ひろみ",
 "なかやま かい",
 "たにがわ さやか",
 …

Scoped order and limit are ignored, it's forced to be batch order and batch size. という警告が表示され、発行された SQL に ORDER BY name が付与されていないのが分かります。

解決策

PostgreSQLCursor という Gem を使って、レコード分割しながらかつ任意の並び順で取得します。ここで、find_each の代わりに each_instance というメソッドを使用します。

一度の取得数を指定する場合は、batch_size ではなく block_size というキーで指定します。

User.order(:name).each_instance(block_size: 10).map(&:name)
   (0.1ms)  BEGIN
   (0.3ms)  declare cursor_cc610a9c45d24652ae951cfd572712a6 cursor for SELECT "users".* FROM "users" ORDER BY "users"."name" ASC
   (0.2ms)  fetch 10 from cursor_cc610a9c45d24652ae951cfd572712a6
   (0.1ms)  fetch 10 from cursor_cc610a9c45d24652ae951cfd572712a6
   (0.2ms)  fetch 10 from cursor_cc610a9c45d24652ae951cfd572712a6
   (0.1ms)  fetch 10 from cursor_cc610a9c45d24652ae951cfd572712a6
   (0.1ms)  fetch 10 from cursor_cc610a9c45d24652ae951cfd572712a6
   (0.1ms)  fetch 10 from cursor_cc610a9c45d24652ae951cfd572712a6
   (0.1ms)  fetch 10 from cursor_cc610a9c45d24652ae951cfd572712a6
   (0.1ms)  close cursor_cc610a9c45d24652ae951cfd572712a6
   (0.1ms)  COMMIT

=> ["あさの こうぞう",
 "あさの すみな",
 "あずま やすあき",
 "あだち ひろみつ",
 "あべ あきお",
 "いしい りさ",
 "いしはら やすゆき",
 "いなば なお",
 "うえはら うり",
 "うえはら かなこ",
 …

正しくソートされていますね :blush:

なお、PostgreSQLCursor は Postgresql の カーソル という機能を使って、レコードの分割取得を実現しています 。
Using Cursors

そして order が使用可能になること以外にも find_each が抱えるデメリットを解消しています。
Background: Why PostgreSQL Cursors?

4
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
4
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?