n+1問題などの解消のためにeager loadingが使われますが、その時に使うメソッドと言えばeager_load
やinclude
、preload
あたり。
本記事の内容は僕の上司から教えてもらった記事から引用します。時々僕の解釈も入れます。翻訳違ってたら教えてください。
いつもどれを使うかと悩んだ時に判断基準をあまり用意してなかったので今回備忘録的にまとめてみました。
どれ使えばいいの?
備忘録も兼ねてるので僕なりの結論から先に書きます。多分見返した時こっちの方が分かりやすいので。
-
preload
はクエリを二つ発行し、eager_load
は結合を使って一つのクエリを発行する。preload
のIN句はデータ量が多いとボトルネックになるかも。 -
where
メソッドとpreload
の組み合わせはsqlの例外が発生する。eager_load
使おう -
include
はRail任せなメソッド。eager_load
かpreload
どっちを使うか決めよう
使うデータ、モデル
サンプルコード丸々使うのは怒られそうなので、適当に。
class Customer < ActiveRecord::Base
has_many :books
end
class Book < ActiveRecord::Base
belongs_to :customer
end
yama = Customer.create!(name: "山田太郎", email: "mail@mail.com")
sato = Customer.create!(name: "佐藤次郎", email: "mail2@mail.com")
yama.books.create!(title: "こころ", author: "夏目漱石", published_at: Date.today.ago(110.years), price: 1000)
yama.books.create!(title: "ノルウェーの森", author: "村上春樹", published_at: Date.today.ago(100.years), price: 1111)
sato.books.create!(title: "ワンピース", author: "尾田栄一郎", published_at: Date.today.ago(10.years), price: 1212)
本当に適当です。
各メソッドの違い
単純なアソシエーションがあるデータでのクエリ
Customer.includes(:books)
# SELECT "customers".* FROM "customers"
# SELECT "books".* FROM "books" WHERE "books"."customer_id" IN (1, 2)
Customer.preload(:books)
# SELECT "customers".* FROM "customers"
# SELECT "books".* FROM "books" WHERE "books"."customer_id" IN (1, 2)
結果は同じ。二つのクエリが発行されているようです。
eager_loadは上二つとかなり違った結果になっています。左結合を使ってクエリが一つにまとまっています。
Customer.eager_load(:books)
# SELECT
# "customers"."id" AS t0_r0, "customers"."name" AS t0_r1, "customers"."email" AS t0_r2, "customers"."created_at" AS t0_r3, "customers"."updated_at" AS t0_r4,
# "books"."id" AS t1_r0, "books"."customer_id" AS t1_r1, "books"."title" AS t1_r2, "books"."author" AS t1_r3, "books"."published_at" AS t1_r4, "books"."price" AS t1_r5, "books"."created_at" AS t1_r6, "books"."updated_at" AS t1_r7
# FROM "customers"
# LEFT OUTER JOIN "books" ON "books"."customer_id" = "customers"."id"
一つのクエリか二つのクエリかどちらかを発行するかによって使い分けをするといいでしょう。
加えてinclude
はプリロードされたモデルの値を使ったwhereメソッドと一緒に呼び出すとeager_loadingと同じ動きをします。
Customer.includes(:books).where("books.title = ?", "こころ")
Customer.eager_load(:books).where("books.title = ?", "こころ")
# SELECT
# "customers"."id" AS t0_r0, "customers"."name" AS t0_r1, "customers"."email" AS t0_r2, "customers"."created_at" AS t0_r3, "customers"."updated_at" AS t0_r4,
# "books"."id" AS t1_r0, "books"."customer_id" AS t1_r1, "books"."title" AS t1_r2, "books"."author" AS t1_r3, "books"."published_at" AS t1_r4, "books"."price" AS t1_r5, "books"."created_at" AS t1_r6, "books"."updated_at" AS t1_r7
# FROM "customers"
# LEFT OUTER JOIN "books"
# ON "books"."customer_id" = "customers"."id"
# WHERE (books.title = 'こころ')
preload
は結合を使わないのでこういったことはできません。where句と組み合わせるならeager_load使いましょう。
Customer.preload(:books).where("books.title = ?", "こころ")
# SELECT "customers".* FROM "customers" WHERE (books.title = 'こころ')
#
# SQLite3::SQLException: no such column: books.title
「こころ」を持つユーザーを取り出すと仮定しましょう。実現するための例としてはこんなコードがあります。
Customer.includes(:books).where("books.title = ?", "こころ")
では「『こころ』を持つユーザー、かつそのユーザーの持つ全ての本をプリロードする」場合には、どのメソッドが有効でしょうか。
r = Customer.joins(:books).where("books.title = ?", "こころ").includes(:books)
r[0]
# => #<Customer id: 1, name: "山田太郎", email: "mail@mail.com", created_at: "2013-12-08 11:26:24", updated_at: "2013-12-08 11:26:24">
r[0].books
# [
# #<Book id: 1, customer_id: 1, title: "こころ", author: "夏目漱石", published_at: Mon, 06 Mar 1911 00:00:00 JST +09:00, price: 1000, created_at: "2013-12-08 11:26:50", updated_at: "2013-12-08 11:26:50">
# ]
# --------------------------------------------------
# SELECT
# "customers"."id" AS t0_r0, "customers"."name" AS t0_r1, "customers"."email" AS t0_r2, "customers"."created_at" AS t0_r3, "customers"."updated_at" AS t0_r4,
# "books"."id" AS t1_r0, "books"."customer_id" AS t1_r1, "books"."title" AS t1_r2, "books"."author" AS t1_r3, "books"."published_at" AS t1_r4, "books"."price" AS t1_r5, "books"."created_at" AS t1_r6, "books"."updated_at" AS t1_r7
# FROM "customers"
# INNER JOIN "books"
# ON "books"."customer_id" = "customers"."id"
# WHERE (books.title = 'こころ')
山田太郎は2つBooksを持っていますが、その全ての本は取れていませんね。
それに今回はINNER JOINが使われています。
r = Customer.joins(:books).where("books.title = ?", "こころ").preload(:books)
# SELECT "customers".* FROM "customers"
# INNER JOIN "books" ON "books"."customer_id" = "customers"."id"
# WHERE (books.title = 'こころ')
# SELECT "books".* FROM "books" WHERE "books"."customer_id" IN (1)
r[0]
# [#<Customer id: 1, name: "山田太郎", email: "mail@mail.com", created_at: "2013-12-08 11:26:24", updated_at: "2013-12-08 11:26:24">]
r[0].books
# [
# <Book id: 1, customer_id: 1, title: "こころ", author: "夏目漱石", published_at: Mon, 06 Mar 1911 00:00:00 JST +09:00, price: 1000, created_at: "2013-12-08 11:26:50", updated_at: "2013-12-08 11:26:50">,
# <Book id: 3, customer_id: 1, title: "ノルウェーの森", author: "村上春樹", published_at: Sun, 06 Mar 1921 00:00:00 JST +09:00, price: 1111, created_at: "2013-12-08 11:36:30", updated_at: "2013-12-08 11:36:30">]
# ]
preload
を使えば実現できます。二つ目のクエリでユーザーの持つ全ての本を取得していますね。
「全てのユーザーと彼らの『こころ』という本を取り出す」にはどうすればいいでしょうか。
subset of association
と記事にはあるのでアソシエーション先のテーブルのデータを一部取り出すという感じでしょう。
has_many
のcondition
とscope
にはラムダを使う
class Customer < ActiveRecord::Base
has_many :books
has_many :kokoro_books, conditions: {title: "こころ"}, class_name: "Book"
end
r = Customer.eager_load(:kokoro_books)
r = Customer.preload(:kokoro_books)
eager_loadとpreloadのどちらでもできます。
Rails4以降conditions
はラムダを使った方がいいらしいです。
class Customer < ActiveRecord::Base
has_many :books
has_many :kokoro_books, -> {where(title: "こころ")}, class_name: "Book"
end
scope
も同様です。僕は実務ではscopeの方をよく使ってます。
include
とwhere
の組み合わせが良くないらしい
めちゃ長い警告が出ます。
Customer.includes(:books).where("books.title = ?", "こころ")
# DEPRECATION WARNING: It looks like you are eager loading table(s)
# (one of: customers, books) that are referenced in a string SQL
# snippet. For example:
#
# Post.includes(:comments).where("comments.title = 'foo'")
#
# Currently, Active Record recognizes the table in the string, and knows
# to JOIN the comments table to the query, rather than loading comments
# in a separate query. However, doing this without writing a full-blown
# SQL parser is inherently flawed. Since we don't want to write an SQL
# parser, we are removing this functionality. From now on, you must explicitly
# tell Active Record when you are referencing a table from a string:
#
# Post.includes(:comments).where("comments.title = 'foo'").references(:comments)
#
# If you don't rely on implicit join references you can disable the
# feature entirely by setting `config.active_record.disable_implicit_join_references = true`. (
# SELECT "customers"."id" AS t0_r0, "customers"."name" AS t0_r1, "customers"."email" AS t0_r2, "customers"."created_at" AS t0_r3, "customers"."updated_at" AS t0_r4,
# "books"."id" AS t1_r0, "books"."customer_id" AS t1_r1, "books"."title" AS t1_r2, "books"."author" AS t1_r3, "books"."published_at" AS t1_r4, "books"."price" AS t1_r5, "books"."created_at" AS t1_r6, "books"."updated_at" AS t1_r7
# FROM "customers"
# LEFT OUTER JOIN "books" ON "books"."customer_id" = "customers"."id"
# WHERE (books.title = 'こころ')
代わりにeager_loadを使うか、referencesメソッドと併せて使ってねと言われてます。Railsを賢すぎないようにしたいのと、どのアルゴリズムを使うか知るためにwhere
条件を監視しないようにしたいからです。
Customer.includes(:books, :tags).where("books.title = ?", "こころ").references(:books)
# SELECT "customers"."id" AS t0_r0, "customers"."name" AS t0_r1, "customers"."email" AS t0_r2, "customers"."created_at" AS t0_r3, "customers"."updated_at" AS t0_r4,
# "books"."id" AS t1_r0, "books"."customer_id" AS t1_r1, "books"."title" AS t1_r2, "books"."author" AS t1_r3, "books"."published_at" AS t1_r4, "books"."price" AS t1_r5, "books"."created_at" AS t1_r6, "books"."updated_at" AS t1_r7,
# "tags"."id" AS t2_r0, "tags"."customer_id" AS t2_r1, "tags"."name" AS t2_r2, "tags"."created_at" AS t2_r3, "tags"."updated_at" AS t2_r4
# FROM "customers"
# LEFT OUTER JOIN "books" ON "books"."customer_id" = "customers"."id"
# LEFT OUTER JOIN "tags" ON "tags"."customer_id" = "customers"."id"
# WHERE (books.title = 'こころ')
booksテーブルはeager_load
メソッドのアルゴリズム(LEFT JOIN)が使われており、tagsテーブルにはpreload
メソッドのアルゴリズム(二つのクエリを発行する方法)が使われています。
以上です。includeはあまり好ましくなく、preload
、eager_load
、追加的にjoin
を状況に併せて使っていくべきでしょう。
preloadとeager_load、この状況ではこっちの方がいいよといった知識がある方、ぜひ教えていただければと思います。