ActiveRecordのコネクションプールの理解を深める

Last updated at 2017-12-12Posted at 2017-12-11

昨日に引き続き、今は亡き経営工学科(通称「i科」)という学科のOBが担当させていただきます！
今回は「へー」と思ったことが半年くらい前にあったのを当時社内の共有メモとして残していたのですがせっかくの機会なのでまた整理がてら引っ張り出してきました。

　発端

rails3→rails4
rails4→rails5

とrailsのアップデートを粛々とやったことがあるのですがrails4→rails5へあげるときconnection_poolで怒られるというエラー場面に出くわしました。

エラー文面はこんなんです。

could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); 
all pooled connections were in use (ActiveRecord::ConnectionTimeoutError)

エラー文言の通りにタイムアウトしているのですが何度リトライしても同じ状態だったので調べないとなーと思って調べました。
結論はdatabase.ymlのpoolを5→10に増やしたら解決。なぜこんなことが起こったのか気になったので調べました。
ちなみに弊社は「Rails + Unicorn + Nginx + MySQLの構成をAWS」です！

そもそもコネクションプールとは

コネクションプールとは、データベースへ接続するときに接続状態を保持しておきそのコネクションを再利用することでデータベースへの接続を短縮する機能のことです。ちなみにコネクションプールの設定はRDBにしかないらしいです。話を戻してコネクションは不足するとDBへの接続リクエストは待ち状態になります。そして待ち状態のまま一定時間が経過するとActiveRecord::ConnectionTimeoutErrorが発生。これが今回起こった現象でした。

本題

さてドキュメントを読むとこんなことが書いてあります。
データベース接続をプールする

Active Recordのデータベース接続はActiveRecord::ConnectionAdapters::ConnectionPoolによって管理されます。これは、接続数に限りのあるデータベース接続にアクセスする際のスレッド数と接続プールが同期するようにするものです。最大接続数はデフォルトで5ですが、database.ymlでカスタマイズ可能です。

接続プールはデフォルトではActive Recordで取り扱われるため、アプリケーションサーバーの動作は、ThinやmongrelやUnicornなどどれであっても同じ振る舞いになります。最初はデータベース接続のプールは空で、必要に応じて追加接続が作成され、接続プールの上限に達するまで接続が追加されます。

なんだかわかるようなわからないような・・・

Active Recordのデータベース接続はActiveRecord::ConnectionAdapters::ConnectionPoolによって管理され

ということらしいのでエンジニアらしく実際にコードを見てみることにします。

コードを追ってみる

ActiveRecord::ConnectionAdapters::ConnectionPool

まず「ActiveRecord::ConnectionAdapters::ConnectionPoolで管理」と書いてあるのでそこに飛んでみます。
http://api.rubyonrails.org/v5.0/classes/ActiveRecord/ConnectionAdapters/ConnectionPool.html

IntroductionとObtaining (checking out) a connectionの部分を軽く読んでみますが、ここでもなんとも言い難い感じなのは否めないです。Obtaining (checking out) a connectionの部分でコネクションに関しては全てActiveRecord::Baseのメソッドであることがわかると思います。なので次はそっちを見てみます。

ActiveRecord::Base

接続に関してこんな内容があります。

Connections are usually created through ActiveRecord::Base.establish_connection and retrieved by ActiveRecord::Base.connection. All classes inheriting from ActiveRecord::Base will use this connection. But you can also set a class-specific connection. For example, if Course is an ActiveRecord::Base, but resides in a different database, you can just say Course.establish_connection and Course and all of its subclasses will use this connection instead.

This feature is implemented by keeping a connection pool in ActiveRecord::Base that is a hash indexed by the class. If a connection is requested, the ActiveRecord::Base.retrieve_connection method will go up the class-hierarchy until a connection is found in the connection pool.

ActiveRecord::Base.establish_connectionで作成しActiveRecord::Base.connectionで接続すると書かれています。

ActiveRecord::Base.establish_connection

作成方法に関してはそこまで興味がわかなかったのでとりあえず接続の仕方だけ追います笑。
ここで何かデフォルトいじっているだけとかだった場合などは作成の仕方にキーがあるはずなので戻ってきます。

ActiveRecord::Base.connection

ググるとこんな感じ。
https://apidock.com/rails/ActiveRecord/Base/connection

This method is deprecated or moved on the latest stable version. The last existing version (v3.2.13) is shown here.

These similar methods exist in v4.2.7:

ActiveRecord::ConnectionAdapters::ConnectionPool#connection
ActiveRecord::Migration#connection
ActiveRecord::ConnectionHandling#connection
ActiveRecord::Migration::CheckPending#connection

移動しているではないか・・・・

ActiveRecord::ConnectionAdapters::ConnectionPool#connection

が一番怪しいのでここから見に行く

ActiveRecord::ConnectionAdapters::ConnectionPool#connection

発見。
https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb

initialize時に @lock_thread = false　と宣言されていることから@lock_thread || Thread.current)はThread.currentが返る。

      def connection
        @thread_cached_conns[connection_cache_key(@lock_thread || Thread.current)] ||= checkout
      end

引数返してますね。

        def connection_cache_key(thread)
          thread
        end

@thread_cached_connsはinitialize時に@sizeでマッピングしてます。
@sizeはinitialize時に引数が指定されていなければデフォルトの5になるようになっています。

      def initialize(spec)
        super()

        @spec = spec

        @checkout_timeout = (spec.config[:checkout_timeout] && spec.config[:checkout_timeout].to_f) || 5
        if @idle_timeout = spec.config.fetch(:idle_timeout, 300)
          @idle_timeout = @idle_timeout.to_f
          @idle_timeout = nil if @idle_timeout <= 0
        end

        # default max pool size to 5
        @size = (spec.config[:pool] && spec.config[:pool].to_i) || 5

        # This variable tracks the cache of threads mapped to reserved connections, with the
        # sole purpose of speeding up the +connection+ method. It is not the authoritative
        # registry of which thread owns which connection. Connection ownership is tracked by
        # the +connection.owner+ attr on each +connection+ instance.
        # The invariant works like this: if there is mapping of <tt>thread => conn</tt>,
        # then that +thread+ does indeed own that +conn+. However, an absence of a such
        # mapping does not mean that the +thread+ doesn't own the said connection. In
        # that case +conn.owner+ attr should be consulted.
        # Access and modification of <tt>@thread_cached_conns</tt> does not require
        # synchronization.
        @thread_cached_conns = Concurrent::Map.new(initial_capacity: @size)
・
・
・
・

herokuさんのところにも書いてありますが基本1コネクション、1スレッド / 1ワーカーです。

Active Record will only create a connection when a new thread or process attempts to talk to the database through a SQL query.

Threadの数だけコネクションプールを張るってことか。しかもConcurrentを入れているあたりゴリゴリの並列処理ですね。
皆さんも知っての通り、rails5からデフォルトのミドルウェアがpumaになっています。その影響ですかね。つまりunicorn卒業してね！というメッセージなのかなと個人的には思ってしまいました。。

まとめ

今回、最後の方端折っちゃいましたがchangelogとかみるのもどういう変化が起こっているのかが目に見えて面白いです。railsに限らずこうやったコードを追うことで中で何が起こっているのかがわかるのでぜひdocumentと仲良くなりましょう！

【参考URL】
Rails4.2のコネクションプールの実装を理解する

最後に。。

Atraeでは、一緒に闘ってくれる仲間を募集しています。もし興味ある方は声かけていただければと思います！
https://www.green-japan.com/job/51812?jo=blur_pc

114

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up