More than 5 years have passed since last update.

Railsでwebページのクロールをする実装をする時、とりあえずサクッとキャッシュする方法。

Last updated at 2015-02-06Posted at 2015-02-06

概要

・キャッシュヒットすれば即時返却。
・ページを取得した場合相手側の負荷を抑えられるsleep(1)付き
・Herokuでの利用もOK。

# app/utils/http_utils.rb とかに配置する
class HttpUtils
  def self.get(_url)
    url = _url.strip
    Rails.cache.fetch("html_caches.#{Digest::MD5.hexdigest(url)}", expires_in: 1.day) do
      ret = open(url).read
      sleep(1)
      ret
    end
  end
end

使い方

HttpUtils.get('https://google.com') #=> 読み込み & sleep(1)
HttpUtils.get('https://google.com') #=> キャッシュヒット & sleep無し！

以上です

使用上の注意

この実装で並列処理で実行すると、同一ドメインに短時間でアクセスしてしまう可能性があります。そういう要件の場合はちゃんと作りましょう！

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up