Help us understand the problem. What is going on with this article?

requests-htmlのuser-agentを確認してみる

More than 1 year has passed since last update.

動的なWEBページをスクレイピングしたいときに超便利なライブラリRequests-HTML

JavaScriptを実行しないとスクレイピングできないWEBページでもHeadlessChromeとseleniumを用意する必要はありません

またrequests-htmlはデフォルトでuser-agentにブラウザの情報を設定してくれます

今回はその確認です

In [1]: import requests

In [2]: from requests_html import HTMLSession

In [3]: session = HTMLSession()

In [4]: session.get('http://httpbin.org/user-agent').text
Out[4]: '{\n  "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/603.3.8 (KHTML, like Gecko) Version/10.1.2 Safari/603.3.8"\n}\n'

In [5]: requests.get('http://httpbin.org/user-agent').text
Out[5]: '{\n  "user-agent": "python-requests/2.18.4"\n}\n'

safariが設定されているみたいですね。ちなみに実行環境はwindows10です。

dlpyvim
Pythonが好き
Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away