2
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

Seleniumでajax(xhr)のデータをとる

Posted at

SeleniumでajaxでリクエストしてるJSONとか取りたい

いろいろ調べるとgithubで公開してくれているページがヒット。
https://gist.github.com/lorey/079c5e178c9c9d3c30ad87df7f70491d#file-selenium_xhr_requests_via_performance_logging-py

ただ今のバージョンだとこれが少し違うくて、

こちら参照

get_log.py
import json
from time import sleep

from selenium import webdriver
from selenium.webdriver import DesiredCapabilities

# make chrome log requests
capabilities = DesiredCapabilities.CHROME
# caps['goog:loggingPrefs']
capabilities["goog:loggingPrefs"] = {"performance": "ALL"}  # newer: goog:loggingPrefs
driver = webdriver.Chrome(
    desired_capabilities=capabilities
)

# fetch a site that does xhr requests
driver.get("https://ja.reactjs.org/docs/faq-ajax.html")
sleep(5)  # wait for the requests to take place

# extract requests from logs
logs_raw = driver.get_log("performance")
logs = [json.loads(lr["message"])["message"] for lr in logs_raw]

def log_filter(log_):
    return (
        # is an actual response
        log_["method"] == "Network.responseReceived"
        # and json
        and "json" in log_["params"]["response"]["mimeType"]
    )

for log in filter(log_filter, logs):
    request_id = log["params"]["requestId"]
    resp_url = log["params"]["response"]["url"]
    print(f"Caught {resp_url}")
    print(driver.execute_cdp_cmd("Network.getResponseBody", {"requestId": request_id}))

でちゃんとJsonレスポンスが取れるようになりました。

一言

chromedriverの仕様はよく変わる…
capabilityとかドキュメントに載ってないものありそうだし。

2
3
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?