More than 5 years have passed since last update.

Selenium でクリックしたページを保存 (python3)

Last updated at 2018-08-25Posted at 2018-08-25

SPA のページで、クリックして出てきた情報を取り込む時の方法です。

click_save.py

# ! /usr/bin/python
# -*- coding: utf-8 -*-
#
#	click_save.py
#
#					Aug/25/2018
#
# ------------------------------------------------------------------
import sys
from selenium.webdriver import Firefox
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.support.wait import WebDriverWait

# ------------------------------------------------------------------
def file_write_proc(file_name,str_out):
	fp_out = open(file_name,mode='w',encoding='utf-8')
	fp_out.write(str_out)
	fp_out.close()
#
# ------------------------------------------------------------------
sys.stderr.write("*** 開始 ***\n")
url_target = sys.argv[1]
idx = sys.argv[2]
file_html = sys.argv[3]
#
#
options = Options()
options.add_argument('-headless')
driver = Firefox(executable_path='/usr/bin/geckodriver', firefox_options=options)
ttx = 100
wait = WebDriverWait(driver, timeout=ttx)
driver.get(url_target)
button = driver.find_element_by_id(idx)
button.click()
driver.save_screenshot("out.png")
html = driver.page_source
#
driver.quit()
file_write_proc(file_html,html)
sys.stderr.write("*** 終了 ***\n")
# ------------------------------------------------------------------

使用例

./click_save.py https://ekzemplaro.org/audio mpg321 mpg321.html

上記の例でクリック前のページ

クリック後のページ

このクリック後のページが保存されます。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

Selenium で クリックしたページを保存 (python3)

Selenium でクリックしたページを保存 (python3)