More than 5 years have passed since last update.

Windows 10 + Python3 + selenium + chromedriver + headless chrome でファイルをDLしてみる

Last updated at 2020-06-09Posted at 2020-06-03

目的

サイズの大きなバイナリファイルと画像ファイルをダウンロードしてみる
※urllib.request.urlretrieve で明示的にファイル名を指定すればよいようだ

サンプルコード

対象とするhtmlの抜粋


<a href="./LibreOffice_6.4.4_Win_x64.msi" target="_blank">LibreOffice_6.4.4_Win_x64.msi</a>
<img src="./eudv015s.jpg">

# Windows Add env PYTHONIOENCODING = UTF-8 & restart vscode

import os
import time
import urllib.request
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait, Select
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
import chromedriver_binary # 抜けているとエラー

# ブラウザーを起動
options = Options()
options.binary_location = 'C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe'
options.add_argument('--headless')
driver = webdriver.Chrome(options=options)

driver.get('http://localhost:8080/')

try:
    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.TAG_NAME, 'a'))
    )

    elm = driver.find_element_by_tag_name("a").get_attribute("href")
    print(elm)

    # DL対象ファイル名
    print(os.path.basename(elm))
    
    # DL先フォルダ名
    print(os.getcwd())

    # DL後のフォルダ＋ファイル名
    print(os.path.join(os.getcwd() + '\\', os.path.basename(elm)))

    # ファイルのダウンロード
    urllib.request.urlretrieve(elm, \
        os.path.join(os.getcwd() + '\\', os.path.basename(elm)))

    img = driver.find_element_by_tag_name("img").get_attribute("src")   
    print(img)

    # ファイルのダウンロード
    urllib.request.urlretrieve(img, \
        os.path.join(os.getcwd() + '\\', os.path.basename(img)))

    # time.sleep(2) # 表示確認用のwait
finally:
    print('done')
    driver.quit()

参考にしたサイトはこちら

urllib.request --- URL を開くための拡張可能なライブラリ
 PythonでWeb上のファイルをダウンロードする

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up