2
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

Seleniumをオフラインで動かす2

Last updated at Posted at 2022-10-02

はじめに

今まではChromeで動かしていたものをEdgeで動かすことになりそうなのでフィジビリティスタディ

参考サイト

◆Selenium API(逆引き)
 https://www.seleniumqref.com/api/webdriver_gyaku.html
◆Webサイトのスクリーンショットを自動化する方法
 https://zenn.dev/kazuki_tam/articles/6c3cf0729c5b847cc2a4


◆Windows10 + Python3 + Selenium4でChromeとEdgeをheadlessで起動してみる
 https://qiita.com/tabizou/items/1a7789d88ba853cd6081
◆Selenium Webdriverの指定方法(executable_path)が非推奨になったことへの対処法
 https://qiita.com/ryohassay/items/2ca3ee7091e4c0cec9ca
◆Python + Selenium 4で Edge のヘッドレス (Headless) 起動の設定
 https://qiita.com/baku2san/items/87e78da028c3f8ed577e
◆Selenium4のEdgedriverでデフォルトのダウンロードフォルダを変更する
 https://qiita.com/pm00/items/1eaf7d76ad68a4dcdf4d
◆Edge(Chromium) + Selenium(Python)を使ってみる ~導入編~
 https://masayoshi-9a7ee.hatenablog.com/entry/2022/01/07/182223


◆Windows で複数バージョンの Python を使う
 https://qiita.com/landwarrior/items/1b5e0f9af5316a025fe0
◆Pythonランチャで複数のPython環境を使い分ける
 https://astherier.com/blog/2020/07/usage-of-python-launcher/
◆py コマンドで Python のバージョンを切り替える
 https://itpc.blog.fc2.com/blog-entry-226.html

導入バージョン

■Edge
 ◆ビジネス向け Microsoft Edge をダウンロードして構成する
  https://www.microsoft.com/ja-jp/edge/business/download
   ※「Windows 64-bit」(v105.0.1343.53):MicrosoftEdgeEnterpriseX64.msi
 ◆Microsoft Edge WebDriver
  https://developer.microsoft.com/ja-jp/microsoft-edge/tools/webdriver/
   ※「Stable チャネル x64」(v105.0.1343.53):edgedriver_win64.zip
■Python
 ◆Python Release Python 3.10.7 | Python.org
  https://www.python.org/downloads/release/python-3107/
   ※「Windows installer (64-bit)」(v3.10.7):python-3.10.7-amd64.exe
■Selenium
 ◆selenium・PyPI
  https://pypi.org/project/selenium/4.5.0/#files
   ※(v4.5.0):selenium-4.5.0-py3-none-any.whl

最近Python v3.11.0が登場したので、ダウンロード情報を以下に...

download.txt
https://www.python.org/ftp/python/3.11.0/python-3.11.0-amd64.exe
https://files.pythonhosted.org/packages/c7/35/721fde638ea1ed2f32851f89544d04dab573ff629496f0275f4e7d5f5f29/selenium-4.6.0-py3-none-any.whl
https://files.pythonhosted.org/packages/6f/de/5be2e3eed8426f871b170663333a0f627fc2924cc386cd41be065e7ea870/urllib3-1.26.12-py2.py3-none-any.whl
https://files.pythonhosted.org/packages/f1/ed/3623a910f9bb7a31b067d6baef476ed6e294e92a245f94ab992988e4a666/trio-0.22.0-py3-none-any.whl
https://files.pythonhosted.org/packages/db/c5/b5e8bc1f40568a354f2a9cc296b8892605a9d2f22e725290fc33836dd2a3/trio_websocket-0.9.2-py3-none-any.whl
https://files.pythonhosted.org/packages/1d/38/fa96a426e0c0e68aabc68e896584b83ad1eec779265a028e156ce509630e/certifi-2022.9.24-py3-none-any.whl
https://files.pythonhosted.org/packages/f2/bc/d817287d1aa01878af07c19505fafd1165cd6a119e9d0821ca1d1c20312d/attrs-22.1.0-py2.py3-none-any.whl
https://files.pythonhosted.org/packages/32/46/9cb0e58b2deb7f82b84065f37f3bffeb12413f947f9388e4cac22c4621ce/sortedcontainers-2.4.0-py2.py3-none-any.whl
https://files.pythonhosted.org/packages/71/52/39d20e03abd0ac9159c162ec24b93fbcaa111e8400308f2465432495ca2b/async_generator-1.10-py3-none-any.whl
https://files.pythonhosted.org/packages/fc/34/3030de6f1370931b9dbb4dad48f6ab1015ab1d32447850b9fc94e60097be/idna-3.4-py3-none-any.whl
https://files.pythonhosted.org/packages/e9/4f/2f2d3f65d851852712b4de3fd0cfdcec9c5e9a9c347430e004ba770ef4db/outcome-1.2.0-py2.py3-none-any.whl
https://files.pythonhosted.org/packages/c3/a0/5dba8ed157b0136607c7f2151db695885606968d1fae123dc3391e0cfdbf/sniffio-1.3.0-py3-none-any.whl
https://files.pythonhosted.org/packages/43/a0/cc7370ef72b6ee586369bacd3961089ab3d94ae712febf07a244f1448ffd/cffi-1.15.1-cp311-cp311-win_amd64.whl
https://files.pythonhosted.org/packages/78/58/e860788190eba3bcce367f74d29c4675466ce8dddfba85f7827588416f01/wsproto-1.2.0-py3-none-any.whl
https://files.pythonhosted.org/packages/8d/59/b4572118e098ac8e46e399a1dd0f2d85403ce8bbaad9ec79373ed6badaf9/PySocks-1.7.1-py3-none-any.whl
https://files.pythonhosted.org/packages/62/d5/5f610ebe421e85889f2e55e33b7f9a6795bd982198517d912eb1c76e1a53/pycparser-2.21-py2.py3-none-any.whl
https://files.pythonhosted.org/packages/95/04/ff642e65ad6b90db43e668d70ffb6736436c7ce41fcc549f4e9472234127/h11-0.14.0-py3-none-any.whl
https://files.pythonhosted.org/packages/cf/a0/b881b63a17a59d9d07f5c0cc91a29182c8e8a9aa2bde5b3b2b16519c02f4/flake8-5.0.4-py2.py3-none-any.whl
https://files.pythonhosted.org/packages/27/1a/1f68f9ba0c207934b35b86a8ca3aad8395a3d6dd7921c0686e23853ff5a9/mccabe-0.7.0-py2.py3-none-any.whl
https://files.pythonhosted.org/packages/67/e4/fc77f1039c34b3612c4867b69cbb2b8a4e569720b1f19b0637002ee03aff/pycodestyle-2.9.1-py2.py3-none-any.whl
https://files.pythonhosted.org/packages/dc/13/63178f59f74e53acc2165aee4b002619a3cfa7eeaeac989a9eb41edf364e/pyflakes-2.5.0-py2.py3-none-any.whl
https://files.pythonhosted.org/packages/2a/c0/a1372bd95578778aae8dc5359e4f8ec03a94479d2f36e30543298d30c48a/autopep8-2.0.0-py2.py3-none-any.whl
https://files.pythonhosted.org/packages/97/75/10a9ebee3fd790d20926a90a2547f0bf78f371b2f13aa822c759680ca7b9/tomli-2.0.1-py3-none-any.whl
https://globalcdn.nuget.org/packages/system.reflection.emit.4.7.0.nupkg

インストール

▼Python
 ★[Advanced Options]画面
  ・[Install for all users]のチェック追加!!
  ・[Customize install location]:C:\Python3_10_7
 ×環境変数追加(しちゃだめと分かった...)
  C:\Python3_10_7
  C:\Python3_10_7\Scripts
 ◆Pythonの複数バージョンの扱い方(Windowsの場合)
  https://gammasoft.jp/python/python-version-management/
▼Selenium

test.py
py -3.10 -m pip install .\selenium-4.5.0-py3-none-any.whl
py -3.10 edgeOperation.py
offlineInstall.py
pip download selenium --dest pipPackage1 --verbose --log download1.log
pip install --no-index --find-links=./pipPackage1 selenium
getDownloadURL.py
# ダウンロードURL取得コマンド
#  pip download selenium --dest pipPackage1 --verbose --log download1.log
#  pip download flake8 --dest pipPackage2 --verbose --log download2.log
#  pip download autopep8 --dest pipPackage3 --verbose --log download3.log

# 読込対象ファイルを開く
fr = open('download1.log', 'r', encoding='UTF-8')
# 書込対象ファイルを開く
fw = open('downloadURL1.txt', 'w', encoding='UTF-8')

# 1行ずつ読み込む
datalist = fr.readlines()
for data in datalist:
  # 検索
  if ('Looking up' in data) and (('.whl"' in data) or ('.zip"' in data)) :
    # ダウンロードURL抽出
    target='"'
    idx = data.find(target)
    workData = data[idx+1:]
    idx = workData.find(target)
    urlData = workData[:idx]
    fw.write(urlData + '\n')

fr.close()
fw.close()

動作検証

※とりあえず参考サイトのコード流用(後で使いながら見直していく)

edgeOperation.py
#!python3
# -*- coding: utf-8 -*-
# PS> py --version
# Python 3.11.0

# site-packages
# PS> py -m pip list --path "C:\wakasama\Python\wakasama311\Lib\site-packages" | findstr selenium
# selenium         4.6.0
import sys
print(sys.path)
sys.path.append('C:\wakasama\Python\wakasama311\Lib\site-packages')

import os
import shutil
import time
from selenium import webdriver
# TODO:後でedge⇒commonで動かしたい!!
from selenium.webdriver.edge.options import Options
from selenium.webdriver.edge.service import Service
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys

browser_conf = {
    # PS> (get-item ("C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe")).VersionInfo.FileVersion
    # 112.0.1722.48
    # https://docs.microsoft.com/ja-jp/microsoft-edge/webdriver-chromium/
    # PS> .\msedgedriver.exe --version
    # MSEdgeDriver 100.0.1185.29 (1feddedadb2184120dec3f8943e514a865a2930a)
    # PS> .\msedgedriver.exe --help
    "browser_driver": os.getcwd() + "\\msedgedriver.exe",
    "browser_options" : [
        # ヘッドレスモードON
        "headless",
    ],
    "implicitly_wait": 10
}

def main():

    # ブラウザ初期設定
    browser = init_browser()

    # キャプチャ用フォルダ作成
    folderCheck = os.path.exists('capture')
    if folderCheck:
        shutil.rmtree('capture')
    os.mkdir('capture')

    # サイト移動&キャプチャ取得1
    browser.get('https://www.yahoo.co.jp')
    browser.save_screenshot(os.getcwd() +f"\\screen_captuer.png")

    # スクロールバーを下にスクロールする
    # これは失敗... 
    #element = browser.find_element(By.ID, 'point')
    #actions = ActionChains(browser).click_and_hold(element).move_by_offset(0,200)
    #actions.perform()   
    #browser.get_screenshot_as_file(os.getcwd() +"\\screen_captuer1.png")

    browser.execute_script('window.scrollTo(0, document.body.scrollHeight);')
    time.sleep(1)
    browser.get_screenshot_as_file(os.getcwd() + "\\capture" + "\\screen_captuer1.png")

    browser.find_element(By.TAG_NAME, 'body').send_keys(Keys.END)
    time.sleep(1)
    browser.find_element(By.TAG_NAME, 'html').send_keys(Keys.END)
    time.sleep(1)
    browser.get_screenshot_as_file(os.getcwd() + "\\capture" + "\\screen_captuer2.png")

    # スクロールバーを上にスクロールする
    # TODO:Keys.UP/Keys.DOWN,Keys.PAGE_UP/Keys.PAGE_DOWN
    browser.find_element(By.TAG_NAME, 'body').send_keys(Keys.HOME)
    time.sleep(1)
    browser.get_screenshot_as_file(os.getcwd() + "\\capture" + "\\screen_captuer3.png")

    # サイト移動&キャプチャ取得2
    # 対象URL
    urls = ['http://www.chiseki.go.jp/', 'https://www.yahoo.co.jp']


def init_browser():

    browser_service = Service(executable_path=browser_conf["browser_driver"] )
    
    browser_opts = Options()
    for tmp_opt in browser_conf["browser_options"]:
        browser_opts.add_argument( tmp_opt )

    browser = webdriver.Edge(service = browser_service, options = browser_opts )
    
    # 要素が見つかるまで、最大 ?秒 待つ
    browser.implicitly_wait( browser_conf["implicitly_wait"] )
    
    return browser

if __name__ == '__main__':
    main()
2
3
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?