More than 5 years have passed since last update.

15分で動かす Selenium

Last updated at 2020-03-02Posted at 2020-03-02

10分で理解する Selenium が大変ためになったので勝手に小さな続編を作らせていただきました。

本記事の目標

「10分で理解する Selenium」の方法1 に従えば、先に作成して動かしておいた Selenium の Docker container に接続することで環境構築に手間取ることなく簡単に Selenium をいじることができます。
しかしながら、このままでは Selenium のスクリプトを利用した後も Docker container は動作し続けてしまい、自分で終了する必要があります。

本記事では、スクリプトを実行した後に Docker container を停止し、削除するところまでを自動化します。したがって**この記事が Selenium の記事だというのは嘘です。**タイトル詐欺です。ごめんなさい。

この記事は、 Selenium を題材にして Docker SDK for Python で Docker container を操作するいい感じの方法を紹介します。

TL;DR

まずは Docker SDK for Python をインストールしておきましょう。

pip install docker

import docker
import time
from selenium import webdriver
from contextlib import contextmanager

SELENIUM_IMAGE = 'selenium/standalone-chrome:3.141.59-xenon'


@contextmanager
def selenium_container():
    client = docker.from_env()
    try:
        selenium_image = client.images.get(SELENIUM_IMAGE)
    except docker.errors.ImageNotFound:
        selenium_image = client.images.pull(SELENIUM_IMAGE)

    container = client.containers.run(
        selenium_image, detach=True, remove=True,
        ports={4444: 4444}, shm_size='2g')
    try:
        while b'Selenium Server is up' not in container.logs():
            time.sleep(0.1)
        yield container
    finally:
        container.kill()


@contextmanager
def selenium_driver():
    options = webdriver.ChromeOptions()
    options.add_argument('--headless')
    driver = webdriver.Remote(
        command_executor='http://localhost:4444/wd/hub',
        desired_capabilities=options.to_capabilities(),
        options=options,
    )
    try:
        yield driver
    finally:
        driver.quit()


def test():
    with selenium_container(), selenium_driver() as driver:
        driver.get('https://qiita.com')
        print(driver.current_url)


if __name__ == '__main__':
    test()

解説

contextmanager

@contextmanager
def selenium_container():

エラーが発生しても container を正常に終了するために、contextlib.contextmanager を使用しました。webdriverについても同様です。

Docker image の取得

    client = docker.from_env()
    try:
        selenium_image = client.images.get(SELENIUM_IMAGE)
    except docker.errors.ImageNotFound:
        selenium_image = client.images.pull(SELENIUM_IMAGE)

まずは Selenium の Docker image を取得します。まだ pull していなければ先に pull してきます。

docker run

    container = client.containers.run(
        selenium_image, detach=True, remove=True,
        ports={4444: 4444}, shm_size='2g')

これは docker run -d -p 4444:4444 --shm-size=2g selenium/standalone-chrome:3.141.59-xenon に相当します。

docker kill

    try:
        while b'Selenium Server is up' not in container.logs():
            time.sleep(0.1)
        yield container
    finally:
        container.kill()

detach=True を渡しているので、 client.containers.run() は Docker container を立ち上げたら即座に次の処理に移っています。したがって、この段階ではまだ Webdriver の準備ができていません。まずは準備が完了するまで待機する必要があります。
準備が完了すると、 Docker container のログに 06:53:00.628 INFO [SeleniumServer.boot] - Selenium Server is up and running on port 4444 などと出力されますので、この出力があるまで while ループを回して待機します¹。

準備が完了したら container を yield します。 yield されたこの container は、 with selenium_container() as container: のように参照できます。
selenium_driver の処理は container を yield した時点で一時停止して、 with ブロックを抜けるまで待機します。
処理が完了したり、 with ブロック内でエラーが発生したりしたりして with ブロックを抜けた場合、 selenium_driver は再び動き出し、 finally ブロックの container.kill() を実行します。

まとめ

これで Docker container や Webdriver の終了忘れを気にする必要がなくなります。
良い Selenium ライフを！

Docker SDK for Python のドキュメントには logs() は str を返すと書かれているのですが、記事執筆時ではなぜか bytes を返すので注意してください。 ↩

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up