LoginSignup
1
2

More than 5 years have passed since last update.

Selenium と google-chrome のサンプル (python3)

Posted at

次のプログラムのドライバーを Firefox から、google-chrome にしました。
Crowdworks の募集中の案件数を取得 (Beautifulsoup4)

development_chrome.py
#! /usr/bin/python
# -*- coding: utf-8 -*-
#
#   development_chrome.py
#
#                   Sep/10/2018
#
# ------------------------------------------------------------------
import sys
from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
#
from bs4 import BeautifulSoup
#
# ------------------------------------------------------------------
def file_write_proc(file_name,str_out):
    fp_out = open(file_name,mode='w',encoding='utf-8')
    fp_out.write(str_out)
    fp_out.close()
#
# ------------------------------------------------------------------
def page_ready_wait_proc(driver):
    ttx = 100
    WebDriverWait(driver, ttx).until(
        EC.presence_of_element_located((By.CLASS_NAME,'result_count'))
        )
    sys.stderr.write("*** page_ready_wait_proc *** end ***\n")
# ------------------------------------------------------------------
sys.stderr.write("*** 開始 ***\n")
url_target = "https://crowdworks.jp/public/jobs/group/development/u/professionals?order=new"
file_html = "tmp001.html"
#
#
options = Options()
options.add_argument('-headless')
driver = Chrome(executable_path='/opt/chromedriver/chromedriver', chrome_options=options)
ttx = 100
wait = WebDriverWait(driver, timeout=ttx)
driver.get(url_target)
#
idx="filter_hide_expired"
box_check = driver.find_element_by_id(idx)
box_check.click()
page_ready_wait_proc(driver)
#
idx="filter_hide_budget_pending"
box_check = driver.find_element_by_id(idx)
box_check.click()
page_ready_wait_proc(driver)
#
page_ready_wait_proc(driver)
html = driver.page_source
#
driver.quit()
#
file_write_proc(file_html,html)
#
soup = BeautifulSoup(html, "html.parser")
ccx=soup.find(class_="result_count")
ccy=ccx.find("span")
count=ccy.get_text()
sys.stderr.write("count = " + count + "\n")
#
sys.stderr.write("*** 終了 ***\n")
# ------------------------------------------------------------------

ドライバーの取得方法

wget https://chromedriver.storage.googleapis.com/2.41/chromedriver_linux64.zip

動作確認は次の環境で行いました。

$ uname -a
Linux ***** 4.18.6-arch1-1-ARCH #1 SMP PREEMPT Wed Sep 5 11:54:09 UTC 2018 x86_64 GNU/Linux
$ python --version
Python 3.7.0
$ which google-chrome-stable
/usr/bin/google-chrome-stable
1
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
2