5
5

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

とりあえずWindows10でHeadlessChromeを使ってスクレイピングしてみる

Posted at

もっとウェブブラウザに頑張って欲しい…

環境

Windows10
Google Chrome 63.0.3239.132(ChromeDriver 2.35)
Python 3.6.3(Anaconda利用)
selenium 3.8.0
beautifulsoup4 4.6.0

準備

Anacondaをインストール、Anaconda Promptでpip freezeでseleniumが入っていることを確認
ここからChromeドライバをダウンロード
解凍したらスクリプトと同じフォルダに配置(きっともっといい配置方法があると思いますがやり方がわからないので放置)

コード

stackoverflowからそのままなページがあるのでそれをとりあえずコピペとBeautifulSoupを組み込む
https://stackoverflow.com/questions/45364102/how-do-i-use-headless-chrome-in-chrome-60-on-windows-10

test.py
from selenium import webdriver
from bs4 import BeautifulSoup

options = webdriver.ChromeOptions()
options.add_argument("headless")
driver = webdriver.Chrome(chrome_options = options)
driver.get('https://news.yahoo.co.jp/')
soup = BeautifulSoup(driver.page_source, "lxml")
driver.quit() 

hoges = soup.find_all('p', class_='ttl')
for hoge in hoges:
	print(hoge.text)

結果

とりあえずニュースタイトルが取れたのでここから更に発展させていきたい。

5
5
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
5
5

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?