More than 5 years have passed since last update.

beautiful soupを使ってみた際の備忘録

Last updated at 2020-09-11Posted at 2020-07-18

使ってみた

pip install beautifulsoup4でインストール
パーサー?なるものはデフォルトでいいかなと思って、lxmlとかは使わずデフォルトで備わっているhtml.parserを用いた。

import requests
from bs4 import BeautifulSoup
url = input()
html = requests.get(url)
soup = BeautifulSoup(html.content, "html.parser")

基本はこれで良いはず。

・id検索(検索できるものは一つ
soup.find(id="id名")
・css selector検索(検索できるものは一つ
ing.select_one("css selector名")

検索に一致する全ての要素を見つける時は
idなら
find_all(id名)
css selectorなら
select(.class属性名)
参照:[Beautiful Soup のfind_all( ) と select( ) の使い方の違い]
(https://gammasoft.jp/blog/difference-find-and-select-in-beautiful-soup-of-python/)

<h3 class="A B">のような(class属性を複数持っている)物をselectで検索する時はselect_one(.A.B)をする。