More than 5 years have passed since last update.

python beautifulsoup requests glob find_all

Last updated at 2020-03-25Posted at 2020-03-25

サンプルコード1 (urlを指定)

import requests
from bs4 import BeautifulSoup

url = 'https://xxx'
r = requests.get(url)

soup = BeautifulSoup(r.text, 'html.parser')

# pタグのテキストを表示
tag_p = soup.find_all('p') 
for p in tag_p:
  print(p.text)

# --- 以下は、find_allメソッドの例(findメソッドも同じ) ---
# 属性の指定
ids = soup.find_all(id='sample')

# 属性の指定(class)
clss = soup.find_all(class_='sample')

# タグ名と属性を指定
divs = soup.find_all('div', class_='sample')

# 複数のタグ
tags = soup.find_all(['a', 'b', 'c'])

サンプルコード2 (ファイルを指定）


from glob import glob
from bs4 import BeautifulSoup

# 同一ディクトリ内のhtmファイルを対象とする時
files = glob('*.htm')

for file in files:
  ff = open( file, 'r' ,encoding='utf-8' ).read() 
  soup = BeautifulSoup( ff ,'html.parser')

  #pタグのテキストを表示
  tag_p = soup.find_all('p')
  for p in tag_p:
    print(p.text)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up