More than 1 year has passed since last update.

Yahooのニュース一覧をpythonでスクレイピングする方法

Last updated at 2022-02-02Posted at 2022-02-02

#1.環境
python3
mac OS
モジュールのrequestsとbs4が導入済であること

#２.モジュールを導入してない場合
macの場合、以下の文をターミナル入れれば大丈夫！

pip install requests
pip install bs4

#3.取得箇所

赤枠で囲ったところ

とりあえず、モジュールを導入したら、以下のコードを実行する

sample.py

import requests
from bs4 import BeautifulSoup

# ニュースサイトを入れる
URL = "https://news.yahoo.co.jp/"
# HTMLのデータを取得する
res = requests.get(URL)
# BeautifulSoupで操作できるようにデータを変換する
soup = BeautifulSoup(res.text, 'html.parser')

# ニュースを記載している箇所のクラス名を取得し、取得数だけ作業を繰り返す
for val in range(len(soup.find_all(class_="sc-hmzhuo dxywt"))):

    # ニュースのタイトルを取得
    news_tittle = soup.find_all(class_="sc-hmzhuo dxywt")[val].get_text()
    # ニュースのURLを取得
    news_herf = soup.find_all(class_="sc-hmzhuo dxywt")[val].find_all('a')[0].get("href")

    # 取得データを出力する
    print(val + "番目")
    print("--------------------------")
    print(news_tittle)
    print(news_herf)


# モジュールを終了させる
res.close()
soup.clear()

結果は、以下のように出ます。
例
1　番目
ーーーーーーーーーーーーー
ニュースのタイトル
ニュースサイトのURL
２　番目
ーーーーーーーーーーーーー
ニュースのタイトル
ニュースサイトのURL
.
.
.

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up