Python × BeautifulSoupで気象庁サイトの過去天気をスクレイピングしてみた

Last updated at 2025-01-19Posted at 2025-01-19

1. 前置き

旅行日を決める際に例年の降水量や気温をまとめて取得したかったため、今回作成してみました。
今後の旅行日決めに活用できそうです。

今回はスクレイピング手法として、requests でHTMLを取得し、BeautifulSoupで解析しました。

GitHub

2. 使用するサイト

3. 今回調査する場所

東京都　青梅

場所の番号取得方法

・上記サイトで左側で場所を選択するとURLに番号が表示される
・今回は、青梅市を選択した際に表示された
「prec_no=44, block_no=1001 」を使用する

4. サイトからどんな情報が取れるのかテストする[1][2]

参考サイト

準備

PowerShell

pip install requests

テスト1

WeatherHistorySearch_html.py

import requests

url = 'https://www.data.jma.go.jp/stats/etrn/view/daily_a1.php?prec_no=44&block_no=1001&year=2024&month=3&day=15&view='
response = requests.get(url)
response.encoding = response.apparent_encoding 

# HTMLファイルに保存する
with open('WeatherHistorySearch.html', 'w', encoding='utf-8') as file:
    file.write(response.text)

出力結果：WeatherHistorySearch.html

5. HTMLから指定の値を取得する[3]

参考サイト

準備

pip install beautifulsoup4

指定日(15日)の値を取得する

テスト2

from bs4 import BeautifulSoup

# HTMLデータの読み込み
html_file = 'WeatherHistorySearch.html'
with open(html_file, 'r', encoding='utf-8') as file:
    html_data = file.read()

# レスポンスの HTML から BeautifulSoup オブジェクトを作る
soup = BeautifulSoup(html_data, 'html.parser')

# 指定の日付
target_day = '15'

# title タグの文字列を取得する
target_row = soup.find('a', string=target_day).find_parent('tr')

if target_row:
    values = [td.text.strip() for td in target_row.find_all('td')]

    # テキストファイルに保存する
    with open('WeatherHistorySearchSoup.txt', 'w', encoding='utf-8') as file:
        file.write(str(values))

出力結果

['15', '0.0', '0.0', '0.0', '9.5', '16.9', '1.0', '///', '///', '1.8', '5.0', '東南東', '9.7', '東南東', '西北西', '11.3', '///', '///']

テスト3

WeatherHistorySearch_BeautifulSoup.py

from bs4 import BeautifulSoup

# HTMLデータの読み込み
html_file = 'WeatherHistorySearch.html'
with open(html_file, 'r', encoding='utf-8') as file:
    html_data = file.read()

# レスポンスの HTML から BeautifulSoup オブジェクトを作る
soup = BeautifulSoup(html_data, 'html.parser')

# 指定の日付
target_day = '15'

# title タグの文字列を取得する
target_row = soup.find('a', string=target_day).find_parent('tr')

if target_row:
    values = [td.text.strip() for td in target_row.find_all('td')]

    result = f"{values[0]}日, {values[1]}mm, {values[4]}℃, {values[5]}℃, {values[6]}℃"

    print(result)

        # テキストファイルに保存する
    with open('WeatherHistorySearchSoup.txt', 'w', encoding='utf-8') as file:
        file.write(str(result))

出力結果

WeatherHistorySearchSoup.txt

15日, 0.0mm, 9.5℃, 16.9℃, 1.0℃

6. まとめ作業

指定のURLから日付、降水量、気温を取得

テスト4

WeatherHistorySearch.py

import requests
from bs4 import BeautifulSoup

url = 'https://www.data.jma.go.jp/stats/etrn/view/daily_a1.php?prec_no=44&block_no=1001&year=2024&month=3&day=15&view='
response = requests.get(url)
response.encoding = response.apparent_encoding 

# レスポンスの HTML から BeautifulSoup オブジェクトを作る
soup = BeautifulSoup(response.text, 'html.parser')

# 指定の日付
target_day = '15'

# title タグの文字列を取得する
target_row = soup.find('a', string=target_day).find_parent('tr')

if target_row:
    values = [td.text.strip() for td in target_row.find_all('td')]

    result = f"{values[0]}日, {values[1]}mm, {values[4]}℃, {values[5]}℃, {values[6]}℃"

    print(result)

7. 複数まとめて取得する

日付を複数取得し、テキストファイルにまとめる処理を追加

WeatherHistorySearch.py

import requests
from bs4 import BeautifulSoup

# 日付定義
dates = [
    {'year': 2024, 'month': 3, 'day': 15},
    {'year': 2023, 'month': 3, 'day': 15},
]

# 場所定義
prec_no = 44
block_no = 1001


results = ["日付,降水量,平均気温,最高気温,最低気温\n"]

for date in dates:
    # urlの設定
    url = f"https://www.data.jma.go.jp/stats/etrn/view/daily_a1.php?prec_no={prec_no}&block_no={block_no}&year={date['year']}&month={date['month']}&day={date['day']}&view="

    response = requests.get(url)
    response.encoding = response.apparent_encoding 

    # BeautifulSoup オブジェクトを作る
    soup = BeautifulSoup(response.text, 'html.parser')

    target_row = soup.find('a', string=str(date['day']))
    target_row = target_row.find_parent('tr') if target_row else None

    if target_row:
        values = [td.text.strip() for td in target_row.find_all('td')]

        result = f"{date['year']}/{date['month']}/{date['day']},{values[1]},{values[4]},{values[5]},{values[6]}\n"
        results.append(result)

# テキストファイルに保存
with open('WeatherHistorySearch.txt', 'w', encoding='utf-8') as file:
    file.writelines(results)

# 出力
print("".join(results))

出力結果

WeatherHistorySearch.txt

日付,降水量,平均気温,最高気温,最低気温
2024/3/15,0.0,9.5,16.9,1.0
2023/3/15,0.0,11.1,18.1,5.3

8. まとめ

毎回上記のプログラムを実行することで、簡単に過去の降水量や気温が取得できるようになった。今後は旅行する際に利用していきたい。

9. 参考サイト

[1]

[2]

[3]

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up