More than 3 years have passed since last update.

Python Webスクレイピング実践入門

Posted at 2020-10-05

Python Webスクレイピング実践入門
を参考にPython3で書き直しました

準備

Pythonのバージョン確認

$ python3 -V
Python 3.8.6

日経平均株価取得

パッケージのインストール

$ pip3 install beautifulsoup4

スクレイピング

from urllib.request import urlopen
from bs4 import BeautifulSoup

url = "http://www.nikkei.com/markets/kabu/"

html = urlopen(url).read()
soup = BeautifulSoup(html, "html.parser")

nikkei_heikin = soup.find("span", class_="mkc-stock_prices").string
print(nikkei_heikin)

パッケージのインストール

$ pip3 install beautifulsoup4
$ pip3 install apscheduler
$ pip3 install requests

import csv
import datetime

import requests
from apscheduler.schedulers.blocking import BlockingScheduler
from bs4 import BeautifulSoup

sched = BlockingScheduler()

# １時間ごとに実行する
# @sched.scheduled_job('interval', hours=1)


# 毎時0分に実行する
@sched.scheduled_job("cron", minute=0, hour="*/1")
def scheduled_job():

    # 日本経済新聞の日経平均株価ページにアクセスし、HTMLを取得する
    r = requests.get("http://www.nikkei.com/markets/kabu/")
    r.raise_for_status()

    # BeautifulSoupを使い日経平均株価を取得する
    soup = BeautifulSoup(r.text, "html.parser")
    nikkei_heikin = soup.select_one(
        "#CONTENTS_MARROW > div.mk-top_stock_average.cmn-clearfix > div.cmn-clearfix > div.mkc-guidepost > div.mkc-prices > span.mkc-stock_prices"
    ).get_text(strip=True)

    # 今の時間を文字列に変換する
    now = datetime.datetime.now().strftime("%Y/%m/%d %H:%M:%S")

    print(f"{now} {nikkei_heikin}"

    # CSVに日時と日経平均株価の値を追記する
    with open("nikkei_heikin.csv", "a") as fw:
        writer = csv.writer(fw, dialect="excel", lineterminator="\n")
        writer.writerow([now, nikkei_heikin])


sched.start()

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

Python Webスクレイピング 実践入門

準備

Pythonのバージョン確認

日経平均株価取得

パッケージのインストール

スクレイピング

パッケージのインストール

Python Webスクレイピング実践入門