More than 3 years have passed since last update.

Seleniumを使い株価データを自動取得

Last updated at 2021-02-03Posted at 2021-01-26

Seleniumを使い株価データを自動取得

はじめに

取得先はYahooファイナンスVIP倶楽部（月額1980円）の特典時系列データダウンロードを使う

seleniumをインストール

$ pip install selenium

webdriverをダウンロード

最新バージョンをダウンロード
chromeも最新にアップデートする

Yahooログインをパスワードに変更

プロフィールのログインとセキュリティよりログイン方法をパスワードに変更
※ パスワードログインは推奨されていないので注意してくださいね

ディレクリ構成

yahoo-finance-download/  
|-- download  
|   |-- 0000.T.csv
|   `-- 9999.T.csv
|-- chromedriver
|-- yahoo_download.csv  
|-- yahoo_download.py  
`-- requirements.txt

Yahooログイン

設定周り

ドライバ設定、get_download_listを使い取得する銘柄を取得
yahoo_download.csvに取得したい銘柄を指定する

import os
import time

import pandas as pd
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import TimeoutException
from pathlib import Path

class YahooDownload:

    def __init__(self):
        dldir_name = 'download'  # 保存先フォルダ名
        dldir_path = Path(dldir_name)
        dldir_path.mkdir(exist_ok=True)
        self.download_dir = str(dldir_path.resolve())

        options = webdriver.ChromeOptions()
        options.add_experimental_option("prefs", {
            "download.default_directory": self.download_dir,
        })

        self.driver = webdriver.Chrome('./chromedriver', options=options) # ドライバー設定
        self.driver.implicitly_wait(10) # 待機時間
        self.driver.set_page_load_timeout(10) # ページロード待機

    def get_download_list(self):
        download_list = pd.read_csv(os.getcwd() + "/yahoo_download.csv", thousands=',', engine='python', encoding="UTF-8")
        return download_list

yahoo_download.csv

stock_no
1407
2053
...

ログイン処理

...
    def login(self):
        # Yahoo! ログイン
        self.driver.get('https://login.yahoo.co.jp/config/login')
        time.sleep(1)
    
        self.driver.find_element_by_name('login').send_keys(LOGIN)
        self.driver.find_element_by_name('btnNext').click()
        time.sleep(1)
    
        self.driver.find_element_by_name('passwd').send_keys(PASSWORD)
        self.driver.find_element_by_name('btnSubmit').click()
        time.sleep(3)

CSVダウンロード

...
    def download(self):
        for _, v in self.download_list.iterrows():
    
            stock_no = v["stock_no"]
    
            # 存在ファイル確認
            if os.path.isfile(self.download_dir + "/{}.T.csv".format(stock_no)):
                    continue
    
            # 銘柄検索
            self.driver.get('https://download.finance.yahoo.co.jp/common/history/{}.T.csv'.format(stock_no))
    
            time.sleep(3)

いい感じに実行

...
    def main(self):
        try:
            self.login()
            self.download()
        except TimeoutException as ex:
            print(ex)
            pass
        except Exception as e:
            print(e)
            pass
        finally:
            time.sleep(5)
            self.driver.close()
            self.driver.quit()


if __name__ == "__main__":
    obj = YahooDownload()
    obj.main()

実行するとseleniumが起動してdownload/以下にcsvがダウンロードされる

$ python yahoo_download.py

次回は取得したデータをBigQueryに追加
さらにその次はデータ加工
いいね！と思ったら LGTM お願いします

【PR】週末ハッカソンというイベントやってます！ → https://weekend-hackathon.toyscreation.jp/about/

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up