5
5

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

みずほ銀行のホームページからロト6の当選番号をとりだす

Last updated at Posted at 2018-04-12

Summary

F#というスクリプト言語で、みずほ銀行のホームページからロト6の当選番号をとりだす

Environment

mono, paketがなければbrew installでダウンロード

firefoxがなければbrew cask installでダウンロード

$ sw_vers 
ProductName:	Mac OS X
ProductVersion:	10.13.4
BuildVersion:	17E199

$ mono --version
Mono JIT compiler version 5.4.1.6 (tarball Mon Dec 11 14:59:42 GMT 2017)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
	TLS:           normal
	SIGSEGV:       altstack
	Notification:  kqueue
	Architecture:  amd64
	Disabled:      none
	Misc:          softdebug 
	LLVM:          supported, not enabled.
	GC:            sgen (concurrent by default)

$ paket --version
Paket version 5.155.0

$ brew cask info firefox
firefox: 59.0.2
https://www.mozilla.org/firefox/
/usr/local/Caskroom/firefox/59.0.2 (64B)
From: https://github.com/caskroom/homebrew-cask/blob/master/Casks/firefox.rb
==> Name
Mozilla Firefox
==> Languages
cs, de, en-GB, en, es-AR, es-CL, es-ES, fi, fr, gl, in, it, ja, ko, nl, pl, pt, pt-BR, ru, tr, uk, zh-TW, zh
==> Artifacts
Firefox.app (App)

ホームページの構成

今月の当選番号はindexにあり、過去の当選番号はbucknumberにリンクが貼ってある

index                https://www.mizuhobank.co.jp/retail/takarakuji/loto/loto6/index.html
└── backnumber       https://www.mizuhobank.co.jp/retail/takarakuji/loto/backnumber/index.html
    ├── ChartA       https://www.mizuhobank.co.jp/retail/takarakuji/loto/loto6/index.html?year=2018&month=2
    ├── ChartB_new   https://www.mizuhobank.co.jp/retail/takarakuji/loto/backnumber/detail.html?fromto=641_660&type=loto6
    └── ChartB_old   https://www.mizuhobank.co.jp/retail/takarakuji/loto/backnumber/loto60001.html

それぞれのページの当選番号を示すCSS Selectorは違うので確認しておく

回別 抽選 当選番号
index table.typeTK > thead > tr > th.alnCenter.bgf7f7f7 table.typeTK > tbody > tr > td[colspan='6'].alnCenter table.typeTK > tbody > tr > td.alnCenter.extension > strong
ChartA table.typeTK > thead > tr > th.alnCenter.bgf7f7f7 table.typeTK > tbody > tr > td[colspan='6'].alnCenter table.typeTK > tbody > tr > td.alnCenter.extension
ChartB_new div.spTableScroll > table.typeTK > tbody > tr > th.bgf7f7f7 div.spTableScroll > table.typeTK > tbody > tr > td.alnRight div.spTableScroll > table.typeTK > tbody > tr > td[class='']
ChartB_old div.spTableScroll > table.typeTK > tbody > tr > th.bgf7f7f7 div.spTableScroll > table.typeTK > tbody > tr > td.alnRight div.spTableScroll > table.typeTK > tbody > tr > td:not(.alnRight)

htmlをダウンロード

ロト6の当選数字はJavaScriptが実行されないとHtml上に出現しないのでBrowserを通じてHtmlをダウンロードする方法でいく

ライブラリのダウンロード

// fooフォルダを作成
$ mkdir foo
$ cd foo/

// ライブラリをダウンロード
$ paket init
$ vim paket.dependencies

    source https://www.nuget.org/api/v2
    nuget fsharp.data == 3.0.0-beta3
    nuget Selenium.webdriver
    nuget Selenium.Support

$ paket install

コードを書く(とりあえずエラー処理はかんがえない)

// File name is foo.fsx


#r "./packages/FSharp.Data/lib/net45/FSharp.Data.dll"
open FSharp.Data

#r "./packages/Selenium.WebDriver/lib/net45/WebDriver.dll"
#r "./packages/Selenium.Support/lib/net45/WebDriver.Support.dll"
open OpenQA.Selenium
open OpenQA.Selenium.Firefox
open OpenQA.Selenium.Support.UI

open System

let url          = @"https://www.mizuhobank.co.jp/retail/takarakuji/loto/loto6/index.html?year=2018&month=2"
let kaisuuCSS    = @"table.typeTK > thead > tr > th.alnCenter.bgf7f7f7"
let kaisaibiCSS  = @"table.typeTK > tbody > tr > td[colspan='6'].alnCenter"
let hitNumberCSS = @"table.typeTK > tbody > tr > td.alnCenter.extension"

type Fox () =

    // ブラウザの画面は見えないようにする
    let opt = new FirefoxOptions()
    do  opt.AddArgument("--headless")
    let driver = new FirefoxDriver( opt )
    // ウエイトタイムはとりえあえず10秒に設定
    let wait = WebDriverWait(driver, TimeSpan.FromSeconds(10.))

    // ウエイトをかけてhtmlを取得する
    member this.HtmlWithJS(url) =

        // 指定したアドレスのホームページに移動
        driver.Url <- url

        // 指定したCSSのinnerTextが出現するまで待つ(最大10秒)
        // innerTextが出現しなかったらエラー
        wait.Until( fun (driver:IWebDriver) ->
            [
                driver.FindElements( By.CssSelector( kaisuuCSS    ))
                driver.FindElements( By.CssSelector( kaisaibiCSS  ))
                driver.FindElements( By.CssSelector( hitNumberCSS ))
            ]
            |> Seq.concat
            |> Seq.forall ( fun (x:IWebElement) -> x.Text <> String.Empty )
            ) |> ignore

        // htmlをかえす
        driver.PageSource

    member this.Quit() =
        driver.Quit()


// Firefoxブラウザを起動する
let f = Fox()

// みずほ銀行のロト6の当選番号が書いてあるホームページからhtmlをダウンロード
f.HtmlWithJS url

// htmlからロト6の当選番号を取り出す
|> HtmlDocument.Parse
|> fun doc ->
    let index   = doc.CssSelect( kaisuuCSS    ) |> List.map ( fun n -> n.InnerText() |> fun s -> String.filter Char.IsDigit s )
    let date    = doc.CssSelect( kaisaibiCSS  ) |> List.map ( fun n -> n.InnerText() |> fun s -> s.Replace("年","/").Replace("月","/").Replace("日",""))
    let numbers = doc.CssSelect( hitNumberCSS ) |> List.map ( fun n -> n.InnerText() ) |> List.chunkBySize 7 |> List.map ( List.truncate 6 )
    (index, date, numbers)
    |||> List.map3 ( fun a b c -> [a] @ [b] @ c )
|> List.iter( fun l -> printfn "%A" l )

f.Quit()

実行してみる

$ fsharpi foo.fsx

// firefoxの処理内容が色々出力される・・・

["1255"; "2018/2/26"; "02"; "14"; "15"; "27"; "40"; "43"]
["1254"; "2018/2/22"; "07"; "08"; "11"; "15"; "36"; "39"]
["1253"; "2018/2/19"; "01"; "12"; "14"; "24"; "33"; "37"]
["1252"; "2018/2/15"; "04"; "15"; "18"; "19"; "22"; "29"]
["1251"; "2018/2/12"; "21"; "28"; "31"; "33"; "34"; "40"]
["1250"; "2018/2/8"; "03"; "10"; "18"; "22"; "23"; "40"]
["1249"; "2018/2/5"; "05"; "08"; "15"; "20"; "25"; "27"]
["1248"; "2018/2/1"; "02"; "06"; "14"; "28"; "34"; "37"]

エラー処理を加える

timeoutほか のときに FireFox を quit する

"Stale Element Reference Exception" が多発するのでその処理

コード例はここを参照

about Stale Element Reference Exception

wait.Untilの動きをみてみる

htmlファイルを作成

<html>
    <body>
        <table>
            <tr>
                <td>foo</td>
                <td></td>
                <td class=abc>baz</td>
            </tr>
        </table>
    </body>
</html>

<style>
table {
	border-collapse: collapse;
}
td {
	border: solid 1px;
	padding: 0.5em;
}
</style>

コードを書いてみる

// File name is bar.fsx

#r "./packages/Selenium.WebDriver/lib/net45/WebDriver.dll"
#r "./packages/Selenium.Support/lib/net45/WebDriver.Support.dll"
open OpenQA.Selenium
open OpenQA.Selenium.Firefox
open OpenQA.Selenium.Support.UI
open System


type Fox () =

    let opt = new FirefoxOptions()
    do  opt.AddArgument("--headless")
    let driver = new FirefoxDriver( opt )
    let wait = WebDriverWait(driver, TimeSpan.FromSeconds(10.))
    let mutable cnt = 0

    member this.Html(url) =
        try
            driver.Url <- url
            wait.Until( fun (driver:IWebDriver) ->
                stdout.WriteLine(cnt)
                cnt <- cnt + 1
                [
                    driver.FindElements( By.CssSelector( "html > body > table > tbody > tr > td:not(.abc)" ))
                    driver.FindElements( By.CssSelector( "html > body > table > tbody > tr > td.abc" ))
                ]
                |> Seq.concat
                |> Seq.forall ( fun (x:IWebElement) -> x.Text <> String.Empty )
                ) |> ignore
            driver.PageSource
        with e ->
            driver.Quit()
            stdout.WriteLine(e.Message)
            String.Empty

    member this.Quit() =
        driver.Quit()


let f = Fox()
@"file:///Users/callmekohei/Desktop/foo.html"
|> f.Html
stdout.WriteLine("foo bar baz")
f.Quit()

結果

10秒のウエイトで20回近くポーリングが行われてるのが確認できる

// いろいろfirefoxの表示が出る

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Timed out after 10 seconds
foo bar baz

CSSセレクタでのnot

FSharp.DataCssSelectornotがつかえない(2018/4/11現在)

notを使う場合は下記のようにする

// css selector in Selenium
@"div.spTableScroll > table.typeTK > tbody > tr > td:not(.alnRight)"

// css selector in FSharp.Data
@"div.spTableScroll > table.typeTK > tbody > tr > td.''"

参考

see left column : WebDriverWait Class: https://seleniumhq.github.io/selenium/docs/api/dotnet/

5
5
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
5
5

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?