More than 1 year has passed since last update.

Rakuten 宝くじから過去の当選番号をスクレイピングしてみよう！

Last updated at 2023-04-30Posted at 2023-04-30

ライブラリをインストール

!pip install beautifulsoup4
!pip install requests

URLからHTMLを取得

import requests
from bs4 import BeautifulSoup

url = "https://takarakuji.rakuten.co.jp/backnumber/numbers3/202304/"

response = requests.get(url)
response.raise_for_status()

soup = BeautifulSoup(response.content, "html.parser")
print(soup)

結果

<!DOCTYPE html>

<!--[if IE 8 ]><html class="ie ie8" lang="ja" prefix="og: http://ogp.me/ns# fb: http://www.facebook.com/2008/fbml"><![endif]-->
<!--[if IE 9 ]><html class="ie ie9" lang="ja" prefix="og: http://ogp.me/ns# fb: http://www.facebook.com/2008/fbml"><![endif]-->
<!--[if !(IE)]><!-->
<html lang="ja" prefix="og: http://ogp.me/ns# fb: http://www.facebook.com/2008/fbml">
<!--<![endif]-->
<head>
<meta charset="utf-8"/>

以下略

当選番号を取得します

winning_numbers = soup.find("td", colspan="2")
print(winning_numbers)

結果

<td colspan="2">2023/04/03</td>

複数の情報を取得するために、find_allを利用します

winning_numbers = soup.find_all("td", colspan="2")
print(winning_numbers)

結果

[<td colspan="2">2023/04/03</td>,
 <td colspan="2">009</td>,
 <td colspan="2">2023/04/04</td>,
 <td colspan="2">911</td>,
 <td colspan="2">2023/04/05</td>,

以下略

タグを削除

tmp = str(winning_numbers[0]).replace('</td>', '')
tmp = tmp.replace('<td colspan="2">', '')
print(tmp)

結果

`2023/04/03`

リスト内のタグを全て取り除き置き換える

for i in range(len(winning_numbers)):
  tmp = str(winning_numbers[i]).replace('</td>', '')
  tmp = tmp.replace('<td colspan="2">', '')
  winning_numbers[i] = tmp

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

Rakuten 宝くじから 過去の当選番号をスクレイピングしてみよう！

ライブラリをインストール

URLからHTMLを取得

結果

当選番号を取得します

結果

結果

タグを削除

結果

リスト内のタグを全て取り除き置き換える

Rakuten 宝くじから過去の当選番号をスクレイピングしてみよう！