Laravel Webスクレイピング機能の搭載 ②

Last updated at 2025-03-27Posted at 2025-03-26

前回の記事に引き続き、lol(League of Legends) の統計サイトから勝率をスクレイピングすることに加えて、「チャンピオンを検索したら動的にその勝率を表示してくれる機能」、「DeepSeekによる検索したチャンピオンの解説機能」を追加しました。

成果物は以下のようになります。

例によって、コードの解説をしていきます。

1. ビュー部分のコード

<!-- 検索フォームをスタイリング -->
    <div class="max-w-7xl mx-auto px-6 mt-4">
        <form id="championSearchForm" class="flex gap-2 mb-6">
            <input type="text" id="championInput" name="query" placeholder="チャンピオン名を入力"
                   class="w-full px-4 py-2 border border-gray-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-blue-500">
            <button type="submit" class="px-4 py-2 bg-blue-500 text-white font-semibold rounded-lg hover:bg-blue-600">
                検索
            </button>
        </form>
    </div>

この検索フォームはユーザーがチャンピオンの名前を入力して検索するために使われます。
JavaScriptを追加すれば、入力した文字に応じて動的に検索結果を更新することも可能です。

<script>
document.addEventListener('DOMContentLoaded', function() {
    const scrapedDataElement = document.getElementById('scrapedData_lola_blonze');
    const championNameElement = document.getElementById('championName');
    const championSearchForm = document.getElementById('championSearchForm');
    const championInput = document.getElementById('championInput');
    
    // デフォルトのチャンピオン名
    let currentChampion = 'garen';
    
    // チャンピオンデータを取得する関数
    function fetchChampionData(champion) {
        // ローディング表示
        scrapedDataElement.innerText = "Loading...";
        
        fetch("{{ route('scrape.data_lola_blonze') }}?champion=" + encodeURIComponent(champion))
            .then(response => {
                if (!response.ok) {
                    throw new Error('サーバーからのレスポンスが正常ではありません: ' + response.status);
                }
                return response.json();
            })
            .then(data => {
                if (data.error) {
                    throw new Error(data.error);
                }
                
                // Update page title with champion name
                if (data.champion) {
                    const capitalizedChampion = data.champion.charAt(0).toUpperCase() + data.champion.slice(1);
                    championNameElement.innerText = capitalizedChampion + 'の勝率';
                }
                
                // 取得したデータ
                const text = `${data.title} ${data.description}`;

                // 正規表現で最初に現れる `数字 + %` を取得
                const match = text.match(/(\d+(\.\d+)?)%/);

                if (match) {
                    const percentage = parseFloat(match[1]); // 数値として取得
                    scrapedDataElement.innerText = match[0];

                    // 48.5% 以下なら青、51.5%以上なら赤
                    if (percentage <= 48.5) {
                        scrapedDataElement.style.color = 'blue';
                    } else if (percentage >= 51.5) {
                        scrapedDataElement.style.color = 'red';
                    } else {
                        scrapedDataElement.style.color = 'black'; // それ以外は黒
                    }
                } else {
                    scrapedDataElement.innerText = "データなし";
                    scrapedDataElement.style.color = 'black';
                }
            })
            .catch(error => {
                console.error('Error fetching data:', error);
                scrapedDataElement.innerText = 'データ取得エラー: ' + error.message;
                scrapedDataElement.style.color = 'black';
            });
    }
    
    // 検索フォームの送信イベント
    championSearchForm.addEventListener('submit', function(event) {
        event.preventDefault(); // ページ遷移を防止
        
        const champion = championInput.value.trim().toLowerCase();
        
        if (champion) {
            currentChampion = champion;
            fetchChampionData(champion);
        }
    });
    
    // 初期データ取得
    fetchChampionData(currentChampion);
});
</script>

以下説明

document.addEventListener('DOMContentLoaded', function() {...})
「ページの読み込みが完了したら、コードを実行する」という意味です。
JavaScriptはHTMLよりも先に実行されることがあるため、DOMContentLoaded イベントを使ってすべてのHTML要素が読み込まれた後にスクリプトを実行させます。

const scrapedDataElement = document.getElementById('scrapedData_opgg_iron');
HTMLの要素を取得するためのコードです。
document.getElementById('ID名') を使うと、指定したIDを持つ要素を取得できます。

このコードでは、勝率データを表示する要素 (<div id="scrapedData_opgg_iron"></div> など) を取得しています。

同様に、以下の要素も取得しています：

let currentChampion = 'garen';
この変数 currentChampion は、デフォルトのチャンピオン名を "garen" に設定しています。
ページを開いたときに、最初にガレンのデータを取得するために使います。

fetch("{{ route('scrape.data') }}?champion=" + encodeURIComponent(champion))
これは Laravelのルート（URL）にアクセスしてデータを取得するためのコードです。

{{ route('scrape.data') }} → Laravelの route('scrape.data') に対応するURLを取得

+ encodeURIComponent(champion) →チャンピオン名をURLに含める（スペースなどの特殊文字をエンコード）

例えば、champion = "garen" の場合、
fetch("http://example.com/scrape/data?champion=garen") のようなリクエストが実行されます。

.then(response => {...})
fetch() で取得したデータを処理する部分です。

① if (!response.ok) {...}
サーバーがエラーを返した場合（例: 404 Not Found）、エラーメッセージを表示します。

② return response.json();
データを JSON形式で解析し、次の .then(data => {...}) に渡します

.then(data => {...})
ここでは、取得したデータを処理します。

if (data.champion) {
    const capitalizedChampion = data.champion.charAt(0).toUpperCase() + data.champion.slice(1);
    championNameElement.innerText = capitalizedChampion + 'の勝率';
}

チャンピオン名の最初の文字を大文字に変えて、ページタイトルを〇〇の勝率に変更します。

例えば:

garen → Garenの勝率

zed → Zedの勝率

const text = `${data.title} ${data.description}`;
const match = text.match(/(\d+(\.\d+)?)%/);

ここで text.match(/(\d+(.\d+)?)%/) を使い、テキストの中から「数字 + %」の形式のデータを探します。

例えば、データの内容が
"勝率 50.3% - 人気度 8.5%" なら、
50.3% が取得されます。(前回解説済み)

if (match) {
    const percentage = parseFloat(match[1]); // 数値に変換
    scrapedDataElement.innerText = match[0]; // 50.3% のように表示

    // 勝率による色分け
    if (percentage <= 48.5) {
        scrapedDataElement.style.color = 'blue'; // 低い勝率 → 青
    } else if (percentage >= 51.5) {
        scrapedDataElement.style.color = 'red'; // 高い勝率 → 赤
    } else {
        scrapedDataElement.style.color = 'black'; // それ以外 → 黒
    }
}

勝率によって色分けです～

ルーターを作る

use App\Http\Controllers\DeepSeekController;


use App\Http\Controllers\ChampionController;

Route::get('/champion-guide', [ChampionController::class, 'getGuide']);
//Route::get('/scrape', [ScraperController::class, 'getScrapedData'])->name('scraper.getScrapedData');
//Route::get('/scrape-data', [ScraperController::class, 'getScrapedData'])->name('scrape.data');
Route::get('/deepseek/champion-guide', [DeepSeekController::class, 'getChampionGuide'])->name('deepseek.champion-guide');
Route::post('/deepseek/guide', [DeepSeekController::class, 'getGuide'])->name('deepseek.guide');

//名前はべつにしないといかんらしい
Route::get('/scrape', [ScraperController::class, 'getScrapedData'])->name('scrape.data');
Route::get('/scrape/blonze', [ScraperController::class, 'getScrapedData_blonze'])->name('scrape.data_blonze');
Route::get('/scrape/silver', [ScraperController::class, 'getScrapedData_silver'])->name('scrape.data_silver');
Route::get('/scrape/gold', [ScraperController::class, 'getScrapedData_gold'])->name('scrape.data_gold');

Route::get('/scrape/lola/iron', [ScraperController::class, 'getScrapedData_lola_iron'])->name('scrape.data_lola_iron');
Route::get('/scrape/lola/blonze', [ScraperController::class, 'getScrapedData_lola_blonze'])->name('scrape.data_lola_blonze');
Route::get('/scrape/lola/silver', [ScraperController::class, 'getScrapedData_lola_silver'])->name('scrape.data_lola_silver');
Route::get('/scrape/lola/gold', [ScraperController::class, 'getScrapedData_lola_gold'])->name('scrape.data_lola_gold');

// routes/web.php に追加
Route::get('/scrape/data-ugg-iron', [ScraperController::class, 'scrapedData_ugg_iron'])->name('scrape.data_ugg_iron');

// Home route
Route::get('/', function () {
    return view('home');
})->name('home');

// Scraping route
Route::get('/scrape-data', [ScraperController::class, 'getScrapedData'])->name('scrape.data');

無駄なの入ってるかも

コントローラーを作る

<?php

namespace App\Http\Controllers;

use Illuminate\Http\Request;
use Symfony\Component\HttpClient\HttpClient;
use Symfony\Component\DomCrawler\Crawler;
use Illuminate\Support\Facades\Log;

class ScraperController extends Controller
{
    public function getScrapedData(Request $request)
    {
        try {
            // Get champion name from request, default to 'garen' if not provided
            $champion = $request->query('champion', 'garen');
            
            // Build the URL with the champion name
            $url = "https://www.op.gg/champions/{$champion}/build?hl=ja_JP&tier=iron";
            
            // Log the URL (for debugging)
            Log::info('Scraping URL: ' . $url);
            
            // HTTPクライアントを作成
            $client = HttpClient::create([
                'headers' => ['User-Agent' => 'Mozilla/5.0'],
                'verify_peer' => false
            ]);
            
            // ページ取得
            $response = $client->request('GET', $url);
            $html = $response->getContent();
            
            // HTMLをログに記録（デバッグ用）
            Log::info('Scraped HTML: ' . substr($html, 0, 500));
            
            // DOM解析
            $crawler = new Crawler($html);
            
            // h1タグのテキストを取得
            $title = $crawler->filter('h1')->count() > 0
                ? $crawler->filter('h1')->text()
                : 'タイトルが見つかりません';
            
            // pタグの最初の要素を取得
            $description = $crawler->filter('p')->count() > 0
                ? $crawler->filter('p')->first()->text()
                : '説明文が見つかりません';
            
            return response()->json([
                'title' => $title,
                'description' => $description,
                'champion' => $champion, // Return the champion name for reference
                'tier' => 'iron', // Return the tier for reference
            ]);
        } catch (\Exception $e) {
            Log::error('Scraping error: ' . $e->getMessage());
            return response()->json(['error' => $e->getMessage()], 500);
        }
    }

これは関数の一つである。

以下説明

このコントローラー (ScraperController) は、OP.GG のチャンピオンページをスクレイピングし、
指定した champion のビルド情報 (タイトル・説明文) を取得して JSON 形式で返すものです。

public function getScrapedData(Request $request)
{
    try {

try ブロックの中にエラーハンドリング（エラー発生時の処理）を記述しています。

$champion = $request->query('champion', 'garen');

query('champion', 'garen') で、リクエスト URL のクエリパラメータ champion の値を取得します。

もし champion が指定されていなかったら、デフォルトで 'garen' を設定。

$url = "https://www.op.gg/champions/{$champion}/build?hl=ja_JP&tier=iron";

{$champion} の部分に変数を埋め込み、チャンピオンのビルド情報を取得するURLを作成。

$client = HttpClient::create([
    'headers' => ['User-Agent' => 'Mozilla/5.0'],
    'verify_peer' => false
]);

HttpClient::create() を使い、Webページにアクセスするための HTTP クライアントを作成。

User-Agent を設定し、ブラウザからのアクセスのように見せる（サーバーによっては User-Agent がないとブロックされる）。

verify_peer => false で SSL 証明書の検証を無効化（自己署名証明書のサイトなどでエラーを回避するため）。

$crawler = new Crawler($html);

Crawler クラスを使って、HTML の解析を開始。

$title = $crawler->filter('h1')->count() > 0
    ? $crawler->filter('h1')->text()
    : 'タイトルが見つかりません';

filter('h1') で h1 タグを探し、count() で1つ以上存在するかチェック。

h1 タグがあれば text() でタイトルを取得。

なければ 'タイトルが見つかりません' を返す。

$description = $crawler->filter('p')->count() > 0
    ? $crawler->filter('p')->first()->text()
    : '説明文が見つかりません';

p タグを探し、最初の要素の :text() を取得。

見つからなければ '説明文が見つかりません' を返す。

return response()->json([
    'title' => $title,
    'description' => $description,
    'champion' => $champion,
    'tier' => 'iron',
]);

Laravel の response()->json() を使い、JSON 形式でデータを返す。

以下は例

{
    "title": "ガレンのビルド",
    "description": "ガレンのおすすめアイテムとルーン",
    "champion": "garen",
    "tier": "iron"
}

response()->json([...]) の意味
response() は Laravel のレスポンスを作成する関数。

.json([...]) を使うことで、指定した配列を JSON 形式で返す。

まとめ

まず、各スクリプト内（ビューのスクレイピング用タグ　scrapedData_opgg_ironなど）で以下のコードが実行される。

// 検索フォームの送信イベント
    championSearchForm.addEventListener('submit', function(event) {
        event.preventDefault(); // ページ遷移を防止
        
        const champion = championInput.value.trim().toLowerCase();
        
        if (champion) {
            currentChampion = champion;
            fetchChampionData(champion);
        }
    });

詳しく見ていく

championSearchForm.addEventListener('submit', function(event) {

championSearchForm は検索フォームの要素です。

.addEventListener('submit', function(event) { ... })

submit イベント（フォームが送信されるタイミング）を監視しています。

フォームが送信されたときに、指定した関数 （function(event) { ... }） が実行されます。

event には、イベント（この場合はフォームの送信）に関する情報が含まれます。

また、これは関数fetchChampionData内

fetch("{{ route('scrape.data') }}?champion=" + encodeURIComponent(champion))

でチャンピオン名を変数として抑え、route('scrape.data')(コントローラー)を発動
コントローラー内では

$champion = $request->query('champion', 'garen');
$url = "https://www.leagueofgraphs.com/ja/champions/stats/garen/iron";

ここで各チャンピオン名に応じた統計サイトのwebページのurlを取得

$response = $client->request('GET', $url);
$html = $response->getContent();

ここで、該当urlのhtmlを入手

// DOM解析    
$crawler = new Crawler($html);

ここは重要なんですが、crawlerという機能を使ってhtml形式のものをjson形式に書き換え

参考サイト

// h1タグのテキストを取得
            $title = $crawler->filter('h1')->count() > 0
                ? $crawler->filter('h1')->text()
                : 'タイトルが見つかりません';
            
            // pタグの最初の要素を取得
            $description = $crawler->filter('p')->count() > 0
                ? $crawler->filter('p')->first()->text()
                : '説明文が見つかりません';

crawler で取得したjsonの中から特定のキーで抽出したものを $title や $description などを定義する。

return response()->json([
                'title' => $title,
                'description' => $description,
                'champion' => $champion, // Return the champion name for reference
                'tier' => 'iron', // Return the tier for reference
            ]);

ここで先ほど定義したtitleやdescription、変数のチャンピオン名などをjson形式で返す。

元のjsコードに戻るが、

// チャンピオンデータを取得する関数
    function fetchChampionData(champion) {
        // ローディング表示
        scrapedDataElement.innerText = "Loading...";
        
        fetch("{{ route('scrape.data') }}?champion=" + encodeURIComponent(champion))
            .then(response => {
                if (!response.ok) {
                    throw new Error('サーバーからのレスポンスが正常ではありません: ' + response.status);
                }
                return response.json();
            })
            .then(data => {
                if (data.error) {
                    throw new Error(data.error);
                }

fetchリクエストの応答としてサーバーから返されたデータ（JSON形式）は、.then(response => response.json())で処理され、dataという変数に格納されます。

// 取得したデータ
const text = `${data.title} ${data.description}`;//連結

// 正規表現で最初に現れる `数字 + %` を取得
const match = text.match(/(\d+(\.\d+)?)%/);

textにtitle,descriptionの連結したjsonデータ, matchには正規表現によって〇〇.〇〇%のような値をとってくる。具体的にはmatch配列には以下のような値が入力される。

0: "44.84%"
1: "44.84"
2: ".84"

if (match) {
                    const percentage = parseFloat(match[1]); // 数値として取得
                    scrapedDataElement.innerText = match[0];

                    // 48.5% 以下なら青、51.5%以上なら赤
                    if (percentage <= 48.5) {
                        scrapedDataElement.style.color = 'blue';
                    } else if (percentage >= 51.5) {
                        scrapedDataElement.style.color = 'red';
                    } else {
                        scrapedDataElement.style.color = 'black'; // それ以外は黒
                    }
                } else {
                    scrapedDataElement.innerText = "データなし";
                    scrapedDataElement.style.color = 'black';
                }
            })

scrapedDataElement.innerText = match[0];　ここで実際にhtml部分を書き換えてます！（重要）

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up