url="https://rtrp.jp/locations/332/categories/291/?order=retrip_score&page={}" for i in range...

[Q&A] pythonのスクレイピング、requestsモジュールについての質問です。

@shiracamus posted at 2022-08-30

ソースコードはマークダウンのコードブロックを使って書いてください。
インデントがなくなっていて、どのインデントレベルで処理しているのか判別できませんので。

以下のコードで全頁取得できているように思えます。

import requests
from bs4 import BeautifulSoup

url = "https://rtrp.jp/locations/332/categories/291/?order=retrip_score&page={}"

for i in range(1,6):
    target_url = url.format(i)
    print(target_url)

    res = requests.get(target_url)
    soup = BeautifulSoup(res.text, 'html.parser')
    for spot in soup.find_all('h3', class_='spotName'):
        print(spot.text.strip())

Comments

@shiracamus
回答に書くのではなく、質問を編集してください。
@shiracamus
あなたの回答欄に私はコメントすることができませんので、こちらにコメントを書くようにしてください。
soupの使い方は別途調べてみてください。
find_all では複数見つかるでしょうから、それぞれに対して text.strip() する必要があります。
回答のコードを変更しておきました。
@shiracamus
全URLとは何ですか？
あとは BeautifulSoup の使い方でしょうから、BeautifulSoup について勉強してみてはいかがでしょうか？
まずは自分で手を動かして、わからないところを質問してください。

@takeshi4 posted at 2022-08-30

失礼しました。

url="https://rtrp.jp/locations/332/categories/291/?order=retrip_score&page={}"

   for i in range(1,6):
   target_url=url.format(i)
   print(target_url)

   res=requests.get(target_url)
   soup=BeautifulSoup(res.text,'html.parser')

以上です。

Comments

@takeshi4
Questioner
回答ありがとうございます。

最後の行のコードについてですが、

soup.findメソッドでは最初のデータしか出力されません。

しかしfind_allではエラーが発生します。

全頁全データを出力する方法はございますか？

@takeshi4 posted at 2022-08-30

ありがとうございます。全タイトルを取得できました。
全URLの取得は可能ですか？？

@takeshi4 posted at 2022-08-30

ありがとうございました。

pythonのスクレイピング、requestsモジュールについての質問です。

4Answer

Comments

Comments

Your answer might help someone💌

Popular Questions