0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

URLを保存したjsonファイルを使って、 自動検索

Last updated at Posted at 2025-03-22

jsonファイル書き方

{
  "url": [
    "https://www.example1.com",
    "https://www.example2.com",
    "https://www.example3.com",
    "https://www.example4.com"
  ]
}

import json
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.firefox.options import Options
from bs4 import BeautifulSoup

# 1. JSONファイルを読み込む
json_file_path = 'data.json'
with open(json_file_path, 'r', encoding='utf-8') as file:
    data = json.load(file)

# 2. Firefoxのオプション設定(ヘッドレスモードなど)
options = Options()
options.headless = False  # ブラウザを画面に表示する場合はFalseに設定

# 3. GeckoDriverのパスを指定して、SeleniumでFirefoxブラウザを起動
geckodriver_path = "/path/to/geckodriver"  # GeckoDriverのパスを指定してください
service = Service(geckodriver_path)

driver = webdriver.Firefox(service=service, options=options)

# 4. JSONデータを使ってDuckDuckGoで検索
for record in data:
    company_name = record["会社名"]
    search_url = f"https://duckduckgo.com/?q={company_name}"  # DuckDuckGoで検索

    driver.get(search_url)
    time.sleep(2)  # ページが読み込まれるのを待つ

    # 5. BeautifulSoupでページ内容を解析
    soup = BeautifulSoup(driver.page_source, 'html.parser')
    
    # 例えば、検索結果のリンクを取得
    results = soup.find_all('a', {'class': 'result__a'})  # DuckDuckGoの検索結果リンク
    

# 6. ブラウザを終了
driver.quit()

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?