LoginSignup
1
2

More than 3 years have passed since last update.

Python Scraping

Last updated at Posted at 2020-03-14

目標

  • サイト
    • eBay.com
  • 検索ワード
    • Dragon Ball
  • 検索条件
    • 売り切れ
  • 取得データ
    • 商品名
    • 状態
    • 価格
    • 送料
    • 1 ページ目のみ
  • 出力先
    • eBayScraping_(日付).csv

ソースコード


from time import sleep
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import csv
import pandas as pd
import datetime


driver = webdriver.Chrome(executable_path="/Users/name/home/work/eBay/Python/chromedriver")

url  = ('https://www.ebay.com/sch/i.html?_from=R40&_nkw=dragon+ball&_sacat=0&rt=nc&LH_Sold=1&LH_Complete=1')
driver.get(url)
driver.implicitly_wait(10)

names = []
statuses = []
prices = []
shippings=[]

items = driver.find_elements_by_class_name('s-item__info.clearfix')

for item in items:
    # get Name
    name = item.find_element_by_class_name('s-item__title').text
    name = name.replace("NEW LISTING", "")
    names.append(name)
    # get Status
    try:
        status = item.find_element_by_class_name('SECONDARY_INFO').text
        statuses.append(status)
    except:
        statuses.append(" ")
    # get Price
    price = item.find_element_by_class_name('s-item__price').text
    price = price.replace("JPY ","")
    prices.append(price)

    # get ShippingCost
    try:
        shipping = item.find_element_by_class_name('s-item__logisticsCost').text
        shipping = shipping.replace("+JPY ","").replace(" shipping","").replace("Free International Shipping","0")
        shippings.append(shipping)
    except:
        shippings.append(" ")
    #print(shippings)

df = pd.DataFrame()
df['name'] = names
df['status'] = statuses
df['price'] = prices
df['shippingcost'] = shippings

csv_date = datetime.datetime.today().strftime("%Y%m%d")
csv_file_name = "eBayScraping_" + csv_date + ".csv"

df.to_csv(csv_file_name, index = False)

driver.quit()

出力結果

Screen Shot 2020-03-14 at 14.49.17.png

問題点

  • 以下の 2 つの項目を明確に区別できない
    • Free Shipping
    • Free International Shipping
  • 以下の項目を取得する際、かなりの時間がかかる
    • 送料

改善・発展

  • 送料の 2 つの項目について、分類方法を見直す
  • 複数ページのデータを取得する
  • 頻出単語を抽出する
1
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
2