Help us understand the problem. What is going on with this article?

Python Scraping

目標

  • サイト
    • eBay.com
  • 検索ワード
    • Dragon Ball
  • 検索条件
    • 売り切れ
  • 取得データ
    • 商品名
    • 状態
    • 価格
    • 送料
    • 1 ページ目のみ
  • 出力先
    • eBayScraping_(日付).csv

ソースコード

from time import sleep
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import csv
import pandas as pd
import datetime


driver = webdriver.Chrome(executable_path="/Users/name/home/work/eBay/Python/chromedriver")

url  = ('https://www.ebay.com/sch/i.html?_from=R40&_nkw=dragon+ball&_sacat=0&rt=nc&LH_Sold=1&LH_Complete=1')
driver.get(url)
driver.implicitly_wait(10)

names = []
statuses = []
prices = []
shippings=[]

items = driver.find_elements_by_class_name('s-item__info.clearfix')

for item in items:
    # get Name
    name = item.find_element_by_class_name('s-item__title').text
    name = name.replace("NEW LISTING", "")
    names.append(name)
    # get Status
    try:
        status = item.find_element_by_class_name('SECONDARY_INFO').text
        statuses.append(status)
    except:
        statuses.append(" ")
    # get Price
    price = item.find_element_by_class_name('s-item__price').text
    price = price.replace("JPY ","")
    prices.append(price)

    # get ShippingCost
    try:
        shipping = item.find_element_by_class_name('s-item__logisticsCost').text
        shipping = shipping.replace("+JPY ","").replace(" shipping","").replace("Free International Shipping","0")
        shippings.append(shipping)
    except:
        shippings.append(" ")
    #print(shippings)

df = pd.DataFrame()
df['name'] = names
df['status'] = statuses
df['price'] = prices
df['shippingcost'] = shippings

csv_date = datetime.datetime.today().strftime("%Y%m%d")
csv_file_name = "eBayScraping_" + csv_date + ".csv"

df.to_csv(csv_file_name, index = False)

driver.quit()

出力結果

Screen Shot 2020-03-14 at 14.49.17.png

問題点

  • 以下の 2 つの項目を明確に区別できない
    • Free Shipping
    • Free International Shipping
  • 以下の項目を取得する際、かなりの時間がかかる
    • 送料

改善・発展

  • 送料の 2 つの項目について、分類方法を見直す
  • 複数ページのデータを取得する
  • 頻出単語を抽出する
kganddl
作業用にメモを取る。 毎日 1 ページ更新が目標。
Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away