3
4

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

機械学習用に画像を集めたい

Last updated at Posted at 2018-11-21

hellock/icrawler: A multi-thread crawler framework with many builtin image crawlers provided.

自分でクローラーを書いてもいいのですが、今回はライブラリを使いました。

pipのインストール

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python get-pip.py --user
export PATH="$HOME/Library/Python/2.7/bin:$PATH"
pip install icrawler --user

確認

$ pip -V
pip 18.1

$ pip install matplotlib

~/.matplotlib/matplotlibrcの設定

backend : TkAgg
plot.py
import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 8]

plt.plot( x, y )
plt.show()
$ python plot.py
スクリーンショット 2018-11-21 13.34.22.png

使い方

# coding:utf-8

from icrawler.builtin import GoogleImageCrawler
    
crawler = GoogleImageCrawler(storage={"root_dir": "images"})
crawler.crawl(keyword="", max_num=100)

実行すると以下のようにダウンロードが始まります。

2018-11-21 13:43:26,450 - INFO - icrawler.crawler - start crawling...
2018-11-21 13:43:26,450 - INFO - icrawler.crawler - starting 1 feeder threads...
2018-11-21 13:43:26,451 - INFO - feeder - thread feeder-001 exit
2018-11-21 13:43:26,451 - INFO - icrawler.crawler - starting 1 parser threads...
2018-11-21 13:43:26,452 - INFO - icrawler.crawler - starting 1 downloader threads...
...
...

マニュアル

ちょっと変えた

nogizaka.py
# coding:utf-8

from icrawler.builtin import GoogleImageCrawler
import sys
import os

argv = sys.argv

if not os.path.isdir(argv[1]):
    os.makedirs(argv[1])


crawler = GoogleImageCrawler(storage = {"root_dir" : argv[1]})
crawler.crawl(keyword = argv[2], max_num = 1000)

乃木坂の画像を集めた

$ python nogizaka.py image/nogizaka 乃木坂46
スクリーンショット 2018-11-21 16.52.04.png

他も試した

スクリーンショット 2018-11-21 16.53.35.png

次回

集めた画像を使ってDeepLearningモデルを学習させたりできそう。

3
4
1

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
3
4

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?