Edited at

pipでWebスクレイピングライブラリScrapyのインストール(Python3)

More than 1 year has passed since last update.


環境

$ python -V 

Python 3.6.1

$ pip -V
pip 9.0.1


Scrapyとは?

これ


https://scrapy.org/



pipでScrapyをインストールする

$ pip install Scrapy 

Collecting scrapy
Using cached Scrapy-1.4.0-py2.py3-none-any.whl
Collecting PyDispatcher>=2.0.5 (from scrapy)
Using cached PyDispatcher-2.0.5.tar.gz
Collecting queuelib (from scrapy)
Using cached queuelib-1.4.2-py2.py3-none-any.whl
Collecting w3lib>=1.17.0 (from scrapy)
Downloading w3lib-1.17.0-py2.py3-none-any.whl
Collecting parsel>=1.1 (from scrapy)
Using cached parsel-1.2.0-py2.py3-none-any.whl
Collecting service-identity (from scrapy)
Using cached service_identity-17.0.0-py2.py3-none-any.whl
Collecting pyOpenSSL (from scrapy)
Using cached pyOpenSSL-17.1.0-py2.py3-none-any.whl
Collecting lxml (from scrapy)
Downloading lxml-3.8.0-cp36-cp36m-manylinux1_x86_64.whl (7.3MB)
100% |████████████████████████████████| 7.3MB 79kB/s
Collecting Twisted>=13.1.0 (from scrapy)
Could not find a version that satisfies the requirement Twisted>=13.1.0 (from scrapy) (from versions: )
No matching distribution found for Twisted>=13.1.0 (from scrapy)

インストールに失敗した。どうやらTwisted >=13.1.0 が必要とのことなのでTwistedをインストールする。

$ pip install Twisted

Collecting Twisted
Could not find a version that satisfies the requirement Twisted (from versions: )
No matching distribution found for Twisted

が、Twistedのインストールもダメ。仕方がないので、ソースからインストールする。


https://twistedmatrix.com/trac/wiki/Downloads


Twisted 17.5.0 をダウンロードしてインストール。

$ wget https://twistedmatrix.com/Releases/Twisted/17.5/Twisted-17.5.0.tar.bz2

$ tar -jxvf Twisted-17.5.0.tar.bz2
$ cd Twisted-17.5.0.tar.bz2
$ python setup.py install
...
Finished processing dependencies for Twisted==17.5.0

Twistedのインストール完了。再度pipでScrapyのインストールにトライする。

$ pip install Scrapy

...
Successfully installed PyDispatcher-2.0.5 Scrapy-1.4.0 pyasn1-0.2.3 pyasn1-modules-0.0.9 queuelib-1.4.2 service-identity-17.0.0

無事にインストールできた。

$ scrapy

Scrapy 1.4.0 - no active project

Usage:
scrapy <command> [options] [args]

Available commands:
bench Run quick benchmark test
fetch Fetch a URL using the Scrapy downloader
genspider Generate new spider using pre-defined templates
runspider Run a self-contained spider (without creating a project)
settings Get settings values
shell Interactive scraping console
startproject Create new project
version Print Scrapy version
view Open URL in browser, as seen by Scrapy

[ more ] More commands available when run from project directory

Use "scrapy <command> -h" to see more info about a command

この記事は自身のブログからの転載です。

http://tic40.hatenablog.com/entry/2017/07/12/220000