Scrapy入門(4)

  • 10
    Like
  • 0
    Comment

Scrapy入門(4)

はじめに

Scrapy入門(1)
Scrapy入門(2)
Scrapy入門(3)

前回の記事では、Scrapyを利用してファイルをダウンロードする方法を試しました。今回はSpiderに対して、コマンドラインから引数を渡す方法をご紹介します。

Spiderの作成

今回はQiitaのアドベントカレンダーのカテゴリページの内容を取得するSpiderを作成します。Spider内では対象となるカテゴリーをコマンドライン引数として受け取り、対象となるURLを動的に生成し、ページの内容をダウンロードします。

qiita_spider.py
# -*- coding:utf-8 -*-


import scrapy


class QiitaSpider(scrapy.Spider):
    name = 'qiita_spider'

    def start_requests(self):

        # クローリング対象のURL
        url = 'http://qiita.com/advent-calendar/2016/categories/'

        # コマンドラインから渡した引数は、デフォルトでSpiderのアトリビュートとして取得することができます。
        # 今回の場合、self.categoriesで引数の値を取得できます。
        categories = getattr(self, 'categories', None)
        if categories:
            url = '{0}{1}'.format(url, categories)

        yield scrapy.Request(url, self.parse)

    def parse(self, response):
        for title in response.css('.adventCalendarList_calendarTitle > a::text').extract():
            yield {'title': title}

引数の渡し方

Scrapy付属のコマンドには-aオプションを使用して引数を渡すことができます。

-a categories=programming_languages

もし複数の引数を渡したい場合は、次のように記載します。

-a offset=0 -a limit=100

コマンドの実行

scrapy crawl qiita_spider -a categories=programming_languages -o categories.json

コマンドを実行するとコンソールにログが表示されます。ログには前回説明した通り、取得中のURL、ステータス、バイト数、サマリーなどの情報が表示されます。

2016-12-13 00:53:13 [scrapy] INFO: Scrapy 1.2.0 started (bot: scrapybot)
2016-12-13 00:53:13 [scrapy] INFO: Overridden settings: {'FEED_URI': 'categories.json', 'SPIDER_MODULES': ['crawler.main.spiders'], 'COOKIES_ENABLED': False, 'TELNETCONSOLE_ENABLED': False, 'FEED_FORMAT': 'json', 'DOWNLOAD_DELAY': 1}
2016-12-13 00:53:13 [scrapy] INFO: Enabled extensions:
['scrapy.extensions.feedexport.FeedExporter',
 'scrapy.extensions.logstats.LogStats',
 'scrapy.extensions.corestats.CoreStats']
2016-12-13 00:53:13 [scrapy] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2016-12-13 00:53:13 [scrapy] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2016-12-13 00:53:13 [scrapy] INFO: Enabled item pipelines:
[]
2016-12-13 00:53:13 [scrapy] INFO: Spider opened
2016-12-13 00:53:13 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2016-12-13 00:53:15 [scrapy] DEBUG: Crawled (200) <GET http://qiita.com/advent-calendar/2016/categories/programming_languages> (referer: None)
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'BEFOOL'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Crystal'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'C\u8a00\u8a9e'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Dart'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Delphi'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'D\u8a00\u8a9e\u304f\u3093'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Elixir (\u305d\u306e2)\u3068Phoenix'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Elm'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Erlang'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Esolang(\u96e3\u89e3\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e)'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'F#'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Faust(\u591a\u5206\u3072\u3068\u308a)'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Formal Method'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'G*Advent Calendar(Groovy,Grails,Gradle,Spock...)'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'GPU \u3067\u6696\u3092\u53d6\u308a\u305f\u3044\u4eba\u306e\u305f\u3081\u306e GLSL'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Haskell Performance'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Hot Soup Processor'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'IchigoJam'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Java Puzzlers '}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Julia'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'LabVIEW'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Lint'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Mathematica'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Maya-Python'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'ML'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Nim'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'null\u5b89\u5168'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Opal'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Perl 5'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Perl 6'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'PHP\u3067\u4f55\u304b\u3092\u4f5c\u308b'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'PHP\u3067\u5b66\u3076\u30c7\u30b6\u30a4\u30f3\u30d1\u30bf\u30fc\u30f3'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'PlayFramework(Java)'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'PlayFramework(Scala)'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Processing'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Prolog'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Python\u3067\u30de\u30a4\u30af\u30e9\u98a8FPS AOS\u3092\u904a\u3073\u5012\u3059'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Rails'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'RubyMine'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Smalltalk'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Swift\u611b\u597d\u4f1a'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Theorem Prover'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'TypeScript'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Visual Basic'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'vvvv'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'XAML'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'C#'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'C# \u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb \u5168\u90e8\u4ffa'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Clojure'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'D\u8a00\u8a9e'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Elixir'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Erlang \u4e00\u4eba'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Go'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Go (\u305d\u306e2)'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Go (\u305d\u306e3)'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Haskell'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Java'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'JavaScript'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Kotlin'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Lisp'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'mruby'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Node.js'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'PHP'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Python'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Python'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Qt'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'R'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Ruby'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Rust'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Rust \u305d\u306e2'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Scala'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Shell Script'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Stan '}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Swift'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'Swift \u305d\u306e2'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'\u30ec\u30c8\u30ed Lisp'}
2016-12-13 00:53:15 [scrapy] DEBUG: Scraped from <200 http://qiita.com/advent-calendar/2016/categories/programming_languages>
{'title': u'\u4e00\u4ebaPHP\u7dcf\u5fa9\u7fd2'}
2016-12-13 00:53:15 [scrapy] INFO: Closing spider (finished)
2016-12-13 00:53:15 [scrapy] INFO: Stored json feed (77 items) in: categories.json
2016-12-13 00:53:15 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 260,
 'downloader/request_count': 1,
 'downloader/request_method_count/GET': 1,
 'downloader/response_bytes': 8609,
 'downloader/response_count': 1,
 'downloader/response_status_count/200': 1,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2016, 12, 12, 15, 53, 15, 675600),
 'item_scraped_count': 77,
 'log_count/DEBUG': 78,
 'log_count/INFO': 8,
 'response_received_count': 1,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'start_time': datetime.datetime(2016, 12, 12, 15, 53, 13, 509011)}
2016-12-13 00:53:15 [scrapy] INFO: Spider closed (finished)

Spiderによって作成されたJsonファイルは次の通りです。コマンドライン引数を利用して、対象ページをダウンロードし、参加者募集中のタイトル一覧をスクレイピングして保存しました。

[
{"title": "BEFOOL"},
{"title": "Crystal"},
{"title": "C\u8a00\u8a9e"},
{"title": "Dart"},
{"title": "Delphi"},
{"title": "D\u8a00\u8a9e\u304f\u3093"},
{"title": "Elixir (\u305d\u306e2)\u3068Phoenix"},
{"title": "Elm"},
{"title": "Erlang"},
{"title": "Esolang(\u96e3\u89e3\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e)"},
{"title": "F#"},
{"title": "Faust(\u591a\u5206\u3072\u3068\u308a)"},
{"title": "Formal Method"},
{"title": "G*Advent Calendar(Groovy,Grails,Gradle,Spock...)"},
{"title": "GPU \u3067\u6696\u3092\u53d6\u308a\u305f\u3044\u4eba\u306e\u305f\u3081\u306e GLSL"},
{"title": "Haskell Performance"},
{"title": "Hot Soup Processor"},
{"title": "IchigoJam"},
{"title": "Java Puzzlers "},
{"title": "Julia"},
{"title": "LabVIEW"},
{"title": "Lint"},
{"title": "Mathematica"},
{"title": "Maya-Python"},
{"title": "ML"},
{"title": "Nim"},
{"title": "null\u5b89\u5168"},
{"title": "Opal"},
{"title": "Perl 5"},
{"title": "Perl 6"},
{"title": "PHP\u3067\u4f55\u304b\u3092\u4f5c\u308b"},
{"title": "PHP\u3067\u5b66\u3076\u30c7\u30b6\u30a4\u30f3\u30d1\u30bf\u30fc\u30f3"},
{"title": "PlayFramework(Java)"},
{"title": "PlayFramework(Scala)"},
{"title": "Processing"},
{"title": "Prolog"},
{"title": "Python\u3067\u30de\u30a4\u30af\u30e9\u98a8FPS AOS\u3092\u904a\u3073\u5012\u3059"},
{"title": "Rails"},
{"title": "RubyMine"},
{"title": "Smalltalk"},
{"title": "Swift\u611b\u597d\u4f1a"},
{"title": "Theorem Prover"},
{"title": "TypeScript"},
{"title": "Visual Basic"},
{"title": "vvvv"},
{"title": "XAML"},
{"title": "C#"},
{"title": "C# \u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb \u5168\u90e8\u4ffa"},
{"title": "Clojure"},
{"title": "D\u8a00\u8a9e"},
{"title": "Elixir"},
{"title": "Erlang \u4e00\u4eba"},
{"title": "Go"},
{"title": "Go (\u305d\u306e2)"},
{"title": "Go (\u305d\u306e3)"},
{"title": "Haskell"},
{"title": "Java"},
{"title": "JavaScript"},
{"title": "Kotlin"},
{"title": "Lisp"},
{"title": "mruby"},
{"title": "Node.js"},
{"title": "PHP"},
{"title": "Python"},
{"title": "Python"},
{"title": "Qt"},
{"title": "R"},
{"title": "Ruby"},
{"title": "Rust"},
{"title": "Rust \u305d\u306e2"},
{"title": "Scala"},
{"title": "Shell Script"},
{"title": "Stan "},
{"title": "Swift"},
{"title": "Swift \u305d\u306e2"},
{"title": "\u30ec\u30c8\u30ed Lisp"},
{"title": "\u4e00\u4ebaPHP\u7dcf\u5fa9\u7fd2"}
][
{"title": "BEFOOL"},
{"title": "Crystal"},
{"title": "C\u8a00\u8a9e"},
{"title": "Dart"},
{"title": "Delphi"},
{"title": "D\u8a00\u8a9e\u304f\u3093"},
{"title": "Elixir (\u305d\u306e2)\u3068Phoenix"},
{"title": "Elm"},
{"title": "Erlang"},
{"title": "Esolang(\u96e3\u89e3\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e)"},
{"title": "F#"},
{"title": "Faust(\u591a\u5206\u3072\u3068\u308a)"},
{"title": "Formal Method"},
{"title": "G*Advent Calendar(Groovy,Grails,Gradle,Spock...)"},
{"title": "GPU \u3067\u6696\u3092\u53d6\u308a\u305f\u3044\u4eba\u306e\u305f\u3081\u306e GLSL"},
{"title": "Haskell Performance"},
{"title": "Hot Soup Processor"},
{"title": "IchigoJam"},
{"title": "Java Puzzlers "},
{"title": "Julia"},
{"title": "LabVIEW"},
{"title": "Lint"},
{"title": "Mathematica"},
{"title": "Maya-Python"},
{"title": "ML"},
{"title": "Nim"},
{"title": "null\u5b89\u5168"},
{"title": "Opal"},
{"title": "Perl 5"},
{"title": "Perl 6"},
{"title": "PHP\u3067\u4f55\u304b\u3092\u4f5c\u308b"},
{"title": "PHP\u3067\u5b66\u3076\u30c7\u30b6\u30a4\u30f3\u30d1\u30bf\u30fc\u30f3"},
{"title": "PlayFramework(Java)"},
{"title": "PlayFramework(Scala)"},
{"title": "Processing"},
{"title": "Prolog"},
{"title": "Python\u3067\u30de\u30a4\u30af\u30e9\u98a8FPS AOS\u3092\u904a\u3073\u5012\u3059"},
{"title": "Rails"},
{"title": "RubyMine"},
{"title": "Smalltalk"},
{"title": "Swift\u611b\u597d\u4f1a"},
{"title": "Theorem Prover"},
{"title": "TypeScript"},
{"title": "Visual Basic"},
{"title": "vvvv"},
{"title": "XAML"},
{"title": "C#"},
{"title": "C# \u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb \u5168\u90e8\u4ffa"},
{"title": "Clojure"},
{"title": "D\u8a00\u8a9e"},
{"title": "Elixir"},
{"title": "Erlang \u4e00\u4eba"},
{"title": "Go"},
{"title": "Go (\u305d\u306e2)"},
{"title": "Go (\u305d\u306e3)"},
{"title": "Haskell"},
{"title": "Java"},
{"title": "JavaScript"},
{"title": "Kotlin"},
{"title": "Lisp"},
{"title": "mruby"},
{"title": "Node.js"},
{"title": "PHP"},
{"title": "Python"},
{"title": "Python"},
{"title": "Qt"},
{"title": "R"},
{"title": "Ruby"},
{"title": "Rust"},
{"title": "Rust \u305d\u306e2"},
{"title": "Scala"},
{"title": "Shell Script"},
{"title": "Stan "},
{"title": "Swift"},
{"title": "Swift \u305d\u306e2"},
{"title": "\u30ec\u30c8\u30ed Lisp"},
{"title": "\u4e00\u4ebaPHP\u7dcf\u5fa9\u7fd2"}
]

終わりに

今回はSpiderに対して、コマンドライン引数を渡して処理する方法を紹介しました。外部からSpiderに対してデータを渡すことにより、柔軟にSpiderに対してカスタマイズができるようになります。次回以降もScrapyの有益な機能を紹介していきます。お楽しみに!