More than 5 years have passed since last update.

Deep LearningフレームワークのトレンドをGitHubリポジトリから自動で取得する

Last updated at 2018-05-23Posted at 2018-01-18

Deep Learningフレームワークの最新のトレンドを調べたいと思い、現時点の情報をGitHubリポジトリから自動で取得するpythonスクリプトを作りました。

取得する情報

各リポジトリに対して、以下の情報を取得します。

Starの数
Forkの数
Issueの数

情報を取得するリポジトリ

以下のフレームワークのトレンドを取得します。

Tensorflow
Chainer
Caffe

実行環境

Ubuntu 16.04 LTS
Python 3.6.0
jqコマンド（sudo apt-get install jq）

スクリプト

github_trend.py


import subprocess

def res_cmd(cmd):
    return subprocess.Popen(
        cmd, stdout = subprocess.PIPE,
        shell=True).communicate()[0]

# targets
target_star = '.watchers'
target_fork = '.forks'
target_issue = '.open_issues'

# repositories
tensorflow = '/tensorflow/tensorflow'
chainer = '/chainer/chainer'
caffe = '/BVLC/caffe'

def repo_crawler(repo):

    def crawler_cmd(repo, target):
        cmd = "curl --silent 'https://api.github.com/repos{0}' -H 'Accept: application/vnd.github.preview' | jq '{1}'".format(repo, target)
        return int(res_cmd(cmd))

    star = crawler_cmd(repo, target_star)
    fork = crawler_cmd(repo, target_fork)
    issue = crawler_cmd(repo, target_issue)

    return repo, star, fork, issue

header = ('Repository', 'Stars', 'Forks', 'Issues')
ret_tensorflow = repo_crawler(tensorflow)
ret_chainer = repo_crawler(chainer)
ret_caffe = repo_crawler(caffe)

print(header)
print(ret_tensorflow)
print(ret_chainer)
print(ret_caffe)

実行しすぎるとGitHub APIの制限に引っかかる

GitHub APIのリクエスト制限が1時間に60回までなので、何度か実行するとエラーになります。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up