More than 5 years have passed since last update.

Qiita APIをPythonから使ってみる

Last updated at 2019-01-07Posted at 2018-12-31

Pythonの初心者がPythonのお勉強のため、Qiita APIを叩いてみたメモ。
目標は自分の記事のビュー数といいね数とストック数をまとめて取得すること。Qiitaではこれらをまとめて表示できるページがないので、このようなツールがあれば便利。

Qiita API v2の仕様

参考リンク

同じようなことをやっている人はいるので参考にさせていただいた。

完成イメージ

できあがったものはこのようなもの。

$ QIITA_TOKEN=hogehoge
$ qiitacheck --help
usage: qiitacheck.py [-h] [-o {text,csv,json}] [-f FILENAME]
                     [--sort-by {views,likes,stocks}] [--reverse]

Qiitaのビュー数、いいね数、ストック数を取得します。

optional arguments:
  -h, --help            show this help message and exit
  -o {text,csv,json}, --output {text,csv,json}
                        出力形式を指定します
  -f FILENAME, --filename FILENAME
                        出力先のファイル名を指定します
  --sort-by {views,likes,stocks}
                        結果を指定のキーでソートします
  --reverse             ソートを降順にします

環境変数QIITA_TOKENにアクセストークンをセットしてから実行してください。
$ qiitacheck --sort-by likes --reverse
+---------------------------------------------------------------------------------------+-------+-------+--------+----------------------+
| Title                                                                                 | Views | Likes | Stocks | Id                   |
+---------------------------------------------------------------------------------------+-------+-------+--------+----------------------+
| ArduinoをBluemixに接続するチュートリアルのTips                                        |  1822 |    18 |     18 | 5f9f475a8051ed9f0f52 |
| Windows 7でlocalhost宛のパケットをキャプチャする                                      | 16136 |    16 |     10 | 4e9b0c855c620a53e6fa |
（省略）
| MacからターミナルでAIXに接続する                                                      |   959 |     0 |      4 | 050371e525f5aa6f0592 |
| JavaScriptで日付をJDBCタイムスタンプエスケープ形式に変換する                          |   977 |     0 |      1 | dc9af6532759ed633ad0 |
+---------------------------------------------------------------------------------------+-------+-------+--------+----------------------+
$

アクセストークンは環境変数で渡す。-o csvでcsv出力、-o jsonでjson形式での出力が可能。ファイルとして出力する場合は-f <filename>を指定する。なお、各記事について2回のAPIを実行しているため、何回も実行しているとQiita APIの利用制限（認証済みだと1000回/h、認証していないと60回/h）にひっかかってエラーになる。

準備

以下リンク先からQiita APIのアクセストークンを発行する。

curlでAPIを確認

curlでAPIを叩いて確認してみる。

いいね数

いいね数は以下の認証中ユーザーの記事一覧APIで取得できる。このAPIでは記事のリストが返されるので、各記事のlikes_countフィールドをとる。

GET /api/v2/authenticated_user/items

コマンド

QIITA_TOKEN=hogehoge
curl -s -H "Authorization: Bearer ${QIITA_TOKEN}" https://qiita.com/api/v2/authenticated_user/items | jq -r '.[] | .title + ": " + ( .likes_count | tostring )'

実行例

$ curl -s -H "Authorization: Bearer ${QIITA_TOKEN}" https://qiita.com/api/v2/authenticated_user/items | jq -r '.[] | .title + ": " + ( .likes_count | tostring )'
hostPathとlocalのPersistentVolumeの違い: 0
LibertyコンテナのJVM引数にPod名を渡す: 0
VagrantでkubeadmでKubernetesを起動する: 0
Certified Kubernetes Application Developer (CKAD) 受験ログ: 2
（省略）

このAPIでは1ページに取得できる記事数が限られていることに注意。パラメータで?page=1&per_page=20のように指定することができるが、1ページの件数はデフォルトで20件で最大でも100件。HTTPヘッダに次のページのリンクが含まれているので、ページネーションして全てのページを取得する必要がある。

curlでヘッダだけをみるにはcurl -D - -s -o /dev/nullを使うと便利。

コマンド

curl -D - -s -o /dev/null -H "Authorization: Bearer ${QIITA_TOKEN}" https://qiita.com/api/v2/authenticated_user/items

linkヘッダに次のページへのリンクが含まれている。最後のページにはrel="next"というリンクがない。

実行例

$ curl -D - -s -o /dev/null -H "Authorization: Bearer ${QIITA_TOKEN}" https://qiita.com/api/v2/authenticated_user/items
HTTP/2 200
date: Fri, 21 Dec 2018 08:41:09 GMT
content-type: application/json; charset=utf-8
（省略）
link: <https://qiita.com/api/v2/authenticated_user/items?page=1>; rel="first", <https://qiita.com/api/v2/authenticated_user/items?page=2>; rel="next", <https://qiita.com/api/v2/authenticated_user/items?page=4>; rel="last"
total-count: 62
（省略）

Pythonでのページネーションの処理は以下のリンク先を参考にした（というかほぼそのままコピペした）。

How to get data from all pages in Github API with Python?

ビュー数

ビュー数も認証中ユーザーの記事一覧APIの各記事のpage_views_countフィールドから取得できそうに思えるが、なぜかnullが返ってきてしまっていてとれない。

コマンド

curl -s -H "Authorization: Bearer ${QIITA_TOKEN}" https://qiita.com/api/v2/authenticated_user/items | jq -r '.[] | .title + ": " + ( .page_views_count | tostring )'

実行例

$ curl -s -H "Authorization: Bearer ${QIITA_TOKEN}" https://qiita.com/api/v2/authenticated_user/items | jq -r '.[] | .title + ": " + ( .page_views_count | tostring )'
hostPathとlocalのPersistentVolumeの違い: null
LibertyコンテナのJVM引数にPod名を渡す: null
VagrantでkubeadmでKubernetesを起動する: null
Certified Kubernetes Application Developer (CKAD) 受験ログ: null
（省略）

なので、個別記事取得APIから取得する必要がある。

GET /api/v2/items/:item_id

個別記事取得APIのpage_views_countフィールドにはちゃんと値が含まれている。

コマンド

ITEM_ID=fugafuga
curl -s -H "Authorization: Bearer ${QIITA_TOKEN}" https://qiita.com/api/v2/items/${ITEM_ID} | jq -r '.page_views_count'

実行例

$ curl -s -H "Authorization: Bearer ${QIITA_TOKEN}" https://qiita.com/api/v2/items/${ITEM_ID} | jq -r '.page_views_count'
384
$

ストック数

ストック数は認証中ユーザーの記事一覧APIや、個別記事取得APIではとれず、記事をストックしているユーザー一覧APIから取得する必要がある。

GET /api/v2/items/:item_id/stockers

このAPIではユーザーのリストが返されるが、jqでリストの数を数える場合はlengthを使う。

コマンド

curl -s -H "Authorization: Bearer ${QIITA_TOKEN}" https://qiita.com/api/v2/items/${ITEM_ID}/stockers | jq length

実行例

$ curl -s -H "Authorization: Bearer ${QIITA_TOKEN}" https://qiita.com/api/v2/items/${ITEM_ID}/stockers | jq length
4
$

Python

以下のように作成。→GitHub

qiitacheck.py

# !/usr/bin/env python3

import argparse
import csv
import json
import logging
import os
import sys

import prettytable
import requests


formatter = '%(asctime)s %(name)-12s %(levelname)-8s %(message)s'
logging.basicConfig(level=logging.WARNING, format=formatter)
logger = logging.getLogger(__name__)


def get_next_url(response):
    """次のページがある場合は'rel="next"'としてurlが含まれるので、urlを抽出して返す。
    ない場合はNoneを返す。

    link: <https://qiita.com/api/v2/authenticated_user/items?page=1>;
    rel="first", <https://qiita.com/api/v2/authenticated_user/items?page=2>;
    rel="next", <https://qiita.com/api/v2/authenticated_user/items?page=4>;
    rel="last"

    :param response:
    :return: 次のurl
    """
    link = response.headers['link']
    if link is None:
        return None

    links = link.split(',')

    for link in links:

        if 'rel="next"' in link:
            return link[link.find('<') + 1:link.find('>')]
    return None


def get_items(token):
    """ページネーションして全ての記事を取得し、
    ストック数とビュー数は一覧に含まれないので、それらの情報も追加して返す。

    :param token:
    :return: 記事のリスト
    """

    url = 'https://qiita.com/api/v2/authenticated_user/items'
    headers = {'Authorization': 'Bearer {}'.format(token)}

    items = []
    while True:
        response = requests.get(url, headers=headers)
        response.raise_for_status()
        items.extend(json.loads(response.text))
        logger.info('GET {}'.format(url))
        # 次のurlがあるかを確認する
        url = get_next_url(response)
        if url is None:
            break

    # 各記事についてビュー数とストック数の情報を取得して追加する
    # page_views_countは一覧APIにもフィールドはあるがnullが返ってくる
    for item in items:

        # ビュー数
        url = 'https://qiita.com/api/v2/items/{}'.format(item['id'])
        logger.info('GET {}'.format(url))
        response = requests.get(url, headers=headers)
        response.raise_for_status()
        item['page_views_count'] = json.loads(response.text)['page_views_count']

        # ストック数
        url = 'https://qiita.com/api/v2/items/{}/stockers'.format(item['id'])
        logger.info('GET {}'.format(url))
        response = requests.get(url, headers=headers)
        response.raise_for_status()
        users = json.loads(response.text)
        for user in users:
            logger.info({
                'id': user['id'],
                'name': user['name']
                })
        item['stocks_count'] = len(users)

    return items


def sort_items(items, sort_by, reverse):
    """リストをソートする

    :param items:
    :param sort_by:
    :param reverse:
    :return:
    """

    if sort_by == 'views':
        if reverse:
            items.sort(key=lambda x: -x['page_views_count'])
        else:
            items.sort(key=lambda x: x['page_views_count'])
    elif sort_by == 'likes':
        if reverse:
            items.sort(key=lambda x: -x['likes_count'])
        else:
            items.sort(key=lambda x: x['likes_count'])
    elif sort_by == 'stocks':
        if reverse:
            items.sort(key=lambda x: -x['stocks_count'])
        else:
            items.sort(key=lambda x: x['stocks_count'])


def output_text(items, filepath):
    """テキストで整形して標準出力に出力する。
    ファイル名が指定された場合はファイルに出力する。

    :param items:
    :param filepath:
    :return:
    """

    table = prettytable.PrettyTable()
    table.field_names = ['Title', 'Views', 'Likes', 'Stocks', 'Id']
    table.align['Title'] = 'l'
    table.align['Views'] = 'r'
    table.align['Likes'] = 'r'
    table.align['Stocks'] = 'r'
    table.align['Id'] = 'l'
    for item in items:
        table.add_row([item['title'],
                       item['page_views_count'],
                       item['likes_count'],
                       item['stocks_count'],
                       item['id']])

    if filepath:
        with open(filepath, 'w') as text_file:
            text_file.write(table.get_string())
    else:
        print(table)


def output_csv(items, filepath):
    """csvに整形して標準出力に出力する。
    ファイル名が指定された場合はファイルに出力する。

    :param items:
    :param filepath:
    :return:
    """

    def write_rows(writer, items):
        for item in items:
            writer.writerow({
                'Title': item['title'],
                'Views': item['page_views_count'],
                'Likes': item['likes_count'],
                'Stocks': item['stocks_count'],
                'Id': item['id']
            })

    fieldnames = ['Title', 'Views', 'Likes', 'Stocks', 'Id']
    if filepath:
        with open(filepath, 'w') as csv_file:
            writer = csv.DictWriter(csv_file, fieldnames=fieldnames)
            writer.writeheader()
            write_rows(writer, items)
    else:
        writer = csv.DictWriter(sys.stdout, fieldnames=fieldnames)
        writer.writeheader()
        write_rows(writer, items)


def output_json(items, filepath):
    """jsonに整形して標準出力に出力する。
    ファイル名が指定された場合はファイルに出力する。

    :param items:
    :param filepath:
    :return:
    """

    my_list = []
    for item in items:
        my_list.append({
            'Title': item['title'],
            'Views': item['page_views_count'],
            'Likes': item['likes_count'],
            'Stocks': item['stocks_count'],
            'Id': item['id']
        })

    if filepath:
        with open(filepath, 'w') as json_file:
            json.dump(my_list, json_file, ensure_ascii=False, indent=4)
    else:
        print(json.dumps(my_list, ensure_ascii=False, indent=4))


def main():

    # コマンド引数の処理
    parser = argparse.ArgumentParser(description='Qiitaのビュー数、いいね数、ストック数を取得します。',
                                     epilog='環境変数QIITA_TOKENにアクセストークンをセットしてから実行してください。')
    parser.add_argument('-o', '--output',
                        default='text',
                        action='store',
                        type=str,
                        choices=['text', 'csv', 'json'],
                        help='出力形式を指定します')
    parser.add_argument('-f', '--filename',
                        action='store',
                        type=str,
                        help='出力先のファイル名を指定します')
    parser.add_argument('--sort-by',
                        action='store',
                        type=str,
                        choices=['views', 'likes', 'stocks'],
                        help='結果を指定のキーでソートします')
    parser.add_argument('--reverse',
                        action='store_true',
                        help='ソートを降順にします')
    args = parser.parse_args()

    # APIからデータを取得
    token = os.environ['QIITA_TOKEN']
    items = get_items(token)
    # items = [
    #     {'title': 'aaa',
    #      'page_views_count': 11,
    #      'likes_count': 22,
    #      'stocks_count': 33,
    #      'id': 'hogehoge'},
    #     {'title': 'bbb',
    #      'page_views_count': 44,
    #      'likes_count': 55,
    #      'stocks_count': 66,
    #      'id': 'fugafuga'}
    # ]

    # リストをソートする
    sort_items(items, args.sort_by, args.reverse)

    # ファイル出力先のパスを決める
    if args.filename:
        # dockerで実行している場合はファイルの出力先を/tmpにする
        try:
            os.environ['IS_DOCKER']
            # フルパスが与えられた場合はファイル名だけにする
            filename = os.path.basename(args.filename)
            filepath = os.path.join('/tmp', filename)
        except KeyError:
            filepath = args.filename
    else:
        filepath = None

    # 結果を出力する
    if args.output == 'csv':
        output_csv(items, filepath)
    elif args.output == 'json':
        output_json(items, filepath)
    else:
        output_text(items, filepath)


if __name__ == '__main__':
    main()

テキスト出力はprettytableモジュールで整形
コマンド引数はargparseモジュールで処理
記事数が20件以上の場合もページネーションして全ての記事を取得
結果はソート可能

実行にはrequestsとprettytableモジュールが必要。

pip3 install requests prettytable
export QIITA_TOKEN=hogehoge
./qiitacheck.py --help

Dockerで動かす

pipモジュールのインストールなどを気にせず、どこでも使いやすいようにDockerイメージにしてCLIっぽく使ってみる。

イメージ作成

DockerHubのpythonイメージの説明を参考にしてrequirements.txtとDockerfileを作成する。

requirements.txt

requests
prettytable

Dockerfile

FROM python:3-alpine

WORKDIR /usr/src/app

COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt

COPY qiitacheck.py ./

ENV IS_DOCKER TRUE
ENTRYPOINT [ "python", "./qiitacheck.py" ]

CLIっぽく使いたいので、pythonの実行コマンドはCMDではなくENTORYPOINTを指定している
Dockerで実行する場合はファイル出力先のパスをマウント先のディレクトリに変更する必要があるので、Dockerで実行されているかを判定するためのIS_DOCKER環境変数を設定している

イメージをビルドする。

docker build -t sotoiwa540/qiita-checker:1.0 .
docker push sotoiwa540/qiita-checker:1.0

実行

エイリアスを作成する。

alias qiitacheck='docker run --rm -it -e QIITA_TOKEN=${QIITA_TOKEN} -v ${PWD}:/tmp sotoiwa540/qiita-checker:1.0'

トークンの環境変数をコンテナの環境変数にも設定している
ファイル出力する場合のために、カレントディレクトリをマウントしている

トークンを環境変数としてエクスポートする。

export QIITA_TOKEN=hogehoge

実行する。csv出力の場合の例。

実行例

$ qiitacheck -o csv -f test.csv --sort-by likes --reverse
$ cat test.csv
Title,Views,Likes,Stocks,Id
ArduinoをBluemixに接続するチュートリアルのTips,1822,18,18,5f9f475a8051ed9f0f52
Windows 7でlocalhost宛のパケットをキャプチャする,16136,16,10,4e9b0c855c620a53e6fa
SwiftでHTTPリクエストを再帰的に複数並列で行い、全てのリクエストが完了してから次の処理,2445,14,17,8dc15a224d2cf6c6be91
Certified Kubernetes Administrator (CKA) 受験ログ,386,10,4,3509ca3d18d1bed00dee
（省略）

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up