pixivpyをなるべくシンプルに使ってみる

Posted at 2025-01-09

1.はじめに

pixivのユーザのページのhtmlソースコードを見てリンクを抽出し，画像をDLしようとする．
何故かリンクが見付からない．
色々調べていくうちに，pixivpyと言うPython用ライブラリを見付ける．
使い方を調べるが，難しいので，シンプルなコードを試みる．

･･･と言った流れで，あまり得意ではないPythonで作ったコードを覚書のために公開します．例外処理等の使用者に優しい部分がかなり抜けているので，修正してから使用する事を推奨します．

2.参考リンク

以下のQiita記事を参考にしました．本記事ではpixivpyのインストール方法，ログイン方法等は省略します．

3.ソース

pixiv_dl_user.py

import time
from pixivpy3 import *
import sys

args = sys.argv
user_id = int(args[1])
r_token = args[2]
api = AppPixivAPI()
api.auth(refresh_token=r_token)

def check_dl_page(lts_illust_pages, page):
    for c_page in lts_illust_pages:
        if page == c_page:
            return False
    return True

def dl_user_page(api, user_id, latest_illustrations):
    lts_illust_pages = []
    with open(latest_illustrations) as f:
        for line in f:
            lts_illust_pages.append(int(line))
    # Checking latest 120 illustrations
    illust_pages = []
    for page in [30, 60, 90, 120]:
        if page == 30:
            json_result = api.user_illusts(user_id)
        else:
            next_qs = api.parse_qs(json_result.next_url)
            if next_qs == None:
                break
            json_result = api.user_illusts(**next_qs)
        for i in json_result.illusts:
            illust_pages.append(i.id)
    # Checking pages we should download, and written log file
    should_dl_pages = []
    with open(latest_illustrations,"w") as fo:
        for page in illust_pages:
            if check_dl_page(lts_illust_pages, page):
                should_dl_pages.append(page)
            print(page, file=fo)
    return should_dl_pages

def download_image(api, img):
    print("# Download image: %s" % img)
    api.download(img)
    time.sleep(10)

def download_in_list(api, illust_ids):
    for illust_id in illust_ids:
        json_result = api.illust_detail(illust_id)
        illust = json_result.illust
        if illust.meta_single_page:
            download_image(api, illust.meta_single_page['original_image_url'])
        else:
            for s in illust.meta_pages:
                download_image(api, s.image_urls['original'])

latest_illustrations = "latest_illustrations.txt"
illust_ids = dl_user_page(api, user_id, latest_illustrations)
print("# Download page list:")
print(illust_ids)
time.sleep(10)

download_in_list(api, illust_ids)

python pixiv_dl_user.py [ユーザ番号] [Refresh token]

のように入力します．最初に使用する際は，"latest_illustrations.txt"と言う空ファイルを作成しておきます．
実行すると，指定したユーザの最近の120件の投稿イラストをダウンロードします．そのときのイラストのIDはlatest_illustrations.txtに記録され，2度目以降の実行では新規の投稿イラストのみをダウンロードします．
イラスト以外(小説，漫画など)は未対応，前回実行から120件を超える投稿があった場合は無視される，等の仕様となっています．

4.終わりに

冒頭にも書きましたが，非常にシンプルに書いたため，使用する際は好みの仕様にアレンジする事をお勧めします．

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up