Qiita Teams that are logged in
You are not logged in to any team

Log in to Qiita Team
Community
OrganizationEventAdvent CalendarQiitadon (β)
Service
Qiita JobsQiita ZineQiita Blog
26
Help us understand the problem. What is going on with this article?

More than 1 year has passed since last update.

@makimaki913

Tweepyで大量のフォロワー情報を取得する

背景

お遊びで、Twitterの某アカウントのフォロワー情報を大量(約20万件)に取得する必要が生じた。
一部トラブルが生じたので、似たようなことをする時の備忘として、方法を残しておく。

環境

  • Windows 10 Pro
  • Python 3.6.6
  • Tweepy 3.6.0

準備

ソースコード

最初は、Qiitaで探した記事を参考に以下のようなコードを動かした。

get_followers1.py
auth = tweepy.OAuthHandler('XXXXX...', 'XXXXX...')
auth.set_access_token('XXXXX...', 'XXXXX...')
api = tweepy.API(auth, wait_on_rate_limit=True)
followers_ids = tweepy.Cursor(api.followers_ids, id='XXX', cursor=-1).items()
for follower_id in followers_ids:
    try:
        user = api.get_user(follower_id)
        user_info = [user.id_str, user.screen_name, user.name, user.created_at]
        print(user_info)
    except tweepy.error.TweepError as e:
        print(e.reason)

ところが、数時間後、以下のようなエラーが発生した。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
~(中略)~
    raise ConnectionError(err, request=request)
tweepy.error.TweepError: Failed to send request: ('Connection aborted.', OSError("(10054, 'WSAECONNRESET')",))

プロセスは終了コード 1 で完了しました

どうやら認証後、接続時間が長すぎたらしい。
なので、カーソルを保存し、一定周期で認証し直すように変更した。

get_followers2.py
cursor = -1
while cursor != 0:
    auth = tweepy.OAuthHandler('XXXXX...', 'XXXXX...')
    auth.set_access_token('XXXXX...', 'XXXXX...')
    api = tweepy.API(auth, wait_on_rate_limit=True)
    itr = tweepy.Cursor(api.followers_ids, id='XXX', cursor=cursor).pages()
    try:
        for follower_id in itr.next():
            try:
                user = api.get_user(follower_id)
                user_info = [user.id_str, user.screen_name, user.name, user.created_at]
                print(user_info)
            except tweepy.error.TweepError as e:
                print(e.reason)
    except ConnectionError as e:
        print(e.reason)
    cursor = itr.next_cursor

今度は最後まで上手くいった。

26
Help us understand the problem. What is going on with this article?
Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
26
Help us understand the problem. What is going on with this article?