4
8

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

TwitterデータをCSVで抽出

Last updated at Posted at 2017-05-15

Found a good library "Tweepy (# https://github.com/tweepy/tweepy )"

SampleコードをPython3に対応しました。

Step 1: Install Tweepy

pip install tweepy

Collecting tweepy
  Downloading tweepy-3.5.0-py2.py3-none-any.whl
Requirement already satisfied: requests-oauthlib>=0.4.1 in /Users/aws/Documents/Anaconda/anaconda/lib/python3.6/site-packages (from tweepy)
Requirement already satisfied: requests>=2.4.3 in /Users/aws/Documents/Anaconda/anaconda/lib/python3.6/site-packages (from tweepy)
Requirement already satisfied: six>=1.7.3 in /Users/aws/Documents/Anaconda/anaconda/lib/python3.6/site-packages (from tweepy)
Requirement already satisfied: oauthlib>=0.6.2 in /Users/aws/Documents/Anaconda/anaconda/lib/python3.6/site-packages (from requests-oauthlib>=0.4.1->tweepy)
Installing collected packages: tweepy
Successfully installed tweepy-3.5.0
# !/usr/bin/env python
# encoding: utf-8

import tweepy  
import csv

# Twitter API credentials
consumer_key = ""
consumer_secret = ""
access_key = ""
access_secret = ""


def get_all_tweets(screen_name):
    # Twitter only allows access to a users most recent 3240 tweets with this method

    # authorize twitter, initialize tweepy
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_key, access_secret)
    api = tweepy.API(auth)

    # initialize a list to hold all the tweepy Tweets
    alltweets = []

    # make initial request for most recent tweets (200 is the maximum allowed count)
    new_tweets = api.user_timeline(screen_name=screen_name, count=200)

    # save most recent tweets
    alltweets.extend(new_tweets)

    # save the id of the oldest tweet less one
    oldest = alltweets[-1].id - 1

    # keep grabbing tweets until there are no tweets left to grab
    while len(new_tweets) > 0:
        print("getting tweets before %s" % (oldest))

        # all subsiquent requests use the max_id param to prevent duplicates
        new_tweets = api.user_timeline(screen_name=screen_name, count=200, max_id=oldest)

        # save most recent tweets
        alltweets.extend(new_tweets)

        # update the id of the oldest tweet less one
        oldest = alltweets[-1].id - 1

        print("...%s tweets downloaded so far" % (len(alltweets)))

    # transform the tweepy tweets into a 2D array that will populate the csv
    outtweets = [[tweet.id_str, tweet.created_at, tweet.text.encode("utf-8")] for tweet in alltweets]

    # write the csv
    with open('%s_tweets.csv' % screen_name, 'w') as f:
        writer = csv.writer(f)
        writer.writerow(["id", "created_at", "text"])
        writer.writerows(outtweets)

    pass


if __name__ == '__main__':
    get_all_tweets("twitter Username")

参考: https://gist.github.com/yanofsky/5436496

4
8
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
4
8

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?