More than 5 years have passed since last update.

Twitter検索で情報を溜め込み、形態素解析してマルコフ連鎖で文章を生成してツイート。

Last updated at 2020-02-14Posted at 2015-04-30

Twitterでの検索結果をテキスト形式で保存。
様々な検索結果をテキストへ保存し、テキストを読み込み形態素解析。
そこからマルコフ連鎖で文章を生成し、ツイートPOSTするまで。

twmarkov.py


# !/user/bin/env python
# -*- coding: utf-8 -*-
from requests_oauthlib import OAuth1Session
import json
import sys
import MeCab
import random
import re

while True:
	search_words = raw_input(u"words: ")
	
	C_KEY = "*************************************"
	C_SECRET = "*************************************"
	A_KEY = "*************************************"
	A_SECRET = "*************************************"

	def Search_words():
		url = "https://api.twitter.com/1.1/search/tweets.json?"
		params = {
				"q": unicode(search_words, "utf-8"),
				"lang": "ja",
				"result_type": "recent",
				"count": "100"
				}
		tw = OAuth1Session(C_KEY,C_SECRET,A_KEY,A_SECRET)
		req = tw.get(url, params = params)
		tweets = json.loads(req.text)
		for tweet in tweets["statuses"]:
			f = open("tweet.txt" , "aw")
			lists = (tweet["text"].encode("utf-8"))
			if "http" in lists:
				lists = lists.split("http", 1)[0]
				lists = lists.split("@")[0]
				lists = lists.split("RT")[0]

				f.write(lists)
				f.flush()
				f.close()

		
	def Mecab_file():	
		f = open("tweet.txt","rb")
		data = f.read()
		f.close()

		mt = MeCab.Tagger("-Owakati")

		wordlist = mt.parse(data)
		wordlist = wordlist.rstrip(" \n").split(" ")
 
		markov = {}
		w = ""
	
		for x in wordlist:
			if w:
				if markov.has_key(w):#Python3では動かない箇所
					new_list = markov[w]
				else:
					new_list =[]
			
				new_list.append(x)
				markov[w] = new_list
			w = x
		
		choice_words = wordlist[0]
		sentence = ""
		count = 0
	
		while count < 90:
			sentence += choice_words
			choice_words = random.choice(markov[choice_words])
			count += 1

			sentence = sentence.split(" ", 1)[0]
			p = re.compile("[!-/:-@[-`{-~]")
			sus = p.sub("", sentence)
	
		words = re.sub(re.compile("[!-~]"),"",sus)
		twits = words + " 【tweet from twmarkov】"
		
		url = "https://api.twitter.com/1.1/statuses/update.json"
		params = {"status": twits,"lang": "ja"}
		tw = OAuth1Session(C_KEY,C_SECRET,A_KEY,A_SECRET)
		req = tw.post(url, params = params)
		if req.status_code == 200:
			print "Success! Your Tweet"
		else:
			print req.status_code
	
	
	if search_words:
		Search_words()
		Mecab_file()
	else:
		break

外部ファイルとして保存したtweet.txtにどんどん検索結果のツイートを格納。
テキスト量が増えすぎても、少なすぎてもツイート結果は面白くならないのが不満点。

cronで叩けば単純にbot化することも出来るんじゃないかと。
その場合は検索用APIではなく、hometimelineとかusertimelineとかを拾うほうがいいかもしれない。

WEBアプリケーションとして動かすとこんな感じになります。
（プログラムをサーバーから削除したので今はないです）

Python3での注意点

if markov.has_key(w):

という処理は

if w in markov:

と置き換えます。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up