More than 5 years have passed since last update.

Vtuberはポジティブなのか？ネガティブなのか？

Posted at 2020-02-16

はじめに

Vtuberはポジティブな人が多いのでしょうか？ネガティブな人が多いのでしょうか。
今回はVtuberのツイートを感情分析し、ポジティブ度を測ってみました。

結果として、名取さなはポジティブであるということがわかりました。

[('Neutral', 31), ('Positive', 12), ('Negative', 7)]
1.7142857142857142
ちょっとポジティブです

他にも、DWUやマシーナリーとも子のツイートも分析しました。

使用するもの

COTOHA APIの紹介

今回はCOTOHA APIで感情分析を行っていきます。
COTOHA APIはNTTコミュニケーションズが開発しているAPIで、40年間の日本語技術の研究成果を使用しています。
テキストを送るだけでめんどくさい感情分析が簡単にできます。
今回使用した感情分析APIのほかにも、構文解析やキーワード抽出APIなどがあります。

ポジティブ度を測るまでのながれ

twitterのAPIキーを取得する
COTOHA APIのAPIキーを取得する
tweepyで対象ユーザーのツイートを取得する
ツイートのクレンジングを行う
COTOHA APIで感情分析を行う
ポジティブなツイートとネガティブなツイートの割合でポジティブ度を判定する

ポジティブ度とネガティブ度の判定

基本的な方針としては、まず50件ほど感情分析を行います。
50件の中で、ポジティブなツイートがネガティブなツイートより多ければポジティブとし、
逆であればネガティブと判定するようにします。
あとは感覚値でかなりポジティブ、ちょっとポジティブとかを振り分けるようにしました。

実装

今回は実装をgoogle colaboratoryで行いました。
詳細な実装はこちらでご確認ください。
twitterのAPIキーと、COTOHA APIのAPIキーを書き換えれば、好きなユーザーのポジティブ度をはかることができます。

BASE_URL = "https://api.ce-cotoha.com/api/dev/nlp/"
CLIENT_ID = "hogehoge"
CLIENT_SECRET = "hogehoge"

def auth(client_id, client_secret):
    token_url = "https://api.ce-cotoha.com/v1/oauth/accesstokens"
    
    headers = {
        "Content-Type": "application/json",
        "charset": "UTF-8"
    }

    data = {
        "grantType": "client_credentials",
        "clientId": client_id,
        "clientSecret": client_secret
    }
    r = requests.post(token_url,
                      headers=headers,
                      data=json.dumps(data))
    
    return r.json()["access_token"]

def sentiment(sentence, access_token):
    base_url = BASE_URL
    
    headers = {
        "Content-Type": "application/json",
        "charset": "UTF-8",
        "Authorization": "Bearer {}".format(access_token)
    }
    
    data = {
        "sentence": sentence,
        "type": "kuzure"
    }
    
    r = requests.post(base_url + "v1/sentiment",
                      headers=headers,
                      data=json.dumps(data))
    return r

def cleansing_tweet(tweet):
    clean_tweets = []
    for line in tweet:
        temp = line.text
        temp = re.sub('RT .*', '', temp)
        temp = re.sub('@.*', '', temp)
        temp = re.sub('http.*', '', temp)
        temp = re.sub('\n.*', '', temp)
        temp = re.sub('\u3000.*', '', temp)
        if len(temp) != 0:
            clean_tweets.append(temp)
        if len(clean_tweets) >= 50:
            break
    return clean_tweets

def check_positive(result):
    counter = Counter(result)
    print(counter.most_common())
    positive_negative = counter["Positive"]/counter["Negative"]
    print(positive_negative)
    if positive_negative > 1:
        if positive_negative > 4:
            print("とてもポジティブです")
        elif positive_negative > 2:
            print("ポジティブです")
        elif positive_negative > 1:
            print("ちょっとポジティブです")
    elif positive_negative < 1:
        if positive_negative < 0.25:
            print("とてもネガティブです")
        elif positive_negative < 0.5:
            print("ネガティブです")
        elif positive_negative < 1:
            print("ちょっとネガティブです")
    elif counter["Positive"] == 0 and counter["Negative"] == 0:
        print("感情を失っています")

結果

正しくポジティブ判定ができているか確認するため、まずはサンシャイン池崎のツイートを分析してみます。

サンシャイン池崎の分析結果

[('Neutral', 32), ('Positive', 16), ('Negative', 2)]
8.0
とてもポジティブです

正しく判定できてそうですね。
Vtuberを分析していきましょう。

名取さなの分析結果

[('Neutral', 31), ('Positive', 12), ('Negative', 7)]
1.7142857142857142
ちょっとポジティブです

DWUの分析結果

[('Neutral', 28), ('Negative', 15), ('Positive', 7)]
0.4666666666666667
ネガティブです

マシーナリーとも子の分析結果

[('Neutral', 27), ('Positive', 16), ('Negative', 7)]
2.2857142857142856
ポジティブです

まとめ

今回はツイートを取得して、感情分析を行い、ポジティブやネガティブ度を判定しました。
今後の展望としては、文全体の感情だけではなく、感情語も使って判定すると面白そうですね。

参考リンク

https://qiita.com/tf0101/items/099fbfce3ad7992e1baa

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up