More than 5 years have passed since last update.

Azure Text AnalyticsをPythonから使ってみた

Posted at 2018-02-21

AzureのCognitive ServicesのText Analytics APIでは文章からセンチメント（感情のネガポジ度的なもの）、キーフレーズ、言語の種類を検出できます。

SDK提供されていて各種言語から利用することができます。
今回はPythonから呼び出して使っています。

前提条件

Azureアカウントの作成
Cognitive Services 試用エクスペリエンス | Microsoft Azure からサブスクリプションキーを取得

スクリプト

Python Quickstart for Azure Cognitive Services, Text Analytics API | Microsoft Docs を参考にやってみました。

センチメント分析の場合は以下のようなコードになります。


import requests
from pprint import pprint

# サブスクリプションキーとAPIのURLを指定
subscription_key="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
text_analytics_url = "https://westcentralus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment"

documents = {'documents' : [
  {'id': '1', 'language': 'en', 'text': 'I had a wonderful experience! The rooms were wonderful and the staff was helpful.'},
  {'id': '2', 'language': 'es', 'text': '¡Tuve una experiencia maravillosa! Las habitaciones eran maravillosas y el personal fue servicial.'},  
  {'id': '3', 'language': 'zh-Hans', 'text': '我有一个美好的经历！房间很棒，工作人员很有帮助。.'},  
  {'id': '4', 'language': 'ja', 'text': '私はすばらしい経験をしました！部屋は素晴らしかったしスタッフも助かりました。'}
]}

headers   = {"Ocp-Apim-Subscription-Key": subscription_key}
response  = requests.post(text_analytics_url, headers=headers, json=documents)
sentiments = response.json()
pprint(sentiments)

こんな結果が返ってきます。英語、スペイン語、中国語、日本語でそれぞれセンチメント値がscoreとして帰ってきています。

{'documents': [{'id': '1', 'score': 0.9552919864654541},
               {'id': '2', 'score': 0.6087863445281982},
               {'id': '3', 'score': 0.7710065841674805},
               {'id': '4', 'score': 0.43333786725997925}],
 'errors': []}

感想

日本語でいろんな文章で試してみましたが現時点(2018年2月)では想定しているネガ/ポジ値がなかなか出ません。Google翻訳とかで日本語から英語に翻訳してから、Text Analytics APIにかけるとかやればそれなりの結果が出そうです。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up