日本語解析API公開のプレスリリース記事はこちら↓
日本語解析API、「gooラボ」で公開 形態素解析やひらがな化など
0. API登録してapp_idを取得する
gooラボの新しいAPIの利用登録手順はこちら↓
githubのアカウントさえあれば、すぐに利用を開始できます。
API登録をするとapp_idがもらえるので、それを保存しておく。
1. 形態素解析API:日本語文字列を語句に分割する技術
curl -v -H "Accept: application/json" -H "Content-type: application/json" -X POST -d '{"app_id":"{YOUR_APP_ID}","request_id":"record001","sentence":"“自分で自分をデザインする人工知能Webサイト(AI Websites That Design Themselves)”をスローガンとするThe Gridは、ユーザが提供するコンテンツを人工知能が判断して自動的に新たなサイトを作るWebサイトビルダーだ。同社は今年の初めに、ソフトウェア製品としては珍しく、クラウドファンディングキャンペーンで大成功したが、このほどさらに大きな資金調達に踏み切った。同社の今日の発表によると、The GridはシリーズAで460万ドルのラウンドを、Jerry YangのAME Cloud Venturesのリードの下(もと)に終了した。そのほかの投資家は、Disney Interactiveの元社長John Pleasants、元Facebookのプロダクト担当VP Greg Badros、そしてElegant ThemesのファウンダNick Roachだ。RoachはThe Gridの310万ドルのシードラウンドをリードした。Yangは今日の発表声明で次のように述べている: “The Gridのクラウド上の人工知能は、Web開発の定型的で時間のかかる部分を人間の手から取り去ることによって、Webデザインという仕事を抜本的に変えた。それによって今では、エレガントなWebサイトをわずかな時間とわずかな費用で作れるようになった。これからは、現代的なWebデザインを誰もができるようになるだろう","output_type":"hiragana"}' https://labs.goo.ne.jp/api/hiragana
=> {"request_id":"record001","word_list":[[["“","括弧","$"],["自分","名詞","ジブン"],["で","格助詞","デ"],["自分","名詞","ジブン"],["を","格助詞","ヲ"],["デザイン","名詞","デザイン"],["する","動詞接尾辞","スル"],["人工","名詞","ジンコウ"],["知能","名詞","チノウ"],["Web","名詞","ウェブ"],["サイト","名詞","サイト"],["(","括弧","$"],["AI","名詞","エーアイ"],[" ","空白","$"],["Websites","Alphabet","ダブリュイービーエスアイティーイーエス"],[" ","空白","$"],["That","名詞","ザット"],[" ","空白","$"],["Design","名詞","デザイン"],[" ","空白","$"],["Themselves","名詞","ゼムセルブズ"],[")","括弧","$"],["”","括弧","$"],["を","格助詞","ヲ"],["スローガン","名詞","スローガン"],["と","判定詞","ト"],["する","動詞語幹","スル"],["The","名詞","ザ"],[" ","空白","$"],["Grid","名詞","グリッド"],["は","連用助詞","ハ"],["、","読点","$"],["ユーザ","名詞","ユーザ"],["が","格助詞","ガ"],["提供","名詞","テイキョウ"],["する","動詞接尾辞","スル"],["コンテンツ","名詞","コンテンツ"],["を","格助詞","ヲ"],["人工","名詞","ジンコウ"],["知能","名詞","チノウ"],["が","格助詞","ガ"],["判断","名詞","ハンダン"],["し","動詞活用語尾","シ"],["て","動詞接尾辞","テ"],["自動的","名詞","ジドウテキ"],["に","格助詞","ニ"],["新たな","連体詞","アラタナ"],["サイト","名詞","サイト"],["を","格助詞","ヲ"],["作","動詞語幹","ツク"],["る","動詞接尾辞","ル"],["Web","名詞","ウェブ"],["サイト","名詞","サイト"],["ビルダー","名詞","ビルダー"],["だ","判定詞","ダ"],["。","句点","$"]],[["同社","名詞","ドウシャ"],["は","連用助詞","ハ"],["今","名詞","イマ"],["年の初め","名詞","トシノハジメ"],["に","格助詞","ニ"],["、","読点","$"],["ソフトウェア","名詞","ソフトウェア"],["製品","名詞接尾辞","セイヒン"],["として","格助詞","トシテ"],["は","連用助詞","ハ"],["珍し","形容詞語幹","メズラシ"],["く","形容詞接尾辞","ク"],["、","読点","$"],["クラウドファンディングキャンペーン","Katakana","クラウドファンディングキャンペーン"],["で","格助詞","デ"],["大","冠名詞","ダイ"],["成功","名詞","セイコウ"],["し","動詞活用語尾","シ"],["た","動詞接尾辞","タ"],["が","接続接尾辞","ガ"],["、","読点","$"],["このほど","名詞","コノホド"],["さらに","連用詞","サラニ"],["大きな","連体詞","オオキナ"],["資金","名詞","シキン"],["調達","名詞","チョウタツ"],["に","格助詞","ニ"],["踏み切","動詞語幹","フミキ"],["っ","動詞活用語尾","ッ"],["た","動詞接尾辞","タ"],["。","句点","$"]],[["同社","名詞","ドウシャ"],["の","格助詞","ノ"],["今日","名詞","キョウ"],["の","格助詞","ノ"],["発表","名詞","ハッピョウ"],["によると","格助詞","ニヨルト"],["、","読点","$"],["The","名詞","ザ"],[" ","空白","$"],["Grid","名詞","グリッド"],["は","連用助詞","ハ"],["シリーズ","名詞","シリーズ"],["A","Alphabet","エー"],["で","格助詞","デ"],["460万","Number","ヨンヒャクロクジューマン"],["ドル","助数詞","ドル"],["の","格助詞","ノ"],["ラウンド","名詞","ラウンド"],["を","格助詞","ヲ"],["、","読点","$"],["Jerry","Alphabet","ジェーイーアールアールワイ"],[" ","空白","$"]],[["Yang","Alphabet","ワイエーエヌジー"],["の","格助詞","ノ"],["AME","Alphabet","エーエムイー"],[" ","空白","$"],["Cloud","名詞","クラウド"],[" ","空白","$"],["Ventures","Alphabet","ブイイーエヌティーユーアールイーエス"],["の","格助詞","ノ"],["リード","名詞","リード"],["の","格助詞","ノ"],["下","名詞","シタ"],["(","括弧","$"],["もと","名詞","モト"],[")","括弧","$"],["に","格助詞","ニ"],["終了","名詞","シュウリョウ"],["し","動詞活用語尾","シ"],["た","動詞接尾辞","タ"],["。","句点","$"]],[["そのほか","名詞","ソノホカ"],["の","格助詞","ノ"],["投資家","名詞","トウシカ"],["は","連用助詞","ハ"],["、","読点","$"],["Disney","名詞","ディズニー"],[" ","空白","$"],["Interactive","Alphabet","アイエヌティーイーアールエーシーティーアイブイイー"],["の","格助詞","ノ"],["元","冠名詞","モト"],["社長","名詞","シャチョウ"],["John","Roman","ジョーン"],[" ","空白","$"],["Pleasants","Alphabet","ピーエルイーエーエスエーエヌティーエス"],["、","読点","$"],["元","冠名詞","モト"],["Facebook","名詞","フェイスブック"],["の","格助詞","ノ"],["プロダクト","名詞","プロダクト"],["担当","名詞","タントウ"],["VP","Alphabet","ブイピー"],[" ","空白","$"]],[["Greg","Alphabet","ジーアールイージー"],[" ","空白","$"],["Badros","Alphabet","ビーエーディーアールオーエス"],["、","読点","$"],["そして","接続詞","ソシテ"],["Elegant","名詞","エレガント"],[" ","空白","$"],["Themes","Alphabet","ティーエッチイーエムイーエス"],["の","格助詞","ノ"],["ファウンダ","名詞","ファウンダ"],["Nick","Alphabet","エヌアイシーケー"],[" ","空白","$"],["Roach","Alphabet","アールオーエーシーエッチ"],["だ","判定詞","ダ"],["。","句点","$"]],[["Roach","Alphabet","アールオーエーシーエッチ"],["は","連用助詞","ハ"],["The","名詞","ザ"],[" ","空白","$"],["Grid","名詞","グリッド"],["の","格助詞","ノ"],["310万","Number","サンビャクジューマン"],["ドル","助数詞","ドル"],["の","格助詞","ノ"],["シード","名詞","シード"],["ラウンド","名詞","ラウンド"],["を","格助詞","ヲ"],["リード","名詞","リード"],["し","動詞活用語尾","シ"],["た","動詞接尾辞","タ"],["。","句点","$"]],[["Yang","Alphabet","ワイエーエヌジー"],["は","連用助詞","ハ"],["今日","名詞","キョウ"],["の","格助詞","ノ"],["発表","名詞","ハッピョウ"],["声明","名詞","セイメイ"],["で","格助詞","デ"],["次のように","連用詞","ツギノヨウニ"],["述べ","動詞語幹","ノベ"],["て","動詞接尾辞","テ"],["い","動詞語幹","イ"],["る","動詞接尾辞","ル"],[":","Symbol","$"],[" ","空白","$"],["“","括弧","$"],["The","名詞","ザ"],[" ","空白","$"],["Grid","名詞","グリッド"],["の","格助詞","ノ"],["クラウド","名詞","クラウド"],["上","名詞接尾辞","ジョウ"],["の","格助詞","ノ"],["人工","名詞","ジンコウ"],["知能","名詞","チノウ"],["は","連用助詞","ハ"],["、","読点","$"],["Web","名詞","ウェブ"],["開発","名詞接尾辞","カイハツ"],["の","格助詞","ノ"],["定型","名詞","テイケイ"],["的","名詞接尾辞","テキ"],["で","判定詞","デ"],["時間","名詞","ジカン"],["の","格助詞","ノ"],["かか","動詞語幹","カカ"],["る","動詞接尾辞","ル"],["部分","名詞","ブブン"],["を","格助詞","ヲ"],["人間","名詞","ニンゲン"],["の","格助詞","ノ"],["手","名詞","テ"],["から","格助詞","カラ"],["取り去","動詞語幹","トリサ"],["る","動詞接尾辞","ル"],["こと","補助名詞","コト"],["によって","格助詞","ニヨッテ"],["、","読点","$"],["Web","名詞","ウェブ"],["デザイン","名詞","デザイン"],["と","格助詞","ト"],["い","動詞語幹","イ"],["う","動詞接尾辞","ウ"],["仕事","名詞","シゴト"],["を","格助詞","ヲ"],["抜本","名詞","バッポン"],["的","名詞接尾辞","テキ"],["に","格助詞","ニ"],["変え","動詞語幹","カエ"],["た","動詞接尾辞","タ"],["。","句点","$"]],[["それによって","連用詞","ソレニヨッテ"],["今","名詞","イマ"],["では","判定* Connection #0 to host labs.goo.ne.jp left intact
詞","デハ"],["、","読点","$"],["エレガント","名詞","エレガント"],["な","判定詞","ナ"],["Web","名詞","ウェブ"],["サイト","名詞","サイト"],["を","格助詞","ヲ"],["わずかな","連体詞","ワズカナ"],["時間","名詞","ジカン"],["と","格助詞","ト"],["わずかな","連体詞","ワズカナ"],["費用","名詞","ヒヨウ"],["で","格助詞","デ"],["作れ","動詞語幹","ツクレ"],["る","動詞接尾辞","ル"],["よう","補助名詞","ヨウ"],["に","判定詞","ニ"],["な","動詞語幹","ナ"],["っ","動詞活用語尾","ッ"],["た","動詞接尾辞","タ"],["。","句点","$"]],[["これ","名詞","コレ"],["から","格助詞","カラ"],["は","連用助詞","ハ"],["、","読点","$"],["現代的","名詞","ゲンダイテキ"],["な","判定詞","ナ"],["Web","名詞","ウェブ"],["デザイン","名詞","デザイン"],["を","格助詞","ヲ"],["誰","名詞","ダレ"],["も","連用助詞","モ"],["が","格助詞","ガ"],["でき","動詞語幹","デキ"],["る","動詞接尾辞","ル"],["よう","補助名詞","ヨウ"],["に","判定詞","ニ"],["な","動詞語幹","ナ"],["る","動詞接尾辞","ル"],["だろう","接続接尾辞","ダロウ"]]]}
※1 request時のjsonデータに"info_filter":"from|pos|read"を追加すると、from(表記),pos(形態素),read(読み)の3つのどれを出力するかを制限できる。
※2 request時のjsonデータに"pos_filter":"名詞|動詞"などと追加すると、出力する品詞を制限できる。
2. 固有表現抽出API:文字列中の人名・地名などを抽出する技術
curl -v -H "Accept: application/json" -H "Content-type: application/json" -X POST -d '{"app_id":"{YOUR_APP_ID}","request_id":"record002","sentence":"“自分で自分をデザインする人工知能Webサイト(AI Websites That Design Themselves)”をスローガンとするThe Gridは、ユーザが提供するコンテンツを人工知能が判断して自動的に新たなサイトを作るWebサイトビルダーだ。同社は今年の初めに、ソフトウェア製品としては珍しく、クラウドファンディングキャンペーンで大成功したが、このほどさらに大きな資金調達に踏み切った。同社の今日の発表によると、The GridはシリーズAで460万ドルのラウンドを、Jerry YangのAME Cloud Venturesのリードの下(もと)に終了した。そのほかの投資家は、Disney Interactiveの元社長John Pleasants、元Facebookのプロダクト担当VP Greg Badros、そしてElegant ThemesのファウンダNick Roachだ。RoachはThe Gridの310万ドルのシードラウンドをリードした。Yangは今日の発表声明で次のように述べている: “The Gridのクラウド上の人工知能は、Web開発の定型的で時間のかかる部分を人間の手から取り去ることによって、Webデザインという仕事を抜本的に変えた。それによって今では、エレガントなWebサイトをわずかな時間とわずかな費用で作れるようになった。これからは、現代的なWebデザインを誰もができるようになるだろう”。"}' https://labs.goo.ne.jp/api/entity
=> {"request_id":"record002","ne_list":[["AI","PSN"],["今日","DAT"],["The Grid","ART"],["Yang","ORG"],["AME Cloud Ventures","ORG"],["Disney Interactive","PSN"],["John Pleasants","PSN"],["Facebook","ORG"],["Elegant Themes","ORG"],["Roach","ORG"],["The Grid","ORG"],["Yang","ORG"],["今日","DAT"]]}
※3 request時のjsonデータに"class_filter":"ART|ORG|PSN|LOC|DAT|TIM"などと追加すると、ART(人工物名)、ORG(組織名)、PSN(人名)、LOC(地名)、DAT(日付表現)、TIM(時刻表現)の5つのうち出力するものを制限できる。
3. 語句類似度算出API:2つの語句の表記ゆれ度算出
curl -v -H "Accept: application/json" -H "Content-type: application/json" -X POST -d '{"app_id":"{YOUR_APP_ID}","request_id":"record003","query_pair":["atrae","アトラエ"]}' https://labs.goo.ne.jp/api/similarity
=> {"request_id":"record003","score":0.6529354884753005}
※4 単語動詞の類似度というよりも、表記のゆれである場合を検知できるような類似度の仕組みのようですね。
4. ひらがな化API:日本語をひらがな/カタカナに変換する技術
curl -v -H "Accept: application/json" -H "Content-type: application/json" -X POST -d '{"app_id":"{YOUR_APP_ID}","request_id":"record004","sentence":"自分で自分をデザインする人工知能Webサイト(AI Websites That Design Themselves)をスローガンとするThe Gridは、ユーザが提供するコンテンツを人工知能が判断して自動的に新たなサイトを作るWebサイトビルダーだ。","output_type":"hiragana"}' https://labs.goo.ne.jp/api/hiragana
=> {"request_id":"record004","output_type":"hiragana","converted":"じぶんで じぶんを でざいんする じんこうちのううぇぶさいと (えーあい だぶりゅいーびーえすあいてぃーいーえす ざっと でざいん ぜむせるぶず)を すろーがんと する ざ ぐりっどは、 ゆーざが ていきょうする こんてんつを じんこうちのうが はんだんして じどうてきに あらたな さいとを つくる うぇぶさいとびるだーだ。"}
※5 "output_type":"katakana"でカタカナ出力もできる。
※6 子供向けにサイトを変換したいときとかには、最高のツールになるかも!