1
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

Everyone mo Kane-Kosugi ni narou!!!

Posted at

#1. Hajimeni
Everyone ha kane kosugi san no romaji diary wo shitte imasuka?
English ga yomeru kigasuru to wadai ni natta yatsu desu!
(Ima ha koushin sarete nai youdesu…)

ズルいと話題! ケイン・コスギの爆笑ローマ字日記。しかし、その本当の理由は・・・
KANE & STAFF DIARY - Kane Kosugi

Sports ya Haiyuu toshite katuyaku suru kane san.
Totemo 45 sai no body toha omoe masen!
Watashi ha "Kiniku Banduke" to "Ninja sentai Kakurenjaa" no "Black ninja" ga
inshou ni nokotte imasu.

Koko deha Auto de Kane romeji Diary ni shimasu!!!

#2.Kankyou
・MacBook Pro
・Python 3.7.4
・COTOHA API
・Google Translation API
・Visual Studio Code

#3. Yaritai koto
入力_出力.png
3ttsu no API to Dictionary de jitsugen shimasu.

##3.1 COTOHA Koyuu hyougen chuushutsu
image.png

##3.2 Google Translation API
3.3Google_Translation_API.png

##3.3 COTOHA Koubun kaiseki
image.png

##3.4 Dictionary
image.png

##3.5 Sonota
image.png

##3.6 Moto no basho ni ate hameru
image.png

##3.7 Subete romaji ni henkan
image.png

##3.8 Buntou wo oomoji ni suru
image.png

#4. Code

##4.1 kane_kosugi.py

Click here !!
import requests
import json
import sys
import dictionary
import re
from pykakasi import kakasi
from google.cloud import translate
import mojimoji

BASE_URL = "https://api.ce-cotoha.com/api/dev/nlp/"
CLIENT_ID = "COTOHA no CLIENT_ID wo irete kudasai"
CLIENT_SECRET = "COTOHA no CLIENT_SECRET wo irete kudasai"
project_id="GCP no project_id wo irete kudasai"

kakasi = kakasi()  
kakasi.setMode('H', 'a')
kakasi.setMode('K', 'a')
kakasi.setMode('J', 'a')
conv = kakasi.getConverter()

#COTOHA認証
def auth(client_id, client_secret):
    token_url = "https://api.ce-cotoha.com/v1/oauth/accesstokens"
    headers = {
        "Content-Type": "application/json",
        "charset": "UTF-8"
    }

    data = {
        "grantType": "client_credentials",
        "clientId": client_id,
        "clientSecret": client_secret
    }
    r = requests.post(token_url,
                      headers=headers,
                      data=json.dumps(data))
    return r.json()["access_token"]



#構文解析
def parse(sentence, access_token):
    base_url = BASE_URL
    headers = {
        "Content-Type": "application/json",
        "charset": "UTF-8",
        "Authorization": "Bearer {}".format(access_token)
    }
    data = {
        "sentence": sentence,
        "type": "default"
    }
    r = requests.post(base_url + "v1/parse",
                      headers=headers,
                      data=json.dumps(data))
    return r.json()



#固有表現抽出
def ne(sentence, access_token):
    base_url = BASE_URL
    headers = {
        "Content-Type": "application/json",
        "charset": "UTF-8",
        "Authorization": "Bearer {}".format(access_token)
    }
    data = {
        "sentence": sentence,
        "type": "default"
    }
    r = requests.post(base_url + "v1/ne",
                      headers=headers,
                      data=json.dumps(data))
    return r.json()



#google翻訳
def translate_text(text,project_id):
    
    client = translate.TranslationServiceClient()
    parent = client.location_path(project_id, "global")

   
    response = client.translate_text(
        parent=parent,
        contents=[text],
        mime_type="text/plain",  
        source_language_code="ja",
        target_language_code="en",
    )
      
    for translation in response.translations:
        return format(translation.translated_text)

#解析
def parsing(target):
    dic = dictionary.dictionay()

    for ne in ne_document['result']:
        print(ne)
        if ne['begin_pos'] == 0:
            if ne['class'] == 'ORG' or ne['class'] == 'PSN'  or ne['class'] == 'LOC' or ne['class'] == 'ART' or ne['extended_class'] == 'Date' or ne['extended_class'] == 'Nature_Color' or ne['extended_class'] == 'Compound':
                target = target.replace(ne['form'],translate_text(ne['form'].upper(),project_id))
        
        else:
            if ne['class'] == 'ORG' or ne['class'] == 'PSN'  or ne['class'] == 'LOC' or ne['class'] == 'ART' or ne['extended_class'] == 'Date' or ne['extended_class'] == 'Nature_Color' or ne['extended_class'] == 'Compound':
                target = target.replace(ne['form'],' ' + translate_text(ne['form'].upper(),project_id))

    for chunk in parse_document['result']:    
        for token in chunk["tokens"]:
            print(token)
            #文頭
            if token["id"] == 0 :
                if token["pos"] == "Number":
                    target = target.replace(token['form'],token['form'])
                else:   
                    for dic_select in dic:
                        if dic_select == token['form'] :                                
                            target = target.replace(token['form'],dic[token['form']])
                            break
                        elif dic_select == 'n9ez6ia2dfgQ':
                            target = target.replace(token['form'],token['kana'])
                            break 

             #、。!など                          
            elif token["pos"] == "句点" or token["pos"] == "読点":
                break       
            
            elif token["pos"] == "Number" or token["pos"] == "助数詞":
                kanji = re.findall("[ぁ-ん一-龥]", token['form'])
                if kanji == []:
                    target = target.replace(token['form'],' ' + token['form'])
                else:   
                    target = target.replace(token['form'],' ' + token['kana'])

            elif token["pos"].endswith("接尾辞"):
                target = target.replace(token['form'],token['kana'])

            elif token["pos"] == "Symbol":
                target = target.replace(token['form'],' ' + token['form'][0].upper() + token['form'][1:])
                
            #その他
            else:
                for dic_select in dic:
                    if dic_select == token['form'] :                                
                        target = target.replace(token['form'],' ' + dic[token['form']])
                        break
                    elif dic_select == 'n9ez6ia2dfgQ':
                        target = target.replace(token['form'],' ' + token['kana'])
                        break
    target = conv.do(target)
    return target
       
                     
if __name__ == "__main__":
    
    document = '今日からセンチュリー21のCM撮影です。ファイトー!!いっぱーつ!'
    document = document.lower()
    target = mojimoji.zen_to_han(document,kana=False)
    args = sys.argv
    if len(args) >= 2:
        document = str(args[1])

    access_token = auth(CLIENT_ID, CLIENT_SECRET)
    parse_document = parse(document, access_token)
    ne_document = ne(document,access_token)    
    text = parsing(target)
    text = text.replace('','\n').splitlines()
    
    for kane_text in text:
        kane_text = kane_text.replace('','-')
        kane_text = kane_text.replace('','.')
        kane_text = kane_text.replace('',',')
        kane_text = kane_text[0].upper() + kane_text[1:]
        print(kane_text)   

##4.2 dictionary.py

Click here !!
#使い方
# ①文字数の多いものを上に記載してください。
# ②アルファベットは半角小文字で登録してください 例)'cm':'CM'

def dictionay():
    dictionay =\
    {\
    #10文字以上\
    'ありがとうございます':'Thank you',\
    #9文字\
    #8文字\ 
    'シチュエーション':'situation',\
    #7文字\
    #6文字\
    'アンバサダー':'Ambassador',\
    'キャラクター':'character',\
    'クランクイン':'crank in',\
    'トーナメント':'tournament',\
    'トレーニング':'training',\
    'ファンクラブ':'fan club',\
    'プロジェクト':'project',\
    #5文字\
    'ありがとう':'Thanks',\
    'アクション':'action',\
    'バースデー':'birthday',\
    'プレミアム':'premium',\
    'メッセージ':'message',\
    #4文字\
    'イベント':'event',\
    'サポート':'support',\
    'シーズン':'season',\
    'スタート':'start',\
    'スポーツ':'sports',\
    'デザイン':'design',\
    'ハッピー':'happy',\
    'ファイト':'Fight',\
    #3文字\
    'ゲーム':'game',\
    '誕生日':'birthday',\
    '皆さん':'everyone',\
    'メール':'mail',\
    'シーン':'scene',\
    'チーム':'team',\
    'ハード':'hard',\
    'パワー':'power',\
    'ホテル':'hotel',\
    'ボトル':'bottle',\
    'ラスト':'last',\
    'パワー':'power',\
    #2文字\
    'から':'Kara',\
    'cm':'CM',\
    '休日':'holiday',\
    '筋肉':'muscle',\
    '日記':'diary',\
    '番付':'ranking',\
    #1文字\
    #終了
    'n9ez6ia2dfgQ':''\
    }
    return dictionay

Kane san no Diary Kara English wo tsukatte iru mono ya
tsukai sou na word wo pikku up shimashita.

#5.Yatte miyou!

$ python Kane_kosugi.py "今日からセンチュリー21のCM撮影です。"
Kyou Kara Century 21 no CM satsuei desu.
$ python Kane_kosugi.py "スーパー変化、ドロンチェンジャー!ニンジャブラック、ジライヤ!人に隠れて悪を斬る。忍者戦隊!カクレンジャー見参!"
Suupaa henka, Dron Changer! Ninja Black, Jiraiya! hito ni kakurete aku wo kiru. 
Ninja Sentai! Kakuranger kenzan!
$ python Kane_kosugi.py "タウリン1,000mg配合リポビタンD!"
Taurin 1,000 mg haigou Lipovitan D!
$ python Kane_kosugi.py "筋肉番付で総合優勝しました。"
Muscle ranking de sougou yuushou shima shita.
$ python Kane_kosugi.py "デスティニープロダクションズ所属のケインコスギです。45歳、身長181cm、出身はアメリカのロサンゼルスです。"
Destiny Productions shozoku no Kane Kosugi desu. 
45 sai, shinchou 181 cm, shusshin ha America no Los Angeles desu.

Nagai bun demo OK desu!
2tsume ha henshin scene de "henka" shichai mashita.
"henge" shite hoshii noni...

#6. Matome
Kore de daredemo kane san ni naremasu ne.
Ato ha Body wo kitaeru dake desu.
(Training ga ichiban kitsui yone...)
#7.Sanshou
COTOHA API Portal
Qiita「募ってはいるが、募集はしていない」 人たちへ
Qiita オレ プログラム ウゴカス オマエ ゲンシジン ナル
Cloud Translation API の基礎を学びます。 - Google Cloud

1
2
2

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?