日本語の数で表せる言葉を辞書から拾って表示するスクリプトです。
pythonで書いています。
案外簡単に書けました。
Cと違ってPythonは文字列処理をやりやすいですね。
例えば、4649: よろしく, 4101040: よいおとしを
などです。
カレントディレクトリの、jd.txtが日本語の辞書です。一行に一つのひらがなの単語の\nで終わる言葉の集合です。
特殊な例として、例えば、40010:しまんと
、50010:ごまんと
などがありますが、簡単に作ったので、このような形は内部辞書に追加するようにしています。このような形はそんなにないと思いますので。
数を参照する言葉は、長い文字列を先に置きます。
nw.py
#!/usr/bin/python3
nw=[ [ "0","えん","まる","ま","れい","れ","おお","を","ぜろ","おう","お"],
[ "1","いっ","いち","い","ひと","かず","わん"],
[ "2","に","ふた","じ","ふ","つー","つう","つ"],
[ "3","さん","ざん","さ","ざ","み","すりー","すり"],
[ "4","よつ","よん","よ","し","ふぉ","ふぉー","ふぉう" ],
[ "5","ご","いつ","こ","ふぁいぶ"],
[ "6","むつ","む","ろく","ろ","る","しっくす"],
[ "7","なな","な","しち","せぶん" ],
[ "8","や","はち","は","ぱち","えいと"],
[ "9","ここの","く","きゅう","きゅ","ないん" ],
[ "10","とう","とお","と","じゅう"],
[ "100","ひゃく","もも"],
[ "1000","いっせん","せん","ち"],
[ "10000","まん","ばん"],
[ "40010","しまんと"],
[ "50000","ごまん"],
[ "50010","ごまんと"],
[ "100000000","おく"],
[ ":","たい"]
]
def cw(w,idx):
for i in nw:
s=i[0]
j=i[1:]
for k in j:
if w[idx:idx+len(k)]==k:
idx+=len(k)
return idx,s
else:
return idx,""
def check(w):
n=""
idx=0
while idx<len(w):
idx,t=cw(w,idx)
if not t:
return ""
n=n+t
return n
def main():
f=open("jd.txt","r");
a=f.read()
f.close()
a=a.split()
for w in a:
num=check(w)
if num:
print(num,":",w)
if __name__=="__main__":
main()
exit(0)
実行
1 : い
11 : いい
1109 : いいおく
11093 : いいおくさん
115 : いいこ
111000019 : いいせんをいく
11104 : いいとし
11291 : いいにくい
111 : いいひと
118 : いいや
・
・
・等々