言語処理100本ノック 2015
52. ステミング
http://www.cl.ecei.tohoku.ac.jp/nlp100/
「
51の出力を入力として受け取り,Porterのステミングアルゴリズムを適用し,単語と語幹をタブ区切り形式で出力せよ. Pythonでは,Porterのステミングアルゴリズムの実装としてstemmingモジュールを利用するとよい.」
素人の言語処理100本ノック:52
https://qiita.com/segavvy/items/d47fa799883ed16eddc2
# ./p52.py
Traceback (most recent call last):
File "./p52.py", line 6, in <module>
from stemming.porter2 import stem
ModuleNotFoundError: No module named 'stemming'
ソースは下記(コマンドとして実行したく1行目追記)
#!/usr/bin/env python
# coding: utf-8
import codecs
import re
from stemming.porter2 import stem
fin = codecs.open('nlp.txt', 'r', 'utf_8')
punctuation = ""
src = []
string = []
word = ""
if __name__ == "__main__":
n = 0
for line in fin:
for x in line:
if n == 100:
break
punctuation = punctuation + x
if len(punctuation) > 3:
punctuation = punctuation[1:4]
if re.search(r'\.\s[A-Z]|\;\s[A-Z]|\:\s[A-Z]|\?\s[A-Z]|\!\s[A-Z]',punctuation):
src.append(string)
string = []
word = ""
if x == " ":
if word != "":
string.append(word)
word = ""
n = n + 1
elif x == "\n":
if word != "":
string.append(word)
word = ""
n = n + 1
elif x == "." or x == ",":
print("",end="")
else:
word = word + x
src.append(string)
for stringx in src:
for wordx in stringx:
print(wordx,"\t",stem(wordx))
print("")
pipでstemming導入
# pip install stemming
Collecting stemming
Downloading https://files.pythonhosted.org/packages/d1/eb/fd53fb51b83a4e3b8e98cfec2fa9e4b99401fce5177ec346e4a5c61df71e/stemming-1.0.1.tar.gz
Building wheels for collected packages: stemming
Running setup.py bdist_wheel for stemming ... done
Stored in directory: /root/.cache/pip/wheels/e8/05/2e/2ddeb64d4464b854b48323f9676528c17560da7d153db7b0e2
Successfully built stemming
Installing collected packages: stemming
Successfully installed stemming-1.0.1
You are using pip version 18.1, however version 19.0 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
# pip install --upgrade pip
Collecting pip
Downloading https://files.pythonhosted.org/packages/60/64/73b729587b6b0d13e690a7c3acd2231ee561e8dd28a58ae1b0409a5a2b20/pip-19.0-py2.py3-none-any.whl (1.4MB)
100% |████████████████████████████████| 1.4MB 2.0MB/s
Installing collected packages: pip
Found existing installation: pip 18.1
Uninstalling pip-18.1:
Successfully uninstalled pip-18.1
Successfully installed pip-19.0
# ./p52.py
解決
最後までおよみいただきありがとうございました。
いいね 💚、フォローをお願いします。
Thank you very much for reading to the last sentence.
Please press the like icon 💚 and follow me for your happy life.