1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

sudachipyを試してみた

Last updated at Posted at 2022-03-23
id text
1 メロスは激怒した。
2 必ず、かの邪智暴虐の王を除かなければならぬと決意した。
3 メロスには政治がわからぬ。
4 メロスは、村の牧人である。
5 笛を吹き、羊と遊んで暮して来た。
tokenizer_obj = dictionary.Dictionary(dict="full").create()
mode = tokenizer.Tokenizer.SplitMode.C
doc = []
for row in range(len(df)):
    t = tokenizer_obj.tokenize(df["text"][row], mode)
    d = [m.normalized_form() for m in t if m.part_of_speech()[0] in ["名詞", "動詞"]]
    doc.append(d)
docs = pd.array([" ".join(doc[i]) for i in range(len(doc))])
print(docs)

#<StringArray>
#['メロス 激怒 為る',
#'邪知 暴虐 王 除く 成る 決意 為る',
#'メロス 政治 分かる',
#'メロス 村 牧人 有る',
#'笛 吹く 羊 遊ぶ 暮らす 来る']
#Length: 5, dtype: string
1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?