0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

Mecab -Overview / Code

Last updated at Posted at 2022-07-07

MECAB (形態素解析エンジン)

  1. 依存ファイルのダウンロード
  2. 辞書のインストール
  3. 実行
  4. 特定の品詞のみ抽出

依存ファイルのダウンロード

   sudo apt install mecab
   sudo apt install libmecab-dev libmecab2 swig
   sudo apt install mecab-ipadic mecab-ipadic-utf8

辞書のインストール

  • mecab-ipadic-NEologd辞書のインストール (頻繁に更新されるのでお勧め)
    git clone --depth 1 https://github.com/neologd/mecab-ipadic-neologd.git
    /mecab-ipadic-neologd$ sudo ./bin/install-mecab-ipadic-neologd -n
    
    • 使用説明:
            mecab-ipadic-neologd$ echo `mecab-config –dicdir`"/mecab-ipadic-neologd"
    

  • IPA辞書(IPADIC)のインストール (更新滞っているが、簡単に使用可能)
    sudo update-alternatives --config mecab-dictionary

fugashiやmecab-python3は、mecabと一緒に自動でダウンロードされるが、辞書は一緒にダウンロードされない。

実行

  • the basic way
   mecab -d /usr/lib/x86_64-linux-gnu/mecab/dic/mecab-ipadic-neologd
  • the simpler way of the above
   mecab -Osimple -d /usr/lib/x86_64-linux-gnu/mecab/dic/mecab-ipadic-neologd
  • wakati
   mecab -Owakati -d /usr/lib/x86_64-linux-gnu/mecab/dic/mecab-ipadic-neologd
  • ChaSen
   mecab -Ochasen -d /usr/lib/x86_64-linux-gnu/mecab/dic/mecab-ipadic-neologd

特定の品詞のみ抽出

  • 辞書: unidic-liteをインストール
   pip install unidic-lite
  • 抽出する特定の項目e.g. 品詞 の番号を探す(今回は名詞を抽出)
   less pos-id.def
  • Pythonスクリプト作成
import MeCab

mecab = Mecab.Tagger('-Osimple -d /use/lib/x86_64-linux-gun/mecab/dic/mecab-ipadic-deologd')

sentence = '夏ってこんなに暑かったっけ'
node = mecab.parseToNode(sentence)

while node:
 if 36 <= node.posid <= 67:
   print (node.surface)
 node = node.next
  • 実行
   python3 scriptName.py
0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?