1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

ニートの自然言語処理100本ノック:6

Posted at
"""
05. n-gram
与えられたシーケンス(文字列やリストなど)からn-gramを作る関数を作成せよ.この関数を用い,"I am an NLPer"という文から単語bi-gram,文字bi-gramを得よ.
""" 
import itertools


text = "I am an NLPer"


def make_str_bigram(text,n=2):

    # 単語をリスト化する
    # リスト内のspaceをlambdaを使って取り除く
    # textに入れなおす

    text = list(filter(lambda s:s != ' ',list(text)))
    iterr = itertools.combinations(text,n) # テキストと組み合わせの文字数でイテレータ型を作りループし出力する
    for i in iterr:
        print(i)


def make_word_bigram(text,n=2):

    text = text.split(' ') # 外に出してsplitしたらわざわざmake_word_bigramを作る必要がないよね
    iterr = itertools.combinations(text,n)
    for i in iterr:
        print(i)



make_str_bigram(text)
make_word_bigram(text)
('I', 'a')
('I', 'm')
('I', 'a')
('I', 'n')
('I', 'N')
('I', 'L')
('I', 'P')
('I', 'e')
('I', 'r')
('a', 'm')
('a', 'a')
('a', 'n')
('a', 'N')
('a', 'L')
('a', 'P')
('a', 'e')
('a', 'r')
('m', 'a')
('m', 'n')
('m', 'N')
('m', 'L')
('m', 'P')
('m', 'e')
('m', 'r')
('a', 'n')
('a', 'N')
('a', 'L')
('a', 'P')
('a', 'e')
('a', 'r')
('n', 'N')
('n', 'L')
('n', 'P')
('n', 'e')
('n', 'r')
('N', 'L')
('N', 'P')
('N', 'e')
('N', 'r')
('L', 'P')
('L', 'e')
('L', 'r')
('P', 'e')
('P', 'r')
('e', 'r')
('I', 'am')
('I', 'an')
('I', 'NLPer')
('am', 'an')
('am', 'NLPer')
('an', 'NLPer')
1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?