2
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

言語処理100本ノック : 第1章

Last updated at Posted at 2018-02-23

自然言語処理 (NLP: Natural Language Processing)の著名なチュートリアルである
言語処理100本ノック 2015 より,第1章: 準備運動 を解きました.
問題はリンク先を見てください.

注意:
初稿の段階で全ての動作は確認していますが,
筆者の都合により回答は改変するかもしれません.
###環境

$ python
Python 3.6.2 |Anaconda custom (64-bit)| (default, Jul 20 2017, 13:14:59)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>

###回答

00: 文字列の逆順

p00.py
string = ''.join(c for c in reversed('stressed')) 
別解
string = 'stressed'[::-1]

01: 「パタトクカシーー」

p01.py
string = ''.join(c for idx, c in enumerate('パタトクカシーー') if idx%2 == 1)
別解
string = 'パタトクカシーー'[::2]

02: 「パトカー」+「タクシー」=「パタトクカシーー」

p02.py
string = ''.join(c1 + c2 for c1, c2 in zip('パトカー', 'タクシー'))

03: 円周率

p03.py
import re

string = 'Now I need a drink, alcoholic of course, after the heavy lectures involving quantum mechanics.'.split(' ')
lst = [len(re.sub(r'\W', '', word)) for word in string]

04: 元素記号

p04.py
string = 'Hi He Lied Because Boron Could Not Oxidize Fluorine. New Nations Might Also Sign Peace Security Clause. Arthur King Can.'
str_lst = string.split()

one_lst = [1, 5, 6, 7, 8, 9, 15, 16, 19]
lst = [word[:i] for word, i in zip(str_lst, [1 if i+1 in one_lst else 2 for i in range(len(str_lst))])]

dct = {}
[dct.update({word:idx+1}) for idx, word in enumerate(lst)]

05: n-gram

p05.py
def char_ngram(string, n):
    """ Character-based N-Gram """
    string = string.replace(' ', '');
    return [string[i:i+n] for i in range(len(string)-n+1)]

def word_ngram(string, n):
    """ Word-based N-Gram """
    string = string.split(' ')
    return [string[i:i+n] for i in range(len(string)-n+1)]

06: 集合

p06.py
from p05 import char_ngram

X = set(char_ngram('paraparaparadise', 2))
Y = set(char_ngram('paragraph', 2))

U = X.union(Y)         #和集合
I = X.intersection(Y)  #積集合
D = X.difference(Y)    #差集合

[print('se' in st) for st in [X, Y]]

07: テンプレートによる文生成

p07.py
def gen_sentence(x, y, z):
    return f'{x}時の{y}{z}'

print(gen_sentence(12, '気温', 22.4))

08: 暗号文

p08.py
def cipher(string):
    return ''.join(chr(219-ord(c)) if c.islower() else c for c in string)

string = "I couldn't believe that I could actually understand what I was reading"
print(string)
print(cipher(string))         # encoded
print(cipher(cipher(string))) # decoded

09: Typoglycemia

p09.py
import random

def char_shuffle(string):
    char_lst = [c for c in string]
    random.shuffle(char_lst)
    return ''.join(c for c in char_lst)

string = "I couldn't believe that I could actually understand what I was reading : the phenomenal power of the human mind ."
str_lst = string.split(' ')
string = ' '.join(char_shuffle(string) if len(string) > 4 else string for string in str_lst)
09-EX: Typoglycemia (没)

題意を取り違えたため生まれた没回答をここに供養します.
このコードは5文字以上の単語の順序をバラバラに入れ替えるものです.

p09-EX.py
import random

string = "I couldn't believe that I could actually understand what I was reading : the phenomenal power of the human mind ."
str_lst = string.split(' ')

idx_lst = [idx for idx, word in enumerate(str_lst) if len(word) > 4]
random.shuffle(idx_lst)

[idx_lst.insert(idx, idx) for idx in range(len(str_lst)) if idx not in idx_lst]

string = ' '.join(str_lst[i] for i in idx_lst)

コメント

0605で作成したchar_ngram(string)を用いています.

2
3
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?