言語処理100本ノック 2015
42.
係り元と係り先の文節の表示
http://www.cl.ecei.tohoku.ac.jp/nlp100/
「係り元の文節と係り先の文節のテキストをタブ区切り形式ですべて抽出せよ.ただし,句読点などの記号は出力しないようにせよ..」
素人の言語処理100本ノック:42
https://qiita.com/segavvy/items/58894f76dba367b2925b
# cp ../chap03/neko.txt .
# ./p42.py
(前略)
Traceback (most recent call last):
File "./p42.py", line 90, in neco_lines
raise StopIteration
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./p42.py", line 94, in <module>
for chunks in neco_lines():
RuntimeError: generator raised StopIteration
ソースは下記(コマンドとして実行したく1行目追記)
#!/usr/bin/env python
# coding: utf-8
import CaboCha
import re
fname = 'neko.txt'
fname_parsed = 'neko.txt.cabocha'
def parse_neko():
with open(fname) as data_file, \
open(fname_parsed, mode='w') as out_file:
cabocha = CaboCha.Parser()
for line in data_file:
out_file.write(cabocha.parse(line).toString(CaboCha.FORMAT_LATTICE))
class Morph:
def __init__(self, surface, base, pos, pos1):
self.surface = surface
self.base = base
self.pos = pos
self.pos1 = pos1
def __str__(self):
return 'surface[{}]\tbase[{}]\tpos[{}]\tpos1[{}]'\
.format(self.surface, self.base, self.pos, self.pos1)
class Chunk:
def __init__(self):
self.morphs = []
self.srcs = []
self.dst = -1
def __str__(self):
surface = ''
for morph in self.morphs:
surface += morph.surface
return '{}\tsrcs{}\tdst[{}]'.format(surface, self.srcs, self.dst)
def normalized_surface(self):
result = ''
for morph in self.morphs:
if morph.pos != '記号':
result += morph.surface
return result
def neco_lines():
with open(fname_parsed) as file_parsed:
chunks = dict()
idx = -1
for line in file_parsed:
if line == 'EOS\n':
if len(chunks) > 0:
sorted_tuple = sorted(chunks.items(), key=lambda x: x[0])
yield list(zip(*sorted_tuple))[1]
chunks.clear()
else:
yield []
elif line[0] == '*':
cols = line.split(' ')
idx = int(cols[1])
dst = int(re.search(r'(.*?)D', cols[2]).group(1))
if idx not in chunks:
chunks[idx] = Chunk()
chunks[idx].dst = dst
if dst != -1:
if dst not in chunks:
chunks[dst] = Chunk()
chunks[dst].srcs.append(idx)
else:
cols = line.split('\t')
res_cols = cols[1].split(',')
chunks[idx].morphs.append(
Morph(
cols[0],
res_cols[6],
res_cols[0],
res_cols[1]
)
)
raise StopIteration
parse_neko()
for chunks in neco_lines():
for chunk in chunks:
if chunk.dst != -1:
src = chunk.normalized_surface()
dst = chunks[chunk.dst].normalized_surface()
if src != '' and dst != '':
print('{}\t{}'.format(src, dst))
43
Traceback (most recent call last):
File "./p43.py", line 94, in neco_lines
raise StopIteration
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./p43.py", line 98, in <module>
for chunks in neco_lines():
RuntimeError: generator raised StopIteration
45
Traceback (most recent call last):
File "./p45.py", line 100, in neco_lines
raise StopIteration
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./p45.py", line 105, in <module>
for chunks in neco_lines():
RuntimeError: generator raised StopIteration
46
Traceback (most recent call last):
File "./p46.py", line 111, in neco_lines
raise StopIteration
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./p46.py", line 116, in <module>
for chunks in neco_lines():
RuntimeError: generator raised StopIteration
47
Traceback (most recent call last):
File "./p47.py", line 120, in neco_lines
raise StopIteration
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./p47.py", line 125, in <module>
for chunks in neco_lines():
RuntimeError: generator raised StopIteration
48
Traceback (most recent call last):
File "./p48.py", line 120, in neco_lines
raise StopIteration
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./p48.py", line 125, in <module>
for chunks in neco_lines():
RuntimeError: generator raised StopIteration
49
Traceback (most recent call last):
File "./p49.py", line 133, in neco_lines
raise StopIteration
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./p49.py", line 138, in <module>
for chunks in neco_lines():
RuntimeError: generator raised StopIteration
一覧
物理記事 上位100
https://qiita.com/kaizen_nagoya/items/66e90fe31fbe3facc6ff
量子(0) 計算機, 量子力学
https://qiita.com/kaizen_nagoya/items/1cd954cb0eed92879fd4
数学関連記事100
https://qiita.com/kaizen_nagoya/items/d8dadb49a6397e854c6d
言語・文学記事 100
https://qiita.com/kaizen_nagoya/items/42d58d5ef7fb53c407d6
医工連携関連記事一覧
https://qiita.com/kaizen_nagoya/items/6ab51c12ba51bc260a82
自動車 記事 100
https://qiita.com/kaizen_nagoya/items/f7f0b9ab36569ad409c5
通信記事100
https://qiita.com/kaizen_nagoya/items/1d67de5e1cd207b05ef7
日本語(0)一欄
https://qiita.com/kaizen_nagoya/items/7498dcfa3a9ba7fd1e68
英語(0) 一覧
https://qiita.com/kaizen_nagoya/items/680e3f5cbf9430486c7d
転職(0)一覧
https://qiita.com/kaizen_nagoya/items/f77520d378d33451d6fe
仮説(0)一覧(目標100現在40)
https://qiita.com/kaizen_nagoya/items/f000506fe1837b3590df
Qiita(0)Qiita関連記事一覧(自分)
https://qiita.com/kaizen_nagoya/items/58db5fbf036b28e9dfa6
鉄道(0)鉄道のシステム考察はてっちゃんがてつだってくれる
https://qiita.com/kaizen_nagoya/items/26bda595f341a27901a0
安全(0)安全工学シンポジウムに向けて: 21
https://qiita.com/kaizen_nagoya/items/c5d78f3def8195cb2409
一覧の一覧( The directory of directories of mine.) Qiita(100)
https://qiita.com/kaizen_nagoya/items/7eb0e006543886138f39
Ethernet 記事一覧 Ethernet(0)
https://qiita.com/kaizen_nagoya/items/88d35e99f74aefc98794
Wireshark 一覧 wireshark(0)、Ethernet(48)
https://qiita.com/kaizen_nagoya/items/fbed841f61875c4731d0
線網(Wi-Fi)空中線(antenna)(0) 記事一覧(118/300目標)
https://qiita.com/kaizen_nagoya/items/5e5464ac2b24bd4cd001
OSEK OS設計の基礎 OSEK(100)
https://qiita.com/kaizen_nagoya/items/7528a22a14242d2d58a3
Error一覧 error(0)
https://qiita.com/kaizen_nagoya/items/48b6cbc8d68eae2c42b8
++ Support(0)
https://qiita.com/kaizen_nagoya/items/8720d26f762369a80514
Coding(0) Rules, C, Secure, MISRA and so on
https://qiita.com/kaizen_nagoya/items/400725644a8a0e90fbb0
プログラマによる、プログラマのための、統計(0)と確率のプログラミングとその後
https://qiita.com/kaizen_nagoya/items/6e9897eb641268766909
なぜdockerで機械学習するか 書籍・ソース一覧作成中 (目標100)
https://qiita.com/kaizen_nagoya/items/ddd12477544bf5ba85e2
言語処理100本ノックをdockerで。python覚えるのに最適。:10+12
https://qiita.com/kaizen_nagoya/items/7e7eb7c543e0c18438c4
プログラムちょい替え(0)一覧:4件
https://qiita.com/kaizen_nagoya/items/296d87ef4bfd516bc394
官公庁・学校・公的団体(NPOを含む)システムの課題、官(0)
https://qiita.com/kaizen_nagoya/items/04ee6eaf7ec13d3af4c3
「はじめての」シリーズ ベクタージャパン
https://qiita.com/kaizen_nagoya/items/2e41634f6e21a3cf74eb
AUTOSAR(0)Qiita記事一覧, OSEK(75)
https://qiita.com/kaizen_nagoya/items/89c07961b59a8754c869
プログラマが知っていると良い「公序良俗」
https://qiita.com/kaizen_nagoya/items/9fe7c0dfac2fbd77a945
LaTeX(0) 一覧
https://qiita.com/kaizen_nagoya/items/e3f7dafacab58c499792
自動制御、制御工学一覧(0)
https://qiita.com/kaizen_nagoya/items/7767a4e19a6ae1479e6b
Rust(0) 一覧
https://qiita.com/kaizen_nagoya/items/5e8bb080ba6ca0281927
小川清最終講義、最終講義(再)計画, Ethernet(100) 英語(100) 安全(100)
https://qiita.com/kaizen_nagoya/items/e2df642e3951e35e6a53
<この記事は個人の過去の経験に基づく個人の感想です。現在所属する組織、業務とは関係がありません。>
文書履歴(document history)
ver. 0.01 20230526
最後までおよみいただきありがとうございました。
いいね 💚、フォローをお願いします。
Thank you very much for reading to the last sentence.
Please press the like icon 💚 and follow me for your happy life.