0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

言語処理100本ノック 第2章 解いてみた

Last updated at Posted at 2024-08-03

はじめに

10. 行数のカウント

Python
with open(f"{dirpath}/popular-names.txt", "r") as f:
  lines = f.readlines()
  print(len(lines))
UNIX
wc -l popular-names.txt

11. タブをスペースに置換

Python
with open(f"{dirpath}/popular-names.txt", "r") as f:
  lines = f.read().splitlines()
  for line in lines:
    print(line.replace("\t", " "))
UNIX
sed 's/\t/ /g' popular-names.txt

12. 1列目をcol1.txtに,2列目をcol2.txtに保存

Python
with open(f"{dirpath}/popular-names.txt", "r") as f:
  lines = f.read().splitlines()
with open(f"{dirpath}/col1.txt", "w") as f:
  for line in lines:
    col1 = line.split("\t")[0]
    f.write(f"{col1}\n")
with open(f"{dirpath}/col2.txt", "w") as f:
  for line in lines:
    col2 = line.split("\t")[1]
    f.write(f"{col2}\n")
UNIX
cut -f 1 popular-names.txt > col1.txt
cut -f 2 popular-names.txt > col2.txt

13. col1.txtとcol2.txtをマージ

Python
with open(f"{dirpath}/merge.txt", "w") as f0:
  with open(f"{dirpath}/col1.txt", "r") as f1:
    with open(f"{dirpath}/col2.txt", "r") as f2:
      lines1 = f1.read().splitlines()
      lines2 = f2.read().splitlines()
      for c1, c2 in zip(lines1, lines2):
        f0.write(f"{c1}\t{c2}\n")
UNIX
paste col1.txt col2.txt > merge.txt

14. 先頭からN行を出力

Python
n = int(input())

with open(f"{dirpath}/popular-names.txt", "r") as f:
  lines = f.read().splitlines()
for line in lines[:n]:
  print(f"{line}")
UNIX
read num
head -n ${num} popoular-names.txt

15. 末尾のN行を出力

Python
n = int(input())

with open(f"{dirpath}/popular-names.txt", "r") as f:
  lines = f.read().splitlines()
for line in lines[len(lines)-n:]:
  print(f"{line}")
UNIX
read num
tail -n ${num} popoular-names.txt

16. ファイルをN分割する

Python
n = int(input())
with open(f"{dirpath}/popular-names.txt", "r") as f:
  lines = f.read().splitlines()
lines_per_output = len(lines) // n
start, stop = 0, 0
for i in range(n):
  with open(f"{dirpath}/splited_files/split{i}.txt", "w") as fout:
    start = stop
    if i < len(lines) % n:
      stop = start + lines_per_output + 1
    else:
      stop = start + lines_per_output
    for line in lines[start:stop]:
      fout.write(f"{line}\n")
UNIX
read num
split -n ${num} -d -additional-suffix=.txt popular-names.txt ./splited_files/split

17. 1列目の文字列の異なり

Python
with open(f"{dirpath}/popular-names.txt", "r") as f:
  lines = f.read().splitlines()
S = set()
for line in lines:
  S.add(line.split("\t")[0])
print(S)
len(S)
UNIX
sort popular-names.txt | cut -f 1 | uniq

18. 各行を3コラム目の数値の降順にソート

Python
with open(f"{dirpath}/popular-names.txt", "r") as f:
  lines = f.read().splitlines()
l = []
for line in lines:
  l.append(line.split("\t"))
result = sorted(l, key=lambda x: x[2], reverse=True)
for line in result:
  print(line)
UNIX
sort -n -r -k 3 popular-names.txt

19. 各行の1コラム目の文字列の出現頻度を求め,出現頻度の高い順に並べる

Python
import collections

with open(f"{dirpath}/popular-names.txt", "r") as f:
  lines = f.read().splitlines()
l = []
for line in lines:
  l.append(line.split("\t")[0])
c = collections.Counter(l)
result = sorted(c.items(), key=lambda x: x[1], reverse=True)
for count in result:
  print(count[0])
UNIX
cut -f 1 popular-names.txt | sort | uniq -c | sort -nr | awk '{print $2}'
0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?