More than 3 years have passed since last update.

「つくってマスターPython」で勉強日記#1

Posted at 2022-03-04

pythonを勉強しているからにはそれなりに実用性のあるものを作りたいと思うもので、今回から｢つくってマスターPython｣を教材にwebスクレイピングについて、勉強していこうと思います。

この本は初心者向けに構成されているので、最初に書かれている基礎知識部分は省略してchapter3から始めていこうと思います。

chapter3 ライブラリを活用する

最初はライブラリの紹介項目です。ライブラリがどういうものかという説明は割愛して、書籍に掲載されているものをまとめます。
3-1から3-3までは標準ライブラリで、インストール不要でモジュールを使うことができます。3-4,3-5はインストールが必要になります。

3-1 基本的な値

mathモジュール　(math.関数名(引数))

関数名	説明	記述
ceil	少数点以下を切り上げ	math.ceil()
floor	少数点以下を切り下げ	math.floor()

randomモジュール(random.関数名(引数))

関数名	説明	記述
random	引数がない時は実数の乱数(0以上1未満)を取る	random.random()
randrange	0または下限以上、上限未満の範囲で整数の乱数を取る	random.randrange(下限値,上限値)
choice	リストからランダムに得る	random.choice(リスト)
shuffle	リストをかき混ぜる(引数のリストそのものに影響)	random.shuffle(リスト)

サイコロ・ゲームを作る
random関数を使ってサイコロゲームを作ります。

dice_game.py

import random 

me = 0
you = 0
end_point = 20

while(True):
    input('--push enter or return--')
    rnd = random.randint(1, 7)
    you += rnd
    print('you:' + str(rnd) + 'total' + str(you))
    
    if (you >= end_point): #条件に合致する場合ループを抜ける
        print('***you win !***')
        break
    
    rnd = random.randint(1, 7)
    me += rnd
    print('me:' + str(rnd) + 'total:' + str(me))
    
    if(me >= end_point):
        print('***I win !***')
        break
        
print('---end---')

ランダム関数によって、Iとyouにそれぞれ数値が足されていき先に20点になった方が勝ちというゲームです。

3-2 日時を扱う

datetaime(datetime.クラス名.関数名(引数))
以下にまとめる表は'from datetime import クラス名'としている前提で記述します。

関数名	説明	記述
date.today	今日のdateを作成	date.today()
date	年月日を指定して作る	date(年, 月, 日)
time	時刻の値	time(時,分,秒)
datetime	dateとtimeの複合版	datetime(年, 月, 日, 時, 分, 秒, マイクロ秒)
timedelta	特定の長さの時間を表す(1日、1時間など)	timedelta(日, 秒, マイクロ秒, ミリ秒, 分, 時, 週)

日時の足し算と引き算

sum_date.py


from datetime import date, time, datetime, timedelta

today = date.today()
d1 = timedelta(days = 1000)
result = today +d1
print(result.isoformat())

sub_date.py

from datetime import date, time, datetime, timedelta

today = date.today()
millennium = date(2001, 1, 1)
result = today - millennium
print(str(result.days) + '日間')

3-3 文字列処理

pythonにはstrクラスというもが備わっており、"Hello"とstr("Hello")は同様の働きをしています。

関数名	説明	記述
len	文字数を得る(strクラスのメソッドではない)	len(文字列)
upper	大文字に変換	文字列.upper()
lower	小文字に変換	文字列.lower()
startswith	指定の文字列で開始するか	<文字列>.startswith(文字列, [,開始位置　[,終了位置)
endswith	指定の文字列で終了するか	<文字列>.endeswith(文字列, [,開始位置　[,終了位置)
find	文字列の検索	文字列.find(文字列, [,開始位置　[,終了位置)
replace	文字列の置換	文字列.replace(検索文字列, 置換文字列 [,回数)
split	文字列をリストに分割	文字列.split(文字列)
join	リストを文字列にまとめる	文字列.join(リスト)

文字列を置換する

replace.py

s = "瓜売りが 瓜売りにきて 売り残し 売り売り帰る　瓜売りの声"
result = s.replace("瓜", "〇")
print(s)
print(result)

findによる置換

find.py

s = """One Little, Two Little, Three Little, Four Little,"""

f = "little"
r = "BIG"
s2 = s.lower()
n = 0

while (s2.find(f, n) != -1):
    i = s2.find(f, n)
    s = s[:i] + r + s[(i + len(f)):]
    s2 = s2[:i] + r + s2[(i + len(f)):]
    n = i + len(r)
    
print(s)

3-4 NumPyで数値計算

numpyはベクトルデータを処理する機能を一通り持っていて、ベクトルについてはとりあえず｢リストのようなもの｣として考えます。

ベクトルの計算

vec.py

import numpy as np

arr = np.array([10, 20, 30, 40, 50])
print(arr)
print(arr + 10)
print(arr * 2)

arr2 = np.array([5, 10, 15, 20, 25])

print(arr)
print(arr2)
#ベクトルどうしの計算
print(arr + arr2)
print(arr * arr2)

import numpy as npとした場合を想定

関数名	説明	記述
zeros	すべてゼロのベクトルを作る	np.zeros(個数)
ones	すべて1のベクトルを作る	np.ons(個数)
arange	ステップを指定して作成	np.arange(開始数,終了数,ステップ)
linespace	分割数で作成	np.linespace(開始数, 終了数, 分割数)
ravel	ベクトルの結合	np.ravel([ベクトル1, ベクトル2, ・・・])
sum	総和	np.sum(ベクトル)/ベクトル.sum()
min	最小値	np.min(ベクトル)/ベクトル.min()
max	最大値	np.max(ベクトル)/ベクトル.max()
mean	平均	np.mean(ベクトル)/ベクトル.mean()
median	中央値	np.median/ベクトル.median()
var	分散	np.var/ベクトル.var()
std	標準偏差	np.std(ベクトル)/ベクトル.std()
random.randint	ランダムな数のベクトル	np.random.randint(最小値, 最大値, 個数)
sin	sinの数値	np.sin(値)
cos	conの数値	np.cos(値)
pi	π(3.14)	np.pi

ランダムな100個の整数を統計処理する

calc.py

import numpy as np

arr = np.random.randint(0, 100, 100)

print(arr)
print("min:" + str(np.min(arr)))
print("max:" + str(np.max(arr)))
print("ave:" + str(np.mean(arr)))
print("med:" + str(np.median(arr)))
print("var:" + str(np.var(arr)))
print("std:" + str(np.std(arr)))

3-5 matplotlibでグラフを作る

mapplotlibを利用する

matplot.py

import matplotlib.pyplot as plt

plt.plot([2, 3, 4], [0, 0, 0]) #必要なデータを指定してグラフ作成
plt.show() #グラフが描画される

import matplotlib.pyplot as pltとした場合

関数名	説明	記述
plot	プロットする	plt.plot(xのデータ, yのデータ, label＝凡例ラベル)
show	グラフを表示する	plt.show()
title	タイトルの設定	plt.title(タイトル名)
legend	凡例の作成(label)	plt.legend()
xlabel	x軸のラベル	plt.xlabel(ラベル)
ylabel	y軸のラベル	plt.ylabel(ラベル)
bar	棒グラフ	plt.bar(xデータ, yデータ [, label=ラベル])
pie	円グラフを作る	plt.pie(データ　[, labels = ラベルデータ])
grid	グリッドの表示	plt.grid(which, axis, color, alpha, linestyle, linewidth)

gridの引数はすべて指定する必要はありません。

gridの引数	説明
which	大まかな線と細かな線の指定。'major','minor','both'のいずれかを指定
axis	描画する軸の指定。'x', 'y', 'both'のいずれか
color	色の指定。16進数か色名
alpha	透過度の指定。0～1の実数
linestyle	線分のスタイル。':', '-', '--', '-.'などの値で指定
linewidth	線の太さの指定

参考に線グラフ、棒グラフ、円グラフの記述をしておきます。

graph.py

import numpy as np
import matplotlib.pyplot as plt

x = np.arange(-np.pi, np.pi, np.pi / 50)
sin_y = np.sin(x)
cos_y = np.cos(x)

plt.plot(x, sin_y, label = 'sin')
plt.plot(x, cos_y, label = 'cos')

plt.title('Sin/Cos Graph')

plt.xlabel('degree')
plt.ylabel('value')

plt.grid(which='major', axis='x', color='gray', alpha=0.5, linestyle=':', linewidth=1)
plt.grid(which='major', axis='y', color='gray', alpha=0.5, linestyle=':', linewidth=1)


plt.legend()

plt.show()

plt.bar(x, sin_y, label='棒')
plt.show()

random_pie.py

import numpy as np
import matplotlib.pyplot as plt

x = np.random.randint(1, 100, 7)
x.sort()
y = list('ABCDEFG')

plt.pie(x[::-1], labels=y)
plt.title('Random Graph')
plt.legend()
plt.show

1回目だったので、処理というよりは機能の羅列になりました。
chapter3はこれで終わりなので、明日からはchapter4の「文書の処理」について進めていきます。
ありがとうございました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up