More than 1 year has passed since last update.

【勝手に正誤表】「O'REILLY Pythonデータサイエンスハンドブック（初版第５刷）」

Posted at 2024-02-27

はじめに

つい最近、第２版が発売されてようですが、この記事は初版本について調べたものです。
そろそろデータサイエンスのお勉強しましょうと、タイトルの本でサンプルコードを動かし始めたが、ErrorやWarningが頻繁に出て修正するのに多くの時間がかかてしまった。
主な原因としては、多くのモジュールをインポートするため、本が書かれた時点と現在とではモジュールの仕様が異なり、対応が取れていないようです。
そこで、途中で理解することを後回しにして、一通りサンプルプログラムからErrorやWarningが出ないような作業を行ったので備忘録として残しておく。
色々調べると、皆さん苦労されているようで！
（写真の付箋のように沢山あった）

書籍名

O'REILLY Python データサイエンスハンドブック　2022年7月1日初版第5刷発行

動作環境

Windows10
anaconda : 2.5.2
jupyterlab : 3.6.3
Python : 3.11.5
numpy : 1.26.3
pandas : 2.0.3
Matplotlib : 3.8.0
seaborn : 0.12.2
scikit-image : 0.20.0
scikit-learn : 1.2.2
サンプルデータ入手先　(2024/1/10)
URL : https://github.com/jakevdp/PythonDataScienceHandbook
- サンプルコードが入っているフォルダは２つあるが、「notebook」フォルダで動作確認
  ・notebook
  ・notebook_v1（一つ前のバージョン？）

本全体

本書で使われている外部ファイル

Page	ファイル名	Notebookファイル名	フォルダ
P61	president_heights.csv	02.04-xx	notebook/data
P69	Seattle2014.csv	02.06-xx	notebook_v1/data
P155	state-abbrevs.csv	03.07-xx	notebook/data
P155	state-areas.csv	03.07-xx	notebook/data
P155	state-population.csv	03.07-xx	notebook/data
P185	recipeitems-latest.json	03.10-xx	download url
P256	california_cities.csv	04.06-xx / 04.13-xx	notebook_v1/data
P273	births.csv	04.09-xx	notebook/data
P312	gistemp250.nc	04.13-xx	download url
P325	marathon-data.csv	04.14-xx	download url
P403	SeattleWeather.csv	05.06-xx	download url
P403	FremontBridge.csv	05.06-xx	download url
P498	construct_grids（関数）	05.06-xx	download url

※ 表の「Notebookファイル名」は、実際のファイル名が長いので先頭数文字で以下は省略しています。
※ ファイルパスは適当に調整してください。
※ 本に書かれているダウンロードしなければならないファイルのURLは移動したようなので、新たに探しました。（2024/1/20現在）

seabornのスタイルシート名

本の内容では、seabornをインポートしてseaborn.set()で初期化しています。
これは問題無く動きますが、サンプルコードではインポートせずにスタイル名を参照しており
実行するとWarningとなります。
原因は、参照しているスタイルシート名が古いためです。
例として、'seaborn-whitegrid' を 'seaborn-v0_8-whitegrid'にすれば直ります。

# 本の内容
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn;seaborn.set()  # plot styling  プロットのスタイルを決める

# サンプルコード
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')

# 以下のように修正
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('seaborn-v0_8-whitegrid')

# 利用できるスタイルシート名は、以下のコマンドで確認できます
import matplotlib.pyplot as plt
print( plt.style.available)

同じ記述は、以下ページにあります。

p70、p80、p88、p175、p226、p235、p241、p244、p248、p252、p259、p266、p272、p294、
p370、p386、p393、p408、p422、p434、p447、p464、p478、p492、p506

２章 Numpyの基礎

P69：本の内容とサンプルコードでは記述が異なる

* 本では、'Seattle2014.csv'を読み込んでいるが、ファイルは「notebook_v1」フォルダにある。
* サンプルコードでは、'vega_datasets'からデータを取り込んでいるが、'altair'モジュールを
　インストールする必要がある。
　参考：Altairのすすめ！Pythonによるデータの可視化
  https://qiita.com/kitta65/items/b71bef31ba3a21868095

# 本の内容
import numpy as np
import pandas as pd

# Use pandas to extract rainfall inchied as a NumPy arrary
rainfall = pd.read_csv('data/Seattle2014.csv')['PRCP'].values
inches = rainfall / 254  # 1/10mm -> inches
inchies.shape

# サンプルコード
import numpy as np
from vega_datasets import data

# Use DataFrame operations to extract rainfall as a NumPy array
rainfall_mm = np.array(
    data.seattle_weather().set_index('date')['precipitation']['2015'])
len(rainfall_mm)

P84：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
plt.plot(bins, counts, linestyle='steps');

# サンプルコード
plt.plot(bins, counts, drawstyle='steps');

３章 Pandasを使ったデータ操作

P117：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
area.index | population.index

# サンプルコード
area.index.union( population.index)

P118：本とサンプルコードでは記述が異なるが両方動作する

# 本の内容
fill = A.stack().mean()
A.add( B, fill_value = fill)

# サンプルコード
A.add( B, fill_value=A.values.mean())

# 本の内容
A = rng.randint(10, size=(3, 4))

# サンプルコード
A = rng.integers(10, size=(3, 4))

P119：本とサンプルコードでは記述が異なるが両方動作する

# 本の内容
df = pd.DataFrame(A, columns=list['QRST'])
df - df.iloc[0]

# サンプルコード
df = pd.DataFrame(A, columns=['Q', 'R', 'S', 'T'])
df - df.iloc[0]

P126：本とサンプルコードでは記述が異なるが両方動作する

# 本の内容
data = pd.Series([1, np.nan, 2, None, 3], index=list('abcde'))
# 結果は dtype:float64

# サンプルコード
data = pd.Series([1, np.nan, 2, None, 3], index=list('abcde'), dtype='Int32')
# 結果は dtype:int32

P132：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
# pandas ver 2.03 では、「labels」というパラメータはありません
pd.MultiIndex(levels=[['a', 'b'], [1, 2]],
              lavels=[[0, 0, 1, 1], [0, 1, 0, 1]])

# サンプルコード
pd.MultiIndex(levels=[['a', 'b'], [1, 2]],
              codes=[[0, 0, 1, 1], [0, 1, 0, 1]])

P140：「3.6.5 多重インデックスに基づいた集約」サンプルコードの部分抜け

・サンプルコードが部分的に抜けています
・「notebooks_v1／03.05-Hierarchical-Indexing.ipynb」にはあったので部分コピー
・また、本の内容ではErrorになるが、コピーしたサンプルコードでは動作する

# 本の内容
data_mean = health_data.mean(level='year')

# サンプルコード
data_mean = health_data.groupby(level='year').mean()

# 本の内容
data_mean.mean(axis=1, level='type')

# サンプルコード
data_mean.groupby( axis=1, level='type').mean()

P146： 3.7.2.3 appendメソッド

・本にはあるが、Pandas2系ではpandas.DataFrame.append()は削除されている

P156：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
merged = merged.drop('abbreviation', 1) # drop duplicate info

# サンプルコード
merged = merged.drop('abbreviation', axis=1) # drop duplicate info

P165：本の表とサンプルコードの結果が合わない

# 本の内容
planets.groupby('method')['year'].describe().unstack()

# 上記コードから、unstack()を取り除く
planets.groupby('method')['year'].describe()

P197：本の内容とサンプルコードではError

# 本の内容
from pandas_datareader import data
goog = data.DataReader('GOOG', start='2004', end='2018', data_source='google')
goog.head()

# サンプルコード
from pandas_datareader import data
sp500 = data.DataReader('^GSPC', start='2018', end='2022', data_source='yahoo')
sp500.head()

# 以下のように修正　：　ChatGPTの助けを借りた
import yfinance as yf

# 取得したい株価データの銘柄、開始日、終了日を指定
ticker_symbol = "GOOGL"  # Googleの株価データを取得する場合
start_date = "2022-01-01"
end_date = "2024-01-01"

sp500 = yf.download( ticker_symbol, start=start_date, end=end_date)
sp500.head()

４章 Matplotlibによる視覚化

P246：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
plt.imshow(Z, extent=[0, 5, 0, 5], origin='lower', cmap='RdGy')
plt.colorbar();
plt.axis(aspect='image');  # <- このオプションは無い
                           # <- この行を削除すれば動作するが表示が粗い

# サンプルコード
plt.imshow(Z, extent=[0, 5, 0, 5], origin='lower', cmap='RdGy',
           interpolation='gaussian', aspect='equal')
plt.colorbar();

P248：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
plt.hist(data, bins=30, normed=True, alpha=0.5,
         histtype='stepfilled', color='steelblue',
         edgecolor='none');

# サンプルコード
plt.hist(data, bins=30, density=True, alpha=0.5,
         histtype='stepfilled', color='steelblue',
         edgecolor='none');

P249：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
kwargs = dict(histtype='stepfilled', alpha=0.3, normed=True, bins=40)

# サンプルコード
kwargs = dict(histtype='stepfilled', alpha=0.3, density=True, bins=40)

P256：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
#・・・
plt.axis(aspect='equal')
#・・・
for area in [100, 300, 500]:
plt.scatter([], [], c='k', alpha=0.3, s=area,  # <- インデントが必要

# サンプルコード
#・・・
plt.axis('equal')
#・・・
for area in [100, 300, 500]:
    plt.scatter([], [], c='k', alpha=0.3, s=area,

P261：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
    cmap = plt.cm.get_cmap(cmap)

# 以下のように修正
    cmap = plt.colormaps.get_cmap(cmap)

P264：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
plt.imshow(I, cmap=plt.cm.get_cmap('Blues', 6))

# 以下のように修正
plt.imshow(I, cmap=plt.get_cmap( 'Blues', 6))

P265：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
plt.scatter(projection[:, 0], projection[:, 1], lw=0.1,
            c=digits.target, cmap=plt.cm.get_cmap('plasma', 6))

# 以下のように修正
plt.scatter(projection[:, 0], projection[:, 1], lw=0.1,
            c=digits.target, cmap=plt.get_cmap('plasma', 6))

P273：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
births = pd.read_csv('births.csv')
# ・・・
births_by_date.index = [pd.datetime(2012, month, day)
                        for (month, day) in births_by_date.index]

# サンプルコード
births = pd.read_csv('data/births.csv')
# ・・・
births_by_date.index = [datetime(2012, month, day)
                        for (month, day) in births_by_date.index]

P302 - P314：サンプルコードファイルの抜け

・「04.13-Geographic-Data-With-Basemap.ipynb」は「notebook」フォルダから抜けています。
・「notebook_v1」フォルダにあったのでコピーして使用
・詳しくは調べていませんが、Basemapの引数（lat_0,lot_0）の値により以下のような警告が表示される。
 （matplotlibの不具合？）
　・Clipping input data to the valid range for imshow with RGB data 
   ([0..1] for floats or [0..255] for integers).
　・The input coordinates to pcolormesh are interpreted as cell centers, 
 　but are not monotonically increasing or decreasing.
   This may lead to incorrectly calculated cell edges, in which case,
   please supply explicit cell edges to pcolormesh.

P315 - P316：「4.16.1 Seaborn対Matplotlib」サンプルコードの部分抜け

・本にはあるが「notebooks/04.14-Visualization-With-Seaborn.ipynb」のサンプルコードが抜けています
・「notebooks_v1/04.14-Visualization-With-Seaborn.ipynb」にはあったので部分コピー

P317：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
for col in 'xy':
    plt.hist(data[col], normed=True, alpha=0.5)

# サンプルコード
for col in 'xy':
    plt.hist(data[col], density=True, alpha=0.5)

P317：本の内容ではError、サンプルコードではwarning

# 本の内容
for col in 'xy':
    sns.kdeplot(data=data, shade=True);

# サンプルコード
sns.kdeplot(data=data, shade=True);

# 以下のように修正
sns.kdeplot(data=data, fill=True);

P318-P320：サンプルコードの部分抜け

・本にはあるが「notebooks/04.14-Visualization-With-Seaborn.ipynb」サンプルコードの部分抜け
・「notebooks_v1/04.14-Visualization-With-Seaborn.ipynb」にはあったので部分コピー。

P318：本の内容とサンプルコードではwarning

# 本の内容
# サンプルコード
sns.distplot( data[ 'x'])
sns.distplot( data[ 'y']);

# 以下のように修正
sns.histplot( data[ 'x'], kde=True)
sns.histplot( data[ 'y'], kde=True);

P318：本の内容とサンプルコードでは、本の図が得られない

# 本の内容
# サンプルコード
sns.kdeplot( data)

# 以下のように修正
sns.kdeplot(data=data, x='x', y='y');

P319：本の内容とサンプルコードではError

# 本の内容
# サンプルコード
with sns.axes_style( 'white'):
    sns.jointplot("x", "y", data, kind='hex')

# 以下のように修正
with sns.axes_style( 'white'):
    sns.jointplot( x=data[ 'x'], y=data[ 'y'], kind='kde')

P320：本の内容とサンプルコードではError

# 本の内容
# サンプルコード
with sns.axes_style( 'white'):
    sns.jointplot("x", "y", data, kind='hex')

# 以下のように修正
with sns.axes_style( 'white'):
    sns.jointplot( x=data[ 'x'], y=data[ 'y'], kind='hex')

P327：本の内容とサンプルコードではError

# 本の内容
# サンプルコード
data['split_sec'] = data['split'].view(int) / 1E9
data['final_sec'] = data['final'].view(int) / 1E9

# 以下のように修正
data['split_sec'] = data['split'].view("int64") / 1E9
data['final_sec'] = data['final'].view("int64") / 1E9

P329：本の内容とサンプルコードではError

# 本の内容
# サンプルコード
sns.kdeplot(data.split_frac[data.gender=='M'], label='men', shade=True)
sns.kdeplot(data.split_frac[data.gender=='W'], label='women', shade=True)

# 以下のように修正
sns.kdeplot(data.split_frac[data.gender=='M'], label='men', fill=True)
sns.kdeplot(data.split_frac[data.gender=='W'], label='women', fill=True)

５章機械学習

P347：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
sns.pairplot(iris, hue='species', size=1.5);

# 以下のように修正
sns.pairplot(iris, hue='species', height=1.5);

P355：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
sns.lmplot("PCA1", "PCA2", hue='species', data=iris, fit_reg=False);

# サンプルコード
sns.lmplot(x="PCA1", y="PCA2", hue='species', data=iris, fit_reg=False);

P356：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
sns.lmplot("PCA1", "PCA2", data=iris, hue='species', col='cluster', fit_reg=False);

# サンプルコード
sns.lmplot(x="PCA1", y="PCA2", data=iris, hue='species', col='cluster', fit_reg=False);

P359：本の内容とサンプルコードでWarnning

# 本の内容
# サンプルコード
            cmap=plt.cm.get_cmap('viridis', 10))

# 以下のように修正
            cmap=plt.colormaps.get_cmap('viridis'))

P379：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
vec.get_feature_names()

# サンプルコード
vec.get_feature_names_out()

P380：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
pd.DataFrame(X.toarray(), columns=vec.get_feature_names())

# サンプルコード
pd.DataFrame(X.toarray(), columns=vec.get_feature_names_out())

P384：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
from sklearn.preprocesing import Imputer
imp = Imputer(strategy='mean')

# サンプルコード
from sklearn.impute import SimpleImputer
imp = SimpleImputer(strategy='mean')

P384：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
model = make_pipeline(Imputer(strategy='mean'),

# サンプルコード
model = make_pipeline(SimpleImputer(strategy='mean'),

P403：本の内容とサンプルコードではError

# 本の内容
# サンプルコード
    days = (date - pd.datetime(2000, 12, 21)).days

# 以下のように修正
    days = (date - pd.to_datetime(2000-12-21)).days

P408：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
from sklearn.datasets.samples_generator import make_blobs

# サンプルコード
from sklearn.datasets import make_blobs

P413：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
from sklearn.datasets.samples_generator import make_circles

# サンプルコード
from sklearn.datasets import make_circles

P414：本とサンプルコードでは記述が異なるが両方動作する

# 本の内容
def plot_3D(elev=30, azim=30, X=X, y=y):
    ax = plt.subplot(projection='3d')
    ax.scatter3D(X[:, 0], X[:, 1], r, c=y, s=50, cmap='autumn')
    ax.view_init(elev=20, azim=30)
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    ax.set_zlabel('r');

interact( plot_3D, elev=[30, 60], azip=(-180, 180),
        X=fixed(X), y=fixed(y));

# サンプルコード
from mpl_toolkits import mplot3d

ax = plt.subplot(projection='3d')
ax.scatter3D(X[:, 0], X[:, 1], r, c=y, s=50, cmap='autumn')
ax.view_init(elev=20, azim=30)
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('r');

P438：本の内容とサンプルコードではWarning

# 本の内容
# 'spectral' -> 'Spectral'なら一応動作するが
            cmap=plt.cm.get_cmap('spectral', 10))

# サンプルコード
            cmap=plt.colormaps.get_cmap('rainbow'))

P448：本の内容とサンプルコードではwarning

# 本の内容
# サンプルコード
colorize = dict(c=X[:, 0], cmap=plt.cm.get_cmap('rainbow', 5))

# 以下のように修正
colorize = dict(c=X[:, 0], cmap=plt.colormaps.get_cmap('rainbow'))

P450：本の内容とサンプルコードではWarning

# 本の内容
model = MDS(n_components=2, dissimilarity='precomputed', random_state=1)

# サンプルコード
model = MDS(n_components=2, dissimilarity='precomputed', random_state=1701)

# 以下のように修正
model = MDS(n_components=2, dissimilarity='precomputed', random_state=1701,
            normalized_stress='auto')

P452：本の内容とサンプルコードではWarning

# 本の内容
model = MDS(n_components=2, random_state=1)

# サンプルコード
model = MDS(n_components=2, random_state=1701)

# 以下のように修正
model = MDS(n_components=2, random_state=1701, normalized_stress='auto')

P454：本の内容とサンプルコードではWarning

# 本の内容
model = MDS(n_components=2, random_state=1)

# サンプルコード
model = MDS(n_components=2, random_state=1701)

# 以下のように修正
model = MDS(n_components=2, random_state=1701, normalized_stress='auto')

P461：本の内容ではErrorになるが、サンプルコードではWarning

# 本の内容
from sklearn.datasets import fetch_mldata
mnist = fetch_mldata('MNIST original')

# サンプルコード
from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784')

# 以下のように修正
from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784', parser='auto')

P462：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
plt.scatter(proj[:, 0], proj[:, 1], c=target, cmap=plt.cm.get_cmap('jet', 10))

# 以下のように修正
plt.scatter(proj[:, 0], proj[:, 1], c=target, cmap=plt.colormaps.get_cmap('jet'))

P464：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
from sklearn.datasets.sample_generator import make_blobs

# サンプルコード
from sklearn.datasets import make_blobs

P464：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
kmeans = KMeans(n_clusters=4)

# 以下のように修正
kmeans = KMeans(n_clusters=4, n_init='auto')

P468：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
labels = KMeans(6, random_state=0).fit_predict(X)

# 以下のように修正
labels = KMeans(6, random_state=0, n_init='auto').fit_predict(X)

P469：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
labels = KMeans(2, random_state=0).fit_predict(X)

# 以下のように修正
labels = KMeans(2, random_state=0, n_init='auto').fit_predict(X)

P470：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
model = SpectralClustering(n_clusters=2, affinity='nearest_neighbors', assign_labels='kmeans')

# 以下のように修正
model = SpectralClustering(n_clusters=2, eigen_solver='arpack', affinity='rbf', gamma=10.0)

P472：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
kmeans = KMeans(n_clusters=10, random_state=0)

# 以下のように修正
kmeans = KMeans(n_clusters=10, random_state=0, n_init='auto')

P474：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
kmeans = KMeans(n_clusters=10, random_state=0)

# 以下のように修正
kmeans = KMeans(n_clusters=10, random_state=0, n_init='auto')

P476：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
kmeans = MiniBatchKMeans(16)

# 以下のように修正
kmeans = MiniBatchKMeans(16, n_init='auto')

P478：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
kmeans = KMeans(4, random_state=0)

# 以下のように修正
kmeans = KMeans(4, random_state=0, n_init='auto')

P479：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
kmeans = KMeans(n_clusters=4, random_state=0)

# 以下のように修正
kmeans = KMeans(n_clusters=4, random_state=0, n_init='auto')

P480：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
kmeans = KMeans(n_clusters=4, random_state=0)

# 以下のように修正
kmeans = KMeans(n_clusters=4, random_state=0, n_init='auto')

P483：本の内容とサンプルコードではError

# 本の内容
# サンプルコード
        ax.add_patch(Ellipse(position, nsig * width, nsig * height, angle, **kwargs))

# 以下のように修正
        ax.add_patch(Ellipse(position, nsig * width, nsig * height, angle=angle, **kwargs))

P492：本の内容ではErrorになるが、サンプルコードでは動作する

# 本の内容
hist = plt.hist(x, bins=30, normed=True)

# サンプルコード
hist = plt.hist(x, bins=30, density=True)

P498 - P500： 5.13.3 「事例：球面上のKDE」サンプルコード部分抜け

・ 本にはあるが、「notebooks/05.13-Kernel-Density-Estimation.ipynb」サンプルコードの部分抜け
・「notebooks_v1/05.13-Kernel-Density-Estimation.ipynb」にはあったので部分コピー
・ def construct_grids(batch) は、サイトよりコピペ（本記事「使用する外部ファイル」参照）

# 本の内容
# サンプルコード
from sklearn.datasets.species_distributions import construct_grids

# 以下のように修正
from mpl_toolkits.basemap import Basemap

def construct_grids(batch):
    """Construct the map grid from the batch object

    Parameters
    ----------
    batch : Batch object
        The object returned by :func:`fetch_species_distributions`

    Returns
    -------
    (xgrid, ygrid) : 1-D arrays
        The grid corresponding to the values in batch.coverages
    """
    # コーナーセルのx,y座標
    xmin = batch.x_left_lower_corner + batch.grid_size
    xmax = xmin + (batch.Nx * batch.grid_size)
    ymin = batch.y_left_lower_corner + batch.grid_size
    ymax = ymin + (batch.Ny * batch.grid_size)

    # グリッドセルのx座標
    xgrid = np.arange(xmin, xmax, batch.grid_size)
    # グリッドセルのy座標
    ygrid = np.arange(ymin, ymax, batch.grid_size)

    return (xgrid, ygrid)

P506：本の内容ではErrorになるが、サンプルコードではWarning

# 本の内容
# スペルミス
hog_vec, hog_vis = feature.hog(image, visualise=True)
# -->
hog_vec, hog_vis = feature.hog(image, visualize=True)

ax[1].imshow(hog_vis, cmap='gray_r')

# サンプルコード
ax[1].imshow(hog_vis)

# 以下のように修正
ax[1].imshow(hog_vis, cmap='gray_r')

P510：本の内容とサンプルコードではWarning

# 本の内容
# サンプルコード
grid = GridSearchCV(LinearSVC(), {'C': [1.0, 2.0, 4.0, 8.0]})

# 以下のように修正
grid = GridSearchCV(LinearSVC(dual=False), {'C': [1.0, 2.0, 4.0, 8.0]})

あとがき

この本は、サンプルが満載で掲載コードも短く、データサイエンスの全体像を把握するには良書であると思われる。　第二版が出るのを知っていたなら・・・（ぶつぶつ）

正誤表が出来たので、内容把握はこれからです。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up