More than 1 year has passed since last update.

scikit-learn の使い方 (その1)

Last updated at 2023-06-30Posted at 2020-09-03

次の記事と同じことを scikit-learn 0.23.1 で行いました。
機械学習のライブラリ！scikit-learnとは【初心者向け】

Arch Linux で必要なライブラリーのインストール

sudo pacman -S python-scikit-learn
sudo pacman -S python-matplotlib

確認したバージョン

$ python
Python 3.11.3 (main, Jun  5 2023, 09:32:32) [GCC 13.1.1 20230429] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sklearn
>>> print(sklearn.__version__)
1.2.2

データの確認

show_data.py

#! /usr/bin/python
#
#	show_data.py
#
#						Sep/03/2020
#
from sklearn import datasets
import matplotlib.pyplot as plt

digits = datasets.load_digits()

plt.matshow(digits.images[0], cmap="Greys")
plt.show()

実行結果

SVM

svm01.py

#! /usr/bin/python
#
#	svm01.py
#
#						Sep/03/2020
#
from sklearn import datasets
from sklearn import svm
import sklearn.metrics as metrics

digits = datasets.load_digits()


X = digits.data
y = digits.target

X_train, y_train = X[0::2], y[0::2]
X_test, y_test = X[1::2], y[1::2]

clf = svm.SVC(gamma=0.001)

clf.fit(X_train, y_train)

accuracy = clf.score(X_test, y_test)
print(f"正解率{accuracy}")

predicted = clf.predict(X_test)

print("classification report")
print(metrics.classification_report(y_test, predicted))

実行結果

$ ./svm01.py 
正解率0.9866369710467706
classification report
              precision    recall  f1-score   support

           0       1.00      0.99      0.99        88
           1       0.98      1.00      0.99        89
           2       1.00      1.00      1.00        91
           3       1.00      0.98      0.99        93
           4       0.99      1.00      0.99        88
           5       0.98      0.97      0.97        91
           6       0.99      1.00      0.99        90
           7       0.99      1.00      0.99        91
           8       0.97      0.97      0.97        86
           9       0.98      0.97      0.97        91

    accuracy                           0.99       898
   macro avg       0.99      0.99      0.99       898
weighted avg       0.99      0.99      0.99       898

ロジスティック回帰

logistic01.py

#! /usr/bin/python
#
#	logistic.py
#
#						Sep/03/2020
#
from sklearn import datasets
import sklearn.metrics as metrics

from sklearn.linear_model import LogisticRegression

digits = datasets.load_digits()

X = digits.data
y = digits.target

X_train, y_train = X[0::2], y[0::2]
X_test, y_test = X[1::2], y[1::2]

clf = LogisticRegression(max_iter=2000)

clf.fit(X_train, y_train)

accuracy = clf.score(X_test, y_test)
print(f"正解率{accuracy}")

predicted = clf.predict(X_test)

print("classification report")
print(metrics.classification_report(y_test, predicted))

実行結果

$ ./logistic01.py 
正解率0.9532293986636972
classification report
              precision    recall  f1-score   support

           0       1.00      0.98      0.99        88
           1       0.87      0.98      0.92        89
           2       0.97      1.00      0.98        91
           3       0.98      0.92      0.95        93
           4       0.93      0.98      0.96        88
           5       0.96      0.95      0.95        91
           6       0.97      0.99      0.98        90
           7       0.99      0.97      0.98        91
           8       0.95      0.88      0.92        86
           9       0.93      0.89      0.91        91

    accuracy                           0.95       898
   macro avg       0.95      0.95      0.95       898
weighted avg       0.95      0.95      0.95       898

scikit-learn の使い方 (その2)
scikit-learn の使い方 (その3)
scikit-learn の使い方 (その4)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

scikit-learn の使い方 (その1)

Arch Linux で必要なライブラリーのインストール

確認したバージョン

データの確認

SVM

ロジスティック回帰

関連記事