ライブラリーインストール

pip install scikit-learn

コード

#Create the testing data and training data

#Split arrays or matrices into random train and test subsets
x_train,x_test,y_train,y_test=train_test_split(X,Y)
#map these files to a file
xy=(x_train,x_test,y_train,y_test)

#save as files
np.save('./animal.npy',xy)

ここで使われてるFunctionはsklearn.model_selection.train_test_splitです。
https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
ランダムでデータをTestingとTrainingデータ分けることができます、デフォルト値0.25です。
つまり、75%はTestingデータになり、Trainingデータは25%です。
そして最後numpyのsave()を使ってデータをテキストに保存します。

Script全体

from PIL import Image
import os,glob
import numpy as np
from sklearn.model_selection import cross_validate

#the name of your classes but also the path name
classes=['clsA','clsB','clsC']
num_classes=len(classes)

#resize the image to 50px
image_size=50

#load the images

#image data
X=[]

#label data
Y=[]

for index,clss in enumerate(classes):
    photos_dir='./'+clss
    #get the same pattern of files
    #in this case is .jpg files
    files=glob.glob(photos_dir+'/*.jpg')
    for i,file in enumerate(files):
        #in this apps, only have 200 pictures
        if i >=200:
            break

        image=Image.open(file)
        image=image.convert('RGB')
        image=image.resize((image_size,image_size))
        data=np.asarray(image)

        X.append(data)
        Y.append(index)

#change to numpy array
X=np.array(X)
Y=np.array(Y)

#Create the testing data and training data

#Split arrays or matrices into random train and test subsets
x_train,x_test,y_train,y_test=train_test_split(X,Y)
#map these files to a file
xy=(x_train,x_test,y_train,y_test)

#save as files
np.save('./cls.npy',xy)

TestとTrainingデータを分けましょう

ライブラリーインストール

コード

Script全体