LoginSignup
0
0

More than 1 year has passed since last update.

Split data into Train data and Test data for Cross-validation【Python】

Last updated at Posted at 2022-04-04

Environment

  • Python3
  • Anaconda
  • Jupyter Notebook

Packages

$ pip install flicker
$ pip install pillow
$ pip install sklearn

Assumption

You have data by here

↓↓↓

Coding

gen_data.ipynb

from PIL import Image
import os, glob
import numpy as np
from sklearn import model_selection

classes = ["apple", "banana", "grape"]
num_classes = len(classes)
image_size = 50

X = []
Y = []
for i, classlabel in enumerate(classes):
    photos_dir = "./" + classlabel
    files = glob.glob(photos_dir + "/*.jpg")
    for i, file in enumerate(files):
        if i >= 200 : break
        image = Image.open(file)
        image = image.convert("RGB")
        image = image.resize((image_size, image_size))
        data = np.asarray(image)
        X.append(data)
        Y.append(i)


X = np.array(X)
Y = np.array(Y)

X_train, X_test, Y_train, Y_test = model_selection.train_test_split(X, Y)
xy = (X_train, X_test, Y_train, Y_test)
np.save("./fruit.npy", xy)

Check data

len(X_train)
len(X_test)
len(Y_train)
len(Y_test)
0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0