CNN (Convolutional Neural Network)
- Fundamental information
- Data Preprocessing
- Building the CNN
- Training the CNN model
- Making a single prediction
- MATH
- Reference
Fundamental information
-
Convolution
- Cover the Feature Detector(Filter) onto Input Image and multiply each value by each value, e.g. top-left x top-left
- preserve the special relationships between pixels!!
-
Input Image x Feature Detector = Feature Map
https://medium.com/@bdhuma/6-basic-things-to-know-about-convolution-daef5e1bc411
-
ReLU(Rectified Linear Unit) Layer
-
Pooling
-
Flattening
-
Full Connection
⬇️Build CNN⬇️
- binary classification problem (image classification)
- CNN(Convolutional neural network)
- 1st〜full_connection layer: relu
- output layer: sigmoid
Data Preprocessing
-
Import Libraries
- Initialise CNN-network
- Create cnn variable which represents the CNN-network (sequential)
import tensorflow as tf from keras.preprocessing.image import ImageDataGenerator # library:keras / module:preprocessing / submodule:image / class:ImageDataGenerator # used for data preprocessing (next step)
tf.__version__ # print the version of TensorFlow
-
Preprocess Training Set
- Mount google-drive if using google-colab
from google.colab import drive drive.mount('/content/drive')
- generaete the traing_data
#using keras API train_datagen = ImageDataGenerator(rescale = 1./255, #feature scaling => applying to each and every single pixels # by dividing their value by 255. (0 - 1) i.e. normalization shear_range = 0.2, zoom_range = 0.2, horizontal_flip = True)
- import the training img_set(target_size, batch...)
training_set = train_datagen.flow_from_directory('dataset/training_set', # take train_datagen object (instance of ImageDataGenerator class)and call the method(function) of ImageDataGenerator class target_size = (64, 64), batch_size = 32, #how many images we want to have in each batch (32 is default) class_mode = 'binary') #classification(0 or 1)
-
Preprocess Test Set
- generate the training_set
test_datagen = ImageDataGenerator(rescale = 1./255)
- import the test img_set(target_size, batch...)
test_set = test_datagen.flow_from_directory('dataset/test_set', target_size = (64, 64), batch_size = 32, class_mode = 'binary')
Building the CNN
1st LAYER(1. Convolution & 2. Pooling)
-
Convolution
- Initialize CNN-network
- Create cnn variable which represents the CNN-network (sequential)
cnn = tf.keras.models.Sequential() # tf:tensorflow # keras: library # models:module # Sequential:class
- Apply Convolution: Input Image => [Feature Detector/Filter] => Feature Map => ReLU Activation Function
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu', input_shape=[64, 64, 3])) #cnn:object #add:method #layers:module #Conv2D():class # filters: = feature Detector:the # of filters # kernel_size: dimention of the filter # activation: rectifier # input_shape: 3D(RGB)//initially reshaped this time
-
Pooling
- Apply Max Pooling: Feature Map => [Max Pooling] =>Pooled Feature Map
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2)) # MaxPool2D: make feature map => pooled feature map # pool_size:size(width/high)e.g.,2x2 in this case # strides: Which # of pixel is the fram shifted to the right
2nd LAYER (repeat previous 2 layers)
-
Add additional Layers(Convolution/Pooling)
- Repeat the exact same codes (Convolution/Pooling)
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu')) # !! remove input_shape !! cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))
-
Flattening
- Flatten all of the pooled images into one long vector/column
cnn.add(tf.keras.layers.Flatten()) # Flatten():class/ no parameter needed
NEURAL NETWORK BEGINS here
-
Full Connection (dense)
cnn.add(tf.keras.layers.Dense(units=128, activation='relu')) # Dense():class / used only when FULLY connecting # units:the # of hidden nurons # activation: !!still not the final layer!!
-
Output Layer (dense)
cnn.add(tf.keras.layers.Dense(units=1, activation='sigmoid')) # units:the # of output nuron (only 1 output - classification) # activation: sigmoid:binary /softmax:multi-class
Training the CNN model
-
Compile the CNN
- Connect CNN to an Optimizer, Lost Function, and some Metrics
cnn.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy']) # compile: method # optimizer, loss, metrics : Parameter names # adam: stochastic gradient descent (update the weights in order to reduce the loss error between the predictions and targets) # binary_crossentropy: for Binary Classification task # metrics: measure the performance of the classification model
-
Train CNN(Training set) & Evaluate(Test set)
- this time, do training and evaluation at the same time
cnn.fit(x = training_set, validation_data = test_set, epochs = 25) # fit: method for training CNN # x, validation_data, epochs : parameter names # training_set: @Data Preprocessing - Training set # test_set: @Data Preprocessing - Test set
Making a single prediction
- Import
import numpy as np
# np: shortcut
from keras.preprocessing import image
# preprocessing: module
# image: module
# NOTE: "from keras.preprocessing.image import ImageDataGenerator" <= specifically downloading ImageDataGenerator, so to use "image" module, we need to download it individually.
- Test image:
- Load/ Define its size as same as training/test set
- Convert from pil to Numpy Array
- Update to BATCH dimension
test_image = image.load_img('dataset/single_prediction/cat_or_dog_1.jpg', target_size = (64, 64))
#test_image: variable
# image: sub-module
# load_img('', target_size=): function / take 2 arguments
# target_size: SAME SAZE as training/test set
test_image = image.img_to_array(test_image)
# convert img format - "pil" => Numpy Array
test_image = np.expand_dims(test_image, axis = 0)
# Update the test_image by adding extra DIMENSION corresponding to the BATCH
# np: (test_image = numpy array)
# expand_dims():add a fake dimension corresponding to the batch
# axis: BATCH = 1st Dimension / Dimension of the batch of test_image is the first dimension /(where we want to add the extra dimension)
result = cnn.predict(test_image)
# predict: method / return 0 or 1
- Clarify output label
- class_indices
training_set.class_indices
# figure out which(dog/cat) is 0 or 1 /(call the class_indices attribute from the training_set object)
# class_indices: attributeee
if result[0][0] == 1:
# result has a BATCH Dimension, therefore [][]
# [only one batch][1st and only elemet(prediction/output)]
prediction = 'dog'
else:
prediction = 'cat'
MATH
Reference
- Gradient-Based Learning Applied to Document Recognition (written by Yann LeCun et al)
- Deep Learning A-Z™: Hands-On Artificial Neural Networks (udemy)
- Data Science: Deep Learning and Neural Networks in Python (udemy)
- wikipedia
- https://keras.io/
- https://www.tensorflow.org/
- https://pytorch.org/
- https://medium.com/analytics-vidhya/beginners-guide-to-convolutional-neural-network-62c48043e262
- https://www.researchgate.net/figure/Overview-and-details-of-a-convolutional-neural-network-CNN-architecture-for-image_fig2_341576780