More than 3 years have passed since last update.

CNN - Overview / Implementation on TensorFlow

Last updated at 2022-07-13Posted at 2022-07-09

CNN　（Convolutional Neural Network）

Fundamental information
Data Preprocessing
Building the CNN
Training the CNN model
Making a single prediction
MATH
Reference

Fundamental information

Convolution
- Cover the Feature Detector(Filter) onto Input Image and multiply each value by each value, e.g. top-left x top-left
- preserve the special relationships between pixels!!
- Input Image x Feature Detector = Feature Map
  
  https://medium.com/@bdhuma/6-basic-things-to-know-about-convolution-daef5e1bc411
ReLU(Rectified Linear Unit) Layer
- = Activation Function
- to remove any linearity or increase nonlinearity in our images
Pooling
- provide "spatial invariance" in our images
- significantly reduces the size of our images
- avoid overfitting
- preserve the main features;)
- Max Pooling:
Flattening
- Flatten all of the pooled images into one long vectors/column
Full Connection
- i.e.

conclusion:

⬇️Build CNN⬇️

binary classification problem (image classification)
CNN(Convolutional neural network)
1st〜full_connection layer: relu
output layer: sigmoid

Data Preprocessing

Import Libraries

Initialise CNN-network
Create cnn variable which represents the CNN-network (sequential)

import tensorflow as tf 
from keras.preprocessing.image import ImageDataGenerator 
    # library:keras / module:preprocessing / submodule:image / class:ImageDataGenerator
    # used for data preprocessing (next step)

tf.__version__
  # print the version of TensorFlow

Preprocess Training Set

Mount google-drive if using google-colab

from google.colab import drive
drive.mount('/content/drive')

generaete the traing_data

#using keras API
train_datagen = ImageDataGenerator(rescale = 1./255, 
                                        #feature scaling => applying to each and every single pixels      
                                                            # by dividing their value by 255. (0 - 1) i.e. normalization
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

import the training img_set(target_size, batch...)

training_set = train_datagen.flow_from_directory('dataset/training_set', 
              # take train_datagen object (instance of ImageDataGenerator class)and call the      method(function) of ImageDataGenerator class
                                                 target_size = (64, 64), 
                                                 batch_size = 32, #how many images we want to have in each batch (32 is default)
                                                 class_mode = 'binary') #classification(0 or 1)

Preprocess Test Set

generate the training_set

test_datagen = ImageDataGenerator(rescale = 1./255)

import the test img_set(target_size, batch...)

test_set = test_datagen.flow_from_directory('dataset/test_set',
                                            target_size = (64, 64),
                                            batch_size = 32,
                                            class_mode = 'binary')

Building the CNN

1st LAYER(1. Convolution & 2. Pooling)

Convolution

Initialize CNN-network
Create cnn variable which represents the CNN-network (sequential)

cnn = tf.keras.models.Sequential() 
     # tf:tensorflow
         # keras: library
               # models:module
                      # Sequential:class

Apply Convolution: Input Image => [Feature Detector/Filter] => Feature Map => ReLU Activation Function

cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu', input_shape=[64, 64, 3]))
#cnn:object
    #add:method
                #layers:module
                        #Conv2D():class
                              # filters: = feature Detector:the # of filters
                              # kernel_size: dimention of the filter
                              # activation: rectifier
                              # input_shape: 3D(RGB)//initially reshaped this time

Pooling

Apply Max Pooling: Feature Map => [Max Pooling] =>Pooled Feature Map

cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))
                        # MaxPool2D: make feature map => pooled feature map
                                  # pool_size:size(width/high)e.g.,2x2 in this case
                                               # strides: Which # of pixel is the fram shifted to the right

2nd LAYER (repeat previous 2 layers)

Add additional Layers(Convolution/Pooling)

Repeat the exact same codes (Convolution/Pooling)

cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu'))
                                                                  # !! remove input_shape !!
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))

Flattening

Flatten all of the pooled images into one long vector/column

cnn.add(tf.keras.layers.Flatten())
                        # Flatten():class/ no parameter needed

NEURAL NETWORK BEGINS here

Full Connection (dense)

cnn.add(tf.keras.layers.Dense(units=128, activation='relu'))
                        # Dense():class / used only when FULLY connecting
                             # units:the # of hidden nurons
                                        # activation: !!still not the final layer!!

Output Layer (dense)

cnn.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))
                             # units:the # of output nuron (only 1 output - classification)
                                      # activation: sigmoid:binary /softmax:multi-class

Training the CNN model

Compile the CNN

Connect CNN to an Optimizer, Lost Function, and some Metrics

cnn.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
# compile: method
         # optimizer, loss, metrics : Parameter names
           # adam: stochastic gradient descent (update the weights in order to reduce the loss error between the predictions and targets)
                               # binary_crossentropy: for Binary Classification task
                                                             # metrics: measure the performance of the classification model

Train CNN(Training set) & Evaluate(Test set)

this time, do training and evaluation at the same time

cnn.fit(x = training_set, validation_data = test_set, epochs = 25)
   # fit: method for training CNN
      # x, validation_data, epochs : parameter names
        # training_set: @Data Preprocessing - Training set
                         # test_set: @Data Preprocessing - Test set

Making a single prediction

Import

import numpy as np
               # np: shortcut 

from keras.preprocessing import image
            # preprocessing: module
                                # image: module
                                      # NOTE: "from keras.preprocessing.image import ImageDataGenerator" <= specifically downloading ImageDataGenerator, so to use "image" module, we need to download it individually.

Test image:
- Load/ Define its size as same as training/test set
- Convert from pil to Numpy Array
- Update to BATCH dimension

test_image = image.load_img('dataset/single_prediction/cat_or_dog_1.jpg', target_size = (64, 64))
#test_image: variable
            # image: sub-module
                  # load_img('', target_size=): function / take 2 arguments
                                                                         # target_size: SAME SAZE as training/test set

test_image = image.img_to_array(test_image)
            # convert img format - "pil" => Numpy Array

test_image = np.expand_dims(test_image, axis = 0)
# Update the test_image by adding extra DIMENSION corresponding to the BATCH
            # np: (test_image = numpy array)
                # expand_dims():add a fake dimension corresponding to the batch 
                                        # axis: BATCH = 1st Dimension / Dimension of the batch of test_image is the first dimension /(where we want to add the extra dimension)


result = cnn.predict(test_image)
            # predict: method / return 0 or 1

Clarify output label
- class_indices

training_set.class_indices
# figure out which(dog/cat) is 0 or 1 /(call the class_indices attribute from the training_set object)
            # class_indices: attributeee

if result[0][0] == 1:
  # result has a BATCH Dimension, therefore [][]
        # [only one batch][1st and only elemet(prediction/output)]
  prediction = 'dog'
else:
  prediction = 'cat'

MATH

Reference

Gradient-Based Learning Applied to Document Recognition (written by Yann LeCun et al)
Deep Learning A-Z™: Hands-On Artificial Neural Networks (udemy)
Data Science: Deep Learning and Neural Networks in Python (udemy)
wikipedia
https://keras.io/
https://www.tensorflow.org/
https://pytorch.org/
https://medium.com/analytics-vidhya/beginners-guide-to-convolutional-neural-network-62c48043e262
https://www.researchgate.net/figure/Overview-and-details-of-a-convolutional-neural-network-CNN-architecture-for-image_fig2_341576780

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up