More than 5 years have passed since last update.

[Review] Gradient-Based Learning Applied to Document Recognition

Last updated at 2018-05-02Posted at 2018-04-19

Preface

Since I have read two research papers, I will brief both of them in this article following order.

Gradient-Based Learning Applied to Document Recognition
Authors: Yann LeCun, Leon Bottou, Yoshua Bengio and Patrick Haffer
Recent Advances in Convolutional Neural Networks : https://qiita.com/Rowing0914/items/4627b4a15b22ae53760b
Authors: Jiuxiang Gua,∗ , Zhenhua Wangb,∗ , Jason Kuenb , Lianyang Mab , Amir Shahroudyb , Bing Shuaib , Ting Liub , Xingxing Wangb , Li Wangb , Gang Wangb , Jianfei Caic , Tsuhan Chenc

Implementation of CNN in python with Numpy

Math in CNN

Brief summary of Gradient-Based Learning Applied to Document Recognition

Abstract

In this paper, they have proposed a novel approach called Convolutional Neural Networks with GTN(Graph Transformer Networks). And they have verified that the technique outperformed all existing machine learning techniques.
As for real-world problem, they have programmed the application to recognise the handwritten characters.

But, unfortunately they didn't elaborate the mathematical proof in this paper.
So I will leave that part to my another article described above.

Nomenclature

GT: Graph Transformer
GTN: Graph Transformer Network
HMM: Hidden Markov Model
HOS: Heuristic Oversegmentation
K-NN: K-nearest Neighbour
NN: Neural Network
OCR: Optical Character Recognition
PCA: Principal Component Analysis
RBF: Radial Basis Function
RS-SVM: Reduced-set Support Vector Machine
SDNN: Space Displacement Neural Network
SVM: Support Vector Machine
TDNN: Time Delay Neural Network
V-SVM: Virtual Support Vector Machine

Introduction
1. Learning from Data
2. Gradient-Based Learning
3. Gradient Backpropagation
4. Learning in Real Handwriting Recognition System
5. Globally Trainable Systems
Convolutional Neural Network for Isolated Character Recognition
1. Convolutional Networks
2. LeNet-5
3. Loss Function
Results and Comparison with Other Methods
1. DataBase: the Modified NIST set
2. Results
3. Comparison with Other Classifiers
4. Discussion
5. Invariance and Noise Resistance
Multi-Module Systems and Graph Transformation Networks
1. An Object-Oriented Approach
2. Special Modules
3. Graph Transformer Networks
Multiple Object Recognition: Heuristic Over-Segmentation
1. Segmentation Graph
2. Recognition Transformer and Viterbi Transformer
Global Training for Graph Transformer Networks
1. Viterbi Training
2. Discriminative Viterbi Training
3. Forward Scoring, and Forward Training
4. Discriminative Forward Training
5. Remarks on Discriminative Training
Multiple Object Recognition: Space Displacement Neural Network
1. Interpreting the Output of an SDNN with a GTN
2. Experiments with SDNN
3. Global Training of SDNN
4. Object Detection and Spotting with SDNN
Graph Transformer Networks and Transducers
1. Previous Work
2. Standard Transduction
3. Generalised Transduction
4. Notes on the Graph Structures
5. GTN and Hidden Markov Models
An On-Line Handwriting Recognition System
1. Preprocessing
2. Network Architecture
3. Network Training
4. Experimental Resutls
Check Reading System
1. GTN for Check Amount Recognition
2. Gradient-Based Learning
3. Rejecting Low Confidence Checks
4. Results
Conclusions

1. Introduction

Historically, the basic architecture for handwriting recognition is separated into two modules as below.

And mostly we just relied on some commercial products like OCR.
But applying GTN, we could unify the entire module.

1. Learning from Data

Basic Notations
Input: Z
Output: D
Dataset: {$(Z^1, D^1), (Z^2, D^2), ... , (Z^P, D^P)$}

Parameter: W
Function: F
Prediction: $Y^P = F(Z^P, W)$
Loss Function: $E^P = D(D^P, F(Z^P, W))$

2. Gradient-Based Learning

The general procedure to minimise the error for loss function is modifying the parameter using gradient-based learning.
$W_k = W_{k-1} - \epsilon \frac{\partial E(W)}{\partial W}$

3. Gradient Backpropagation

To propagate the error correctly, we use backprop.

4. Learning in Real Handwriting Recognition System

One of the hardest problem in handwritten recognition arise in how to distinguish the characters respectively.

5. Globally Trainable Systems

Graph Transformer takes one or more graphs as input, and produces a graph as output. Such networks are called GTNs(Graph Transformer Network), and requires gradient-based learning to efficiently learn the pattern of characters in the images.

2. Convolutional Neural Network for Isolated Character Recognition

As for training of feature extraction, certainly we have achieved good performance with neural networks previously though, there are some problems.

Image size and quality is growing so fast, e.g. often it requires several hundreds pixels. So this ends up with burring us in tons of parameters.
Deficiency of fully-connected design is that the topology of inputs is entirely ignored.

1. Convolutional Networks

Three features of CNN

Local Receptive Fields
Shared Weights
Sub-Sampling
Neurons can extract elementary visual features like, scale oriented edges, end-point and so on. Hence, once we have detected the character in that image, we could successfully spot that character, even if that one rotate or move to different position.
This is mostly equivalent of the feature of the robustness of convolutional layers.
While local receptive fields are filtering the feature map of the image, they share the weight on that image's feature map.
locally average the feature map and do sub-sampling(in other word, max/average pooling).

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up