Original Source Here

# Keras based Implementation of LeNet-5

# Introduction

*LeNet-5** *is one of the first Convolutional Neural Network models proposed by Yann LeCun et al. in 1989. The architecture was first observed in the paper “**Gradient-Based Learning Applied to Document Recognition**.” The authors have used this architecture to recognize handwritten digits. We won’t be going into the details of this paper; rather, we will focus more on implementing the architecture in Keras.

The LeNet-5 architecture has described in the paper is shown below

# Architecture Overview

The input image that is feed to the network is **32 x 32**. This is fed to the first convolutional layer with a filter size of **5 x 5**, **stride =1,** and the image is fed to **6** such filters ( filters are also called **Kernels**), which generates an output **C1**. Kernels are nothing but a feature extractor of an image and are also known as “**feature maps**.” The size of each feature map here is **28 x 28,** and **6** such feature maps are generated. Let’s understand how we obtained the dimensions **28 x 28**. This is one of the most important things in the understanding of architecture.

Let us assume the size of image be n x n and the kernel size be k x k and stride be s

So the output size of the feature maps is obtained by the formula ( [ (n-k) / s ] + 1 ) x ( [ (n-k) / s ] + 1).

In our case the value of n = 32 and k = 5 so the feature map size (32–5+1) x (32–5+1) = 28 x 28.

Now that we have understood this, the rest of the architecture is based on the above calculation.

The second layer is the pooling layer or subsampling layer. The subsampling layer performs sampling of the feature map to reduce its size. This is done to reduce the overall computation since the neighbouring pixels store more or less similar information. The subsampling layer used is of **size** **2 x 2** with **stride = 2**. The output of the subsampling layer is the feature maps **S2 **with size **14 x 14** (Calculation is the same as the above).

The third layer again is the convolutional layer with filter **size 5 x 5** and **stride = 1**. The only difference here is we would be the number of kernels (**16 kernels.) **The output **C3** would be of the size **10 x 10** with **16 **feature maps.

“ ** Layer C3 is a convolutional layer with 16 feature maps. Each unit in each feature map is connected to several 5 x 5 neighbourhoods at identical locations in a subset of S2’s feature maps. The first six C3feature maps take inputs from every contiguous subset of three feature maps in S2 The next six take input from every contiguous subset of four The next three take input from some discontinuous subsets of four. Finally, the last one takes input from all S2 feature maps**.”

The obtained **C3** consists of **16 **layers, each of **10 x 10. **As earlier, again** **subsampling layer here is used of **size** **2 x 2** with **stride = 2**. The output here is a feature map **S4** with **16** layers of size **5 x 5. **The feature map **S4 **is convoluted with **120** kernels each of size** 5 x 5** followed by a flattening layer to get the output **C5 **layer. A fully connected layer **F6** of 84 layers is connected after the **C5 **layers. At last, we have **10 output** layers obtained from the **F6** layer as the target variable has 10 distinct values. Now let’s look at the implementation of the same architecture in Keras.

# Implementation

Below is the implementation of the entire architecture in Keras using Functional programming.

The entire python code for the same, along with the Jupyter notebook, can be accessed here:

**References:**

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot