The Basics of Neural Networks (Neural Network Series) — Part 1



Original Source Here

The Basics of Neural Networks (Neural Network Series) — Part 1

Neural Networks

An Artificial Neural Network (ANN) or simply a Neural Network(NN) is interconnected layers of small units called nodes that perform mathematical operations to detect patterns in data. NN algorithms are built in a way that mimics how human neurons work (we will cover the connection between the two in the last section of the article).

Definitions

Before we dive deep, below are key terms we will be using when discussing Neural Networks (NNs).

  1. Neuron — This is a basic building block of a NN. It takes weighted values, performs mathematical calculation and produce output. It is also called a unit, node or perceptron.
  2. Input — This is the data/values passed to the neurons.
  3. Deep Neural Network (DNN) — This is an ANN with many hidden layers (layers between the input (first) layer and the output (last) layer).
  4. Weights — These values explain the strength (degree of importance) of the connection between any two neurons.
  5. Bias — is a constant value added to the sum of the product between input values and respective weights. It is used to accelerate or delay the activation of a given node.
  6. Activation function — is a function used to introduce the non-linearity phenomenon into the NN system. This property will allow the network to learn more complex patterns.

Note: weights and biases are the trainable parameters in NN, that is, the network learns patterns by tuning these parameters to get the best predictions.

Artificial Neuron — Mathematical Operation on one Neuron

An artificial neuron takes input values (it can be several) with weights assigned to them. Inside the node, the weighted inputs are summed up, and an activation function is applied to get the results. The output of the node is passed on to the other nodes or, in the case of the last layer of the network, the output is the overall output of the network.

Figure 1: An artificial neuron with n input values (Source: Author).

A single neuron, like the one shown above, performs the following mathematical operation,

Equation 1

In the Equation, four things are happening — input is multiplied with the respective weights and added, bias is added to the result, and then an activation function, g, is applied so that the output of the neuron is g(w·x+b).

Neural Network Design

A Neural Network(NN) is made of several neurons stacked into layers. For ann-dimensional input, the first layer (also called the input layer) will have n nodes and the t-dimensional final/output layer will have t neural units. All intermediate layers are called hidden layers, and the number of layers in a network determines the depth of the model. The Figure below shows a 3–4–4–1 NN.

Figure 2: A Neural network with 3 input features, two hidden layers with 4 nodes each and one-value output. The nodes are densely connected- each node is connected to all the neurons in the immediate previous layer. Each connection has weights expressing the strength of the connection between any two nodes. Every node performs the computation described in Equation 1 except the nodes at the input layer (Source: Author).

Simplified Example

Let’s take a simple example of how a single neuron work. In this example, we assume 3 input values and a bias of 0.

Figure 3: Artificial Neural with 3 input values 2, 1, -4 and the weights 0.8, 0.12 and 0.3, respectively. The bias in this case is set to 0.

In this example, we will consider a commonly used activation function called sigmoid, which is defined as (we will discuss activation functions fully in the coming part of the series)

Sigmoid function with its derivative. The sigmoid f(x) pushes any real value x into the range (0,1). At this moment don’t mind too much about the derivative. We will discuss later.
This is a plot of sigmoid plot. Notice that for a value of x less than -5 or greater than 5 then f(x) approaches 0 and 1, respectively.

As said before, four things are happening inside the neuron. First, the input values are weighted by multiplying the input values with corresponding weights.

First operation: input values are weighted.

Next, is to sum the weighted input then add the bias,

Second operation: sum weighted inputs and the bias

and lastly, sigmoid activation function is applied on the above result

Third operation: apply sigmoid function.

That is it. The output of the neuron is 0.627. If the given neuron is in the hidden layer, then this output becomes the input of the next neuron(s). On the other hand, if this value is the output of the last layer, then it can be interpreted as the final prediction of the model (it can be seen as the probability of a given class).

Important Note: To simplify the mathematical operation done with the neuron, we can use a more compact matrix form of the first two operations. In this case, a dot product operation between the vector of input values and the weights vector will come in handy.

As said earlier, the operation of artificial neuron, the basic building block for the NN is inspired by how the human brain works. In the next section, we will discuss this relationship in detail.

Connection between Biological and Artificial Neuron

The nervous system in the biological brain consists of two categories of cells: neurons and glial cells. Glial cells provide supportive functions to the nervous system. Specifically, the cells are tasked to maintain homeostasis, form a myelin sheath that insulates the nerves, and participate in signal transmission.

A neuron is made up of the cell body, axon, and dendrites. The dendrites are the projections that act as the input to the neuron. It receives electro-chemical information from other neurons and propagates them to the cell body. On the other hand, the axon is a long elongation of the neuron that transports information from the cell body into the other neurons, glands, and muscles. Axon connects to the cell body in a conical projection called the axon hillock. The hillock is responsible for summing the inhibitory and excitatory signals, and if the sum exceeds some threshold, the neuron fires a signal (called an action potential). Two neurons connect at the synapses. The synapses are located at the axon terminal of the first neuron and the dendrites of the second neuron.

Biological (left) and artificial neuron (right).

Artificial neuron

An artificial neuron (also called a unit or a node) mimics the biological neuron in structure and function (in a loose sense — see the next note). The artificial neuron takes several input values (synonymous to the dendrites in the biological neuron) with weights assigned to them (analogous to the role of synapses). Inside the node, the weighted inputs are summed up, and an activation function is applied to get the results. This operation matches the role of the cell body and the axon hillock in the biological neuron. The output of the node is passed on to the other units — an operation that mimics the process of electro-chemical information being passed on from one neuron to another or other parts of the nervous system.

Note: Over the last few years, scientists are beginning to be against describing relationship between ANN with biological counterpart in a direct sense.

“As a matter of fact, artificial neural networks do not resemble their biological counterparts in terms of functioning and scale, howeverwriting they are indeed motivated by the BNNs and several of the terms used to describe artificial neural networks are borrowed from the neuroscience literature” — Aghdam, H.H. and Heravi, E.J. (2017). Guide to convolutional neural networks.

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: