4 Key Design Considerations for Your Neural Network Model


Original Source Here

4 Key Design Considerations for Your Neural Network Model

From the layout of layers to optimization rules, pay attention to these elements.

Photo by Martin Adams on Unsplash

Artificial neural networks (ANNs) are a commonly used tool in deep learning. In this earlier tutorial, you can learn what they are, learn their basic structure, and code a simple neural network with only one neuron. When you design your own neural networks, there are a number of considerations to take into account. This article will describe a few.

Layout of Network Layers

Rarely if ever are neural networks as simple as one neuron. For example, the majority of neural networks have at least several layers. You will need hidden layers, layers of artificial neurons between the input and output neuron layers, if the data are to be separated in a non-linear fashion.

You might want to think about each hidden neuron as a linear classifier. The number of neurons in the first hidden layer should equal the number of lines you would need to draw to classify your data. The later hidden layers and the output layer connect the various linear classifiers.

Activation Functions

In neural networks, an activation function is a function that determines the artificial neuron’s output, given specific inputs. In my earlier tutorial, we used a sigmoid function, which has the advantage of forcing outputs to be within a specific range. Another advantage is that a sigmoid function is monotonic — in other words, the value order of inputs is the same as the value order of outputs. A disadvantage of sigmoid functions is that especially where the sigmoid curve is relatively flat, learning can be slow.

Another popular type of activation function is called the rectified linear unit (ReLU). The value of this function is simply 0 if x is less than 0. Otherwise, it is x. ReLUs enable a faster learning process, even if they create an arbitrary distinction between negative and positive values of x. In their advantages and disadvantages, hyperbolic tangent or tanh activation functions tend to be a happy medium between sigmoid and ReLU.

The hyperbolic sine, cosine, and tangent functions. Image from Fylwind on Wikipedia

Loss Functions

Loss is merely the prediction error of the neural network, determined in each pass-through. Ideally, it should be minimized. The loss function, or the function of loss against output and predicted value, is used to update the weights of the neural network for the next pass-through. The calculation of the new weights is based in some way on the gradient, a function representing the slope of the loss function at each point. Different types of loss functions should be used for different types of regression or classification tasks, as described in more detail here.

Optimization Rules

An optimizer is an algorithm or other method which will update the attributes of the neural network, in order to minimize the losses. For example, it can account for the history of gradient updates, rather than only updating the gradient from a single set — or batch — of data samples. It may incorporate momentum — in other words, the newest update will be the weighted average of all previous updates, with the older weights decayed exponentially.


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: