Original Source Here
8 — Artificial Noise: As we are using custom dataset i.e. no noisy images, we have to add some artificial noise to our images. We will generate many noisy images from one images, yes we are kind of image augmenting but also doing sort of feature engineering:
- Full Noisy Image: We simple add noise to our image, doing this will make the model handle full noisy images.
- Partly Noisy: We apply noisy to some part of images , doing this will make the model handle partly noisy images.
- Pixel Modification: In this technique we take a small random sample of pixels and replace then them with 0 and 1 i.e. black and white or add some noise.
9 — Auto-Encoders:· Auto-Encoder: It is a neural network which sets the target values to be equal to the inputs. Autoencoders are used to reduce the size of our inputs into a smaller representation. If anyone needs the original data, they can reconstruct it from the compressed data. Sample architecture.
- About Layers: Let me explain about each layer in details.
i. Input Layer: The input layer is the input of the whole CNN. In the neural network of image processing, it generally represents the pixel matrix of the image. It is a major layer in whole NN without it there will be no flow of data into our NN.
ii. Convolution Layer: The layer takes an input matrix/tensor X and do an elementwise multiplication with another matrix K (or kernel) where the values of K are initialized as zeros, uniform or any random distribution, K will always updated while training the model using backpropagation until loss is minimum.
iii. Max Pool: Max Pooling is a convolution process where the Kernel extracts the maximum value of the area it convolves. Max Pooling simply says to the CNN that we will carry forward only that information, if that is the largest information available amplitude wise.
iv. Up-sampling: The up-sampling layer is a simple layer with no weights that will double the dimensions of input. This kind of layer are very useful when rebuilding the image/data followed by a traditional convolutional layer.
10 — UNET: The architecture is symmetric and consists of two major parts, the left part is called contracting path, which is constituted by the general convolutional process; the right part is expansive path, which is constituted by transposed convolutional layers. We can use segmentation_models library to build this model.
The Architecture: It consists of the repeated application of two 3×3 convolutions (unpadded convolutions), each followed by a rectified linear unit (ReLU) and a 2×2 max pooling operation with stride 2 for downsampling. At each downsampling step we double the number of feature channels. Every step in the expansive path consists of an upsampling of the feature map followed by a 2×2 convolution (“up-convolution”) that halves the number of feature channels, a concatenation with the correspondingly cropped feature map from the contracting path, and two 3×3 convolutions, each followed by a ReLU. The cropping is necessary due to the loss of border pixels in every convolution. At the final layer a 1×1 convolution is used to map each 64- component feature vector to the desired number of classes. In total the network has 23 convolutional layers. To allow a seamless tiling of the output segmentation map , it is important to select the input tile size such that all 2×2 max-pooling operations are applied to a layer with an even x- and y-size.
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot