Unpacking DenseNet to understand and then creating using TensorFlow

Original Source Here

Unpacking DenseNet to understand and then creating using TensorFlow

What’s DenseNet? It’s a network which is Dense. That’s it over, bye.

Kidding. But the truth is not far away from it. Densenet is an abbreviation for Densely Connected Convolutional Networks. The Densenet has a block in which multiple convolution layers are connected to each other. Every i layer in the block is connected to its all successive layer i.e. i+1, i+2,… till the last. Such type of connection is called Residual Network.

The authors of the paper used Residual Network in the block but to further improve the information flow between layers we propose a different connectivity pattern: introduce direct connections from any layer to all subsequent layers. The core idea behind it is feature reuse, which leads to very compact models. As a result it requires fewer parameters than other CNNs, as there are no repeated feature-maps. Milking till the features run dry!!!

The first convolution block has 64 filters of size 7×7 and a stride of 2. It is followed by a MaxPooling layer with 3×3 max pooling and a stride of 2. These two lines can be represented with the following code in python.

Densenet have two main blocks — Dense Block and Transition Layer.

`    input = Input(input_shape)    x = Conv2D(64, 7, strides=2, padding="same")(input)    x = BatchNormalization()(x)    x = ReLU()(x)    x = MaxPool2D(3, strides=2, padding="same")(x)`

Dense Block has 2 parts. A 1×1 convolution block and a 3×3 convolution block. The concatenation with all the previous feature maps may result in memory explosion. The 1×1 convolution keeps the memory explosion in check.

First lets define the convolution block in Python.

`def conv_layer(x, filters, kernel=1, strides=1):    x = BatchNormalization()(x)    x = ReLU()(x)    x = Conv2D(filters, kernel, strides=strides, padding="same")(x)return x`

Now coming to the Dense Block. In this block, there’s a parameter that decides the depth of the block. It’s called growth factor. The growth factor is tunable, the default value is 32. The growth factor is just the number of filters. The 1×1 block has 4 times the number of filters. Followed by it is the 3×3 convolution. And then comes the main highlight of the paper, each layer concatenated to its successive one. Here, the output is concatenated to the input using TensorFlow function.

Each dense block is repeated n times, depending upon the image size. To implement this we will create a list of repetitions and apply a for loop on it. The repetitions are [6, 12, 24, 16].

`def dense_block(x, repetition,filters):for _ in range(repetition):      y = conv_layer(x, 4 * filters)      y = conv_layer(y, filters, 3)      x = concatenate([y, x])return x`

The output of dense block is passed of to the Transition block. There are a 1×1 convolutional layer and a 2×2 average pooling layer with a stride of 2 (downsizes the image). kernel size of 1×1 is already set in the function, so we do not explicitly need to define it again. In the transition layers, we have to remove channels to half of the existing channels.

`def transition_layer(x):    x = conv_layer(x, x.shape[-1]/ 2)    x = AvgPool2D(2, strides=2, padding="same")(x)return x`

Complete DenseNet 121 architecture:

Now that we have all the blocks together, let’s merge them to see the entire Densenet 121 architecture.

Complete DenseNet 121 architecture:

`def conv_layer(x, filters, kernel=1, strides=1):    x = BatchNormalization()(x)    x = ReLU()(x)    x = Conv2D(filters, kernel, strides=strides, padding="same")(x)return xdef dense_block(x, repetition, filters):for _ in range(repetition):      y = conv_layer(x, 4 * filters)      y = conv_layer(y, filters, 3)      x = concatenate([y, x])return xdef transition_layer(x):    x = conv_layer(x, x.shape[-1]/ 2)    x = AvgPool2D(2, strides=2, padding="same")(x)return xdef densenet(input_shape, n_classes, filters=32):    input = Input(input_shape)    x = Conv2D(64, 7, strides=2, padding="same")(input)    x = BatchNormalization()(x)    x = ReLU()(x)    x = MaxPool2D(3, strides=2, padding="same")(x)    for repetition in [6, 12, 24, 16]:       d = dense_block(x, repetition,filters)       x = transition_layer(d)     x = GlobalAveragePooling2D()(d)     output = Dense(n_classes, activation="softmax")(x)     model = Model(input, output)return model`

And that’s how we can implement the Densenet 121 architecture using TensorFlow.

To see the code in a much more nicer presentable way, checkout the code on github.

References:

Gao Huang and Zhuang Liu and Laurens van der Maaten and Kilian Q. Weinberger, Densely Connected Convolutional Networks, arXiv 1608.06993 (2016)

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot