Introduction
The convolutional operation consists of filters, kernel size, padding, and stride followed by activation maps and pooling layers. In the majority of convolution neural network models, we use a kernel size of 3x3 or 7x7 which leads to high model complexity.
The fact that the number of feature mappings frequently rises with the network's depth is a drawback of deep convolutional neural networks. When bigger filter sizes, like 5x5 and 7x7, are utilized, this problem can cause a huge rise in the number of parameters and computation needed.
In opposition to this, a 1x1 convolutional layer that enables channel-wise pooling, also known as feature map pooling or a projection layer, can be utilized to solve this issue. This simple method can be used to reduce the number of feature maps while keeping the important features. To pool features across channels or to enhance the number of feature maps, such as after conventional pooling layers, it can also be used directly to project feature maps at a one-to-one scale.
Table of contents:
- Convolution over channels
- 1X1 Convolution operation
- Benefits 1x1 Convolution operation
- Implementation of 1x1 Convolution using Keras
- Conclusion
- Convolution over channels
The convolutional operation produces an output feature map by applying a smaller filter linearly to a bigger input.
A single number is produced by a filter every time it is applied to an input picture or input feature map. A two-dimensional feature map is produced by applying the filter to the input in a systematic left-to-right and top-to-bottom direction. One feature map is produced for each filter. the depth of the input and the filter must match for a filter to build a feature map with the same number of channels as the input, but regardless of the depth of the input and the filter, the output is always a single number.
For example, if the input image has three channels for red, green, and blue, then a 3x3 filter will be applied to 3x3x3 blocks.
The depth of any convolutional operation only defines the number of filters or units applied to an input image.
Convolutional operation with kernel size 3x3 or 7x7 required a huge amount of computational power owing to its time complexity as it extracts 3x3 patches from the input. While in opposition to this 1x1 kernel size filters are much more efficient.
In the next section, we will see the 1x1 convolution operation in details
- 1X1 Convolution operation
A 1x1 filter, like any other filter, produces a single output value by having a single parameter or weight for each input channel. With input from the same place across all of the input feature maps, the 1x1 filter can behave like a single neuron because of this structure. The output is a feature map that is the same width and height as the input when this single neuron is applied systematically with a stride of one, left to right and top to bottom.
Because the 1x1 filter has a simple process to extract feature maps, it may not be regarded as a convolutional operation since no nearby pixels in the input are involved. Instead, the input is projected or weighted linearly. Additionally, the projection may carry out complex computations on the input feature maps thanks to the usage of a nonlinearity, just like with other convolutional layers.
The input feature maps can be succinctly summarised using this simple 1x1 filter. The number of summaries of the input feature maps can then be adjusted by using multiple 1x1 filters, essentially allowing the depth of the feature maps to be changed as necessary. Therefore, each point in a convolutional neural network can employ a convolutional layer with an 11 filter to regulate the number of feature maps. Due to this, it's frequently referred to as a projection procedure, projection layer, feature map, or channel pooling layer.
- Benefits 1x1 Convolution operation
In CNNs, the number of activation maps produced by a network increases with its depth. If the convolution procedure employs large-sized filters, such as 5x5 or 7x7 filters, which results in a disproportionately high number of parameters, the issue is further aggravated. Spatial downsampling is accomplished by aggregating data along spatial dimensions while continuously reducing the input's height and width. There is a trade-off between downsampling and information loss, even though it preserves significant spatial details to some extent. The bottom line is that we can only partially use pooling.
By providing filter-wise pooling, serving as a projection layer that pools (or projects) information across channels, and enabling dimensionality reduction by reducing the number of filters while maintaining critical, feature-related information, the 1x1 convolution can be used to address this issue.
Moreover, the 1x1 convolution can choose which input filters to look at using the 1x1 convolution, instead of multiplying every adjacent parameter.
- Implementation of 1x1 Convolution using Keras
As we saw in the above section, 1x1 Convolution can be used in dimensionality reduction without trading off information loss. Now let’s look at the implementation of 1x1 convolution using Keras to archieve the same.
In this example, we have used a 3x3 convolutional neural network
# example of a 1x1 filter for dimensionality reduction using keras API
from keras.models import Sequential
from keras.layers import Conv2D
# create model
model = Sequential()
model.add(Conv2D(512, (3,3), padding='same', activation='relu', input_shape=(256, 256, 3))) # 3x3 convolution
model.add(Conv2D(64, (1,1), activation='relu')) # 1x1 convolution
# summarize model
model.summary()
The output of the above code is as below:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 256, 256, 512) 14336
_________________________________________________________________
conv2d_2 (Conv2D) (None, 256, 256, 64) 32832
=================================================================
Total params: 47,168
Trainable params: 47,168
Non-trainable params: 0
_________________________________________________________________
- Conclusion
In this blog, we learned in-depth about how convolutional operation over channels is performed by 1x1 convolutional filters to overcome the problem of information loss to special dimensionality reduction.
We also saw the practical implementation of the 1x1 convolution operation by using keras API.
References: