Table of contents:
- Introduction to Pooling operations
- MaxPool: Maximum Pooling
- Advantage of using MaxPool
- Use & Alternatives of MaxPool
Let us get started MaxPool.
Introduction to Pooling operations
Pooling layer is an important building block of a Convolutional Neural Network. Max pooling and Average Pooling layers are some of the most popular and most effective layers. We shall learn which of the two will work the best for you!
Pooling layers are a part of Convolutional Neural Networks (CNNs). What makes CNNs different is that unlike regular neural networks they work on volumes of data.
Inputs are multichanneled images. Eg. RGB valued images have three channels
Features from such images are extracted by means of convolutional layers.
To know which pooling layer works the best, you must know how does pooling help.
Convolutional layers represent the presence of features in an input image. But they present a problem, they're sensitive to location of features in the input.
This gives us specific data rather than generalised data, deepening the problem of overfitting and doesn't deliver good results for data outside the training set.
So we need to generalise the presence of features. This is done by means of pooling layers.
Here, we need to select a pooling layer.
In this article we deal with Max Pooling layer and Average Pooling layer.
Max Pooling - The feature with the most activated presence shall shine through.
Average Pooling - The Average presence of features is reflected.
As you may observe above, the max pooling layer gives more sharp image, focused on the maximum values, which for understanding purposes may be the intensity of light here whereas average pooling gives a more smooth image retaining the essence of the features in the image.
MaxPool: Maximum Pooling
The algorithm of 2D MaxPool is:
- Input: 2D image IN of size NxN, a kernel KxK
- Define Output of size N-K+1 x N-K+1
- For every sub-matrix S1 of size KxK in IN:
3.1. Find maximum element in S1 say M1
3.2. Let top leftmost element has index (i, j)
3.3. Set output at index (i, j) to be M1
Similarly, MaxPool can be done on 3D and 4D input data as well.
The most common kernel size KxK is 3x3.
Pooling with the maximum, as the name suggests, it retains the most prominent features of the feature map.
Below is an example of the same, using Keras library:
import numpy as np from keras.models import Sequential from keras.layers import MaxPooling2D import matplotlib.pyplot as plt # define input image image = np.array([[3, 7, 2, 2, 1, 5, 6, 8, 2, 0, 5], [1, 6, 4, 9, 3, 3, 2, 2, 6, 1, 8], [4, 2, 5, 8, 1, 9, 9, 2, 9, 3, 3], [6, 2, 1, 3, 1, 2, 3, 4, 5, 6, 7], [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 1], [2, 4, 6, 8, 2, 4, 6, 8, 2, 4, 6], [5, 1, 1, 9, 3, 5, 7, 7, 2, 3, 4], [0, 0, 2, 9, 4, 6, 5, 5, 1, 1, 2], [8, 5, 4, 3, 2, 1, 8, 5, 4, 3, 2], [1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1], [0, 1, 9, 8, 7, 6, 5, 4, 3, 2, 1]]) #for pictorial representation of the image plt.imshow(image, cmap="gray") plt.show() image = image.reshape(1, 11, 11, 1) # define model containing just a single average pooling layer model = Sequential( [MaxPooling2D (pool_size = 2, strides = 2)]) # generate pooled output output = model.predict(image) # print output image matrix output = np.squeeze(output) print(output) # print output image plt.imshow(output, cmap="gray") plt.title('Average pooling') plt.show()
You may observe the greatest values from 2x2 blocks retained. This is maximum pooling, only the largest value is kept.
[[7. 9. 5. 8. 6.] [6. 8. 9. 9. 9.] [4. 8. 6. 8. 9.] [5. 9. 6. 7. 3.] [8. 4. 6. 8. 9.]]
The matrix used in this coding example represents grayscale image of blocks as visible below.
Advantage of using MaxPool
While selecting a layer you must be well versed with:
- Your data
- How does pooling work, and how is it beneficial for your data set.
The properties of MaxPool are:
- MaxPool rejects a big chunk of data.
- MaxPool extracts only the most salient features of the data.
- MaxPool restricts the CNN network to only the very important features, and might miss out in some details.
- Max pooling works better for darker backgrounds and can thus highly save computation cost.
Hence, Choice of pooling method is dependent on the expectations from the pooling layer and the CNN
Type of image
You may observe by above two cases, same kind of image, by exchanging foreground and background brings a drastic impact on the effectiveness of the output of the max pooling layer, whereas the average pooling maintains its smooth and average character.
Max pooling worked really well for generalising the line on the black background, but the line on the white background disappeared totally!
Average pooling can save you from such drastic effects, but if the images are having a similar dark background, maxpooling shall be more effective.
Max pooling works better for darker backgrounds and can thus highly save computation cost whereas average pooling shows a similar effect irrespective of the background
Use & Alternatives of MaxPool
MaxPool is a widely used Pooling operation in Convolution Neural Networks for example:
- ResNet50 variants like ResNet50, ResNet50v1.5
Alternatives of MaxPool involve:
- AvgPool (a widely used alternative and used in models like MobileNetV1)
- MinimumPool (not used in production)
With this article at OpenGenus, you must have a strong idea of MaxPool which is one of the most important layers in CNN models.