MaxPool

In this article, we have explored MaxPool which is the most commonly used Pooling operation in CNN models. It refers to Maximum Pooling and is used in ResNet50 models.

Table of contents:

Introduction to Pooling operations
MaxPool: Maximum Pooling
Advantage of using MaxPool
Use & Alternatives of MaxPool

Let us get started MaxPool.

Introduction to Pooling operations

Pooling layer is an important building block of a Convolutional Neural Network. Max pooling and Average Pooling layers are some of the most popular and most effective layers. We shall learn which of the two will work the best for you!

Pooling layers are a part of Convolutional Neural Networks (CNNs). What makes CNNs different is that unlike regular neural networks they work on volumes of data.
Inputs are multichanneled images. Eg. RGB valued images have three channels
Features from such images are extracted by means of convolutional layers.
To know which pooling layer works the best, you must know how does pooling help.

Learn more about the purpose of each operation of a Machine Learning model

Convolutional layers represent the presence of features in an input image. But they present a problem, they're sensitive to location of features in the input.

This gives us specific data rather than generalised data, deepening the problem of overfitting and doesn't deliver good results for data outside the training set.
So we need to generalise the presence of features. This is done by means of pooling layers.

Here, we need to select a pooling layer.
In this article we deal with Max Pooling layer and Average Pooling layer.

Max Pooling - The feature with the most activated presence shall shine through.
Average Pooling - The Average presence of features is reflected.

PoolingEg1

As you may observe above, the max pooling layer gives more sharp image, focused on the maximum values, which for understanding purposes may be the intensity of light here whereas average pooling gives a more smooth image retaining the essence of the features in the image.

MaxPool: Maximum Pooling

The algorithm of 2D MaxPool is:

Input: 2D image IN of size NxN, a kernel KxK
Define Output of size N-K+1 x N-K+1
For every sub-matrix S1 of size KxK in IN:
3.1. Find maximum element in S1 say M1
3.2. Let top leftmost element has index (i, j)
3.3. Set output at index (i, j) to be M1

Similarly, MaxPool can be done on 3D and 4D input data as well.

The most common kernel size KxK is 3x3.

Pooling with the maximum, as the name suggests, it retains the most prominent features of the feature map.

Below is an example of the same, using Keras library:

import numpy as np 
from keras.models import Sequential 
from keras.layers import MaxPooling2D  
import matplotlib.pyplot as plt
  
# define input image 
image = np.array([[3, 7, 2, 2, 1, 5, 6, 8, 2, 0, 5], 
                  [1, 6, 4, 9, 3, 3, 2, 2, 6, 1, 8], 
                  [4, 2, 5, 8, 1, 9, 9, 2, 9, 3, 3], 
                  [6, 2, 1, 3, 1, 2, 3, 4, 5, 6, 7],
                  [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 1],
                  [2, 4, 6, 8, 2, 4, 6, 8, 2, 4, 6],
                  [5, 1, 1, 9, 3, 5, 7, 7, 2, 3, 4],
                  [0, 0, 2, 9, 4, 6, 5, 5, 1, 1, 2],
                  [8, 5, 4, 3, 2, 1, 8, 5, 4, 3, 2],
                  [1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1],
                  [0, 1, 9, 8, 7, 6, 5, 4, 3, 2, 1]]) 

#for pictorial representation of the image
plt.imshow(image, cmap="gray")
plt.show()

image = image.reshape(1, 11, 11, 1) 
  
# define model containing just a single average pooling layer 
model = Sequential( 
    [MaxPooling2D (pool_size = 2, strides = 2)]) 
  
# generate pooled output 
output = model.predict(image) 
  
# print output image matrix
output = np.squeeze(output) 
print(output) 

# print output image 
plt.imshow(output, cmap="gray")
plt.title('Average pooling')
plt.show()

Output Matrix

You may observe the greatest values from 2x2 blocks retained. This is maximum pooling, only the largest value is kept.

[[7. 9. 5. 8. 6.]
 [6. 8. 9. 9. 9.]
 [4. 8. 6. 8. 9.]
 [5. 9. 6. 7. 3.]
 [8. 4. 6. 8. 9.]]

The transition

The matrix used in this coding example represents grayscale image of blocks as visible below.

MaxPool

Advantage of using MaxPool

While selecting a layer you must be well versed with:

Your data
How does pooling work, and how is it beneficial for your data set.

The properties of MaxPool are:

MaxPool rejects a big chunk of data.
MaxPool extracts only the most salient features of the data.
MaxPool restricts the CNN network to only the very important features, and might miss out in some details.
Max pooling works better for darker backgrounds and can thus highly save computation cost.

Hence, Choice of pooling method is dependent on the expectations from the pooling layer and the CNN

Type of image

Pooling-Eg2

You may observe by above two cases, same kind of image, by exchanging foreground and background brings a drastic impact on the effectiveness of the output of the max pooling layer, whereas the average pooling maintains its smooth and average character.

Max pooling worked really well for generalising the line on the black background, but the line on the white background disappeared totally!
Average pooling can save you from such drastic effects, but if the images are having a similar dark background, maxpooling shall be more effective.

Max pooling works better for darker backgrounds and can thus highly save computation cost whereas average pooling shows a similar effect irrespective of the background

Use & Alternatives of MaxPool

MaxPool is a widely used Pooling operation in Convolution Neural Networks for example:

ResNet50 variants like ResNet50, ResNet50v1.5
GoogleNet

Alternatives of MaxPool involve:

AvgPool (a widely used alternative and used in models like MobileNetV1)
MinimumPool (not used in production)

Find Differences between MaxPool and AvgPool.

With this article at OpenGenus, you must have a strong idea of MaxPool which is one of the most important layers in CNN models.