MobileNetV2 architecture
We have explored MobileNet V2 architecture in depth. MobileNet V2 model has 53 convolution layers and 1 AvgPool with nearly 350 GFLOP. It has two main components:
- Inverted Residual Block
- Bottleneck Residual Block
There are two types of Convolution layers in MobileNet V2 architecture:
- 1x1 Convolution
- 3x3 Depthwise Convolution
These are the two different components in MobileNet V2 model:

Each block has 3 different layers:
- 1x1 Convolution with Relu6
- Depthwise Convolution
- 1x1 Convolution without any linearity
There are Stride 1 Blocks and Stride 2 Blocks. The internal components of the two blocks are as follows:

Stride 1 Block:
- Input
- 1x1 Convolution with Relu6
- Depthwise Convolution with Relu6
- 1x1 Convolution without any linearity
- Add
Stride 2 Block:
- Input
- 1x1 Convolution with Relu6
- Depthwise Convolution with stride=2 and Relu6
- 1x1 Convolution without any linearity
Layers in MobileNetV2
| # | Op | Expansion | Repeat |
| 1 | Convolution | - | 1 |
| 2 | Bottleneck | 1 | 1 |
| 3 | Bottleneck | 6 | 2 |
| 4 | Bottleneck | 6 | 3 |
| 5 | Bottleneck | 6 | 4 |
| 6 | Bottleneck | 6 | 3 |
| 7 | Bottleneck | 6 | 3 |
| 8 | Bottleneck | 6 | 1 |
| 9 | Convolution | - | 1 |
| 10 | AvgPool | - | 1 |
| 11 | Convolution | - | 1 |
Bottleneck is either Inverted Residual Block or Bottleneck Residual Block or Stride 1 or Stride 2 block.
Convolutions in MobileNetV2
Following is the list of the 53 Convolution layers in MobileNetV2 architecture with details of different parameters like Input height, Input width, Kernel height and more:
| # Conv | Input H/W | Input C | Kernel H/W | Stride H/W | Padding H/W | Output H/W | Output C |
| 1 | 224 | 3 | 3 | 2 | 0 | 112 | 32 |
| 2 | 112 | 32 | 3 | 1 | 1 | 112 | 32 |
| 3 | 112 | 32 | 1 | 1 | 0 | 112 | 16 |
| 4 | 112 | 16 | 1 | 1 | 0 | 112 | 96 |
| 5 | 112 | 96 | 3 | 2 | 0 | 56 | 96 |
| 6 | 56 | 96 | 1 | 1 | 0 | 56 | 24 |
| 7 | 56 | 24 | 1 | 1 | 0 | 56 | 144 |
| 8 | 56 | 144 | 3 | 1 | 1 | 56 | 144 |
| 9 | 56 | 144 | 1 | 1 | 0 | 56 | 24 |
| 10 | 56 | 24 | 1 | 1 | 0 | 56 | 144 |
| 11 | 56 | 144 | 3 | 2 | 0 | 28 | 144 |
| 12 | 28 | 144 | 1 | 1 | 0 | 28 | 32 |
| 13 | 28 | 32 | 1 | 1 | 0 | 28 | 192 |
| 14 | 28 | 192 | 3 | 1 | 1 | 28 | 192 |
| 15 | 28 | 192 | 1 | 1 | 0 | 28 | 32 |
| 16 | 28 | 32 | 1 | 1 | 0 | 28 | 192 |
| 17 | 28 | 192 | 3 | 1 | 1 | 28 | 192 |
| 18 | 28 | 192 | 1 | 1 | 0 | 28 | 32 |
| 19 | 28 | 32 | 1 | 1 | 0 | 28 | 192 |
| 20 | 28 | 192 | 3 | 2 | 0 | 14 | 192 |
| 21 | 14 | 192 | 1 | 1 | 0 | 14 | 64 |
| 22 | 14 | 64 | 1 | 1 | 0 | 14 | 384 |
| 23 | 14 | 384 | 3 | 1 | 1 | 14 | 384 |
| 24 | 14 | 384 | 1 | 1 | 0 | 14 | 64 |
| 25 | 14 | 64 | 1 | 1 | 0 | 14 | 384 |
| 26 | 14 | 384 | 3 | 1 | 1 | 14 | 384 |
| 27 | 14 | 384 | 1 | 1 | 0 | 14 | 64 |
| 28 | 14 | 64 | 1 | 1 | 0 | 14 | 384 |
| 29 | 14 | 384 | 3 | 1 | 1 | 14 | 384 |
| 30 | 14 | 384 | 1 | 1 | 0 | 14 | 64 |
| 31 | 14 | 64 | 1 | 1 | 0 | 14 | 384 |
| 32 | 14 | 384 | 3 | 1 | 1 | 14 | 384 |
| 33 | 14 | 384 | 1 | 1 | 0 | 14 | 96 |
| 34 | 14 | 96 | 1 | 1 | 0 | 14 | 576 |
| 35 | 14 | 576 | 3 | 1 | 1 | 14 | 576 |
| 36 | 14 | 576 | 1 | 1 | 0 | 14 | 96 |
| 37 | 14 | 96 | 1 | 1 | 0 | 14 | 576 |
| 38 | 14 | 576 | 3 | 1 | 1 | 14 | 576 |
| 39 | 14 | 576 | 1 | 1 | 0 | 14 | 96 |
| 40 | 14 | 96 | 1 | 1 | 0 | 14 | 576 |
| 41 | 14 | 576 | 3 | 2 | 0 | 7 | 576 |
| 42 | 7 | 576 | 1 | 1 | 0 | 7 | 160 |
| 43 | 7 | 160 | 1 | 1 | 0 | 7 | 960 |
| 44 | 7 | 960 | 3 | 1 | 1 | 7 | 960 |
| 45 | 7 | 960 | 1 | 1 | 0 | 7 | 160 |
| 46 | 7 | 160 | 1 | 1 | 0 | 7 | 960 |
| 47 | 7 | 960 | 3 | 1 | 1 | 7 | 960 |
| 48 | 7 | 960 | 1 | 1 | 0 | 7 | 160 |
| 49 | 7 | 160 | 1 | 1 | 0 | 7 | 960 |
| 50 | 7 | 960 | 3 | 1 | 1 | 7 | 960 |
| 51 | 7 | 960 | 1 | 1 | 0 | 7 | 320 |
| 52 | 7 | 320 | 1 | 1 | 0 | 7 | 1280 |
| 53 | 1 | 1280 | 1 | 1 | 0 | 1 | 1001 |
The parameters of each Convolution layer in order are:
- Input Height and width
- Input Channel
- Kernel Height and Width
- Stride Height/ Width
- Padding Height/ Width
- Output Height/ Width
- Output Channel
With this, you have the complete idea about the architecture of MobileNetV2 model. Enjoy.