MobileNetV2 architecture

Get FREE domain for 1st year and build your brand new site

Free Linux Book

We have explored MobileNet V2 architecture in depth. MobileNet V2 model has 53 convolution layers and 1 AvgPool with nearly 350 GFLOP. It has two main components:

  • Inverted Residual Block
  • Bottleneck Residual Block

There are two types of Convolution layers in MobileNet V2 architecture:

  • 1x1 Convolution
  • 3x3 Depthwise Convolution

These are the two different components in MobileNet V2 model:

conv_mobilenet_v2

Each block has 3 different layers:

  • 1x1 Convolution with Relu6
  • Depthwise Convolution
  • 1x1 Convolution without any linearity

There are Stride 1 Blocks and Stride 2 Blocks. The internal components of the two blocks are as follows:

stride_block_mobilenet

Stride 1 Block:

  • Input
  • 1x1 Convolution with Relu6
  • Depthwise Convolution with Relu6
  • 1x1 Convolution without any linearity
  • Add

Stride 2 Block:

  • Input
  • 1x1 Convolution with Relu6
  • Depthwise Convolution with stride=2 and Relu6
  • 1x1 Convolution without any linearity

Layers in MobileNetV2

# Op Expansion Repeat
1 Convolution - 1
2 Bottleneck 1 1
3 Bottleneck 6 2
4 Bottleneck 6 3
5 Bottleneck 6 4
6 Bottleneck 6 3
7 Bottleneck 6 3
8 Bottleneck 6 1
9 Convolution - 1
10 AvgPool - 1
11 Convolution - 1

Bottleneck is either Inverted Residual Block or Bottleneck Residual Block or Stride 1 or Stride 2 block.

Convolutions in MobileNetV2

Following is the list of the 53 Convolution layers in MobileNetV2 architecture with details of different parameters like Input height, Input width, Kernel height and more:

# Conv Input H/W Input C Kernel H/W Stride H/W Padding H/W Output H/W Output C
1 224 3 3 2 0 112 32
2 112 32 3 1 1 112 32
3 112 32 1 1 0 112 16
4 112 16 1 1 0 112 96
5 112 96 3 2 0 56 96
6 56 96 1 1 0 56 24
7 56 24 1 1 0 56 144
8 56 144 3 1 1 56 144
9 56 144 1 1 0 56 24
10 56 24 1 1 0 56 144
11 56 144 3 2 0 28 144
12 28 144 1 1 0 28 32
13 28 32 1 1 0 28 192
14 28 192 3 1 1 28 192
15 28 192 1 1 0 28 32
16 28 32 1 1 0 28 192
17 28 192 3 1 1 28 192
18 28 192 1 1 0 28 32
19 28 32 1 1 0 28 192
20 28 192 3 2 0 14 192
21 14 192 1 1 0 14 64
22 14 64 1 1 0 14 384
23 14 384 3 1 1 14 384
24 14 384 1 1 0 14 64
25 14 64 1 1 0 14 384
26 14 384 3 1 1 14 384
27 14 384 1 1 0 14 64
28 14 64 1 1 0 14 384
29 14 384 3 1 1 14 384
30 14 384 1 1 0 14 64
31 14 64 1 1 0 14 384
32 14 384 3 1 1 14 384
33 14 384 1 1 0 14 96
34 14 96 1 1 0 14 576
35 14 576 3 1 1 14 576
36 14 576 1 1 0 14 96
37 14 96 1 1 0 14 576
38 14 576 3 1 1 14 576
39 14 576 1 1 0 14 96
40 14 96 1 1 0 14 576
41 14 576 3 2 0 7 576
42 7 576 1 1 0 7 160
43 7 160 1 1 0 7 960
44 7 960 3 1 1 7 960
45 7 960 1 1 0 7 160
46 7 160 1 1 0 7 960
47 7 960 3 1 1 7 960
48 7 960 1 1 0 7 160
49 7 160 1 1 0 7 960
50 7 960 3 1 1 7 960
51 7 960 1 1 0 7 320
52 7 320 1 1 0 7 1280
53 1 1280 1 1 0 1 1001

The parameters of each Convolution layer in order are:

  • Input Height and width
  • Input Channel
  • Kernel Height and Width
  • Stride Height/ Width
  • Padding Height/ Width
  • Output Height/ Width
  • Output Channel

With this, you have the complete idea about the architecture of MobileNetV2 model. Enjoy.