In this article, we have explored the idea of Reshaping tensors in TensorFlow with tf.reshape() and which models like GoogleNet use it.
Table of contents:
- Usage of reshape in Models
TensorFlow is a module that allows engineers to create and build models by representing sets of data as tensors. A tensor is a generalization of vectors and matrices to potentially higher dimensions. Tensorflow represents tensors as n-dimensional arrays of specified data types. So a single scaler can be represented as a 1x1 matrix, similar to a language like MATLAB).Part of its usefullness is also it's flexibility, to work with different data types using tensors.
During the design ajnd implementation of a model with TensorFlow, certain operations are required. Data(tensors) transform throughout the flow of the model from numerous operations and computations to present useful results for the engineer.
Tensorflow hence has a variety of matrix/tensor operations, example matrix multiplication which uses the MatMul operation in the linalg class of the tensorflow library. However, in this article we will discuss the operation of tf.reshape and see its application in real problems as well.
As you might guess, the tf.reshape() operation is used to change the shape of a tensor. The general definition of the operation is as follows:
tf.reshape(tensor, new_shape, name=None)
What this does is; given a tensor of initial shape, tf.reshape() returns a tensor with the same elements, in the same order, with the same datatype, but with a different arrangement(i.e. shape). Note that the product of the dimensions you will request of the new tensor must be equal to the product of the dimensions of the original tensor, or else you will get an error. Let's explore this.
# Initialize tensors a,b,c a = tf.constant([1, 2, 3, 4, 5, 6], shape=[2, 3]) #returns a 2x3 int32 matrix b = tf.constant([7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18], shape = [4, 3]) #returns a 4x3 int32 matrix c = [[0,4,9,1],[6,3,2,5]] print(a) print(b) print(tf.shape(c).numpy() # Ouput >> # tf.Tensor( # [[1 2 3] # [4 5 6]], shape=(2, 3), dtype=int32) # tf.Tensor( # [[ 7 8 9] # [10 11 12] # [13 14 15] # [16 17 18]], shape=(4, 3), dtype=int32) # [2 4] # Reshaping tensors a, b, c, into tensor d, e, f, g, h d = tf.reshape(a, [6, 1]) e = tf.reshape(b, [6, 2]) f = tf.reshape(c ,[8,-1]) print(d,e,f) # Output >> # tf.Tensor( [     ], # shape=(6, 1), dtype=int32) # tf.Tensor( [[ 7 8] [ 9 10] [11 12] [13 14] [15 16] [17 18]], # shape=(6, 2), dtype=int32) # tf.Tensor( [       ], # shape=(8, 1), dtype=int32) try: g = tf.reshape(b, [2, 3]) except Exception as ge: print(ge) try: h = tf.reshape(c ,[3, 3]) except Exception as he: print(he) # Output >> # Input to reshape is a tensor with 12 values, but the requested shape has 6 [Op:Reshape] # Input to reshape is a tensor with 8 values, but the requested shape has 9 [Op:Reshape]
From the code we observe the conditions in which an error will occur which confirm the earlier limitations I states. Tensors can be reshaped into matching dimensions and dimensions match if the resulting tensor can house the same number of elements as the original tensor. If there are 12 elements the dimensions of the reshaped tensor must have 12 elements as well(i.e. [3,4], [4,3], [12,1], [1,12], [6,2], [2,6]).
If you read through the code, you will notice f was reshaped a little differently. Read it again one more time if you didn't notice it, or read it in the first place. The new shape was initialized to [8, -1] and there was no error. In the case you don't know the matching dimension to make the resulting tensor shape compatible with the original tensor shape, -1 is how we tell TensorFlow to reshape the tensor into 8 rows and the matching columns to allow for the tf.reshape() function to work without errors. In the case of rank-1 tensors, this might not come in handy but for tensors with higher ranks this becomes a very handy tool for efficient coding. See below for another example.
# Initializing i, which is a 2x3x2 rank 3 tensor i = [[[1,9], [2,0], [6,1]], [[4, 12], [5, 3], [6, 5]]] print(tf.shape(i).numpy()) # Reshaping i into j, k, l j = tf.reshape(i, [6, 2]) k = tf.reshape(i, [4, 3, 1]) l = tf.reshape(i, [3, 1, 2, 2]) m = tf.reshape(i, [-1, 6, 2]) print(j) print(k) print(l) print(m) # Output >> # tf.Tensor( # [[ 1 9] [ 2 0] [ 6 1] [ 4 12] [ 5 3][ 6 5]], # shape=(6, 2), dtype=int32) # tf.Tensor( # [[[ 1] [ 9] [ 2]] # [[ 0] [ 6] [ 1]] # [[ 4]  [ 5]] # [[ 3] [ 6] [ 5]]], # shape=(4, 3, 1), dtype=int32) # tf.Tensor( # [[[[ 1 9] # [ 2 0]]] # [[[ 6 1] # [ 4 12]] # [[[ 5 3] # [ 6 5]]]], shape=(3, 1, 2, 2), dtype=int32) # tf.Tensor( # [[[ 1] [ 9] [ 2] [ 0] [ 6] [ 1]] # [[ 4]  [ 5] [ 3] [ 6] [ 5]]], # shape=(2, 6, 1), dtype=int32)
Hopefully you paid keen attention this time, if so, then you will also notice that we can reshape the tensor into a different rank as well. For the operation for l, we reshaped the tensor and turned it into a rank 3 tensor and it still worked. Hence, tf.reshape() does not necessarly retain the rank of a tensor, unless the engineer demands it by code. Also notice in all the examples that tf.reshape() also does not change the order of the elements in the tensor. These features make tf.reshape() a relatively fast operation regardless of how big of a tensor it is reshaping.
Usage of reshape in Models
There are real Machine Learning models that employ the reshaping of tensors in their neural networks. An example is the LeNet-5 model, which we will focus on in this article.
LeNet is a multi-layer convolutional neural network, used for image classification. Infact, in the paper in which it was presented, it was used for text regonsion and to classify text as handwritten or machine printed. Proposed my Yann LeCun, LeNet is a pre-trained model, that incorporates transfer learning. Transfer learning involves training a model on a number of generalized enough dataset and using it on another model. You in essence, "transferring" knowledge you learned from the generalized dataset and applying it on another dataset of similar nature and parameters.
LeNet-5 gained its popularity because of its simple archetecture. It has 5 layers with 60,000 learnable parameters, it has 3 convolution layers, 2 average pooling layers and 2 fully connected layers for softmax classfying. Find below a link to the full code for LeNet and the paper that details it approach and archetecture.
However, for the purpose of this article we will focus on the use of reshape, which you will find in the fully connected layers. Firstly, the size of the layer is initialized by the xavier_init function based on kernel size. In the xavier_init function, the parameters are sorted at random based on the kernel size, and reshaped to fit the size of the fully connected layer. Remember one of the features of reshape, none of our data is lost or mized, the data, the order of data and the type of data remains the same, the shape is the only thing that us altered.
In the fully connected layer itself, during porward propagation, a dot product of the output of the mean pool(for the first fully connected layer), and the first relu function(for the second fully connected layer) and the initialized parameters of the fully connected layer must be calculated. In order for the dot product to be calculated, both tensors(sets of data) must be compatible. This is where reshape comes in, it is used to reshape the input from the previous node(again mean pool for the first fully connected layer and the first relu function for the second fully connected layer), to be compatible with the tensor for the parameters of the fully connected layer for the dot operation to be functional.
Similarly, for back propagation, the dimensions of the output(derivative; change in overall output of the model with respect to that specific node/layer), should match that of the previous layer before it for compatibility sake.
Some models using reshape op are as follows:
- Lenet5 model
cnn.conv(32, 5, 5) cnn.mpool(2, 2) cnn.conv(64, 5, 5) cnn.mpool(2, 2) cnn.reshape([-1, 64 * 7 * 7]) cnn.affine(512)
- AlexNet: Last few layers:
cnn.mpool(3, 3, 2, 2) cnn.reshape([-1, 256 * 6 * 6]) cnn.affine(4096) cnn.dropout() cnn.affine(4096) cnn.dropout()
- InceptionV3 model
InceptionV3 model have 2 reshape ops. One reshape op is at the end of the model and the second reshape op is in the auxillary part.
Last reshape op:
cnn.apool(8, 8, 1, 1, 'VALID') # 8 x 8 x 2048 cnn.reshape([-1, 2048]) # 1 x 1 x 2048
Reshape op in auxillary part:
cnn.conv(128, 1, 1, mode='SAME') cnn.conv(768, 5, 5, mode='VALID', stddev=0.01) cnn.reshape([-1, 768])
- Overfeat model
Overfeat model has one reshape op which comes just after the last pooling op:
cnn.mpool(2, 2) cnn.reshape([-1, 1024 * 6 * 6]) cnn.affine(3072) cnn.dropout() cnn.affine(4096) cnn.dropout()
GoogleNet has one reshape op which is the last op in the model:
inception_v1(cnn, 384, 192, 384, 48, 128, 128) cnn.apool(7, 7, 1, 1, mode='VALID') cnn.reshape([-1, 1024])
- VGG models
VGG models have one reshape op and it comes after the pool op.
cnn.mpool(2, 2) cnn.reshape([-1, 512 * 7 * 7]) cnn.affine(4096) cnn.dropout() cnn.affine(4096) cnn.dropout()
Among the many operations we can use in TensorFlow, tf.reshape() allows us to manipulate the shape and rank of a tensor without changing the individual elements nor the order in which they appear. This makes the function a very dynamic and fast to implement regardless of the size and rank of the initial tensor.
The shape of the resulting tensor is specified by the engineer and must match the shape of the original tensor. The shapes match if they both allow for the same amount of elements. Meaning the product of the dimensions of the resulting tensor and the original tensor must be the same regardless if their ranks. If not you will get an error code.
Many real world models like LeNet-5 use the reshape functions for the compatibility of its operations. The dimensions of the parameters in each layer need to match the input being fed into them. During forward propagation the output from the mean pooling layer needs to be reshaped so that it can match the shape of the parameters in the first fully connected layer and the output of the first relu function must match the shape of the tensor containing the parameters of the second fully connected layer. In back propagation(learning), the size of the derivative of the second fully connected layer needs to match the size of the first relu function and that of the first fully connected layer has to match the size of the mean pooling layer.
Hopefully, from this article at OpenGenus, you have understood the function of reshape, its uses, its nature and also have an intuitive understanding of how it is applied and used in the Lenet-5 model.