Search anything:

Data augmentation Techniques

Binary Tree book by OpenGenus

Open-Source Internship opportunity by OpenGenus for programmers. Apply now.

Reading time: 35 minutes

Data augmentation is the technique of increasing the size of data used for training a model. For reliable predictions, the deep learning models often require a lot of training data, which is not always available. Therefore, the existing data is augmented in order to make a better generalized model.

Although data augmentation can be applied in various domains, it's commonly used in computer vision. Some of the most common data augmentation techniques used for images are:

  • Position augmentation
    • Scaling
    • Cropping
    • Flipping
    • Padding
    • Rotation
    • Translation
    • Affine transformation
  • Color augmentation
    • Brightness
    • Contrast
    • Saturation
    • Hue

Let's go through the above techniques one-by-one and implement them in PyTorch. First, let's define a helper function to plot the images.

import PIL.Image
import matplotlib.pyplot as plt
import torch
from torchvision import transforms

def imshow(img, transform):
    """helper function to show data augmentation
    :param img: path of the image
    :param transform: data augmentation technique to apply"""
    img = PIL.Image.open(img)
    fig, ax = plt.subplots(1, 2, figsize=(15, 4))
    ax[0].set_title(f'original image {img.size}')
    img = transform(img)
    ax[1].set_title(f'transformed image {img.size}')

Position augmentation

In position augmentations, the pixel positions of an image is changed.


In scaling or resizing, the image is resized to the given size e.g. the width of the image can be doubled.

loader_transform = transforms.Resize((140, 140))
imshow('/home/harshit/Pictures/tiger.jpg', loader_transform)



In cropping, a portion of the image is selected e.g. in the given example the center cropped image is returned.

loader_transform = transforms.CenterCrop(140)
imshow('/home/harshit/Pictures/tiger.jpg', loader_transform)



In flipping, the image is flipped horizontally or vertically.

# horizontal flip with probability 1 (default is 0.5)
loader_transform = transforms.RandomHorizontalFlip(p=1)
imshow('/home/harshit/Pictures/tiger.jpg', loader_transform)



In padding, the image is padded with a given value on all sides.

# left, top, right, bottom
loader_transform = transforms.Pad((2, 5, 0, 5))
imshow('/home/harshit/Pictures/tiger.jpg', loader_transform)



The image is rotated randomly in rotation.

loader_transform = transforms.RandomRotation(30)
imshow('/home/harshit/Pictures/tiger.jpg', loader_transform)



In translation, the image is moved either along the x-axis or y-axis.

Affine transformation

The affine transformation preserves points, straight lines, and planes. It can be used for scaling, tranlation, shearing, rotation etc.

# random affine transformation of the image keeping center invariant
loader_transform = transforms.RandomAffine(0, translate=(0.4, 0.5))
imshow('/home/harshit/Pictures/tiger.jpg', loader_transform)


Color augmentation

Color augmentation or color jittering deals with altering the color properties of an image by changing its pixel values.


One way to augment is to change the brightness of the image. The resultant image becomes darker or lighter compared to the original one.


The contrast is defined as the degree of separation between the darkest and brightest areas of an image. The contrast of the image can also be changed.


Saturation is the separation between colors of an image.


Hue can be described of as the shade of the colors in an image.

img = PIL.Image.open('/home/harshit/Pictures/tiger.jpg')
fig, ax = plt.subplots(2, 2, figsize=(16, 10))

# brightness
loader_transform1 = transforms.ColorJitter(brightness=2)
img1 = loader_transform1(img)
ax[0, 0].set_title(f'brightness')
ax[0, 0].imshow(img1)

# contrast
loader_transform2 = transforms.ColorJitter(contrast=2)
img2 = loader_transform2(img)
ax[0, 1].set_title(f'contrast')
ax[0, 1].imshow(img2)

# saturation
loader_transform3 = transforms.ColorJitter(saturation=2)
img3 = loader_transform3(img)
ax[1, 0].set_title(f'saturation')
ax[1, 0].imshow(img3)
fig.savefig('color augmentation', bbox_inches='tight')

# hue
loader_transform4 = transforms.ColorJitter(hue=0.2)
img4 = loader_transform4(img)
ax[1, 1].set_title(f'hue')
ax[1, 1].imshow(img4)

fig.savefig('color augmentation', bbox_inches='tight')



The color image can be converted into grayscale for augmentation.

loader_transform = transforms.Grayscale()
imshow('/home/harshit/Pictures/tiger.jpg', loader_transform)


Advanced methods

  • The Generative Adversarial Networks are used to generate new samples of images. The generated new samples can also be augmented to the training set.

  • Neural Style transfer is used to combine the content of one image with the style of another. Though fairly new compared to the classical methods, these methods can also be used for data augmentation.


The above mentioned data augmentation techniques are often applied in combination e.g. cropping after resizing. Also, note that data augmentation is only applied on the training set, not on the testing set.

# random rotation > resizing > cropping > flipping
loader_transform = transforms.Compose([
imshow('/home/harshit/Pictures/tiger.jpg', loader_transform)


Data augmentation not only helps in increasing the size of the training set but also in avoiding overfitting. By increasing the size of data and adding diversity in data, data augmentation helps the model generalize better, hence preventing overfitting.


Which of the follwing methods can't be used for data augmentation?

Affine transformations
Random cropping
Generative Adversarial Networks
The affine transformations, cropping and GANs can be used for data augmentation. Cross-validation, on the other hand, is a method to reduce overfitting, not for data augmentation.
Data augmentation Techniques
Share this