Types of Data Formats in Machine Learning
Do not miss this exclusive book on Binary Tree Problems. Get it now for free.
Reading time: 20 minutes
Each data format represents how the input data is represented in memory. This is important as each machine learning application performs well for a particular data format and worse for others. Interchanging between various data formats and chosing the correct format is a major optimization technique. For example, TensorFlow is built around NHWC format while MKLDNN is built around NCHW data format.
There are four types of data formats:
- NHWC
- NCHW
- NCDHW
- NDHWC
Note that one can work with other data format combinations as well but the above four are the commonly used ones.
General guidance
Each letter in the formats denotes a particular aspect/ dimension of the data:
- N: Batch size : is the number of images passed together as a group for inference
- C: Channel : is the number of data components that make a data point for the input data. It is 3 for opaque images and 4 for transparent images.
- H: Height : is the height/ measurement in y axis of the input data
- W: Width : is the width/ measurement in x axis of the input data
- D: Depth : is the depth of the input data
Points to consider
-
Operations like convolutions and gradients operate channel-wise which will benefit if memory is contiguous with respect to pixels within a channel.
-
Languages like Python and library like Numpy use a default memory layout of row-major this means that the 'correct' memory layout is CHW for an image.
-
In case of batches, it makes sense if the memory for an image is contiguous which results in NCHW
-
Column-major is the default layout in Fortan and Matlab in which case, the ordering should be HWC which is the case in most image loading software.
NHWC
NHWC denotes (Batch size, Height, Width, Channel). This means there is a 4D array where the first dimension represents batch size and accordingly. This 4D array is laid out in memory in row major order.
Hence, you can visualize the memory layout to imagine which operations will access consequetive memory (fast) or memory separated by other data (slow).
Commonly used data: Images
Software: TensorFlow
NCHW
NCHW denotes (Batch size, Channel, Height, Width). This means there is a 4D array where the first dimension represents batch size and accordingly. This 4D array is laid out in memory in row major order.
Commonly used data: Images
Software: MKLDNN
NCDHW
NCHW denotes (Batch size, Channel, Depth, Height, Width). This means there is a 5D array where the first dimension represents batch size and accordingly. This 5D array is laid out in memory in row major order.
Commonly used data: Video
Software: MKLDNN
NDHWC
NCHW denotes (Batch size, Depth, Height, Width, Channel). This means there is a 5D array where the first dimension represents batch size and accordingly. This 5D array is laid out in memory in row major order.
Commonly used data: Video
Software: TensorFlow
Sign up for FREE 3 months of Amazon Music. YOU MUST NOT MISS.