SGEMM

Open-Source Internship opportunity by OpenGenus for programmers. Apply now.

In this article, we have covered SGEMM (Single Precision General Matrix Multiplication) which is a standard library function for Matrix Multiplication and a variant of GEMM.

Table of contents:

Introduction to SGEMM
Use of SGEMM
Difference between sgemm and other gemm functions

Introduction to SGEMM

SGEMM stands for "Single Precision General Matrix Multiplication".

It is a standard gemm routine in BLAS and BLIS libraries like OpenBLAS and is used to do Matrix Multiplication. It performs the standard GEMM operation that is Matrix Matrix multiplication with the matrices being of datatype FLOAT 32 bits.

The API of SGEMM is as follows:

status sgemm(
    char transa,
    char transb,
    dim_t M,
    dim_t N,
    dim_t K,
    float alpha,
    const float* A,
    dim_t lda,
    const float* B,
    dim_t ldb,
    float beta,
    float* C,
    dim_t ldc
    )

Note:

The 3 matrices A, B and C are in float datatype.

SGEMM operation is defined as follows:

C = alpha * op(A) * op(B) + beta * C

where

op(X) = X or X^T depending on transa and related values. X is a matrix.
alpha and beta are scalars
A, B, and C are matrices
op(A) is an MxK matrix
op(B) is an KxN matrix
C is an MxN matrix (output)

The parameters are as follows:

transa: Transposition flag for matrix A. If it is set to 0, op(A) = A and if it is set to 1, op(A) = A^T.
transb: Transposition flag for matrix B. If it is set to 0, op(B) = B and if it is set to 1, op(B) = B^T.
M, N, K: dimensions
alpha: parameter that is used to scale the product of matrices A and B.
A: Input matrix of size MxK
lda: Leading dimension for matrix A
B: Input matrix of size KxN
ldb: Leading dimension for matrix B
beta: Beta parameter that is used to scale matrix C
C: Output matrix
ldc: Leading dimension for matrix C

The different combinations in SGEMM will be:

C = alpha * A * B + beta * C
C = alpha * A^T * B + beta * C
C = alpha * A * B^T + beta * C
C = alpha * A^T * B^T + beta * C

The GEMM operations like SGEMM from any library are highly optimized for specific applications and platforms.

Use of SGEMM

SGEMM is used in different operations in Machine Learning models. These operations include MatMul, Convolution and others. This is because single precision (float, 32 bits) is enough for Machine Learning calculations. Moreover, the research is towards lower precision like INT8, INT4, FP16.

SGEMM is not used in Scientific calculations as the precision is low for such applications. In this, DGEMM is used which has double precision (double, 64 bits).

SGEMM calls are available in different libraries like:

BLAS like OpenBLAS
BLIS like FLAME BLIS
FBGEMM
OneDNN

and others.

Difference between sgemm and other gemm functions

SGEMM vs DGEMM

The major difference is that SGEMM deals with single precision (float 32 bits) data while DGEMM deals with double precision (double 64 bits) data.

The float datatype in SGEMM has 7 decimal digits reserved for precision and the toal size is 32 bits . On the other hand, the double datatype used in DGEMM has a total size of 64 bits and is a standard: "IEEE 754 double-precision floating point number".

SGEMM vs GEMM

The main difference is that GEMM is the generalized function. SGEMM is a specific implementation of GEMM.

The general GEMM functions have different variations with different datatypes for the 3 matrices involved like:

gemm_u8s8s32: GEMM with A of datatype unsigned INT8, B of datatype signed INT8 and output as signed INT32.
gemm_s8s8s32: GEMM with A of datatype signed INT8, B of datatype signed INT8 and output as signed INT32.

and much more.

The datatype is SGEMM is fixed that is FLOAT 32 bits (FP32).

sgemm vs (cgemm and zgemm)

cgemm and zgemm deal with matrices of Complex datatype (that is a real and imaginary part). SGEMM deals with only real numbers.

With this article at OpenGenus, you must have the complete idea of SGEMM.

SGEMM

Machine Learning (ML)

Introduction to SGEMM

Use of SGEMM

Difference between sgemm and other gemm functions

Python script to control cursor

sleep command in Linux