KMP_AFFINITY [complete guide]

Open-Source Internship opportunity by OpenGenus for programmers. Apply now.

KMP_AFFINITY is an environment variable that is used to control how hardware threads are distributed in relative to each other. This is used along with KMP_HW_SUBSET for finer control over the threads.

Table of contents:

When to use KMP_AFFINITY?
Basics of using KMP_AFFINITY
KMP_AFFINITY in depth

When to use KMP_AFFINITY?

One should use KMP_AFFINITY when:

Control how threads are distributed across available CPU topology
KMP_HW_SUBSET has been set explicitly
Run compute intensive applications efficiently

Basics of using KMP_AFFINITY

The basic usage of KMP_AFFINITY is as follows:

export KMP_AFFINITY=<type>

The main values for type are:

compact: Threads are close to each other
disabled: Does not pin threads and disables KMP_AFFINITY
explicit: Use the proclist modifier to pin threads.
none: Does not pin threads but OpenMP determines affinity.
scatter: Equally distribute threads to cores

Example:

export KMP_AFFINITY=compact
export KMP_AFFINITY=scatter

KMP_AFFINITY=compact is similar to OMP_PROC_BIND=close and KMP_AFFINITY=scatter is similar to OMP_PROC_BIND=spread.

KMP_AFFINITY in depth

The finer way of setting KMP_AFFINITY is:

export KMP_AFFINITY=[<modifier>,]<type>[,<permute>][,<offset>]

The 3 parameters modifier, permute and offset are optional. Only the parater type is complusory.

The use of different parameters are as follows:


Parameters for KMP_AFFINITY
Parameter	Need?	Use	Options
modifier	Optional	To control granularity of threads and log messages	Any combination of: granularity, norespect, noverbose, nowarnings, noreset, proclist, respect, verbose, warnings, reset
type	Yes	To control distribution of threads	Anyone: balanced, compact, disabled, explicit, none, scatter, logical, physical
permute	Optional	Control which level of topology is most important	explicit, none, or disabled
offset	Optional	Select the position of thread assignment	explicit, none, or disabled

The main points to set KMP_AFFINITY are as follows:

Use type=compact, if you want the threads to be near the core.
Use type=scatter, if you want the threads to be equally distributed across cores. This reduces cache and memory bandwidth so results in optimal performance.
With type=explicit, we can tie threads to specific cores defined by proclist.

For example:

export KMP_AFFINITY="explicit,proclist=[0,1,4,5],verbose"

Set granularity=core to pin threads to physical core, or set granularity=fine to pin to logical cores.

If the command is like this:

export KMP_AFFINITY="granularity=fine,compact,1,0"

This means:

Threads will be pinned to logical cores
Threads will be near the cores as much as possible.
permute is set to 1 so topology level 1 of the system will be given priority.
offset is 0 so threads will be assigned from the first core.

With this article at OpenGenus, you must have the complete idea of KMP_AFFINITY and how to set it to get the most optimal performance.

KMP_AFFINITY [complete guide]

C++ Linux

When to use KMP_AFFINITY?

Basics of using KMP_AFFINITY

KMP_AFFINITY in depth

Majority Element using randomized algorithm

Git delete remote branch [2 methods]