KMP_AFFINITY [complete guide]

Do not miss this exclusive book on Binary Tree Problems. Get it now for free.

KMP_AFFINITY is an environment variable that is used to control how hardware threads are distributed in relative to each other. This is used along with KMP_HW_SUBSET for finer control over the threads.

Table of contents:

  1. When to use KMP_AFFINITY?
  2. Basics of using KMP_AFFINITY
  3. KMP_AFFINITY in depth

When to use KMP_AFFINITY?

One should use KMP_AFFINITY when:

  • Control how threads are distributed across available CPU topology
  • KMP_HW_SUBSET has been set explicitly
  • Run compute intensive applications efficiently

Basics of using KMP_AFFINITY

The basic usage of KMP_AFFINITY is as follows:

export KMP_AFFINITY=<type>

The main values for type are:

  • compact: Threads are close to each other
  • disabled: Does not pin threads and disables KMP_AFFINITY
  • explicit: Use the proclist modifier to pin threads.
  • none: Does not pin threads but OpenMP determines affinity.
  • scatter: Equally distribute threads to cores

Example:

export KMP_AFFINITY=compact
export KMP_AFFINITY=scatter

KMP_AFFINITY=compact is similar to OMP_PROC_BIND=close and KMP_AFFINITY=scatter is similar to OMP_PROC_BIND=spread.

KMP_AFFINITY in depth

The finer way of setting KMP_AFFINITY is:

export KMP_AFFINITY=[<modifier>,]<type>[,<permute>][,<offset>]

The 3 parameters modifier, permute and offset are optional. Only the parater type is complusory.

The use of different parameters are as follows:

Parameters for KMP_AFFINITY
ParameterNeed?UseOptions
modifierOptionalTo control granularity of threads and log messagesAny combination of: granularity, norespect, noverbose, nowarnings, noreset, proclist, respect, verbose, warnings, reset
typeYesTo control distribution of threadsAnyone: balanced, compact, disabled, explicit, none, scatter, logical, physical
permuteOptionalControl which level of topology is most importantexplicit, none, or disabled
offsetOptionalSelect the position of thread assignmentexplicit, none, or disabled

The main points to set KMP_AFFINITY are as follows:

  • Use type=compact, if you want the threads to be near the core.
  • Use type=scatter, if you want the threads to be equally distributed across cores. This reduces cache and memory bandwidth so results in optimal performance.
  • With type=explicit, we can tie threads to specific cores defined by proclist.

For example:

export KMP_AFFINITY="explicit,proclist=[0,1,4,5],verbose"  
  • Set granularity=core to pin threads to physical core, or set granularity=fine to pin to logical cores.

If the command is like this:

export KMP_AFFINITY="granularity=fine,compact,1,0"  

This means:

  • Threads will be pinned to logical cores
  • Threads will be near the cores as much as possible.
  • permute is set to 1 so topology level 1 of the system will be given priority.
  • offset is 0 so threads will be assigned from the first core.

With this article at OpenGenus, you must have the complete idea of KMP_AFFINITY and how to set it to get the most optimal performance.

Sign up for FREE 3 months of Amazon Music. YOU MUST NOT MISS.