KMP_AFFINITY [complete guide]
Do not miss this exclusive book on Binary Tree Problems. Get it now for free.
KMP_AFFINITY is an environment variable that is used to control how hardware threads are distributed in relative to each other. This is used along with KMP_HW_SUBSET for finer control over the threads.
Table of contents:
- When to use KMP_AFFINITY?
- Basics of using KMP_AFFINITY
- KMP_AFFINITY in depth
When to use KMP_AFFINITY?
One should use KMP_AFFINITY when:
- Control how threads are distributed across available CPU topology
- KMP_HW_SUBSET has been set explicitly
- Run compute intensive applications efficiently
Basics of using KMP_AFFINITY
The basic usage of KMP_AFFINITY is as follows:
export KMP_AFFINITY=<type>
The main values for type are:
- compact: Threads are close to each other
- disabled: Does not pin threads and disables KMP_AFFINITY
- explicit: Use the proclist modifier to pin threads.
- none: Does not pin threads but OpenMP determines affinity.
- scatter: Equally distribute threads to cores
Example:
export KMP_AFFINITY=compact
export KMP_AFFINITY=scatter
KMP_AFFINITY=compact is similar to OMP_PROC_BIND=close and KMP_AFFINITY=scatter is similar to OMP_PROC_BIND=spread.
KMP_AFFINITY in depth
The finer way of setting KMP_AFFINITY is:
export KMP_AFFINITY=[<modifier>,]<type>[,<permute>][,<offset>]
The 3 parameters modifier, permute and offset are optional. Only the parater type is complusory.
The use of different parameters are as follows:
Parameters for KMP_AFFINITY | |||
---|---|---|---|
Parameter | Need? | Use | Options |
modifier | Optional | To control granularity of threads and log messages | Any combination of: granularity, norespect, noverbose, nowarnings, noreset, proclist, respect, verbose, warnings, reset |
type | Yes | To control distribution of threads | Anyone: balanced, compact, disabled, explicit, none, scatter, logical, physical |
permute | Optional | Control which level of topology is most important | explicit, none, or disabled |
offset | Optional | Select the position of thread assignment | explicit, none, or disabled |
The main points to set KMP_AFFINITY are as follows:
- Use type=compact, if you want the threads to be near the core.
- Use type=scatter, if you want the threads to be equally distributed across cores. This reduces cache and memory bandwidth so results in optimal performance.
- With type=explicit, we can tie threads to specific cores defined by proclist.
For example:
export KMP_AFFINITY="explicit,proclist=[0,1,4,5],verbose"
- Set granularity=core to pin threads to physical core, or set granularity=fine to pin to logical cores.
If the command is like this:
export KMP_AFFINITY="granularity=fine,compact,1,0"
This means:
- Threads will be pinned to logical cores
- Threads will be near the cores as much as possible.
- permute is set to 1 so topology level 1 of the system will be given priority.
- offset is 0 so threads will be assigned from the first core.
With this article at OpenGenus, you must have the complete idea of KMP_AFFINITY and how to set it to get the most optimal performance.
Sign up for FREE 3 months of Amazon Music. YOU MUST NOT MISS.