Cache Associativity

Open-Source Internship opportunity by OpenGenus for programmers. Apply now.

Cache associativity is a property of the cache which decides how many different memory blocks can be stored in a cache line. In this OpenGenus article, we shall classify caches on the basis of their associativity property into 4 main categories:- Direct Mapped, 2-Way Set Associative, 4-Way Set Associative and Fully Associative.

Pre-requisites: Basic knowledge of computer memory and knowledge of the concepts in this article- Types of Cache Misses

Please note that in this OpenGenus article, you may find cache line and cache block being used interchangeably, because they in fact refer to the same concept.

Brief Explanation of Cache Associativity

Cache associativity decides how many memory blocks can be stored in a cache line, and a cache with higher associativity has the capacity to store more memory blocks in a cache line. Basically, the higher the associativity, the more number of memory blocks can be stored in a cache line. This reduces the possibility for cache misses but increases the hardware cost and complexity. Whereas if the associativity is less, lesser number of blocks can be stored in a cache line, but the problems which will arise will be of the contrary nature.

To understand the classification of cache memory on the basis of its associativity type, we also need to understand the basic division of an incoming address to a cache line.

A memory address used to locate and retrieve data from a cache is divided into 3 parts- tag, index and offset. This splitting into 3 divisions is done on the basis of cache/block size, and associativity as we will see. Offset is the part of the address which refers to the location of a byte/word in the cache line. Index is the part which refers to the location of a cache set where the desired block can be found. Tag is the part of the address which identifies the desired block within a cache set.

Direct Mapped Cache

In a direct mapped cache model, a memory block can only go into a single cache line. Thus, each memory address maps to only a particular cache block. In other words, a memory block can be associated with one and only one cache line. This mapping is mostly decided using a modulo operation on the address, resulting in the unique mapping of the block to a cache line. An advantage of this model is, as you probably understood, its simplicity which helps in quick access and significantly reduces the required hardware complexity. A disadvantage of this is that multiple cache blocks may end up mapping to the same cache line (since there is no imposition of a unique mapping in the opposite direction), and thus this increases the probability of encountering cache misses.

n-Way Set Associative Cache

This cache associativity model offers a compromise between a direct mapped and fully associative model in terms of both simplicity and hardware complexity. A direct mapped cache model is an n-way associative model, in which n=1. In such a model, a cache is divided into sets, each of which has the capacity for 'n' cache blocks. The incoming memory block is mapped to a set in the cache first, followed by getting mapped to one of the 'n' cache blocks within that set. The index of its memory address is used to ascertain the set, and the tag helps identify the cache block. The mapping of the memory blocks to the cache requires more complex operations than a simple modulus and this helps in much better handling of cache conflicts.

Fully Associative Cache

In a fully associative cache, a memory block can be mapped to any cache line anywhere in the cache. Essentially, it is an m-way associative for a cache of 'm' blocks. This allows for maximum flexibility, thus significantly reducing cache misses, and the possibility of cache conflict, but comes at the cost of very high hardware complexity. To find a certain block in the cache, every tag must be compared, thus increasing the access time.

It essentially boils down to what your/ the cache controller's priority is. Is it to maximize the hit rate or to minimize the hardware complexity and access time? The appropriate cache placement policy can be chosen accordingly.

Impact on Software Performance & Associativity for Different Systems

It has been observed that transposing (swapping the elements of a matrix such that the last element occupies the first element's position and vice versa, the second last element occupies the second element's position, etc.) a 512 X 512 size matrix is much slower than transposing a 513 X 513 matrix. This is due to cache contention (when multiple system components want to access the same cache line) which happens because the matrix size is a power of 2 (or a multiple of the cache block size), 2^9 in our case and the cache block size is 4 words. Thus, the elements to be swapped are always 512 words apart and end up in the same cache line, which creates cache conflicts and need for frequent reloading, all of which contributes to poor cache performance.
In a 513 X 513 matrix, we do not encounter such an issue as the matrix size is clearly not a power of 2 (it is relatively prime to the cache block size) and the elements to be swapped are 513 words apart thus ending up in different cache lines.

Another observation to be made is that in modern processors, cache is generally 8-way (or more) set associative. This is good enough for most purposes, as applications do not deal with more than 8 data blocks at the same time as doing so can lead to aliasing which is what happens when multiple memory blocks map to the same cache line causing conflicts and inefficiency. In fact, newer processors can use higher associativity, as high as 12-way to further improve cache performance.

Key Takeaways

One can choose from different cache associativity models on the basis of their preferences.
The three general types of models are direct-mapping, n-way associativity and full associativity.
All the 3 models are essentially n-way set associative models, with n=1, n=n and n= number of blocks in the cache, respectively.
A direct mapping model involves the mapping of an incoming block to a unique cache block and offers the advantage of minimized hardware complexity and access time, although it brings forth the disadvantages of increased cache misses and conflicts.
An n-way associative model consists of a scheme in which the cache is divided into multiple sets, each of which hold 'n' cache blocks and strikes a compromise between a directly mapped and fully associative model.
A fully associative model is the pinnacle of flexibility, as the incoming block can be placed anywhere within the cache. It does come with considerably higher risks of cache conflicts and misses than the previously mentioned models.

Cache Associativity

Computer Architecture Operating System

Table of Contents:

1} Brief Explanation of Cache Associativity

2} Direct-Mapped Cache

3} n-Way Set Associative Cache

4} Fully Associative Cache

5} Impact on Software Performance & Associativity for Different Systems

5} Key Takeways