LLVM Memory and SSA
Do not miss this exclusive book on Binary Tree Problems. Get it now for free.
In this article, we learn all about SSA and LLVM memory and how the two are related.
Table of contents.
- Introduction.
- Mutable variables.
- LLVM Memory.
- Summary.
- References.
Prerequisites.
User-defined Operators in Kaleidoscope.
Introduction.
In previous articles, we learned how to build and represent AST, generating LLVM IR, optimizing the code and JIT compilation.
SSA is s property of IR (Intermediate representation) that requires each variable to only be assigned once and each variable to be declared before it is used.
SSAs are useful since they simplify and improves results from compiler optimizations. Compiler optimization algorithms thay are improved by SSA include;, constant propagation, dead code elimination, global variable numbering, value range propagation etc.
Functional languages such as Kaleidoscope makes it easy to generate a LLVM IR directly in SSA form. LLVM also requires its input to be in SSA form.
In general, there is no need for the compiler front-end to build SSA for since LLVM provides a highly tuned tested support.
Mutable variables.
Mutable variable cause complexities during SSA construction, for example, lets look at the following code;
int G, H;
int test(_Bool Condition) {
int X;
if (Condition)
X = G;
else
X = H;
return X;
}
Above, we have X which is a variable whose value is dependent on the path executed in the program. There are two possibilities for X, we insert a PHI node which merges the two values.
We have the resulting LLVM IR;
@G = weak global i32 0 ; type of @G is i32*
@H = weak global i32 0 ; type of @H is i32*
define i32 @test(i1 %Condition) {
entry:
br i1 %Condition, label %cond_true, label %cond_false
cond_true:
%X.0 = load i32, i32* @G
br label %cond_next
cond_false:
%X.1 = load i32, i32* @H
br label %cond_next
cond_next:
%X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
ret i32 %X.2
}
Above, the loads from the G and H global variables are explicit and live in the then/else branches of the if statement.
To merge incoming values, X.2 phi node in cond_next block chooses the right value for use based on the direction control flow is coming from.
LLVM Memory.
LLVM requires all register values to be in SSA form, on the other hand it doe not require memory objects to be in SSA form. In LLVm instead of data flow analysis of memory into the LLVM IR, we handle this useing Analysis Passes which are computed on demand.
For more on LLVM Analysis Passes this link is helpful.
The idea is that we want a stack variable that lives in memory for each mutable object in a function.
To take advantage of this trick, lets first discuss how stack variable are represented in LLVM.
Memory access in LLVM are explicit with load/store instructions. They are designed not to need an address-of operator. Stack variables are declared iwthin the LLVM allocal instruction.
The code shows how we can declare and manipulate a stack variable in the LLVM IR;
define i32 @example() {
entry:
%X = alloca i32 ; type of %X is i32*.
...
%tmp = load i32, i32* %X ; load the stack value %X from the stack.
%tmp2 = add i32 %tmp, 1 ; increment it
store i32 %tmp2, i32* %X ; store it back
...
Stack memory that is allocated with the alloca instruction is general, we can pass the stack slot address to functions or store it in other variables.
We can write the above example using alloca technique and avoid using a PHI node.
@G = weak global i32 0 ; type of @G is i32*
@H = weak global i32 0 ; type of @H is i32*
define i32 @test(i1 %Condition) {
entry:
%X = alloca i32 ; type of %X is i32*.
br i1 %Condition, label %cond_true, label %cond_false
cond_true:
%X.0 = load i32, i32* @G
store i32 %X.0, i32* %X ; Update X
br label %cond_next
cond_false:
%X.1 = load i32, i32* @H
store i32 %X.1, i32* %X ; Update X
br label %cond_next
cond_next:
%X.2 = load i32, i32* %X ; Read X
ret i32 %X.2
}
We should not the following concerning the handling of arbitrary mutable variables without creating phi nodes;
- Each mutable variable becomes a stack allocation.
- Each read of a variable becomes the load of the stack.
- Taking a variable address uses the stack address directly.
We have solved the intermediate problem but introduced another problem, a performance issue. The problem is caused by alot of stakc traffic for simple and common expresions.
LLVM optimizer has a optimizer pass named mem2reg which handles such cases.
When we execute the above example, we will have the following result;
$ llvm-as < example.ll | opt -mem2reg | llvm-dis
@G = weak global i32 0
@H = weak global i32 0
define i32 @test(i1 %Condition) {
entry:
br i1 %Condition, label %cond_true, label %cond_false
cond_true:
%X.0 = load i32, i32* @G
br label %cond_next
cond_false:
%X.1 = load i32, i32* @H
br label %cond_next
cond_next:
%X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
ret i32 %X.01
}
We use mem2reg pass to implement the standad algorithm for building a SSA for and has a couple of optimizations that speed up degenerate cases.
mem2reg is the solution for dealing with mutable variables. It only works on variables in specific circumstances;
- First, mem2reg is alloca-driven -> it finds alloca and if it can handle them, it promotes them. This is not applicable to global variables or heap allocations.
- mem2reg only looks for alloca instructions in the function entry block.
Being in the entry block is a guarantee that alloca is executed once, this makes analysis simpler. - mem2reg only promotes allocas whose uses are direct loads and stores.
- Finally mem2reg works on allocas only if the array size of allocation is 1. It however does not support structs or array to registers.
The above properties
Summary.
SSA is s property of IR(Intermediate representation) that requires each variable to only be assigned once and each variable to be declared before it is used.
Functional languages such as Kaleidoscope makes it easy to generate a LLVM IR directly in SSA form.
Mutable variable cause complexities during SSA construction.
LLVM requires all register values to be in SSA form.
Memory access in LLVM are explicit with load/store instructions.
References.
Sign up for FREE 3 months of Amazon Music. YOU MUST NOT MISS.