×

Search anything:

Add debugging support in Programming Language

Binary Tree book by OpenGenus

Open-Source Internship opportunity by OpenGenus for programmers. Apply now.

A good programming language should also be able to support debugging and debug information. In this article, we add support for debugging and debug information.

Table of contents.

  1. Introduction.
  2. Ahead-of-time(AOT) compilation.
  3. Summary.
  4. References.

Prerequisites.

Compiling to object code.

Introduction.

Until now, we can use functions and variables in Kaleidoscope, however, the main purpose of compilers is to report errors so we can debug code before shipping it to the client.

In this article, we will discuss and add debugging information to our programming language and translate it to DWARF. When debugging a program, we expect the debugging information to be readable, therefore we need to translate the source from binary format to the source previously written by the programmer, this is referred to as source-level debugging. LLVM uses the DWARF format to represent debugging information, it is a compact encoding representing types, source, and variable locations.

The following simple program computes the nth Fibonacci number, in this case, the function returns 55 since its the 10th Fibonacci number;

def fib(n)
  if n < 3 then
    1
  else
    fib(n - 1) + fib(n - 2);

fib(10)

Ahead-of-time compilation.

Ahead-of-time compilation involves compiling a high-level programming language into a low-level language before executing it, this is done during built time, it reduces work needed during run-time.

We will make changes to Kaleidoscope to support the compilation of the IR produced into a standalone program that can be executed and debugged.

First we make an anonymous function containing our top-level statement the main;

-    auto Proto = std::make_unique<PrototypeAST>("", std::vector<std::string>());
+    auto Proto = std::make_unique<PrototypeAST>("main", std::vector<std::string>());

We then remove the command line code as follows;

@@ -1129,7 +1129,6 @@ static void HandleTopLevelExpression() {
 /// top ::= definition | external | expression | ';'
 static void MainLoop() {
   while (1) {
-    fprintf(stderr, "ready> ");
     switch (CurTok) {
     case tok_eof:
       return;
@@ -1184,7 +1183,6 @@ int main() {
   BinopPrecedence['*'] = 40; // highest.

   // Prime the first token.
-  fprintf(stderr, "ready> ");
   getNextToken();

Finally, we disable all optimization passes and JIT(Just In Time Compilation). The only thing that can happen when done with parsing and code generation is that the LLVM IR goes to standard error;

@@ -1108,17 +1108,8 @@ static void HandleExtern() {
 static void HandleTopLevelExpression() {
   // Evaluate a top-level expression into an anonymous function.
   if (auto FnAST = ParseTopLevelExpr()) {
-    if (auto *FnIR = FnAST->codegen()) {
-      // We're just doing this to make sure it executes.
-      TheExecutionEngine->finalizeObject();
-      // JIT the function, returning a function pointer.
-      void *FPtr = TheExecutionEngine->getPointerToFunction(FnIR);
-
-      // Cast it to the right type (takes no arguments, returns a double) so we
-      // can call it as a native function.
-      double (*FP)() = (double (*)())(intptr_t)FPtr;
-      // Ignore the return value for this.
-      (void)FP;
+    if (!F->codegen()) {
+      fprintf(stderr, "Error generating code for top level expr");
     }
   } else {
     // Skip token for error recovery.
@@ -1439,11 +1459,11 @@ int main() {
   // target lays out data structures.
   TheModule->setDataLayout(TheExecutionEngine->getDataLayout());
   OurFPM.add(new DataLayoutPass());
+#if 0
   OurFPM.add(createBasicAliasAnalysisPass());
   // Promote allocas to registers.
   OurFPM.add(createPromoteMemoryToRegisterPass());
@@ -1218,7 +1210,7 @@ int main() {
   OurFPM.add(createGVNPass());
   // Simplify the control flow graph (deleting unreachable blocks, etc).
   OurFPM.add(createCFGSimplificationPass());
-
+  #endif
   OurFPM.doInitialization();

   // Set the global so the code gen can use this.

We can now compile Kaleidoscope to an executable program a.out or .exe. For this we execute the following command;

Kaleidoscope-Ch9 < fib.ks | & clang -x ir -

Summary.

Creating debug information for a language is considered a hard problem for the following reasons;

  • Code needs to be optimized - keeping source locations after optimization is hard.
  • Optimization moves variables, they are either optimized out, shared in memory, or difficult to track.
    In this article, we have learned how to add support for debugging and debug information for the Kaleidoscope programming language. As we stated earlier, this is an important aspect of any good programming language.

References.

Debugging: DWARF, Functions, Source locations, Variables.

Add debugging support in Programming Language
Share this