Unsafe Rust

Internship at OpenGenus

Get this book -> Problems on Array: For Interviews and Competitive Programming

Rust has a lot of safety checks that ensure memory safety and allow us to completely avoid a lot of the common errors that plague other languages with unrestricted memory freedom. But sometimes said freedom is required for us to do what we want or have to do as programmers, and that is where Unsafe Rust comes in.

Table of contents

  1. Why and when to use Unsafe
  2. The Five Superpowers of Unsafe Rust
  3. Superpowers explained

Why and when to use Unsafe

Using Unsafe-marked code to take advantage of one or more of the superpowers listed below is not wrong nor frowned upon. If you have a reason to use unsafe code, do it. The fact you will mark every unsafe block will make it easier to track down any bugs that may pop up. But keep in mind, the compiler cannot enforce or ensure memory safety inside unsafe code. With that word of warning/encouragement said and done, let's go on to explain what these superpowers are.

The Five Superpowers of Unsafe Rust

Unsafe Rust is like a whole new language inside of Rust which is, by default, conservative. Meaning that, when faced with some code that might be right or might be wrong, it will choose to not compile it. It rather block some acceptable code, than accept some invalid code.

Now, using Unsafe gives you access to 5 things normal Rust doesn't allow you to do, but in some circunstances might be necessary to do. These are:

  1. Dereference a raw pointer
  2. Call an unsafe function or method
  3. Access or modify a mutable static variable
  4. Implement an unsafe trait
  5. Access fields of unions

Some of these topics we've covered in the past using Safe rust. I'll go through each of these below and borrow some examples from the book to accompany the explanation!

It's imporant to note that using unsafe does not turn off Rust's memory checker. If you use a reference in unsafe code, it will still be checked like nothing is happening. Using unsafe only gives you access to these 5 things, and only these 5 things will not be checked by the memory checker. Also unsafe does not equal to dangerous. Some very basic things we can do with unsafe are not dangerous at all. But the possibility is there, so be extra careful. Unsafe essentially tells the compiler "Don't worry too much about these, I know what I'm doing" when talking about the Big 5 below.

Superpowers Explained

Dereference a raw pointer

In previous articles I've talked about the compiler, and the fact that it enforces the validity of any references we use, mainly through borrowing and moving ( Can't re-use after moving, can't modify if borrowed, etc). Unsafe Rust has two variable types that are not usable in Safe Rust. Immutable raw pointers, and mutable raw pointers, that are comparable to raw pointers in a language like C, for example.

They are the *const T and *mut T, immutable and mutable respectively. The asterisk is not a dereference operator as we've seen before, in this case it's part of the type declaration. In the context of Raw pointers, being immutable does not mean they can never be changed after assignment. It means they cannot be modified after being dereferenced.

Rust has Smart pointers and references, unsafe has raw pointers. What are the differences?
Raw pointers..

  1. Are allowed to ignore the borrowing rules by having both immutable and mutable pointers or multiple mutable pointers to the same location.
  2. Aren’t guaranteed to point to valid memory.
  3. Are allowed to be null.
  4. Don’t implement any automatic cleanup.

Giving up these guarantees and automatic functionality, you give up any guarantees that what you're using is valid. So you better be sure that what you are doing is valid OR implement safety checks yourself. (Which is a common occurrence in C for example, where you always, or almost always, check if a pointer is null before attempting to dereference it.)

    let mut num = 5;

    let r1 = &num as *const i32;
    let r2 = &mut num as *mut i32;

Note that the unsafe keyword is not used. That is because you can actually create raw pointers in Safe Rust. What we can't do is dereference them in safe code.

We've created two raw pointers, by casting them using the as keyword. Since we've created them using references that we know are valid, we can be sure that the pointers themselves are valid, but this might not always be the case.

How do we dereference them then?

    let mut num = 5;

    let r1 = &num as *const i32;
    let r2 = &mut num as *mut i32;

    unsafe {
        println!("r1 is: {}", *r1);
        println!("r2 is: {}", *r2);
    }

We use an unsafe block, as seen above. Then, you can use the dereference operator *(asterisk) to access the value pointed at by our raw pointers. Lots of power comes from these, but at the cost of greater chance of bugs. With great power comes great responsibility, said Uncle Ben, and it really applies here.
The biggest use case for raw pointers is when interfacing with external code, as in, using code in another language and using that functionality in our rust program, usually interfacing with C, but many other languages can be used. Another case is when trying to create safe abstractions that the borrow checker doesn’t quite admit.

Call an unsafe function or method

Another reason to use Unsafe Rust is to call Unsafe functions. Unsafe functions and methods look exactly like normal functions and methods, but with the added unsafe keyword at the front. You must call unsafe functions within unsafe blocks. Otherwise the compiler will complain and not compile.

 unsafe fn dangerous() {}

    unsafe {
        dangerous();
    }

Even if the function doesn't do anything 'unsafe' or dangerous, the compiler will complain because it is labeled as unsafe. Anything inside an unsafe function is considered unsafe, so you don't need to do something silly like..

unsafe fn dangerous() {
    unsafe {
    
    }
}

There's an interesting study case that the book goes into using an unsafe function in the standard library that I highly suggest that you go check out!

Access or modify a mutable static variable

Remember in my article about concurrency, that I mentioned something called a data race? Quick recap, a data race happens when two different processes try to access the same variable. There's no telling which one will grab a hold of it first, so the results can vary wildly between each run attempt.

Data races can happen with global variables aswell, which Rust allows but can quickly become problematic. They are called static variables here however.

static STRING_TO_PRINT: &str = "This will be printed";

fn main() {
    println!("Let's print: {}", STRING_TO_PRINT);
}

Unsafe-1

Constants and immutable static variables might seem similar, but a subtle difference is that values in a static variable have a fixed address in memory. Using the value will always access the same data. Constants, on the other hand, are allowed to duplicate their data whenever they’re used.

Another difference between constants and static variables is that static variables can be mutable. Accessing and modifying mutable static variables is unsafe.

This is how we declare a mutable static variable and the fact we need to use an unsafe block to modify and read it.

static mut COUNTER: u32 = 0;

fn add_to_count(inc: u32) {
    unsafe {
        COUNTER += inc;
    }
}

fn main() {
    add_to_count(3);

    unsafe {
        println!("COUNTER: {}", COUNTER);
    }
}

Output is:
Unsafe-2

Implement an unsafe trait

A trait is unsafe when any of it's methods is unsafe. To declare a trait is unsafe, we add unsafe before the trait and impl keywords, as so..

unsafe trait Foo {
    // methods go here
}

unsafe impl Foo for i32 {
    // method implementations go here
}

Access fields of unions

Finally, accessing fields of unions. A union is a structure similar to a struct, but only one declared field is used in a particular instance at a time. Unions are primarily used to reference C code, so I will leave a link to the Unions reference below so you can take a look for yourself!

#[repr(C)]
union MyUnion {
    f1: u32,
    f2: f32,
}

As you see the declaration of union is exactly the same as struct, but with union instead. When we create a Union, it's also the same, but you must specify exactly 1 field. As you can see, reading this field, requires an unsafe block, and is accessed just like structures.

let u = MyUnion { f1: 1 };
let f = unsafe { u.f1 };

This article was a quick and dirty explanation of the main aspects of Unsafe Rust. Use these carefully when needed to avoid some of those nasty memory management bugs. Chances are if you know you need to use unsafe, you're fully capable of dealing with the possible bugs, debugging and such.

See you in the next one!

References

Unsafe Rust Chapter in The Rust Book

Unions Reference