“There is no truth. There is only perception.”

— Gustave Flaubert

Prologue

Illegal memory access and memory leaks are the most common bugs in C/C++ programming. However, C/C++ does not detect such errors by themselves, and these bugs can occur quite randomly and are difficult to reproduce.

Recently, I was working on a program repair project where I had to fix bugs in C/C++ code. I found that sanitizers are very useful in detecting certain types of bugs, such as memory related ones. In this article, I will show you how to use sanitizers to find hidden bugs in your C/C++ code.

The demonstration is done on Ubuntu 20.04 LTS in my WSL2 environment, but it should work everywhere as long as you have the compiler and the necessary tools installed.

Quite a lot of the content in this article is generated by Copilot.😉


Basic Knowledge

What are Sanitizers?

Sanitizers are a set of tools that help you find bugs in your code. They are built into the compiler to enable runtime checks that help you find bugs that are difficult to catch with static analysis tools. The most common sanitizers are:

  • AddressSanitizer (ASan): Detects memory errors such as buffer overflows, use-after-free, and other memory corruption bugs.
  • UndefinedBehaviorSanitizer (UBSan): Detects undefined behavior in your code.
  • MemorySanitizer (MSan): Detects uninitialized memory reads.
  • LeakSanitizer (LSan): Detects memory leaks. It already comes with AddressSanitizer.

There are also more specialized sanitizers, such as divide-by-zero sanitizer, etc.

How does it work?

Sanitizers work by instrumenting your code with additional checks. For example, AddressSanitizer adds a redzone around each memory allocation and checks if the program accesses memory outside the allocated region. If it does, the program is terminated and a report is generated.

How to install?

As mentioned earlier, sanitizers are built into the compiler, so you don’t need to install anything extra. You just need to pass the appropriate flags to the compiler to enable them.


Getting Started

Let’s start by writing some buggy C programs and then use sanitizers to unveil the bugs. ASAN and UBSAN are the most commonly used sanitizers, so we will focus on them. You can try other sanitizers on your own.

AddressSanitizer

AddressSanitizer is the most commonly used sanitizer, which detects the general memory errors in your code. Let’s start with a simple example:

1
2
3
4
5
6
7
8
#include <stdio.h>

int main()
{
char buffer[10] = "123456789";
printf("The 11th character is: %c\n", buffer[10]);
return 0;
}

First, compile the program without any sanitizers:

1
gcc -o test test.c && ./test

You will see that the program runs without any error or warning, and that’s how the bug goes unnoticed. Now, let’s compile the program with AddressSanitizer enabled:

1
gcc -fsanitize=address -o test test.c && ./test

Ka-boom!💥Well, the bug is obvious, AddressSanitizer gives a quite comprehensive report. And notably, it comes with colors.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
=================================================================
==8595==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffca6b4ee6a at pc 0x55e6bbf67380 bp 0x7ffca6b4ee30 sp 0x7ffca6b4ee20
READ of size 1 at 0x7ffca6b4ee6a thread T0
#0 0x55e6bbf6737f in main (/home/tonix/buaa/repair/omega/gdbtest/test+0x137f)
#1 0x7fc1a6679082 in __libc_start_main ../csu/libc-start.c:308
#2 0x55e6bbf6718d in _start (/home/tonix/buaa/repair/omega/gdbtest/test+0x118d)

Address 0x7ffca6b4ee6a is located in stack of thread T0 at offset 42 in frame
#0 0x55e6bbf67258 in main (/home/tonix/buaa/repair/omega/gdbtest/test+0x1258)

This frame has 1 object(s):
[32, 42) 'buffer' (line 5) <== Memory access at offset 42 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow (/home/tonix/buaa/repair/omega/gdbtest/test+0x137f) in main
Shadow bytes around the buggy address:
0x100014d61d70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100014d61d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100014d61d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100014d61da0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100014d61db0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x100014d61dc0: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00[02]f3 f3
0x100014d61dd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100014d61de0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100014d61df0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100014d61e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100014d61e10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==8595==ABORTING

The report is quite detailed and tells you exactly what went wrong. You can see that the program tried to access the 11th character of the buffer, which is outside the allocated region. This is a classic buffer overflow bug, and AddressSanitizer caught it.

Besides buffer overflow, AddressSanitizer can also detect use-after-free bugs, double-free bugs, and other memory corruption bugs. It is a very powerful tool for finding memory-related bugs in your code.

UndefinedBehaviorSanitizer

UndefinedBehaviorSanitizer detects undefined behavior in your code. You may ask, what is undefined behavior? Undefined behavior is when the program does something that the C standard does not define. For example, dividing by zero, dereferencing a null pointer, etc.

1
2
3
4
5
6
7
8
9
10
#include <stdio.h>

int main()
{
int a = 10;
int b = 0;
int c = a / b;
printf("The result is: %d\n", c);
return 0;
}

Without sanitizers, the program may simply crash with a floating point exception, which is not very informative. However, if you compile the program with UndefinedBehaviorSanitizer enabled, you will get a more detailed report.

1
2
3
gcc -fsanitize=undefined -o test test.c && ./test
test.c:7:15: runtime error: division by zero
Floating point exception (core dumped)

Now we know exactly what went wrong, and we can fix the bug.


More Options

There are some more options you can use with sanitizers to get more detailed reports or to suppress certain warnings.

By default, sanitizers may not abort the program immediately when an error is detected. Instead, they may continue running the program and report the error at the end. This is useful if you want to see all the errors at once. However, if you want the program to abort immediately after an error is detected, you can set the environment variable ASAN_OPTIONS.

1
ASAN_OPTIONS=abort_on_error=1

If you want to see more logs from the sanitizer, set another environment for this.

1
LSAN_OPTIONS=verbosity=1:log_threads=1

At last, if the program you run requires dynamic libraries, you may encounter

1
ASAN_OPTIONS=verify_asan_link_order=0

When you want to specify multiple options, separate them with ,.

1
ASAN_OPTIONS=abort_on_error=1,verify_asan_link_order=0

Epilogue

Sanitizer is really useful to uncover those hidden 🐞 in your program. Use it to save your life. ᓚᘏᗢ