Build Essentials

When working with C++, like any other language, a complete toolchain is required. A typical toolchain includes a compiler (e.g., g++/clang++/MSVC), an assembler, a linker, and binary utilities (e.g., archivers and inspection tools such as ar, nm, objdump/llvm-objdump). In addition, most real projects rely on a build system to manage dependencies and configurations.

There is one exception though, the Cling used in ROOT. It is built on top of Clang and LLVM, and is a JIT compiler for C++ to allow interpreted execution.

Compiler

The most important component is, of course, the compiler. It bridges the high-level programming languages and low-level machine instructions. And there are assemblers to do more specific translation from assembly to machine code.

Beyond translation, the compiler (and its accompanying toolchain) abstracts away differences in architecture and ABI. The output of compilation is usually a relocatable object file (or an intermediate representation), not a final executable. Object files contain code and data in sections (commonly .text, .data, .rodata, etc.), plus symbol and relocation information. They may also carry debug information (e.g., DWARF on many ELF-based platforms).

This separation makes large builds modular: you can compile many translation units independently, then link them together into an executable or a library.

Note that the object file is not executable. It contains data and instructions, but the target addresses of many references are not yet resolved. An executable needs these addresses resolved, which is the job of the linker.

For the ABI, which stands for Application Binary Interface, it is the contract at binary level. Usually, it defines (among other things):

memory layout
calling convention
name mangling
RTTI/type information conventions
exception handling and unwinding
dynamic linking conventions (symbol visibility and relocation model)

For C++, templates can be a source of ABI issues because their definitions live in headers and are instantiated in each translation unit. ABI mismatches typically come from incompatible headers or ABI settings across components (for example, mixing different standard library ABIs/versions, or toggles that change the representation of standard types). This is usually not about optimization level changing “which template gets instantiated”, but about whether all components were built against the same ABI expectations.

Inlining can also be a problem. If you change an inline (or template) function in a header, any code that used it must be recompiled, otherwise old call sites may still contain the old inlined machine code. Also, if a translation unit expects a non-inline out-of-line definition but the definition is missing (or not linked), you can get an undefined symbol at link time.

Why we need stack?

It is a cheap and simple way to allocate per-call storage and save state. Recall the Von Neumann architecture, which includes Control Unit, Arithmetic/Logic Unit, Memory Unit, Input and Output. The processor (CU/ALU) uses registers for fast computation, and RAM to store and retrieve data.

When you call a function, the stack provides a stack frame: locals, saved registers, and bookkeeping (such as the return address). The compiler can access frame data efficiently using a base register (frame pointer or stack pointer) plus a fixed offset.

Linker

The compiled object files need to be linked together to become an executable, or a shared library. Linking happens at build time; with dynamic linking, some symbol resolution/relocation is deferred to load time or runtime by the dynamic loader.

The primary mission for the linker is to combine object files, resolve symbols, and apply relocations. A static library is typically an archive of object files; at link time, the linker pulls in only the needed objects and then performs final address layout.

With dynamic linking, references to external shared libraries are left to be resolved by the platform’s dynamic loader at program startup (or lazily on first call). On ELF systems, calls often go through PLT/GOT stubs generated in the executable; on Windows, imports are typically resolved via the Import Address Table (IAT).

PIC/PIE and ASLR (and why it matters)

Don’t confuse relocatable objects with position-independent code.

Relocatable object files (.o) are designed to be relocated by the linker at build time.
Shared libraries (.so) are typically built as PIC so they can be mapped at (almost) any runtime address without requiring expensive text relocations.
PIE (Position Independent Executable) is the same idea applied to the main executable: building the executable as PIE allows the loader to place it at a randomized base address.

PIE is important for enabling ASLR (Address Space Layout Randomization) for the main executable’s code segment. In practice, ASLR randomizes the base addresses of several memory mappings (commonly the main executable if it is PIE, shared libraries, heap, stack, and mmap regions), not necessarily every byte.

Further reading of PLT/GOT:
Global Offset Table (GOT)
GOT vs PLT in Binary Analysis

Build System

The most useful tool at scale is a build system. It manages configuration and dependencies, then drives the toolchain to compile/link in the right order. In practice there is often a generator (e.g., CMake) that produces build files, and a backend (e.g., Ninja/Make) that executes them. Key benefits include incremental builds (only rebuilding what changed) and automation.

Resources

Inspection Tools

Compiler Explorer: Mainly for checking assembly code.
Quick C++ Benchmark: Quickly benchmark performance.
CPP Insights: See the de-sugared C++ code.

Online Resources

Appendix: Self-Check

ABI + 3 concrete C++ ABI breaks

ABI (Application Binary Interface) is the binary-level contract that lets separately-compiled code interoperate: calling convention (how args/returns are passed), object layout (size/alignment/field offsets), name mangling/symbols, exception/unwind and RTTI conventions, and how dynamic linking expects symbols/relocations.

Three concrete ways it breaks:

Change a struct/class layout that crosses a library boundary (e.g., reorder fields, add a field, change #pragma pack, change base classes) → sizeof, alignment, and member offsets change; callers and callees disagree.
Build and link against different C++ standard library ABIs/versions (e.g., libstdc++ dual ABI toggled by _GLIBCXX_USE_CXX11_ABI) → the mangled names and/or layout of std::string, std::list, etc. differ, causing link errors or worse, runtime corruption.
Mismatch exception/RTTI settings or compiler ABI conventions across boundaries (e.g., one side compiled with -fno-exceptions or different unwinder/exception ABI expectations; mixing incompatible toolchains) → throws/catches or typeinfo/dynamic_cast may fail or behave unpredictably.

Object file sections vs executable segments

Sections are linker-level containers in object files and final binaries (ELF .text, .data, .rodata, .bss, .debug_*; COFF has similar concepts). They’re about how the linker groups code/data and metadata.

Segments (ELF program headers) are loader-level views used at runtime: they describe what to map into memory and with what permissions (e.g., a PT_LOAD RX segment that contains .text + some read-only data, and an RW segment that contains .data + .bss).

Roughly: sections are for linking, segments are for loading; many sections can be packed into one segment.

Static archive vs shared library vs PIE executable (one paragraph)

A static library (.a/.lib) is an archive of relocatable object files; when you link an executable, the linker copies in only the needed objects and performs final relocation to produce a fixed (or PIE) executable. A shared library (.so/.dll) is a loadable binary module with exported symbols and dynamic relocation metadata, designed to be mapped by the dynamic loader and shared across processes; calls to it are resolved via the dynamic linking machinery. A PIE executable is an executable built as position-independent so it can be loaded at a randomized base address (ASLR); on ELF it’s typically ET_DYN-like in terms of relocation model even though it’s the main program, and it can be dynamically linked (common) or, in some setups, statically linked while still being relocatable.

High-level PLT/GOT call flow

Consider an ELF executable calling printf from libc:

The compiler emits a call to printf@PLT (a stub in the executable).
The PLT stub uses the GOT entry for printf to jump to the real function address.
On the first call with lazy binding enabled, the GOT entry doesn’t yet point to libc’s printf; it points back into the dynamic loader’s resolver path. The PLT stub transfers control to the resolver, passing an identifier for “printf”.
The dynamic loader (ld.so) looks up printf in the loaded shared objects, computes its address, and then patches the GOT entry so future calls jump directly to libc’s printf.
Subsequent calls: printf@PLT immediately jumps through the now-resolved GOT entry (no resolver overhead).
(If eager binding is enabled, resolution happens at load time and the first-call resolver step is skipped.)

One real header-only change causes mismatch example + mechanism

Example: a library exposes a struct in a public header used by both the library and its clients:

1 2	// v1 header struct Config { int mode; bool enabled; };

You change it to:

1 2	// v2 header struct Config { bool enabled; int mode; };

Now sizeof(Config) and member offsets likely change due to alignment/padding. If a client binary was compiled against v1 but runs with a v2 library (or vice versa), passing Config by value, storing it, or accessing fields across the boundary can misinterpret memory and corrupt data. Mechanism: ABI break due to changed object layout (field order + padding/alignment), even though the source-level type name is identical.