kae3g 9594: Build Systems - From Source to Binary
Phase 1: Foundations & Philosophy | Week 4 | Reading Time: 16 minutes
What You'll Learn
- How source code becomes executable programs
- Compilation vs interpretation
- The build pipeline (preprocess, compile, assemble, link)
- Build tools (make, ninja, Bazel, Nix)
- Why reproducible builds matter
- Incremental builds (rebuild only what changed)
- Nix as the ultimate build system
- Build systems as recipe management (plant lens)
Prerequisites
- 9560: Text Files - Source code is text
- 9504: What Is Clojure? - Compilation in dynamic languages
- 9590: Filesystem - Where build artifacts live
From Seeds to Harvest
Source code (text files): Dormant potential, like seeds
Build process: Transformation into executable, like growing plants
Binary executable: Running program, like harvested crop
Plant lens: "Build systems are recipes for growing crops from seeds—taking source (seeds) and producing binaries (harvest)."
Compilation vs Interpretation
Compiled Languages
C, Rust, Go:
Source code (.c, .rs, .go)
↓ compile
Machine code (binary)
↓ run
Executable program
Pros:
- Fast (already machine code)
- Distributed (ship binary, not source)
Cons:
- Compilation time (must rebuild to test)
- Platform-specific (x86 binary won't run on ARM)
Interpreted Languages
Python, JavaScript, Ruby:
Source code (.py, .js, .rb)
↓ run
Interpreter (reads source, executes)
Pros:
- Quick iteration (edit → run immediately)
- Portable (same source runs on any platform with interpreter)
Cons:
- Slower (interpreting overhead)
- Requires interpreter (can't ship just binary)
Hybrid (JVM, Clojure)
Source code (.clj, .java)
↓ compile
Bytecode (.class)
↓ run
JVM (interprets/JITs bytecode)
Best of both: Fast-ish (JIT compilation), portable (bytecode works anywhere).
The Compilation Pipeline
For C program:
Step 1: Preprocessing
// hello.c
#include <stdio.h>
#define MESSAGE "Hello, Valley!"
int main() {
printf("%s\n", MESSAGE);
}
Preprocessor expands macros, includes headers:
gcc -E hello.c -o hello.i
# hello.i now has entire stdio.h contents + "Hello, Valley!" inline
Step 2: Compilation
C → Assembly:
gcc -S hello.i -o hello.s
# hello.s contains assembly code:
# mov edi, OFFSET FLAT:.LC0
# call puts
# ...
Step 3: Assembly
Assembly → Machine code:
as hello.s -o hello.o
# hello.o is binary (object file)
# Contains machine instructions, but not yet executable
Step 4: Linking
Combine object files + libraries:
ld hello.o /usr/lib/libc.so -o hello
# hello is now executable!
Or all at once:
gcc hello.c -o hello
# (gcc runs all 4 steps internally)
Build Tools: Automating the Process
Make (1976, Stuart Feldman)
Makefile:
# Target: dependencies
# command
hello: hello.o
gcc hello.o -o hello
hello.o: hello.c
gcc -c hello.c -o hello.o
clean:
rm -f hello hello.o
Run:
make hello
# Output:
# gcc -c hello.c -o hello.o
# gcc hello.o -o hello
make clean
# rm -f hello hello.o
Benefit: Incremental (only rebuilds what changed).
Problem: Imperative (you specify HOW to build, not just WHAT).
Ninja (2012, Evan Martin)
Faster than Make:
- Simpler syntax (generated by tools, not hand-written)
- Parallel by default (uses all CPU cores)
- Used by: Chromium, LLVM, Meson
Not hand-written (too low-level). Tools generate build.ninja
files.
Bazel (Google)
Scalable (handles huge codebases):
- Hermetic builds (isolated, reproducible)
- Distributed (can farm out to build servers)
- Incremental (caches aggressively)
Used by: Google (entire codebase), large projects.
Problem: Complex (learning curve, overhead for small projects).
Nix (The Ultimate!)
Declarative, reproducible, isolated:
# default.nix
{ pkgs ? import <nixpkgs> {} }:
pkgs.stdenv.mkDerivation {
name = "hello";
src = ./.;
buildInputs = [ pkgs.gcc ];
buildPhase = ''
gcc hello.c -o hello
'';
installPhase = ''
mkdir -p $out/bin
cp hello $out/bin/
'';
}
Build:
nix-build
# Result: ./result/bin/hello
Benefits:
- Reproducible (same inputs → same output, always)
- Isolated (dependencies don't conflict)
- Cacheable (binary cache - never rebuild same thing)
- Rollbackable (old versions stay around)
This is sovereignty (Essay 9503, 9960 - grainhouse strategy!).
Why Reproducible Builds Matter
Problem: "Works on my machine!"
Developer: Builds fine ✅
CI server: Build fails ❌
Production: Different binary ⚠️
Causes:
- Different dependency versions
- Different OS
- Different timestamps (embedded in binary)
- Non-deterministic build steps
Solution: Reproducible builds (same source + same environment → bit-identical binary).
Nix guarantees this (hermetic builds, locked dependencies).
Why it matters:
- Security: Can verify binary matches source (no backdoor injection)
- Debugging: If production binary differs, you can't reproduce bugs
- Trust: Users can build from source, verify it matches distributed binary
Plant lens: "Reproducible builds are like saving seeds—plant the same seed in same soil → get identical plant."
Incremental Builds
Problem: Rebuilding everything wastes time.
Solution: Track dependencies, rebuild only what changed.
Example (make):
main: main.o utils.o
gcc main.o utils.o -o main
main.o: main.c utils.h
gcc -c main.c
utils.o: utils.c utils.h
gcc -c utils.c
Change main.c
:
make
# Only recompiles main.c → main.o
# Then relinks main
# Skips utils.c (unchanged!)
Nix approach: Hash-based (content-addressed):
- Each dependency has a hash
- If hash unchanged, use cached result
- No manual dependency tracking needed!
The Nix Build Model
Nix is special (Essay 9504 mentioned it):
Hermetic Builds
Isolated from system:
Traditional build:
gcc hello.c -o hello
# Uses: system gcc, system libc, system headers
# (Depends on what's installed!)
Nix build:
Uses: /nix/store/abc123-gcc-11.2/bin/gcc
/nix/store/def456-glibc-2.35/
# Everything explicit, isolated
# (Doesn't depend on system state!)
Result: Same Nix expression → same binary (always).
Content-Addressed
Derivations are hashed:
/nix/store/abc123-hello-1.0
└─────┘
Hash of: source + dependencies + build script
If ANY input changes (source, dependencies, script), hash changes → rebuild.
If nothing changed, use cached result → instant!
This is perfect for grainhouse strategy (Essay 9960).
Try This
Exercise 1: Manual Compilation
# Write simple C program
cat > hello.c <<EOF
#include <stdio.h>
int main() {
printf("Hello, Valley!\n");
return 0;
}
EOF
# Compile step-by-step
gcc -E hello.c -o hello.i # Preprocess
gcc -S hello.i -o hello.s # Compile to assembly
as hello.s -o hello.o # Assemble to object file
gcc hello.o -o hello # Link to executable
# Run
./hello
Observe: Four distinct steps (usually hidden by gcc hello.c -o hello
).
Exercise 2: Make Incremental Build
# Create Makefile
cat > Makefile <<EOF
hello: hello.o
gcc hello.o -o hello
hello.o: hello.c
gcc -c hello.c -o hello.o
clean:
rm -f hello hello.o
EOF
# First build
make hello
# No changes, rebuild:
make hello
# Output: make: 'hello' is up to date.
# Change source:
echo "// comment" >> hello.c
# Rebuild (incremental!)
make hello
# Only recompiles hello.c
Observe: Make tracks timestamps, rebuilds only what's needed.
Exercise 3: Nix Build (if you have Nix)
# Create simple Nix expression
cat > default.nix <<EOF
{ pkgs ? import <nixpkgs> {} }:
pkgs.writeScriptBin "hello" ''
echo "Hello from Nix!"
''
EOF
# Build
nix-build
# Run
./result/bin/hello
Observe: Nix handles dependencies, isolation, caching automatically.
Going Deeper
Related Essays
- 9504: What Is Clojure? - JVM compilation
- 9560: Text Files - Source as text
- 9595: Package Managers - Dependency management
- 9960: The Grainhouse - Nix for sovereignty
External Resources
man gcc
- Compiler documentation- "The Nix Manual" - Complete Nix reference
- Bazel documentation - Google's build system
- "Recursive Make Considered Harmful" - Classic paper on Make's flaws
Reflection Questions
- Why do build systems exist? (Can't we just
gcc *.c
? What breaks at scale?) - Is reproducibility always achievable? (Timestamps, randomness, network - how to handle?)
- Should all builds be hermetic? (Nix says yes - but what's the cost? Disk space, complexity)
- What if source code directly executed? (Interpreted languages do this - trade-offs?)
- How would Nock build systems work? (Pure functions (noun → noun) - deterministic by definition!)
Summary
Build Systems Transform:
- Source code (text) → Executables (machine code)
Compilation Pipeline:
- Preprocess: Expand macros, include headers
- Compile: C → Assembly
- Assemble: Assembly → Object code
- Link: Object files → Executable
Build Tools:
- Make (1976): Incremental, imperative, timestamp-based
- Ninja (2012): Fast, parallel, generated
- Bazel (Google): Scalable, hermetic, distributed
- Nix (Ultimate): Declarative, reproducible, content-addressed
Key Concepts:
- Incremental builds: Rebuild only what changed
- Reproducible builds: Same inputs → same output (always)
- Hermetic builds: Isolated (no system dependencies)
- Content-addressed: Hash-based caching
Why Reproducibility:
- Security: Verify binary matches source
- Debugging: Reproduce exact production binary
- Trust: Users can build and verify
Nix Advantages:
- Truly reproducible (hermetic, locked deps)
- Content-addressed (perfect caching)
- Declarative (what, not how)
- Sovereignty (grainhouse strategy!)
In the Valley:
- We prefer Nix (reproducibility, sovereignty)
- We value incremental (fast iteration)
- We verify builds (reproducible → security)
- We understand the pipeline (not just black box)
Plant lens: "Build systems are recipes—transform seeds (source) into crops (binaries) through systematic cultivation (compilation pipeline)."
Next: We'll explore package managers—how to manage dependencies at scale, the problem Nix solves beautifully, and why dependency hell exists!
Navigation:
← Previous: 9594 (concurrency threads parallelism) | Phase 1 Index | Next: 9596 (package managers dependency resolution)
Metadata:
- Phase: 1 (Foundations)
- Week: 4
- Prerequisites: 9560, 9504, 9590
- Concepts: Compilation, linking, build systems, Make, Nix, reproducible builds, incremental builds
- Next Concepts: Package managers, dependency resolution, Nix deep dive
- Plant Lens: Seeds (source) → crops (binaries), recipes (build systems), cultivation (compilation)
Copyright © 2025 kae3g | Dual-licensed under Apache-2.0 / MIT
Competitive technology in service of clarity and beauty