MiniLisp C++: A Compile-Time Lisp Interpreter in C++20

TL;DR: I built a Lisp interpreter that evaluates expressions at compile-time using C++20 constexpr. The same code works at runtime too—no duplication needed. Along the way, I discovered that macOS adds ~28KB of constant overhead to all C++ binaries, and that Mach-O is surprisingly more efficient than Linux ELF for small programs.

Try it right now — this runs the same interpreter compiled to WebAssembly (27KB):

Lisp Expression:

Result:

Loading WASM...

Try: (* 6 7) · (car '(10 20 30)) · (cdr '(1 2 3))

Some weeks back I saw that Dan Lemire had a PR open on simdjson that added an expression to parse whole JSON. That intrigued me and took the challenge to see if I could write a LISP Interpreter. Here’s a minimal godbolt playground if you’re interested to play around. I don’t have this problem as much anymore but I used to need a little DSLs in programs all the time. Maybe long term Lua is right choice but I can see something like this to be useful in a very small form factor like some kind of verified binary that you want to minimize your dependencies and adding a whole new lib will add more complexity.

The Magic: Compile-Time Lisp

Here’s what this looks like:

// This is evaluated by the compiler, not at runtime!
constexpr auto val = "(+ 10 (* 2 5))"_lisp;
static_assert(val == 20);

constexpr auto head = "(car '(10 20 30))"_lisp;
static_assert(head == 10);

If the Lisp expression is invalid, your code won’t compile. The compiler becomes your Lisp interpreter.

Why This Matters

Catch errors at compile time - Invalid Lisp expressions fail during build, not at 3 AM in production
Zero runtime cost - The result is baked directly into the binary
Type safety - The compiler verifies your Lisp code before you ship

The C++20 Features That Make This Possible

C++20 made compile-time programming dramatically more powerful:

constexpr everything - Vectors, algorithms, and even memory allocation now work at compile-time
User-defined literals - The _lisp suffix creates elegant syntax
std::variant - Type-safe unions for representing S-expressions
std::span - Zero-copy parameter passing for operand lists
consteval - Forces compile-time-only evaluation

Implementation Highlights

The interpreter is built on a few key components:

FixedString - A template struct that captures string literals at compile-time:

template <size_t N>
struct FixedString {
    char data[N];
    consteval FixedString(const char (&str)[N]) {
        std::copy(str, str + N, data);
    }
    constexpr std::string_view get() const {
        return std::string_view(data, N - 1);
    }
};

S-Expression Types - The classic Lisp data structures:

// An "Atom" is either a number or a symbol
using Atom = std::variant<long, std::string_view>;

// A "List" is a vector of S-Expressions
using List = std::vector<SExpr>;

// An S-Expression is either an Atom or a List
struct SExpr {
    std::optional<Atom> atom;
    std::optional<List> list;
};

The User-Defined Literal - The magic entry point:

template <FixedString S>
consteval auto operator""_lisp() {
    std::string_view s = S.get();
    auto ast = MiniLisp::parse(s);
    auto result_sexpr = MiniLisp::eval(ast);
    // Extract and return the final long value
    return std::get<long>(*result_sexpr.atom);
}

Functional Arithmetic - Using std::transform_reduce for clean, constexpr-compatible operations:

if (op == "+") {
    long result = std::transform_reduce(
        operands.begin(), operands.end(),
        0L,                                    // Initial value
        std::plus<long>(),                     // Reduce operation
        [](const SExpr& e) { return get_long(e); } // Transform
    );
    return SExpr{Atom{result}};
}

Comparison with Other Approaches

Approach	Example	Pros	Cons
Runtime OOP	ofan’s Lisp	Simple, ~200 lines	Runtime only
Template metaprogramming	Crisp, Templisp	Compile-time	Ugly syntax, hard to debug
constexpr (this project)	minilisp-cpp	Clean, dual-mode, debuggable	Requires C++20

The ofan gist shows a classic runtime interpreter in ~200 lines of clean C++. But with C++20 constexpr, we get the same readable code that also works at compile time—that’s the key insight.

Extending the Interpreter

Adding new functions is straightforward. Here’s how to add a max function:

else if (op == "max") {
    p_assert(!operands.empty(), "'max' requires at least one argument");
    long result = get_long(operands[0]);
    for (size_t i = 1; i < operands.size(); ++i) {
        long val = get_long(operands[i]);
        if (val > result) result = val;
    }
    return SExpr{Atom{result}};
}

This automatically works at both compile-time and runtime—no extra effort needed.

The Binary Size Deep Dive

While optimizing the interpreter for size, I learned two lessons that would have saved me hours if I’d known them upfront.

Lesson 1: Know When to Stop

After applying every optimization I could find, the macOS binary sat stubbornly at 34KB. I spent time trying to squeeze out more bytes before realizing: 34KB is the floor. On macOS, the Mach-O binary format has ~28KB of unavoidable overhead. Once you hit that limit, further code optimization is wasted effort.

Lesson 2: Measure the Right Thing

File size (ls -l) is misleading—it’s dominated by format overhead you can’t control. What you can control is actual code size, measured with the size command. My real win was a 32% reduction in executable code (10.7KB → 7.3KB), even though the file size barely budged.

Build Configurations

Build	macOS	Linux	WASM	Techniques
Default	39KB	-	-	`-O2`
Small	36KB	-	-	`-Os`, LTO, strip
Ultra-small	34KB	66KB (10KB UPX)	27KB	POSIX I/O, no iostream, wasm-opt

What We Actually Removed

Here’s the real code reduction (measured with size):

DEFAULT BUILD (with iostream):
  Code section:      8,484 bytes
  Exception tables:    844 bytes
  Total code:       10,753 bytes

ULTRA-SMALL BUILD (POSIX I/O):
  Code section:      5,752 bytes  (32% reduction!)
  Exception tables:    288 bytes  (66% reduction!)
  Total code:        7,273 bytes  (32% reduction!)

The techniques:

Replace <iostream> with POSIX write()/read()
Replace std::string with fixed buffers
Simplify exception handling

Build flags: -Os -flto -fno-rtti -ffunction-sections -fdata-sections -Wl,-dead_strip

Why macOS Has a 34KB Floor

Let’s dig into why you can’t go below 34KB on macOS—understanding this saves you from chasing impossible optimizations.

Mach-O Segment Layout

Running size -m lisp_repl reveals the structure:

Segment __PAGEZERO: 4294967296  (4GB virtual, catches NULL pointers)
Segment __TEXT: 16384           (contains ~7KB code + padding)
Segment __DATA_CONST: 16384     (contains 328 bytes + padding)
Segment __LINKEDIT: varies      (symbols, code signature)

Full output of size -m lisp_repl

Segment __PAGEZERO: 4294967296 (zero fill)
Segment __TEXT: 16384
        Section __text: 8484
        Section __stubs: 336
        Section __gcc_except_tab: 844
        Section __cstring: 737
        Section __unwind_info: 352
        total 10753
Segment __DATA_CONST: 16384
        Section __got: 328
        total 328
Segment __LINKEDIT: 16384
total 4295016448

The key insight: Mach-O uses 16KB segment alignment. Each segment must start on a 16KB boundary, so even tiny segments consume 16KB of disk space.

3 on-disk segments × 16KB = ~48KB baseline
After strip: ~34KB (removes some __LINKEDIT)

This means 34KB is essentially the floor for any C++ program on macOS—even “hello world” is ~33KB.

The Counter-Intuitive Comparison

Metric	macOS Mach-O	Linux ELF
Stripped size	35,016 bytes	67,952 bytes
Actual code (text)	~7KB	~11KB
Format overhead	~28KB	~56KB
Page alignment	16KB	4KB

Despite 16KB pages, Mach-O is MORE efficient than ELF!

Why ELF is larger:

More section headers and metadata
Debug info remnants even after strip
Symbol table overhead

Inspecting Your Own Binaries

# macOS - see segment sizes
size -m your_binary

# macOS - detailed Mach-O structure
otool -l your_binary | grep -A5 "segname"

# Linux - section sizes
size your_binary

# Linux - detailed sections
readelf -S your_binary

The UPX Factor (Linux Only)

UPX (Ultimate Packer for eXecutables) compresses binaries:

Algorithm	Size	Compression
Uncompressed	67,952 bytes	-
UPX NRV (default)	10,288 bytes	85% reduction
UPX LZMA	11,528 bytes	83% reduction

NRV beats LZMA for small binaries by 11%! This surprised me—I expected LZMA to always win.

Why not macOS?

UPX is officially unsupported for Mach-O
Code signing conflicts with compressed binaries
--force-macos often causes segfaults

WebAssembly Build

The interpreter also compiles to WebAssembly, producing a 27KB binary after optimization.

wasi-sdk vs Emscripten

I chose wasi-sdk over Emscripten for one reason: no JavaScript bloat.

Toolchain	Output Size	What You Get
wasi-sdk + wasm-opt	27KB	Single `.wasm` file
Emscripten	100KB+	`.wasm` + JavaScript runtime

Emscripten provides a full POSIX-like environment with filesystem emulation. For a simple eval function, that’s overkill. wasi-sdk produces a minimal WASI-compliant binary that only needs stub implementations for a handful of syscalls.

Build Flags

# Compile with wasi-sdk
clang++ -std=c++20 -Os -fno-exceptions -Wl,--no-entry -Wl,--export-dynamic

# Optimize with wasm-opt (from Binaryen)
wasm-opt -Oz --strip-debug --strip-producers lisp.wasm -o lisp.wasm

Key choices:

-fno-exceptions - Errors via __builtin_trap(), reduces binary size
-Wl,--export-dynamic - Export the eval function for JS access
-Wl,--no-entry - Library mode, no main()

wasm-opt Optimization

Stage	Size	Reduction
After wasi-sdk compile	33KB	—
After wasm-opt -Oz	33KB	~0% (already optimized)
After –strip-debug	28KB	15%
After –strip-producers	27KB	18% total

The -Oz flag alone doesn’t help much since wasi-sdk already optimizes well, but stripping debug info and producer metadata saves ~6KB.

Try It Yourself

Clone and Build

git clone https://github.com/prasincs/minilisp-cpp
cd minilisp-cpp

# Default build
make

# Size-optimized
make small

# Ultra-small (POSIX I/O)
make ultra-small

Verify Compile-Time Evaluation

The static_assert statements in main.cpp prove compile-time evaluation works:

constexpr auto val = "(+ 10 (* 2 5))"_lisp;
static_assert(val == 20);  // Fails to compile if wrong!

constexpr auto val3 = "(car '(10 20 30))"_lisp;
static_assert(val3 == 10);

Try introducing an error—the compiler will catch it:

// This fails at compile time with a parse error
constexpr auto bad = "(+ 1"_lisp;

Cross-Compile for Linux (from macOS)

./build-linux.sh
# Uses Docker to build Linux ARM64 binary
# Shows size comparison automatically

Measure Binary Sections

# macOS
size -m lisp_repl
otool -l lisp_repl | grep -A5 "segname"

# Linux
size lisp_repl
readelf -S lisp_repl

Key Takeaways

On binary size optimization:

Know when to stop - macOS has a ~34KB floor due to Mach-O format overhead. Once you hit it, further code optimization is wasted effort.
Measure actual code size - Use size, not ls -l. File size is dominated by format overhead; code size is what you can control.
iostream is expensive - Removing it saved 32% of actual executable code. If size matters, use POSIX I/O.

On C++20:

constexpr is powerful - A full Lisp interpreter at compile time in readable code
Same code, dual modes - No template metaprogramming gymnastics required

Conclusion

Building a compile-time Lisp interpreter turned out to be a journey through modern C++ and binary format archaeology. The compile-time evaluation is genuinely useful for catching errors early, but the binary size investigation taught me more about platform-specific behavior than I expected.

The source code is available at github.com/prasincs/minilisp-cpp, as well as a standalone playground. Try adding new operations—they’ll automatically work at both compile-time and runtime.

Sometimes the journey of optimization teaches more than the destination.

The Magic: Compile-Time Lisp#

Why This Matters#

The C++20 Features That Make This Possible#

Implementation Highlights#

Comparison with Other Approaches#

Extending the Interpreter#

The Binary Size Deep Dive#

Lesson 1: Know When to Stop#

Lesson 2: Measure the Right Thing#

Build Configurations#

What We Actually Removed#

Why macOS Has a 34KB Floor#

Mach-O Segment Layout#

The Counter-Intuitive Comparison#

Inspecting Your Own Binaries#

The UPX Factor (Linux Only)#

WebAssembly Build#

wasi-sdk vs Emscripten#

Build Flags#

wasm-opt Optimization#

Try It Yourself#

Clone and Build#

Verify Compile-Time Evaluation#

Cross-Compile for Linux (from macOS)#

Measure Binary Sections#

Key Takeaways#

Conclusion#