Each section compares how the three languages approach the same concern, side by side. Tags: ⚡ Perf · 🔐 Safety · 🧹 DX · 🔍 Debug · 📦 Binary · 🔒 SecOps
Notes on reading this document: performance figures are from specific benchmarks, not guarantees — they vary with workload, input size, and hardware. Library names are current as of June 2026; ecosystems move. Where a language lacks a capability, that is stated plainly rather than softened.
🔐 Safety — Rust: impossible states unrepresentable; ADTs + exhaustive match vs Go's open interface world ⚡ Perf — Rust/Zig: static dispatch via monomorphization (zero overhead); Go: interfaces always carry a vtable, Zig builds vtables by hand 🧹 DX — Go: structural typing means zero boilerplate to satisfy an interface; Rust:
impl Traitis explicit but more powerful
This section is the foundation. Every other section in this guide builds on how each language models data and expresses abstraction.
All three have similar numeric towers but differ on width defaults, distinct vs alias types, and special types.
Rust:
// Signed integers — explicit width always
let a: i8 = -128;
let b: i32 = 2_147_483_647;
let c: i128 = 170_141_183_460_469_231_731_687_303_715_884_105_727;
// Unsigned integers
let x: u8 = 255;
let y: u64 = 18_446_744_073_709_551_615;
let z: usize = vec.len(); // platform-width, used for indexing
// Floats — IEEE 754
let f: f32 = 3.14_f32;
let g: f64 = 3.141_592_653_589_793;
// char is a Unicode scalar value (4 bytes) — NOT a byte
let ch: char = '€'; // valid; ch as u32 == 0x20AC
let emoji: char = '🦀';
// bool, unit, never
let b: bool = true;
let u: () = (); // unit type — zero-size; used as "void"
// ! (never) — type of diverging expressions (panic!, return, loop{})
// Integer literals — all legal
let hex = 0xFF_u8;
let bin = 0b1010_0101_u8;
let oct = 0o77_u8;Go:
// int and uint are platform-width (32-bit on 32-bit OS, 64-bit on 64-bit OS)
var a int = -42 // platform-width signed
var b int64 = math.MaxInt64
var c uint8 = 255 // byte alias
// float32 / float64 — IEEE 754 (no f suffix on literals)
var f float32 = 3.14
var g float64 = math.Pi
// rune is int32 — a Unicode code point
var r rune = '€' // r == 0x20AC
var e rune = '🦀'
// string — immutable byte sequence (UTF-8 by convention, not enforced)
var s string = "hello, 世界"
// complex numbers — first-class (unique to Go among systems languages)
var z complex128 = 3 + 4i
fmt.Println(real(z), imag(z), cmplx.Abs(z))
// bool, byte (= uint8), rune (= int32)
var ok bool = true
var b2 byte = 'A' // byte == uint8Key differences:
- Rust
charis always 4 bytes (Unicode scalar); Goruneisint32(alias, not distinct type); Zig has no char type — character literals are integers, and strings are[]const u8 - Rust has
i128/u128natively; Go's largest isint64/uint64; Zig has arbitrary-width integers (u7,i23, up tou65535) as first-class types - Go has built-in
complex64/complex128; Rust and Zig need a library - Rust
usize/isizeand Zigusize/isizeare the indexing types; Go uses plainintfor indexing - Rust integer literals need explicit type or context; Go infers from default
int; Zig requires the type be known (via the binding or a cast) - Rust's
!(never type) is part of the type system; Zig hasnoreturn; Go has no equivalent - Zig has no booleans-as-integers and no implicit numeric coercion at all — every narrowing/widening is an explicit
@intCast/@floatCast, stricter than both Rust and Go
Structs are the primary way to define composite data in all three languages.
Rust — three kinds of struct:
// Named-field struct (most common)
#[derive(Debug, Clone, PartialEq)]
struct User {
id: u64,
name: String,
email: String,
created_at: std::time::Instant,
active: bool,
}
// Tuple struct — fields accessed by position; useful for newtypes
struct Meters(f64);
struct Seconds(f64);
struct Color(u8, u8, u8); // RGB
// Unit struct — zero-size; often used as marker types or with impl blocks
struct Sentinel;
// Methods live in a separate impl block — not inside the struct definition
impl User {
// Associated function (constructor by convention — no "self")
pub fn new(id: u64, name: impl Into<String>, email: impl Into<String>) -> Self {
Self { id, name: name.into(), email: email.into(),
created_at: std::time::Instant::now(), active: true }
}
// Immutable method — borrows self
pub fn display_name(&self) -> &str { &self.name }
// Mutable method — exclusively borrows self
pub fn deactivate(&mut self) { self.active = false; }
// Consuming method — takes ownership of self
pub fn into_archived(self) -> ArchivedUser { ArchivedUser { id: self.id } }
}
// Struct update syntax — copy all fields except the ones you change
let admin = User { name: "Admin".to_string(), ..regular_user };Go:
// Named-field struct
type User struct {
ID uint64
Name string
Email string
CreatedAt time.Time
Active bool
}
// Methods on structs — defined outside the struct body, anywhere in the package
// Value receiver — receives a copy; safe for reads, cannot mutate the original
func (u User) DisplayName() string { return u.Name }
// Pointer receiver — receives the address; can mutate; avoids copying large structs
func (u *User) Deactivate() { u.Active = false }
// Constructor function by convention (no language-enforced constructor mechanism)
func NewUser(id uint64, name, email string) *User {
return &User{ID: id, Name: name, Email: email,
CreatedAt: time.Now(), Active: true}
}
// Struct literal — all fields, or named subset (rest zero-initialised)
u := User{ID: 1, Name: "Alice", Email: "alice@example.com", Active: true}
u2 := User{Name: "Bob"} // ID=0, Email="", CreatedAt=zero, Active=falseZig:
// A struct is a value of type `type`; methods live inside it as namespaced functions
const User = struct {
id: u64,
name: []const u8,
email: []const u8,
active: bool = true, // default field value
const Self = @This();
// "constructor" is just a function returning Self — no special syntax
pub fn init(id: u64, name: []const u8, email: []const u8) Self {
return .{ .id = id, .name = name, .email = email };
}
pub fn displayName(self: Self) []const u8 { return self.name; } // by-value receiver
pub fn deactivate(self: *Self) void { self.active = false; } // by-pointer receiver
};
const u = User.init(1, "Alice", "alice@example.com");
const u2 = User{ .id = 2, .name = "Bob", .email = "" }; // no zero-init: all non-default fields requiredKey differences:
- Rust methods live in
implblocks (multiple per type allowed, anywhere in the codebase); Zig methods live inside the struct body as namespaced functions; Go methods are defined anywhere in the package with noimpl/struct-body requirement - Rust distinguishes
&self/&mut self/selfand the compiler enforces it; Zig distinguishes by-valueselfvs by-pointer*self(a convention the compiler does not enforce for mutation safety); Go distinguishes value vs pointer receiver, also unenforced - Rust requires all fields or a
Default; Zig requires all non-default fields (fields can carry default values inline); Go zero-initialises every field - Go and Zig both have a notion of a zero/default value, but Zig makes you opt in per field rather than defaulting everything
Memory layout — the low-level reality. By default Rust and Go may reorder struct fields, but the rules differ in consequence:
- Rust uses
repr(Rust), which is deliberately unspecified — the compiler is free to reorder fields to minimise padding, and it does.struct S { a: u8, b: u64, c: u8 }is reordered so the twou8s pack together, givingsize_of::<S>() == 16instead of the 24 a naive C layout would produce. You opt into a fixed layout with#[repr(C)](for FFI),#[repr(packed)](remove all padding — beware unaligned access UB),#[repr(align(N))](force alignment, e.g. 64 to a cache line to avoid false sharing), or#[repr(transparent)](single-field newtype guaranteed identical to its inner type). Field alignment follows the largest member;sizeis rounded up to a multiple ofalign. - Go also does not guarantee field order matching source, but in practice the
gccompiler does not reorder for packing — it lays fields out in declaration order with natural alignment padding. This means field ordering is a manual optimisation in Go:struct{ a bool; b int64; c bool }occupies 24 bytes, but reordering tob, a, cgives 16. Tools likefieldalignment(part ofgo vet's extended checks) flag this. Go has no#[repr]equivalent; for guaranteed C layout across cgo you rely on matching field types and theunsafepackage'sSizeof/Offsetof/Alignof. - Zig lets the compiler reorder ordinary
structfields for packing (like Rust), and gives explicit control withextern struct(guaranteed C layout for FFI),packed struct(bit-exact, enables sub-byte integer fields likeu3), andalign().@sizeOf/@alignOf/@offsetOfare builtins. So Zig matches Rust in offering both auto-packing and explicit layout, withextern/packedas the FFI/bit-layout contract.
The practical upshot: Rust and Zig both offer automatic packing and explicit layout control; Go gives neither by default and makes layout a manual, lint-assisted discipline. For cache-sensitive data structures (hot-loop structs, lock-free nodes, SIMD-aligned buffers), this matters.
use std::mem::{size_of, align_of};
#[repr(C)] struct CLayout { a: u8, b: u64, c: u8 } // 24 bytes (C rules)
struct RsLayout { a: u8, b: u64, c: u8 } // 16 bytes (reordered)
#[repr(align(64))] struct CacheLine { counter: u64 } // align 64, size 64
assert_eq!(size_of::<RsLayout>(), 16);
assert_eq!(align_of::<CacheLine>(), 64);This is one of the sharpest differences among the three languages.
Rust — data-carrying enums (algebraic data types):
// Each variant can carry different data — or none at all
#[derive(Debug)]
enum PaymentStatus {
Pending, // unit variant — no data
Processing { transaction_id: String }, // struct variant — named fields
Completed(f64, chrono::DateTime<chrono::Utc>), // tuple variant — positional
Failed { code: u32, message: String }, // struct variant
Refunded(f64), // tuple variant
}
// Pattern match — compiler REQUIRES all variants to be handled
fn describe_payment(status: &PaymentStatus) -> String {
match status {
PaymentStatus::Pending => "Awaiting processing".into(),
PaymentStatus::Processing { transaction_id } => format!("Processing: {transaction_id}"),
PaymentStatus::Completed(amount, at) => format!("Paid ${amount:.2} at {at}"),
PaymentStatus::Failed { code, message } => format!("Error {code}: {message}"),
PaymentStatus::Refunded(amount) => format!("Refunded ${amount:.2}"),
// Omit any variant → compile error. No silent fall-through.
}
}
// Enums as state machines — self-documenting, impossible states prevented
enum TcpState {
Closed,
Listen,
SynSent { seq: u32 },
Established{ seq: u32, ack: u32, socket: TcpStream },
FinWait1 { seq: u32 },
// etc.
}
// A Closed state cannot carry a socket. An Established state must have a socket.
// These constraints are in the type — no runtime nil-check needed.How a Rust enum is laid out. A data-carrying enum compiles to a tagged union: a
discriminant (the tag) plus storage sized to the largest variant, with alignment of the
strictest member. size_of is therefore max(variant sizes) + tag, rounded for alignment —
so a Result<(), [u8; 64]> is ~65 bytes regardless of which variant is live. A notable
optimisation is niche-filling: if a variant contains a field with invalid bit patterns
(a "niche"), the compiler encodes the discriminant into that niche instead of adding a
separate tag. Option<&T> and Option<Box<T>> use the null pointer as None, so they are
exactly one word — no tag byte. Option<bool> fits in one byte (using values 2..=255 for
None). enum E { A, B(NonZeroU32) } is 4 bytes. This is why idiomatic Rust pays nothing
for Option/Result in the common case, and why "use the type system instead of a nil
pointer" is not a performance sacrifice.
Go — enums are typed integers with no data:
// iota pattern — no data attachment possible
type PaymentStatus int
const (
PaymentPending PaymentStatus = iota
PaymentProcessing
PaymentCompleted
PaymentFailed
PaymentRefunded
)
// To carry data per status, you need a struct with optional fields
// (most are nil for most statuses — illegal combinations are representable)
type Payment struct {
Status PaymentStatus
TransactionID *string // non-nil only when Processing
Amount *float64 // non-nil only when Completed or Refunded
FailCode *int // non-nil only when Failed
FailMessage *string // non-nil only when Failed
CompletedAt *time.Time // non-nil only when Completed
}
// Nothing prevents setting TransactionID when Status == PaymentFailed
// The programmer must enforce invariants manually
// Type switch — no exhaustiveness check
func describe(p Payment) string {
switch p.Status {
case PaymentPending: return "Awaiting"
case PaymentProcessing: return "Processing: " + *p.TransactionID
case PaymentCompleted: return fmt.Sprintf("Paid $%.2f", *p.Amount)
// Forget PaymentFailed and PaymentRefunded? No compile error.
}
return "unknown"
}Zig — tagged unions are the ADT:
const PaymentStatus = union(enum) {
pending: void,
processing: struct { transaction_id: []const u8 },
completed: struct { amount: f64, at: i64 },
failed: struct { code: u32, message: []const u8 },
refunded: f64,
pub fn describe(self: PaymentStatus, buf: []u8) ![]u8 {
return switch (self) { // exhaustive — compile error if a tag is missed
.pending => "Awaiting processing",
.processing => |p| std.fmt.bufPrint(buf, "Processing: {s}", .{p.transaction_id}),
.completed => |c| std.fmt.bufPrint(buf, "Paid ${d:.2}", .{c.amount}),
.failed => |f| std.fmt.bufPrint(buf, "Error {d}: {s}", .{ f.code, f.message }),
.refunded => |amt| std.fmt.bufPrint(buf, "Refunded ${d:.2}", .{amt}),
};
}
};
// Like Rust: a `.completed` value carries its amount; illegal combinations are unrepresentable.This is the deepest design divergence among the three languages.
Go interfaces — structural, implicit, always-dynamic:
An interface in Go is a set of method signatures. Any type that has those methods satisfies the interface — no declaration needed. This is structural typing (also called duck typing in a statically checked form).
// Define an interface — just method signatures
type Writer interface {
Write(p []byte) (n int, err error)
}
type ReadWriter interface {
Reader // interface embedding — compose interfaces
Writer
}
// bytes.Buffer satisfies Writer without knowing this interface exists
var w Writer = &bytes.Buffer{}
// os.File also satisfies Writer — cross-package, zero boilerplate
var w2 Writer = os.Stdout
// Interface with multiple concrete implementations
type Shape interface {
Area() float64
Perimeter() float64
String() string
}
type Circle struct { Radius float64 }
func (c Circle) Area() float64 { return math.Pi * c.Radius * c.Radius }
func (c Circle) Perimeter() float64 { return 2 * math.Pi * c.Radius }
func (c Circle) String() string { return fmt.Sprintf("Circle(r=%.2f)", c.Radius) }
// No "implements Shape" declaration — Go checks at the assignment site
var s Shape = Circle{Radius: 5.0} // Circle satisfies Shape implicitlyAn interface value in Go is a fat pointer: 16 bytes containing a pointer to the concrete value's data and a pointer to the interface's method table (itab). Every interface call goes through the itab — dynamic dispatch, always.
// The empty interface — accepts any value
func log(v any) { fmt.Printf("%T: %v
", v, v) }
log(42)
log("hello")
log(Circle{Radius: 3})
// any == interface{} — no type information enforcedInterface nil trap — the most notorious Go footgun:
// An interface value is nil only if BOTH the type pointer and data pointer are nil
var err *MyError = nil // typed nil pointer
var iface error = err // interface wrapping a typed nil
fmt.Println(iface == nil) // false! — the type pointer is set, data is nil
// This causes "nil pointer dereference" bugs where iface != nil looks safeRust traits — explicit, static or dynamic, richly composable:
A trait defines behaviour. Types implement traits explicitly with impl Trait for Type.
There is no implicit satisfaction — if you want a type to implement Display, you write
the impl. The tradeoff: more code for simple cases; far more expressive for complex ones.
// Define a trait
trait Shape {
fn area(&self) -> f64;
fn perimeter(&self) -> f64;
// Default implementation — all implementors get this for free
fn describe(&self) -> String {
format!("Area: {:.2}, Perimeter: {:.2}", self.area(), self.perimeter())
}
}
struct Circle { radius: f64 }
struct Rectangle { width: f64, height: f64 }
impl Shape for Circle {
fn area(&self) -> f64 { std::f64::consts::PI * self.radius * self.radius }
fn perimeter(&self) -> f64 { 2.0 * std::f64::consts::PI * self.radius }
// describe() is inherited for free
}
impl Shape for Rectangle {
fn area(&self) -> f64 { self.width * self.height }
fn perimeter(&self) -> f64 { 2.0 * (self.width + self.height) }
}
// Shape is NOT a type — it is a constraint. To use polymorphism you choose:
// (A) Static dispatch — monomorphized at compile time, zero overhead
fn print_area_static<S: Shape>(s: &S) {
println!("{:.2}", s.area()); // call inlined per concrete type
}
// Or equivalently with impl Trait syntax:
fn print_area_impl(s: &impl Shape) { println!("{:.2}", s.area()); }
// (B) Dynamic dispatch — vtable, one compiled copy, heterogeneous collections
fn print_area_dynamic(s: &dyn Shape) { println!("{:.2}", s.area()); }
let shapes: Vec<Box<dyn Shape>> = vec![
Box::new(Circle { radius: 3.0 }),
Box::new(Rectangle { width: 4.0, height: 5.0 }),
];Traits with associated types — expressing type families:
// Associated types bind output types to a trait implementation
// This is impossible with Go interfaces
trait Converter {
type Output; // associated type — defined per implementation
type Error: std::error::Error;
fn convert(&self) -> Result<Self::Output, Self::Error>;
}
struct JsonToProto;
impl Converter for JsonToProto {
type Output = prost::Message;
type Error = ConversionError;
fn convert(&self) -> Result<Self::Output, Self::Error> { ... }
}
// Caller uses the associated type without knowing its concrete form
fn run<C: Converter>(c: &C) -> Result<C::Output, C::Error> { c.convert() }Blanket implementations — implement a trait for all types satisfying a bound:
// In the standard library: any type that implements Display gets to_string() for free
impl<T: fmt::Display> ToString for T {
fn to_string(&self) -> String { format!("{}", self) }
}
// Go has no blanket-impl equivalent: you would write to_string per type, or a free
// func ToString[T fmt.Stringer](v T) string — not a method added to every Display type at onceThe orphan rule — coherence guarantee:
// You can only implement a trait for a type if you own the trait OR the type.
// This prevents two libraries from providing conflicting implementations.
// impl Display for Vec<i32> {} // compile error — neither Display nor Vec is yours
struct MyVec(Vec<i32>);
impl fmt::Display for MyVec { ... } // fine — you own MyVecGo's structural typing avoids the orphan problem (a third-party type can satisfy your interface automatically) but has no equivalent coherence guarantee.
Zig — no traits or interfaces; comptime duck typing plus hand-built vtables:
Zig has neither Go's interfaces nor Rust's traits. It expresses the same two needs with two
different tools. For static polymorphism, anytype parameters are resolved per call site —
if the passed value has the operations the body uses, it compiles; otherwise the instantiation
is a compile error (structural/duck typing checked at comptime, no declared bound):
// "Generic over anything with an area() method" — the bound is "does the body compile"
fn printArea(shape: anytype) void {
std.debug.print("{d}\n", .{shape.area()}); // compile error if `shape` has no area()
}For dynamic polymorphism (the dyn Trait/interface case), Zig has no language feature; the
idiom — used by its own stdlib (std.mem.Allocator, std.Io, std.Random) — is a manual
fat-pointer struct: a *anyopaque context plus a struct of function pointers (the vtable),
assembled explicitly. It is exactly what Rust's dyn and Go's interface compile to, written
in source rather than synthesised:
const Shape = struct {
ptr: *anyopaque,
vtable: *const VTable,
const VTable = struct { area: *const fn (*anyopaque) f64 };
pub fn area(self: Shape) f64 { return self.vtable.area(self.ptr); } // explicit dynamic dispatch
};Zig has no associated types, no blanket impls, and no orphan rule (there are no traits to
implement coherently); the equivalent expressiveness comes from comptime (see §8). Compared
to Go, Zig's static path is monomorphized (no forced vtable) but its dynamic path is more
verbose (you write the vtable). Compared to Rust, it trades trait machinery and compile-time
coherence for one mechanism (comptime) plus explicit vtables.
This is where performance characteristics diverge sharply.
Rust — explicit choice:
// Static dispatch: compiler generates a specialised copy per concrete type
// ⚡ Perf: inlining, zero call overhead, LLVM can optimise per-type
fn process_static<T: Serialize + Validate>(item: &T) -> Result<Vec<u8>, Error> {
item.validate()?;
serde_json::to_vec(item).map_err(Error::from)
}
// process_static::<Order> — one compiled version for Order
// process_static::<Invoice> — separate compiled version for Invoice
// Each can be individually inlined and optimised
// Dynamic dispatch: one compiled copy, vtable lookup per call
// 🧹 DX: heterogeneous collections, smaller binary when many types are involved
// ⚡ Perf: ~1–5ns per virtual call overhead; pointer indirection for data
fn process_dynamic(item: &dyn (Serialize + Validate)) -> Result<Vec<u8>, Error> {
item.validate()?;
serde_json::to_vec(item).map_err(Error::from)
}
let items: Vec<Box<dyn (Serialize + Validate)>> = load_mixed_items();Go — interfaces are always dynamic:
// All interface calls go through a vtable (itab) — no choice available
func process(item interface{ Validate() error }) error {
return item.Validate() // always vtable lookup
}
// Go's generics (1.18+) provide static dispatch via type constraints
func processGeneric[T interface{ Validate() error }](item T) error {
return item.Validate() // may be inlined — depends on GC shapes
}
// But Go's monomorphization uses GC shapes: all pointer types share ONE compiled copy
// with a dictionary; true per-type specialisation like Rust is not guaranteedLow-level mechanics — what these actually compile to:
A Rust trait object (&dyn Trait, Box<dyn Trait>) is a fat pointer: two machine words —
(data_ptr, vtable_ptr). The vtable is a static, per-(type, trait) table emitted once into
.rodata, laid out as [drop_in_place, size, align, method0, method1, ...]. A dynamic call
is call [vtable_ptr + offset] — one dependent load to fetch the function pointer, then an
indirect branch. The indirect branch defeats inlining and is a branch-predictor target;
mispredicts cost ~10–20 cycles on modern x86, correctly-predicted ~1–3 cycles plus the vtable
load's L1 latency (~4 cycles).
A Go interface value is also a two-word (itab_ptr, data_ptr) pair. The itab ("interface
table") holds the dynamic type descriptor plus the method function pointers, and is
computed once per (concrete type, interface) pair and cached in a global hash table the
first time that pairing is needed at runtime. Two consequences fall out of this design that
Rust does not share:
- A Go interface holding a non-pointer value (e.g.
interface{}wrapping anint) must box it — heap-allocate the value so thedata_ptrhas something to point at. Small integers 0–255 are cached, but in general "put a value type in an interface" is a heap allocation and a GC-tracked pointer. This is a real, frequently-overlooked allocation source in hot Go code. Rust'sdynnever implicitly boxes — you opt in withBox<dyn>. - The infamous typed-nil interface: an interface is
nilonly when both words are zero. Anil*Tstored into anerrormakesitab_ptrnon-nil, soerr != nilis true even though the underlying pointer is nil — Rust'sOption<&T>/Option<Box<T>>has no analogous trap becauseNoneis a single niche-optimised value.
Static dispatch costs. Rust monomorphization stamps out a fresh, fully-specialised copy
of process_static per concrete T, each independently inlined, with T's methods
devirtualised and often inlined too. The win is peak speed; the costs are (1) compile time —
the backend optimises N copies — and (2) binary size / instruction-cache pressure ("code
bloat"), which can hurt runtime if the duplicated code blows the I-cache. Go's generics
take the opposite trade: the compiler groups instantiations by GC shape (roughly, identical
size and pointer-bitmap), so every pointer-typed instantiation shares one compiled body that
receives a hidden dictionary argument carrying the per-type metadata and method pointers.
Method calls through that dictionary are effectively dynamic dispatch again — so Go generics
can be slower than hand-written concrete code, and frequently no faster than an interface.
The payoff is small binaries and fast builds. Neither choice is strictly better; they are
different points on the speed/size/compile-time surface, and this is one reason "Rust is
faster" and "Go compiles faster" are two faces of the same decision.
A Zig "vtable struct" (the std.Io/std.mem.Allocator pattern) is the same two-word
(ptr, vtable_ptr) fat pointer as Rust's dyn, except you declare the vtable struct and
populate it — there is no compiler-synthesised table and no implicit boxing (a value placed
behind the interface is whatever you point ptr at; you choose where it lives). Zig's
comptime "generics" monomorphize like Rust's: Stack(i32) and Stack(u8) are distinct
generated types with no dictionary and no vtable, so Zig sits at Rust's end of the
speed/size/compile-time surface (fast code, larger output, more compiler work) rather than
Go's. Zig has no typed-nil-interface trap (its optionals ?T niche-optimise like Rust's
Option), and an anytype value is resolved structurally at the call site with no runtime
descriptor at all.
Rust — rich bounds system:
// Single bound
fn largest<T: PartialOrd>(list: &[T]) -> &T {
let mut largest = &list[0];
for item in list { if item > largest { largest = item; } }
largest
}
// Multiple bounds with where clause (cleaner for complex signatures)
fn serialize_and_log<T>(item: &T) -> Result<String, serde_json::Error>
where
T: Serialize + fmt::Debug + Send + Sync + 'static
{
let json = serde_json::to_string(item)?;
log::debug!("{:?} → {}", item, json);
Ok(json)
}
// Const generics — generic over a value, not just a type
struct Matrix<T, const ROWS: usize, const COLS: usize> {
data: [[T; COLS]; ROWS],
}
impl<T: Default + Copy, const R: usize, const C: usize> Matrix<T, R, C> {
fn transpose(&self) -> Matrix<T, C, R> { ... }
}
let m: Matrix<f64, 3, 4> = Matrix { data: [[0.0; 4]; 3] };
// Matrix<f64, 3, 4> and Matrix<f64, 4, 3> are distinct types — size mismatch = compile error
// Generic Associated Types (GATs) — parameterise associated types with lifetimes or types
trait Repository {
type Item<'a> where Self: 'a; // Item borrows from the repository
type Error: std::error::Error;
fn get<'a>(&'a self, id: u64) -> Result<Self::Item<'a>, Self::Error>;
}
// impl Trait in argument position — anonymous generic
fn draw_all(shapes: impl Iterator<Item = impl Shape>) {
for shape in shapes { println!("{}", shape.describe()); }
}Go — interface constraints and type sets:
// Type constraint using interface
type Ordered interface {
~int | ~int8 | ~int16 | ~int32 | ~int64 |
~uint | ~uint8 | ~uint16 | ~uint32 | ~uint64 |
~float32 | ~float64 | ~string
}
func Largest[T Ordered](list []T) T {
largest := list[0]
for _, v := range list[1:] {
if v > largest { largest = v }
}
return largest
}
// ~ means "any type whose underlying type is X" — covers newtypes
type Celsius float64
type Fahrenheit float64
var temps = []Celsius{98.6, 37.0, 100.0}
hottest := Largest(temps) // works — Celsius's underlying type is float64
// Combining method requirements and type sets
type Numeric interface {
~int | ~int64 | ~float64
String() string // must also have a String method
}
// Limitations vs Rust:
// - No const generics (generic over a value)
// - No associated types on interfaces
// - Type inference is less powerful in complex generic chains
// - Generic methods: NOT in 1.26, but ACCEPTED and targeted for 1.27 (Aug 2026) — see belowZig — generics are comptime functions returning a type:
// No generics keyword. A generic type is a function evaluated at compile time.
fn List(comptime T: type) type {
return struct {
items: []T,
allocator: std.mem.Allocator,
const Self = @This();
pub fn append(self: *Self, v: T) !void { /* ... */ }
};
}
const IntList = List(i32); // instantiated at comptime — a distinct concrete type
// Generic functions take comptime type params (or use anytype for duck typing)
fn largest(comptime T: type, list: []const T) T {
var max = list[0];
for (list[1..]) |v| { if (v > max) max = v; }
return max;
}
// "Const generics" come free: comptime params can be values, not just types
fn Matrix(comptime T: type, comptime rows: usize, comptime cols: usize) type {
return struct { data: [rows][cols]T };
}
const M = Matrix(f64, 3, 4); // Matrix(f64,3,4) and Matrix(f64,4,3) are distinct typesBecause the parameter is any comptime value, Zig gets const-generics, generic methods, and
type-valued parameters from one mechanism, with monomorphization like Rust. What it lacks is a
declared bound: there is no where T: Ord. The "constraint" is whether the body compiles for
the given type (if (v > max) requires T to be comparable), so a mismatch surfaces as a
compile error at instantiation rather than at the signature — more flexible than Go's type sets
and Rust's trait bounds, but with later and sometimes harder-to-read errors. (§8 covers the
comptime mechanism in depth.)
Go 1.26 — two shipped language refinements. Before the upcoming generic-methods work, Go 1.26
(February 2026) already made two relevant changes. First, the built-in new now accepts an
expression, so new(expr) allocates a variable initialized to that value and returns its
pointer — eliminating the ubiquitous func ptr[T any](v T) *T { return &v } helper, which is
especially handy for optional struct fields with serialization (Age: new(yearsSince(born))).
Second, the restriction that a generic type may not refer to itself in its own type-parameter
list was lifted, so self-referential constraints like type Adder[A Adder[A]] interface { Add(A) A }
now compile — the CRTP-style pattern used for fluent builders and numeric/tower abstractions. Note
this is the self-referential type feature (shipped in 1.26); generic methods are the separate,
still-forthcoming change below.
Go 1.27 — generic methods are coming. Since 1.18, a Go method could not declare its own
type parameters — only package-level functions and types could. This forced the awkward pattern
of free functions taking the receiver as their first argument (func MapCache[T,U any](c *Cache, …) instead of c.Map[U](…)), which don't chain and don't autocomplete as methods. In
January 2026 the Go team accepted proposal #77273 (authored by Robert Griesemer), reversing
a position held since generics shipped, and generic methods are targeted for Go 1.27
(expected August 2026; some sources note 1.27-or-1.28). The mechanism treats a generic concrete
method as a generic function with a receiver:
type Query[T any] struct { /* ... */ }
// Go 1.27: a method may declare its OWN type parameters
func (q *Query[T]) Include[F any](selector func(*T) *F) *Query[T] {
// ...
return q
}The deliberate restriction: interface methods still may not declare type parameters, and a
generic method cannot implement an interface method. This is because Go cannot know at compile
time which instantiations a dynamically-satisfied interface would need. So 1.27 closes the
"generic functions in a type's namespace" gap (chainable, discoverable APIs — builders, ORMs,
functional helpers) without opening the genuinely hard problem of generic dispatch through
interfaces. This narrows one of the larger gaps versus Rust, where methods in an impl block
have always been able to introduce their own type parameters.
Type-level programming — Rust, and how closely Go and Zig can imitate it. The more useful lens than higher-kinded types is type-level programming: encoding facts and computation in types so the compiler enforces invariants and selects code, with no runtime cost. Rust supports a rich form of this; Go and Zig approximate parts of it by very different means.
What Rust offers:
- The typestate pattern — encode an object's state in its type so illegal operations don't
compile. A
Builder<Unset>vsBuilder<Set>, or aConnection<Open>vsConnection<Closed>, makes "callsendon a closed connection" a compile error. Each transition consumesselfand returns a different type:
struct Door<State> { _state: PhantomData<State> }
struct Open; struct Closed;
impl Door<Closed> { fn open(self) -> Door<Open> { Door { _state: PhantomData } } }
impl Door<Open> { fn close(self) -> Door<Closed> { Door { _state: PhantomData } }
fn walk_through(&self) { /* only callable while Open */ } }
// door.walk_through() does not compile unless the door's type is Door<Open>PhantomData<T>carries a type parameter that influences type-checking, variance, and drop-checking without storing a value — the zero-sized lever that makes typestate and units-of-measure encodings free at runtime.- Const generics (
struct Matrix<const R: usize, const C: usize>) put values in types, so matrix dimensions are checked at compile time —a: Matrix<2,3> * b: Matrix<3,4>type-checks and* b2: Matrix<2,2>does not. - Generic Associated Types (GATs), stable since 1.65 — associated types that take their own
generic/lifetime parameters (
type Item<'a>;), which enable lending iterators and zero-copy views (the cases people historically wanted higher-kinded types for). - Trait bounds + blanket impls + marker traits let the compiler select behaviour by type
and prove properties (
Send/Syncare entirely type-level facts).
How close Go gets:
- Go has no const generics, no
PhantomData, no associated types, and (until 1.27) no generic methods, so full typestate is not expressible — you cannot make "send on closed connection" a compile error via types; you check at runtime. You can partially imitate typestate with distinct named types and methods that return the next type (OpenConn→ClosedConn), but with no shared generic machinery it is verbose and easily bypassed. Type sets (~int | ~string) are Go's one genuinely type-level construct, used for constraint satisfaction, not computation. Phantom-type-like tagging is sometimes faked with a zero-size fieldstruct{ _ tag }, but without variance or inference support it stays a convention. Net: Go does value-level validation where Rust does type-level, by design — the language optimises for obviousness over compile-time proof. - Zig has no traits, lifetimes, or
PhantomData, butcomptimereaches a surprising amount of the same ground from the other direction: because types are values and arbitrary code runs at compile time, you can compute types, branch on@typeInfo, and@compileErrorto reject invalid combinations — a form of type-level validation. Acomptime-checked dimension on a matrix, or acomptimeassertion that a state transition is legal, gives typestate-like guarantees expressed as compile-timeif/assertrather than as distinct parameterised types. What Zig lacks is the declarative encoding (noDoor<Open>type the signature can require); the check is imperative comptime code you must place at each boundary, and there is no coherence or variance system. So Zig imitates the outcome (compile-time rejection of illegal states) without the type-as-proposition machinery.
On higher-kinded types specifically: none of the three has true HKT (abstracting over an
unapplied constructor like F<_>). In Rust the practical substitute is GATs plus traits, and the
honest assessment is that HKT buys an eager, ownership-tracked language less than it buys a lazy
GC'd one — the and_then on Option, Result, Iterator, and Future have materially
different signatures, so a single Monad abstraction would be far less useful, which is why
const generics and GATs were prioritised instead. Go does not attempt it; Zig sidesteps it by
passing a type constructor as an ordinary comptime fn (type) type value. The capability that
matters in practice — parameterising over a container — is reachable in Rust (GATs) and Zig
(comptime) without it.
Go — struct embedding for composition:
type Logger struct { Level string }
func (l *Logger) Info(msg string) { fmt.Printf("[%s] INFO: %s
", l.Level, msg) }
func (l *Logger) Error(msg string) { fmt.Printf("[%s] ERROR: %s
", l.Level, msg) }
type MetricsCollector struct { prefix string }
func (m *MetricsCollector) Inc(name string) { /* increment counter */ }
func (m *MetricsCollector) Gauge(name string, v float64) { /* set gauge */ }
type Server struct {
Logger // all Logger methods promoted to Server
MetricsCollector // all MetricsCollector methods promoted
addr string
handler http.Handler
}
s := Server{
Logger: Logger{Level: "INFO"},
MetricsCollector: MetricsCollector{prefix: "server"},
addr: ":8080",
}
s.Info("starting up") // promoted — no delegation code written
s.Inc("requests_total") // promoted
s.Logger.Level = "DEBUG" // explicit access to embedded field when neededInterface embedding composes interfaces:
type ReadWriter interface {
io.Reader // embed Reader interface
io.Writer // embed Writer interface
}
type ReadWriteCloser interface {
ReadWriter // embed ReadWriter (which embeds Reader and Writer)
io.Closer
}Rust — trait-based composition:
// Traits compose via supertraits and blanket impls
trait Loggable: fmt::Debug { // supertrait — implementor must also implement Debug
fn log_level(&self) -> &str;
fn info(&self, msg: &str) { println!("[{}] INFO: {}", self.log_level(), msg); }
fn error(&self, msg: &str) { println!("[{}] ERROR: {}", self.log_level(), msg); }
}
// A struct that implements multiple traits
#[derive(Debug)]
struct Server { addr: String, log_level: String }
impl Loggable for Server {
fn log_level(&self) -> &str { &self.log_level }
}
// Compose capabilities via trait bounds
fn start<S>(server: S)
where
S: Loggable + Clone + Send + 'static
{
server.info("starting");
let handle = std::thread::spawn(move || { server.info("thread started"); });
handle.join().unwrap();
}
// Delegation (no embedding) — must write it manually
struct MeteredServer {
inner: Server,
counter: std::sync::atomic::AtomicU64,
}
impl Loggable for MeteredServer {
fn log_level(&self) -> &str { self.inner.log_level() } // manual delegation
}
// A Rust RFC for delegation syntax exists but is unimplemented as of 1.95Go embedding nuances — the details that bite in practice:
- Ambiguity is silent until use. If two embedded types both have a method
Close(), the promotedCloseis ambiguous — callings.Close()is a compile error, but only when you actually call it. You disambiguate explicitly:s.Logger.Close(). Embedding two types with overlapping method sets compiles fine until the collision is exercised. - Shadowing — the outer type wins. If
Serverdefines its ownInfo(), it shadows the embeddedLogger.Info(). The promoted method is silently overridden;s.Info()callsServer's, and the embedded one is reachable only vias.Logger.Info(). There is nooverridekeyword and no warning — this is how you "override" promoted behavior. - Embedding satisfies interfaces. If
LoggerhasInfo(string)and that's all an interfaceInfoerneeds, thenServer(embeddingLogger) satisfiesInfoerfor free — the promoted method counts. This is the common way to partially implement a large interface: embed a type (or even the interface itself) that provides most methods, override the few you care about. - Embedding an interface, not a struct. You can embed an interface in a struct:
struct{ io.Writer }. The struct then satisfiesio.Writerby forwarding to whatever concrete value is stored — and panics with a nil-pointer deref if it's nil. This is the idiomatic "wrap and override one method" pattern (e.g. wrapping ahttp.ResponseWriter). - Pointer vs value embedding.
struct{ Logger }embeds by value (copied);struct{ *Logger }embeds a pointer (shared, nil-able). The method set differs: pointer-embedding promotes both value- and pointer-receiver methods; value-embedding of an addressable struct also promotes both, but a value-embedded field inside an interface only promotes value-receiver methods. - The diamond is allowed. Embedding
AandBthat both embedBaseis legal; the twoBasesubobjects are distinct (no virtual-inheritance merging like C++). PromotedBasemethods become ambiguous and must be qualified.
Rust trait-composition nuances — the corresponding details:
- Supertraits express requirements, not inheritance.
trait Loggable: Debugmeans "anyLoggablemust also beDebug," and a default method can callDebugmethods onself. It is a bound, not subclassing — there is no data inheritance, only capability requirements. - Default methods + override. A trait can provide default method bodies (as
info/errorabove); animplmay accept the defaults or override any of them. This is Rust's "mixin" mechanism — provide one required method, get a family of derived methods for free (the entireIteratoradapter suite works this way). - Blanket impls compose capabilities across all types.
impl<T: Display> ToString for T {}gives everyDisplaytype aToString— a cross-cutting capability added to a whole set of types at once. Go and Zig have no equivalent; you would write per-type code or a generic function instead. - The orphan rule constrains composition. You may implement a trait for a type only if you
own the trait or the type. This guarantees coherence (no two crates provide conflicting
impls) but means you cannot
impl ExternalTrait for ExternalType— you wrap it in a newtype (§1.9) first. Go's structural interfaces sidestep this (a foreign type satisfies your interface automatically) at the cost of any coherence guarantee. - Associated types vs generic params shape composition. A trait with an associated type
(
trait Iterator { type Item; }) has one impl per type; a generic trait (trait From<T>) can be implemented many times per type for differentT. Choosing between them is a composition-design decision Go and Zig don't face (Go interfaces have neither; Zig expresses both viacomptime). - Trait objects restrict composition. You can combine auto traits with one base trait in
a
dynobject (dyn Shape + Send + Sync), but not two arbitrary non-auto traits (dyn Read + Writeis not allowed — you make a newtrait ReadWrite: Read + Write). This is the dynamic-dispatch counterpart of Go's freely-composable interface embedding.
Zig composition nuances. Composition is explicit field nesting; there is no promotion, so
server.logger.info(...) is written in full (some projects add thin forwarding methods by
hand). usingnamespace can mix another container's declarations (constants, functions) into
the current namespace — closer to "import these names" than to method promotion, and it does not
forward instance methods over a field. To require that a composed type provides a capability,
you assert it at comptime (e.g. comptime { if (!@hasDecl(T, "deinit")) @compileError(...); }),
which is a manual, explicit stand-in for Rust's supertrait check and Go's interface satisfaction.
Rust — three closure traits tracking mutability:
// Closures are anonymous structs that capture their environment
// Fn: captures immutably; can be called any number of times from any thread
// FnMut: captures mutably; can be called many times, not necessarily from multiple threads
// FnOnce: may consume captures; can only be called once
fn apply_twice<F: Fn(i32) -> i32>(f: F, x: i32) -> i32 { f(f(x)) }
let double = |x| x * 2; // Fn — captures nothing
apply_twice(double, 3); // 12
fn run_once<F: FnOnce() -> String>(f: F) -> String { f() }
let name = String::from("Alice");
run_once(move || format!("Hello, {}!", name)); // name is moved into the closure
// println!("{name}"); // compile error: moved into closure
// Function pointers — for when you don't need closure captures
fn add(a: i32, b: i32) -> i32 { a + b }
let op: fn(i32, i32) -> i32 = add;
let result = op(3, 4); // 7
// Higher-order functions with closure type inference
let numbers = vec![1, 2, 3, 4, 5, 6];
let sum_of_even_squares: i32 = numbers.iter()
.filter(|&&x| x % 2 == 0)
.map(|&x| x * x)
.sum(); // 4 + 16 + 36 = 56Go — first-class function types:
// Functions are first-class values — can be stored, passed, returned
type Predicate func(int) bool
type Transform func(int) int
func filter(nums []int, keep Predicate) []int {
var result []int
for _, n := range nums { if keep(n) { result = append(result, n) } }
return result
}
evens := filter([]int{1, 2, 3, 4, 5}, func(n int) bool { return n%2 == 0 })
// Closures capture by reference (shared mutable state — watch for goroutine races)
counter := 0
inc := func() int { counter++; return counter }
fmt.Println(inc(), inc(), inc()) // 1 2 3
// counter is mutated through the closure
// Method values — bound to a specific receiver
u := User{Name: "Alice"}
getName := u.DisplayName // a function value bound to u
fmt.Println(getName()) // "Alice"Zig — function pointers and comptime closures, no capturing closures:
Zig deliberately has no capturing closures. A function value is a plain function pointer
(*const fn (i32) i32), which captures nothing. To carry state you pass it explicitly — the
same (context_ptr, fn_ptr) pattern Zig uses for interfaces — which is why stdlib callbacks
take a context: anytype alongside the function. This keeps "no hidden allocation" honest: a
closure that captured environment would need to allocate somewhere, so Zig makes the state
explicit instead.
// No capture: state is threaded through an explicit context parameter
fn filter(nums: []const i32, ctx: anytype, keep: fn (@TypeOf(ctx), i32) bool, out: []i32) usize {
var n: usize = 0;
for (nums) |v| { if (keep(ctx, v)) { out[n] = v; n += 1; } }
return n;
}So Rust tracks capture mode in the type system (Fn/FnMut/FnOnce), Go captures by
reference implicitly (convenient, but a common goroutine data-race source), and Zig captures
nothing — you pass context by hand, trading ergonomics for zero hidden state and zero hidden
allocation.
Rust:
// Type alias — same type, just a shorter name (no new type, no safety)
type Result<T> = std::result::Result<T, Box<dyn std::error::Error>>;
type Callback = Box<dyn Fn(Event) -> Result<()> + Send + 'static>;
// Newtype pattern — a genuinely new type; the inner type is hidden behind a struct
// ⚡ Zero runtime cost; the wrapper is compiled away
// 🔐 Safety: units of measure, validated values, ID types that cannot be confused
struct Meters(f64);
struct Seconds(f64);
struct UserId(u64);
struct OrderId(u64);
// UserId and OrderId are not interchangeable even though both wrap u64
fn get_user(id: UserId) -> User { ... }
fn get_order(id: OrderId) -> Order { ... }
// get_user(OrderId(42)) → compile error: expected UserId, found OrderId
// Validated newtype — the constructor enforces the invariant
pub struct Email(String);
impl Email {
pub fn new(s: impl Into<String>) -> Result<Self, &'static str> {
let s = s.into();
if s.contains('@') { Ok(Email(s)) } else { Err("invalid email") }
}
pub fn as_str(&self) -> &str { &self.0 }
}
// Once you have an Email, it is guaranteed to contain '@'
// You cannot construct a raw Email("not-an-email") from outside this moduleGo:
// Type definition — creates a new named type with the same underlying representation
// New type does NOT inherit the methods of the underlying type (only operators)
type Meters float64
type Seconds float64
type UserID uint64
type OrderID uint64
// Same underlying type → assignable with explicit conversion, but not directly
var d Meters = 5.0
var t Seconds = 10.0
// var x Meters = t // compile error: cannot use Seconds as Meters
var x Meters = Meters(t) // explicit conversion — legal but defeats safety
// Type alias — same type, different name; fully interchangeable
type byte = uint8 // alias — not a new type
type rune = int32
// Methods can be added to defined types
func (m Meters) String() string { return fmt.Sprintf("%.2fm", float64(m)) }Rust polymorphism decision tree:
// Pattern 1: Static dispatch via generics — best default
// ⚡ Inlined, zero overhead; ✓ when you know all concrete types at compile time
fn serialize<T: Serialize>(data: &T) -> String { serde_json::to_string(data).unwrap() }
// Pattern 2: Trait objects — dynamic dispatch
// ✓ When you need a heterogeneous collection or erased return type
fn handlers() -> Vec<Box<dyn Handler>> { vec![...] }
// Pattern 3: Enum dispatch — exhaustive, no heap allocation
// ⚡ Fastest; 🔐 Exhaustive; ✓ When you own all variants
enum Command { Quit, Move(i32, i32), Resize(u32, u32) }
fn handle(cmd: Command) { match cmd { ... } }
// Pattern 4: impl Trait in return position — opaque type, zero overhead
// ✓ When returning a single concrete type you don't want to name
fn evens_squared(n: u32) -> impl Iterator<Item=u32> {
(0..n).filter(|x| x%2==0).map(|x| x*x)
}
// Choosing:
// Own all variants, closed world → enum dispatch (fastest, safest)
// Open world, static dispatch → generics / impl Trait (fast, some code bloat)
// Open world, dynamic dispatch → dyn Trait (one binary copy, vtable overhead)Go polymorphism decision tree:
// Pattern 1: Interface — the universal Go tool for polymorphism
// Always dynamic dispatch; structural, no declaration needed
type Stringer interface { String() string }
func print(s Stringer) { fmt.Println(s.String()) }
// Pattern 2: Type switch — recover concrete type from interface
func process(v any) {
switch x := v.(type) {
case int: fmt.Println("int:", x)
case string: fmt.Println("str:", x)
case fmt.Stringer: fmt.Println("stringer:", x.String())
}
}
// Pattern 3: Generics — static dispatch (GC-shapes, not true monomorphization)
func Map[T, U any](slice []T, fn func(T) U) []U {
result := make([]U, len(slice))
for i, v := range slice { result[i] = fn(v) }
return result
}
// Pattern 4: Struct embedding — inherit and extend behaviour
type LoggingHandler struct {
http.Handler // promote all Handler methods
logger *slog.Logger
}
func (h LoggingHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
h.logger.Info("request", "path", r.URL.Path)
h.Handler.ServeHTTP(w, r) // delegate to wrapped handler
}Zig polymorphism decision tree:
// Pattern 1: comptime generics — static dispatch, monomorphized (closed or open set)
fn Map(comptime T: type, comptime U: type) type { /* returns a typed mapper */ }
// Pattern 2: anytype — structural/duck typing, resolved at the call site
fn area(shape: anytype) f64 { return shape.area(); } // compiles iff shape has area()
// Pattern 3: tagged union — closed set, exhaustive switch, no heap, fastest
const Shape = union(enum) { circle: f64, rect: struct { w: f64, h: f64 } };
// Pattern 4: vtable struct — open set, dynamic dispatch, you declare the table by hand
const Drawable = struct { ptr: *anyopaque, vtable: *const struct { draw: *const fn (*anyopaque) void } };
// Choosing:
// Closed world → tagged union (fastest, exhaustive)
// Open world, static → comptime generics / anytype (monomorphized)
// Open world, dynamic → hand-built vtable struct (one copy, explicit indirection)A note on language philosophy that recurs throughout this document: Zig's governing rule is "no hidden control flow, no hidden allocations, no hidden anything" — no operator overloading, no destructors, no implicit conversions, no GC, and allocators passed explicitly. Rust's rule is that the compiler should prove memory and data-race safety (ownership + borrow checking). Go's rule is to keep the language small and let a GC plus first-party tooling carry the rest. Those three stances explain most of the differences in every section that follows.
Two more abstraction features differ sharply and are worth covering explicitly, because each language draws the line in a different place.
Iteration. How you write "loop over a custom collection" reveals each design:
- Rust — iteration is a trait.
Iteratorrequires one method,fn next(&mut self) -> Option<Self::Item>, and you get ~70 adapter methods (map,filter,take,zip,fold,collect, …) as defaults for free. Iterators are lazy (nothing runs until consumed) and zero-cost (the adapter chain monomorphises and inlines into a loop with no allocation).for x in collectiondesugars toIntoIterator::into_iter+next(). This is the backbone of idiomatic Rust.
let total: u64 = (1..=100).filter(|n| n % 3 == 0).map(|n| n * n).sum(); // lazy, fused, no alloc- Go — historically iteration over custom types meant exposing a method and a manual loop,
or a channel (which allocates and involves the scheduler). Go 1.23 added range-over-func
iterators: a function with signature
func(yield func(K, V) bool)can be ranged over directly withfor k, v := range myIter. This standardised custom iteration without channels, and theiterpackage plusmaps/sliceshelpers build on it. It is eager and closure-driven rather than a lazy fused pipeline.
// Go 1.23+ range-over-func: a "push" iterator
func Multiples(of, max int) func(yield func(int) bool) {
return func(yield func(int) bool) {
for n := of; n <= max; n += of { if !yield(n) { return } }
}
}
for n := range Multiples(3, 100) { _ = n } // ranges over the function- Zig — there is no iterator trait or language iterator protocol. The convention is a struct
with a
next()method returning an optional (?T), and you drive it with awhileloop using optional-capture syntax. Standard-library types (std.mem.SplitIterator,std.fs.Dir.Iterator, hash-map iterators) all follow this shape by convention, not by an enforced interface.
var it = std.mem.splitScalar(u8, "a,b,c", ',');
while (it.next()) |part| { use(part); } // `while (optional) |capture|` is the iteration idiomHigher-order-function chains on collections (map/filter/reduce) — availability and
performance cost. A separate question from "how do I iterate" is "can I write
data.map(...).filter(...).reduce(...) as a chain," and the three diverge sharply on both
whether it exists in the standard library and what it costs.
- Rust — full chains in std, zero-cost. The
Iteratortrait shipsmap,filter,filter_map,flat_map,fold/reduce,scan,take_while,zip,chain,enumerate,collect,sum, and ~60 more — all on any slice, array,Vec,HashMap,BTreeMap, or custom iterator. The performance is the headline: each adapter is a distinct generic type, so a chain monomorphizes, inlines, and fuses into a single loop with no intermediate collection, no heap allocation, and no per-element function-call overhead — the emitted machine code matches a hand-writtenforloop (verifiable withcargo asm/godbolt). CPU/memory benefit: one pass over the data, one set of bounds checks the optimizer often elides, and nothing touches the allocator — so the chain is as cache-friendly as the manual loop while reading like a specification. The one cost to know: laziness is what makes fusion possible, so an explicit intermediate.collect::<Vec<_>>()in the middle of a chain forces a real allocation and breaks fusion — the documented anti-pattern. Real-world use: data transformation pipelines, parsing (filter_map(|s| s.parse().ok())), and stream processing where you want expressiveness and C-speed. For adapters beyond std (group_by,chunk_by,cartesian_product,dedup,itertools::izip!), theitertoolscrate is the universal extension.
// One fused pass, no allocation: parse, keep evens, square, sum
let s: u64 = lines.iter().filter_map(|l| l.parse::<u64>().ok())
.filter(|n| n % 2 == 0).map(|n| n * n).sum();- Go — iterator plumbing in std (1.23+), but no
map/filter/reduce, by design. Go 1.23 stabilized range-over-func and added theiter,slices, andmapspackages — but those give you iterator construction and collection (slices.Values,slices.Collect,slices.Sorted,maps.Keys,maps.Values), not the transformation combinators. There is deliberately noslices.Maporslices.Filter: the Go team's stated position is that a stdlibFilterwould "obscure allocation and overhead" and is easily overused, so they prefer you write theforloop (which makes the allocation visible). You can build chains oniter.Seqby writing your ownMap/Filtercombinators (a few lines each, lazy like Rust), but the ergonomics suffer because Go has no method-chaining on free functions and closures are verbose — a chain reads as nested callsFilter(Map(seq, f), pred), notseq.map(f).filter(pred). Performance cost: a range-over-func chain is closure-driven — each stage is an indirect call through ayieldfunction the compiler usually cannot inline across stages, so unlike Rust there is no fusion and a per-element call overhead; an eager helper (below) additionally allocates a new slice per stage. For hot loops the idiomatic Go answer remains the explicitforloop. Third-party libraries fill the ergonomic gap:samber/lo(the most popular, 100+ eager helpers —lo.Map,lo.Filter,lo.Reduce,lo.GroupBy; each allocates a result slice),samber/lo/it(lazyiter.Seqversions, no buffering),samber/lo/parallel(worker-pool parallel map for CPU-bound transforms), andgo-functional.
// stdlib gives iterator plumbing, not transforms — you collect, or use a lib:
evens := slices.Collect(func(yield func(int) bool) { // hand-written filter combinator
for _, n := range nums { if n%2 == 0 && !yield(n) { return } }
})
// or, with samber/lo (eager, allocates a new slice):
squares := lo.Map(lo.Filter(nums, func(n, _ int) bool { return n%2 == 0 }),
func(n, _ int) int { return n * n })- Zig — no functional combinators in std at all.
stdhas iterator structs withnext()(SplitIterator,TokenIterator, map/array-hash-map iterators) but nomap/filter/reduceadapters and no way to chain them — consistent with the "no hidden control flow, no hidden allocation" philosophy (amapthat returns a new collection would allocate; a lazy adapter would add indirection). The idiomatic approach is an explicitwhile/forloop, which is maximally transparent about cost. Performance angle: the manual loop is exactly as fast as Rust's fused chain (same single pass, no allocation) — you simply write it out, trading expressiveness for visible control, and you choose the allocator if a result collection is needed. No functional-combinator library is part of the standard distribution, but community options have emerged:lo.zigis a Lodash-style utility library built around lazy, iterator-first chains with no hidden allocations, andLazy-Zigports LINQ-style operators — though both move with the language's pre-1.0 churn. When you genuinely need generic transformation logic,comptimeis the usual in-language tool.
// No map/filter/reduce in std — the explicit loop IS the idiom (one pass, no alloc)
var sum: u64 = 0;
for (nums) |n| { if (n % 2 == 0) sum += n * n; }Operator overloading. Rust allows it through traits (Add, Mul, Index, Deref, PartialEq,
…); implementing std::ops::Add makes + work on your type, and Deref even lets a smart
pointer transparently expose its target's methods (Box<T> behaves like T). Go and Zig
both deliberately omit operator overloading — in both, +, ==, [] work only on built-in
types, and custom types use named methods (a.Add(b), a.eql(b)). The rationale is identical in
spirit ("an operator should mean one obvious thing"), and it is one of the clearest examples of
Rust accepting more language surface for expressiveness while Go and Zig keep the surface small.
A consequence: numeric/matrix/big-integer libraries read naturally in Rust (a * b + c) and
verbosely in Go and Zig (a.Mul(b).Add(c) / c.add(a.mul(b))).
⚡ Perf — slice/array layout determines bounds-check cost, allocation behavior, and cache locality 🔐 Safety — Rust encodes aliasing and nullability in pointer types; Go and Zig encode less, differently 🧹 DX — Go's
append/slice-of-slice model is uniquely convenient and uniquely footgun-prone
Every language here distinguishes a fixed-size array from a runtime-length slice/view, and each has a different pointer/reference vocabulary. These mechanics drive most real-world performance and aliasing bugs, so they get their own section.
Rust separates [T; N] (array, length in the type, stack-allocatable) from &[T]/&mut [T]
(slice, a fat pointer = (ptr, len), two words) and Vec<T> (owned, heap, (ptr, len, cap),
three words). A slice borrows; a Vec owns. Indexing is bounds-checked (panics on OOB) unless
you use get() (returns Option) or get_unchecked() (unsafe). Slicing is &v[a..b].
let arr: [i32; 4] = [1, 2, 3, 4]; // array — size in type, lives on stack
let s: &[i32] = &arr[1..3]; // slice — (ptr,len) borrow, no copy
let mut v: Vec<i32> = vec![1, 2, 3]; // owned heap buffer (ptr,len,cap)
v.push(4); // may reallocate when len==cap
let first = v.get(10); // None, not a panicGo has arrays [N]T (value types — copied on assignment/pass, length in the type) and
slices []T, which are a 3-word header (ptr, len, cap) pointing into a backing array.
This is the single most important Go data structure and its semantics are subtle:
a := [3]int{1, 2, 3} // array — VALUE type; passing it copies all elements
s := []int{1, 2, 3} // slice — header into a backing array
s2 := s[1:3] // s2 shares s's backing array; no copy
s2[0] = 99 // mutates s[1] too — aliasing through the shared backing array
s = append(s, 4) // if cap exceeded, allocates a NEW backing array and copies;
// existing slices then point at the OLD array — a classic bug
b := make([]int, 0, 16) // len 0, cap 16 — preallocate to avoid append reallocationsGo's append growth, shared backing arrays, and the len-vs-cap distinction are a frequent
source of aliasing bugs (mutating one slice changes another; or a re-slice keeps a huge backing
array alive). Rust's borrow checker forbids exactly these aliased-mutation patterns at compile
time; Go trades that safety for convenience.
Zig separates arrays [N]T (value type, length in type) from slices []T and []const T
(a fat pointer (ptr, len), like Rust). Slicing is arr[a..b]. Bounds are checked at runtime
in Debug/ReleaseSafe and are UB in ReleaseFast. Zig has no Vec/append built into the
language — growth is std.ArrayList(T), which takes an allocator explicitly:
var arr = [_]i32{ 1, 2, 3, 4 }; // array, size inferred ([4]i32), value type
const s: []const i32 = arr[1..3]; // slice — (ptr,len) view, no copy
var list = std.ArrayList(i32){}; // growable, allocator passed to its methods
try list.append(allocator, 5); // explicit allocator — no hidden realloc sourceDifferences worth naming: Go arrays are copied by value (a real gotcha for the unwary, but
makes value semantics predictable); Rust and Zig arrays are also value types but you almost
always pass slices. Go's slice carries cap and grows via append with shared-backing-array
aliasing; Rust splits this into borrow (&[T]) vs owned (Vec) so aliasing is type-visible;
Zig keeps slices to (ptr, len) and pushes growth into ArrayList with an explicit allocator.
This is where the three diverge most, and where Rust has machinery the others lack entirely.
Go has exactly one pointer type, *T, and these rules: no pointer arithmetic (in safe
code), no *T→*U casts (without unsafe), nil is the zero value, and the GC tracks every
pointer. You take an address with &x and dereference with *p. There is no distinction
between owned and borrowed — everything reachable is kept alive by the GC. unsafe.Pointer is
the escape hatch for arithmetic and reinterpretation.
x := 42
p := &x // *int
*p = 43 // x is now 43
var q *int // nil
// q++ // illegal — no pointer arithmeticRust encodes ownership, mutability, and nullability in the type of the reference/pointer. This is the heart of the language:
&T— shared (immutable) borrow; any number may coexist&mut T— exclusive (mutable) borrow; only one at a time, none alongside&TBox<T>— owned heap allocation, single owner, freed on drop (like C++unique_ptr)Rc<T>— shared ownership via non-atomic reference counting (single-thread);Weak<T>breaks cyclesArc<T>— shared ownership via atomic reference counting (thread-safe)Cell<T>/RefCell<T>— interior mutability: mutate through a shared&reference, with the borrow rules checked at compile time (Cell) or at runtime (RefCell, panics on violation) — the safety valve for when the static borrow checker is too strict*const T/*mut T— raw pointers; arithmetic and deref requireunsafe
let b = Box::new(5); // owned heap value, freed when b drops
let shared = Rc::new(RefCell::new(0)); // shared owner + interior mutability
*shared.borrow_mut() += 1; // runtime-checked mutable borrow
let across_threads = Arc::new(data); // atomic refcount for multi-thread sharingThe combinator Rc<RefCell<T>> (single-thread shared mutable) and Arc<Mutex<T>> (multi-thread
shared mutable) are the idiomatic ways to get shared mutability that Go gets implicitly (and
unsafely w.r.t. races) and that Zig leaves entirely to you.
Zig has the richest pointer vocabulary of the three, distinguishing kinds of pointer at the type level (but with no ownership tracking):
*T— single-item pointer (points at exactly oneT); deref isp.*[*]T— many-item pointer (C-array-like; supports pointer arithmetic and indexing, no length)[]T— slice = many-item pointer plus a length ((ptr, len))[*:0]T— sentinel-terminated pointer (e.g. null-terminated C strings:[*:0]const u8)[*c]T— C pointer (allows null and arithmetic; only for C interop)?*T— optional pointer; null is a distinct, niche-optimised state (pointer-sized)*const Tvs*T— const-ness is in the type
var x: i32 = 42;
const p: *i32 = &x; // single-item pointer
p.* = 43; // deref with .*
const many: [*]i32 = ...; // many-item pointer, supports many[3] and arithmetic
const cstr: [*:0]const u8 = "hi"; // null-terminated, for C interop
const maybe: ?*i32 = null; // optional pointer; null is a separate state, not a valueZig's distinction between "pointer to one" (*T), "pointer to many" ([*]T), and "pointer to
many with length" ([]T) is something neither Rust nor Go expresses in the type system — in
Rust a raw *const T is untyped as to count, and Go has only *T. The sentinel-terminated
pointer type ([*:0]T) makes null-terminated C strings first-class without conflating them with
length-carrying slices. Zig has no Box/Rc/Arc — ownership and lifetime are entirely manual
(you allocate, you free, you decide sharing), with no compile-time aliasing rules.
The three string models follow directly from the slice/pointer designs above:
- Rust
&str/Stringare UTF-8-validated slices/buffers (&stris(ptr, len)into UTF-8 bytes). - Go
stringis an immutable(ptr, len)header over bytes (UTF-8 by convention, not enforced);[]byteis the mutable form. - Zig
[]const u8is the string type — a byte slice; UTF-8 is convention (viastd.unicode), and[*:0]const u8is the C-string form.
(Encoding-correctness implications are covered in §10, Serialization & String Handling.)
Beyond the basic pointer/reference types, each language has distinctive low-level pointer machinery. Listing what is unique to each is the clearest way to see the design gaps.
Rust-only pointer machinery:
- Niche-optimised layout (§1.3). Because the type system knows which bit patterns are
invalid,
Option<&T>,Option<Box<T>>,Option<NonNull<T>>, andOption<NonZeroU32>cost zero extra bytes — the null/zero pattern encodesNone. No other language here folds the null state into the pointer for free across arbitrary types. Pin<P>— a pointer wrapper guaranteeing the pointee never moves in memory, required for self-referential structures and the address-stability thatasyncfutures need (a future may hold a pointer into its own storage). There is no Go or Zig equivalent; they either don't move values that way (Go's GC can move stacks but not heap objects exposed to pointers) or leave address-stability to manual discipline.PhantomData<T>— a zero-sized marker that makes a type "act like" it owns/borrows aTfor the purposes of variance, drop-check, and lifetime analysis, without storing one. It lets you carry lifetime/ownership information on a raw pointer. Unique to Rust's type system.- Lifetimes attached to references.
&'a Tcarries, in the type, how long the borrow is valid; the compiler rejects any use that outlives'a. This is the mechanism that makes a borrowed slice or pointer statically safe — neither Go (GC keeps everything alive) nor Zig (manual) expresses borrow duration in types. NonNull<T>,ptr::offset,ptr::read_volatile, strict provenance APIs. Raw-pointer work is possible but quarantined insideunsafe, with a documented memory model (Stacked/Tree Borrows under active formalisation) thatmirican check.
Go-only pointer machinery:
unsafe.Pointerconversion rules. Go defines a small set of legal patterns forunsafe.Pointer↔*T↔uintptrconversions; the GC understandsunsafe.Pointeras a live reference but treatsuintptras a plain integer (not a reference). The famous footgun: computing an address asuintptr, then converting back, is only valid if done in a single expression, because the GC may move/free the object between statements. This "uintptr is not a pointer" hazard is unique to having a moving/concurrent GC.- Interface internals as two words. A
*Tstored in aninterface{}can be recovered with a type assertion; the runtime tracks the dynamic type in theitab(§1.5). The "typed nil" trap (interface != nileven when the underlying pointer is nil) is a direct consequence of this two-word representation. - Implicit escape analysis.
&xon a local may keep the value on the stack or silently promote it to the heap depending on whether the compiler proves it escapes —go build -gcflags=-mreveals the decision. The programmer never writesBox; the compiler decides. - No pointer arithmetic in safe code. This is a deliberate absence that distinguishes Go
from both others — you cannot stride a pointer through an array without
unsafe.
Zig-only pointer machinery:
- Distinct pointer shapes in the type (§2.2):
*T(one),[*]T(many, arithmetic),[]T(many + len),[*:0]T(sentinel-terminated),[*c]T(C pointer). No other language encodes "how many does this point to" in the type. @ptrCast/@alignCast/@constCast— explicit, greppable pointer reinterpretation builtins (no silentas-style coercion). Casting away alignment is a separate, named step, so unaligned-access bugs are visible in source.@fieldParentPtr— given a pointer to a struct field, recover a pointer to the containing struct. This is the idiom behind intrusive data structures (intrusive linked lists, the pattern the Linux kernel uses viacontainer_of) and Zig's interface vtables. Neither Rust (safe) nor Go offers this directly.- Sentinel-terminated everything. Sentinels aren't only for strings:
[*:0]Tand[:0]Tgeneralise "terminated by a known value," so C-API boundaries and parser buffers are typed precisely. volatileand arbitrary-address pointers —@as(*volatile u32, @ptrFromInt(0x1000_0000))for memory-mapped I/O, first-class for bare-metal/driver work, nounsafeblock required because the whole language already has this power (safety is enforced by build-mode runtime checks, not by quarantine).- Bit-level fields and alignment-typed pointers — a
packed structlays out sub-byte fields (e.g.u3,u1) at exact bit offsets, and pointer alignment is part of the pointer type, so*align(1) u32is a legal under-aligned pointer the compiler handles correctly. (Two 0.16 refinements: pointers are now forbidden as fields insidepacked struct/packed union— proposal #24657, because non-byte-aligned pointers can't be represented in most binary formats; store ausizeand convert with@ptrFromInt/@intFromPtrinstead. And*u8vs*align(1) u8are now formally distinct types — though they coerce to each other freely, so in practice it rarely matters, much likeu32vsc_uint.)
Technical background: Pin<P> and the self-referential-future problem. Rust values are
movable by default — the compiler is free to memcpy a value to a new address (returning it,
pushing it into a Vec, etc.). That is normally fine, but it breaks any value that contains a
pointer into itself: after a move, that internal pointer still points at the old address, now
garbage. Self-referential values arise naturally from async: when the compiler lowers an
async fn to a state-machine enum (§5), a borrow held across an .await becomes a field that
points into another field of the same future. If such a future moved after being polled, the
self-pointer would dangle. Pin<P> (where P is a pointer type like Box<T> or &mut T)
encodes "the pointee will never move again" in the type system: once a value is pinned, you can
only get a &mut to it through unsafe, so safe code cannot move it. The Unpin auto-trait
marks types that don't care (most types — moving an i32 is always fine), and they get Pin
for free; only genuinely address-sensitive types (compiler-generated futures, intrusive nodes)
are !Unpin and actually constrained. This is why every async runtime polls Pin<&mut Future>,
not &mut Future. Go and Zig have no equivalent because they never auto-generate
self-referential state machines: Go's goroutines keep a real stack (the locals live at stable
stack addresses the runtime manages), and Zig's std.Io futures plus manual memory leave
address-stability to the programmer.
Technical background: pointer provenance and the aliasing model. "Provenance" is the idea
that a pointer carries not just an address but an invisible tag recording which allocation it
was derived from and what it's allowed to access. It matters because the optimiser relies on it:
if two pointers provably have different provenance, the compiler may assume they don't alias and
reorder/cache memory accesses around them. Rust formalises this with experimental Stacked
Borrows / Tree Borrows models — a discipline saying, roughly, that creating a &mut invalidates
other aliases to the same location, enforced as a per-location stack/tree of valid borrows that
miri can check at runtime. Casting a pointer to an integer and back (as usize → as *mut T)
can strip provenance and produce undefined behaviour under these models, which is why Rust added
the strict-provenance APIs (ptr::addr, ptr::with_addr, ptr::map_addr) that keep the tag
explicit. C and Zig have the looser C-style aliasing rules (with restrict/noalias as opt-in
hints); Go sidesteps the whole question by forbidding pointer arithmetic in safe code and letting
the GC own reachability. The practical upshot: Rust can hand the optimiser strong no-alias
guarantees safely (a speed advantage), at the cost of a memory model intricate enough to still be
under active formalisation.
A weak reference points at an object without keeping it alive: it does not contribute to
ownership/refcount, and it goes empty (None/nil) once the last strong reference is gone.
The two reasons weak references exist are (1) breaking reference cycles that would otherwise
leak, and (2) caches/observers that should not keep their targets alive. How much you need
them differs sharply by memory model.
Rust — Weak<T>, the explicit cycle-breaker (used deliberately, not often). Rc<T> and
Arc<T> are reference-counted; a cycle of strong Rcs never reaches refcount zero and leaks
(Rust's one safe memory leak — mem::forget and Rc cycles are safe, just wasteful). The fix
is Weak<T> (from Rc::downgrade/Arc::downgrade): it holds a non-owning handle you must
upgrade() to a strong Option<Rc<T>> before use, which returns None if the value was
dropped. The canonical case is a parent↔child tree where children point back at parents:
use std::rc::{Rc, Weak};
use std::cell::RefCell;
struct Node {
parent: RefCell<Weak<Node>>, // weak — child must NOT keep parent alive
children: RefCell<Vec<Rc<Node>>>, // strong — parent owns children
}
// access the parent only by upgrading:
if let Some(parent) = node.parent.borrow().upgrade() { /* parent still alive */ }When to use in Rust: reach for Weak specifically when you have a back-edge in an
ownership graph (child→parent, observer→subject, cache→value) where the back-edge must not own.
Don't reach for it as a default — if your data is a tree or DAG with clear single ownership, plain
Box/& references are simpler and have no refcount cost. The presence of Rc/Arc at all is
already a signal you have shared ownership; Weak is the targeted tool for the cycle within that.
Weak also has a small runtime cost (a second weak-count, and upgrade() is an atomic op for
Arc), so it is a deliberate choice, not a free one.
Go — the weak package (added 1.24), plus runtime.AddCleanup. Because Go is
garbage-collected, ordinary reference cycles do not leak — the tracing GC reclaims unreachable
cycles automatically, so you almost never need weak references for the cycle-breaking reason that
drives Rust's Weak. Go nonetheless added a weak package in 1.24 for the other reason: caches
and canonicalization maps that must not pin their entries in memory.
import "weak"
wp := weak.Make(obj) // weak.Pointer[T] — does not keep obj alive
if p := wp.Value(); p != nil { // nil once the GC has reclaimed obj
use(p)
}Paired with runtime.AddCleanup (1.24, the replacement for the error-prone runtime.SetFinalizer),
this enables weak-keyed caches and interning tables (the stdlib unique package is built on
exactly these primitives). Two documented subtleties: reclamation takes at least two GC cycles,
and a weakly-keyed map must not let the value reference the key (or the key stays live).
When to use in Go: only for memory-sensitive caches, canonicalization/interning, and
observer registries where you explicitly want entries to disappear under memory pressure. Not
for cycle-breaking — the GC handles that — and as a general-purpose pointer it adds cost without
benefit: a weak pointer dereferences more slowly (it must check whether the target is still
live) and reclamation lags by at least two GC cycles, so a plain *T is the simpler choice
unless you specifically need the not-keeping-alive property.
Zig — no language weak references; lifetime is manual. Zig has no Rc/Arc/Weak in the
language and no GC, so there is no built-in weak reference. With manual memory management the
"weak" concept is your invention: you might store a raw ?*T optional pointer as a non-owning
back-edge and set it to null when the owner frees the target — but nothing enforces that you
actually null it, so a stale non-owning pointer is a use-after-free waiting to happen (caught at
runtime by safety checks in Debug/ReleaseSafe, UB in ReleaseFast). If you want the Rust-style
counted pointer rather than hand-rolling it, the community zigrc library provides Rc/Arc
equivalents (with weak variants) modeled on Rust's; otherwise a hand-written RefCounted(T)
means you build the weak half yourself, including the generation/validity check. Practically:
cycles in Zig are a manual ownership-design problem; you break them by deciding which edge is
non-owning and nulling it on teardown, with ?*T as the representation and defer/errdefer
to sequence the cleanup.
🔐 Safety — Rust: silent error drops are compile warnings/errors and
matchis exhaustive; Go:_discard and non-exhaustiveswitchare always legal 🧹 DX — Go: multiple returns + named returns +deferwrapping; Rust:?propagation,match/let-else,thiserror/anyhow🔍 Debug — Rust: typed error enums + structured data; Go:errors.Is/As/Unwrapchains; Zig: built-in error-return-traces ⚡ Perf — RustResultand Zig error unions are register values with no happy-path unwind; Go error is an interface value, occasionally heap-boxed
Error handling and control flow are one topic in all three languages because the mechanism you
use to handle an error is the same mechanism you use for control flow: match/if let in
Rust, if err != nil/switch in Go. This section covers both together.
Rust splits failure into two type-system-visible kinds: recoverable (Result<T, E>,
Option<T>) and unrecoverable (panic!). The first is a value you must handle; the
second unwinds the stack.
#[derive(Debug, thiserror::Error)]
enum AppError {
#[error("database error: {0}")] Db(#[from] sqlx::Error),
#[error("config missing key: {key}")] MissingConfig { key: String },
#[error("IO: {0}")] Io(#[from] io::Error),
}
fn startup(config_path: &str) -> Result<App, AppError> {
let config = read_config(config_path)?; // io::Error → AppError::Io via From
let pool = connect_db(&config.dsn)?; // sqlx::Error → AppError::Db via From
Ok(App::new(config, pool))
}What ? actually desugars to. The ? operator is not magic — it expands to a match
that early-returns the error after running it through the From conversion:
// let config = read_config(path)?; expands to roughly:
let config = match read_config(path) {
Ok(v) => v,
Err(e) => return Err(<AppError as From<_>>::from(e)),
};The From call is why ? can convert an io::Error into your AppError automatically:
the compiler inserts From::from at the return site, monomorphised and usually inlined to
nothing. On the success path there is zero overhead — no exception table, no unwind, just a
branch on the discriminant the CPU's predictor handles trivially. (Pre-1.0 Rust desugared
via a Try trait that is still the underlying mechanism for Option, ControlFlow, and
custom types.) Because Result carries the error by value, propagating it is a move, not
a heap allocation — unless your E is a Box<dyn Error>, which trades one allocation for
type erasure (the anyhow/eyre approach).
panic! unwinds the stack by default: it walks frames running each Drop (the same
mechanism C++ exceptions use, driven by the same DWARF/.eh_frame unwind tables), then
either terminates the thread or is caught at a boundary with catch_unwind. You can switch
the whole binary to panic = "abort" in the release profile, which removes the unwind
tables entirely — smaller binary, faster, but a panic becomes an immediate SIGABRT with
no cleanup. Libraries that must not leak a panic across an FFI boundary (unwinding into C is
UB) wrap their entry points in catch_unwind:
// Fault-tolerant boundary: one bad request cannot unwind into the C caller or kill the server
let result = std::panic::catch_unwind(|| handle_request(req));
match result {
Ok(response) => response,
Err(_) => Response::internal_server_error(),
}Pattern matching is the handling mechanism. match destructures, binds, guards, and is
exhaustive — omitting a variant is a compile error. The compiler lowers a match over an
enum to a jump table on the discriminant when the arms are dense, or a decision tree
otherwise — the same code a hand-written C switch would produce, but with the
exhaustiveness guarantee on top.
match event {
Event::Key { code: KeyCode::Enter, mods } if mods.ctrl() => submit(), // arm guard
Event::Key { code, .. } => echo(code),
Event::Click { x, y, btn: Btn::Left} => select(x, y),
Event::Resize(w, h) => resize(w, h),
Event::Quit => std::process::exit(0),
// miss a variant → compile error, not a runtime fall-through
}let-else (stable 1.65) handles the "bind-or-diverge" case without nesting; if let
handles single-pattern matches; if let chains (stable 1.88) combine several:
let Some(user_id) = session.get("user_id") else { return Err(Unauthorized) };
let Ok(user_id) = user_id.parse::<u64>() else { return Err(BadRequest) };As of Rust 1.95 (April 2026), if let may also appear inside a match-arm guard, so an arm
can both match a pattern and conditionally bind a second one without falling through to a nested
if:
match event {
// 1.95: an `if let` guard — match Resize AND succeed at the inner bind, else try later arms
Event::Resize(size) if let Some(win) = focused_window() => win.resize(size),
Event::Resize(_) => {} // no focused window
_ => {}
}And because if, match, loop, and blocks are expressions, control flow composes into
values without temporaries — a match arm that returns or panic!s has type ! (never),
which unifies with any other arm's type, so the diverging branch needs no dummy value:
let port: u16 = match cfg.port {
Some(p) => p,
None => return Err(AppError::MissingConfig { key: "port".into() }), // type !
};Go returns errors as the last value by convention. Multiple return values are a first-class language feature (not tuple sugar): the function ABI returns them in registers/stack slots directly, and the caller binds them positionally.
func readConfig(path string) (*Config, error) {
data, err := os.ReadFile(path)
if err != nil { return nil, fmt.Errorf("readConfig: %w", err) }
var cfg Config
if err := json.Unmarshal(data, &cfg); err != nil {
return nil, fmt.Errorf("readConfig: unmarshal: %w", err)
}
return &cfg, nil
}What a Go error is, at the machine level. error is an interface: a two-word
(itab, data) value (see §1.5). errors.New("x") allocates a *errorString on the heap;
fmt.Errorf("...: %w", err) allocates a *wrapError holding the message and the wrapped
error, forming a linked chain. errors.Is walks that chain calling Unwrap(); errors.As
walks it doing type assertions. The cost is one allocation per wrap and a pointer-chase per
inspection — negligible for error paths, but it does mean errors are GC-tracked heap objects,
not stack values as in Rust. Sentinel errors (var ErrNotFound = errors.New(...)) are
allocated once at package init and compared by identity.
var ErrNotFound = errors.New("not found")
if errors.Is(err, ErrNotFound) { /* walks Unwrap() chain */ }
var pathErr *fs.PathError
if errors.As(err, &pathErr) { /* finds first *fs.PathError in the chain */ }
// Go 1.26: errors.AsType is a generic, type-safe (and faster) form of errors.As
if pe, ok := errors.AsType[*fs.PathError](err); ok { _ = pe.Path }Named return values document each position and let a defer rewrite the result on any
return path — the standard idiom for adding context or converting a recovered panic into an
error:
func divide(a, b float64) (result float64, err error) {
defer func() {
if err != nil { err = fmt.Errorf("divide(%g, %g): %w", a, b, err) }
}()
if b == 0 { err = errors.New("division by zero"); return }
return a / b, nil
}panic/recover is Go's unwinding mechanism. A panic runs deferred functions up the
stack; a recover() inside a deferred function stops the unwind and returns the panic
value. Unlike Rust's catch_unwind, there is no compile-time marker (UnwindSafe) for
state that might be left inconsistent — you reason about it manually.
func safeHandler(w http.ResponseWriter, r *http.Request) {
defer func() {
if rc := recover(); rc != nil {
log.Printf("panic: %v\n%s", rc, debug.Stack())
http.Error(w, "internal error", 500)
}
}()
actualHandler(w, r)
}Control flow / matching. Go's switch is clean but statement-oriented, non-exhaustive,
and without destructuring. Its type switch is the idiomatic way to recover a concrete type
from an interface — effectively Go's pattern match, but only over dynamic type, and still
non-exhaustive:
switch v := shape.(type) {
case Circle: return math.Pi * v.Radius * v.Radius
case Rectangle: return v.Width * v.Height
// omit a case → no compile error; default or fall-through silently
}Zig's error handling is arguably its most refined feature, and it sits between Rust and Go:
errors are values like Rust, but they are a built-in language construct with their own
syntax rather than a generic Result enum.
An error set is an enum-like set of error tags; an error union E!T is a value that
is either an error from set E or a payload T. The !T shorthand lets the compiler
infer the error set from everything the function can return — you rarely write the set by
hand:
const OpenError = error{ NotFound, PermissionDenied };
fn readConfig(io: std.Io, path: []const u8) !Config { // inferred error set
const data = try io.readFileAlloc(allocator, path, max_size); // `try` = propagate on error
defer allocator.free(data);
return parseConfig(data) catch |err| { // `catch` handles or transforms
std.log.err("parse failed: {}", .{err});
return err;
};
}What try and catch compile to. try expr is exactly expr catch |err| return err —
the same early-return-on-error that Rust's ? desugars to, but with no From conversion
inserted (Zig does not auto-convert error types; you either widen the set or map explicitly).
An error union is represented as a tagged value — typically the payload plus an error-code
discriminant — passed in registers; the happy path is a branch, no allocation, no unwinding.
Ignoring an error union is a compile error: you must try it, catch it, or explicitly
discard with _ =. This matches Rust's enforcement and beats Go's silent _.
errdefer — cleanup only on the error path. Zig has no destructors, so cleanup is
manual via defer (always runs at scope exit) and errdefer (runs only if the scope
returns an error). This makes correct-by-construction resource handling explicit:
fn createConnection(allocator: std.mem.Allocator) !*Connection {
const conn = try allocator.create(Connection);
errdefer allocator.destroy(conn); // freed ONLY if a later step fails
conn.socket = try openSocket();
errdefer conn.socket.close(); // closed ONLY if a later step fails
try conn.handshake();
return conn; // success: neither errdefer runs
}Error return traces. In Debug/ReleaseSafe, Zig attaches an error return trace — not a
full stack trace, but the chain of try propagation points the error passed through, which
is often more useful than a stack trace for "where did this error come from." Rust gets
similar context only via anyhow's backtraces; Go via manual %w wrapping.
No panic-as-error-handling. Zig distinguishes recoverable errors (error unions) from
programmer bugs (unreachable, failed asserts, integer overflow in safe builds) which
trigger a panic that aborts. There is no recover/catch_unwind equivalent for normal
control flow — panics are meant to be fatal. switch on a tagged union or error set is
exhaustive exactly like Rust's match.
⚡ Perf — Rust/Zig: no GC, deterministic frees, custom/explicit allocators; Go: sub-ms Green Tea GC 🔐 Safety — Rust: use-after-free, double-free, dangling refs eliminated at compile time 🧹 DX — Go: no memory management burden; Rust: RAII handles resources automatically
Every value in Rust has exactly one owner. When the owner goes out of scope, the value is freed — deterministically, immediately, with no GC involvement. This is RAII: any resource (file, socket, mutex lock, database connection) attached to an owned value is released automatically, even through panics.
fn process_file(path: &str) -> io::Result<()> {
let file = File::open(path)?; // file opened here
let lock = mutex.lock().unwrap(); // lock acquired here
let client = db.connect()?; // connection opened here
// ... do work ...
} // file, lock, and client all Drop here in reverse order — guaranteed
// No defer, no try-finally, no risk of forgetting to closeThe borrow checker enforces a fundamental rule: at any point in time, a value may have either one mutable reference or any number of immutable references — never both. This eliminates iterator invalidation, aliased mutations, and data races without any runtime cost.
let mut v = vec![1, 2, 3];
let first = &v[0]; // immutable borrow
v.push(4); // compile error: cannot mutate while borrowed
// (push might reallocate, invalidating `first`)
println!("{}", first);Lifetime annotations prove that references do not outlive the data they point to. A dangling pointer is a compile error, not a runtime segfault.
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() { x } else { y }
}
// The compiler proves the returned reference lives at least as long as both inputs.
// Returning a reference to a local variable is a compile error.Memory layout can be precisely controlled for performance and FFI:
#[repr(C, align(64))] // C-compatible layout, cache-line aligned — zero false sharing
struct WorkerState { active: bool, counter: u64, _pad: [u8; 55] }
#[repr(packed)] // no padding — network packet header, file format
struct Header { magic: u32, length: u16, checksum: u16 }Custom allocators associate different allocation strategies with individual data structures:
// Per-request arena: all allocations freed in O(1) at request end
let arena = BumpAllocator::new(64 * 1024);
let headers: Vec<Header, &BumpAllocator> = Vec::new_in(&arena);
let body_parts: Vec<&str, &BumpAllocator> = Vec::new_in(&arena);
// arena drops here — single deallocation, zero per-item free overheadNiche optimisation gives Option<Box<T>> the same size as Box<T> (null pointer = None),
and Option<NonZeroU32> the same size as u32 (zero = None). No extra discriminant byte.
assert_eq!(size_of::<Option<Box<i32>>>(), size_of::<*const i32>()); // 8 bytes each
assert_eq!(size_of::<Option<bool>>(), 1); // uses bit patterns 2–255 for NoneInteger overflow panics in debug builds (finds bugs early) and wraps in release.
Explicit arithmetic families (checked_add, saturating_add, wrapping_add) let you
state your intent — all compile to one or two hardware instructions.
Go uses a concurrent, tricolor mark-and-sweep garbage collector that runs alongside application code. Go 1.26 enables the Green Tea GC by default. Instead of traversing the object graph pointer-by-pointer, Green Tea scans contiguous memory spans — dramatically better cache behaviour on modern multi-core CPUs (35%+ reduction in GC scan overhead for memory-bandwidth-bound workloads). Sub-millisecond stop-the-world pauses are typical for most server workloads.
// GOGC controls how much the heap grows before a GC cycle
// GOMEMLIMIT (since 1.19) sets a hard ceiling — critical for containers
// These can be tuned at runtime:
runtime.SetMemoryLimit(512 * 1024 * 1024) // 512 MB hard cap
runtime/debug.SetGCPercent(200) // collect less often — lower CPU overheaddefer provides lightweight scope-based cleanup without implementing a destructor type.
Multiple defers run in LIFO order and execute even through panics.
func processFile(path string) (err error) {
f, err := os.Open(path)
if err != nil { return }
defer f.Close() // registered once, runs always
mu.Lock()
defer mu.Unlock() // lock and paired unlock visible together
// ...
}sync.Pool reduces GC pressure for frequently allocated and freed objects. The pool holds
objects between GC cycles, allowing reuse without allocation.
var bufPool = sync.Pool{
New: func() any { return make([]byte, 0, 4096) },
}
buf := bufPool.Get().([]byte)
buf = doWork(buf)
bufPool.Put(buf[:0]) // return to pool; GC may reclaim if memory is tightGo zero-initialises all memory. Every variable starts at its zero value (0, false,
nil, ""). This prevents certain uninitialised-memory bugs but can mask missing
initialisation — a false flag or 0 counter may look correct even when it was
never set.
Low-level: how the two allocators actually behave. Go's runtime allocator is a
TCMalloc-derived design: per-P (per-logical-processor) mcache thread-local free lists feed
from a central mcentral, backed by the mheap which manages memory in 8 KB spans grouped
into ~70 size classes. Small objects (<32 KB) are bump-or-freelist allocated from the mcache
with no lock on the fast path; large objects go straight to the mheap. Allocation also does
write-barrier bookkeeping while the concurrent GC is marking. The collector itself is a
concurrent, non-generational tricolor mark-sweep with a hybrid Dijkstra/Yuasa write barrier;
Go 1.26's Green Tea GC (experimental in 1.25, now on by default) changes the marking
strategy to scan memory span-by-span (contiguous, cache-friendly) rather than chasing the object
graph pointer-by-pointer. The official notes quote a 10–40% reduction in GC overhead on
collection-heavy real-world programs, with a further ~10% on newer amd64 CPUs (Intel Ice Lake /
AMD Zen 4+) where it uses vector instructions to scan small objects. GOGC sets the heap-growth
trigger (default 100% = collect when the heap doubles since last live size); GOMEMLIMIT (1.19+)
imposes a soft hard cap so containers don't OOM — the pair GOGC=off + GOMEMLIMIT=<cap> is a
common
latency-tuning pattern.
Rust has no runtime allocator of its own — it calls the system allocator (malloc/
jemalloc/mimalloc, selectable via #[global_allocator]) through the GlobalAlloc trait.
There is no write barrier, no GC metadata, no scanning. A Vec<T> is exactly
(ptr, len, cap); dropping it calls dealloc immediately. The allocator API (Vec::new_in,
Box::new_in) lets you bind an arena or pool allocator to individual structures, so a
per-request bump allocator frees thousands of objects with a single pointer reset — a pattern
Go can only approximate with sync.Pool object reuse, which recycles individual objects but
cannot do region-style bulk free. The cost of Rust's model is that you (well, the borrow
checker) must prove every free is safe; the benefit is deterministic, scan-free, barrier-free
memory traffic — which is why Rust holds flatter tail latencies under allocation-heavy load.
Zig — Explicit Allocators Everywhere, defer, and No Hidden Allocation
Zig's memory model is the most explicit of the three: there is no GC and no borrow
checker, and — crucially — nothing in the standard library allocates without being handed
an allocator. An allocator is a value (std.mem.Allocator, the fat-pointer interface
struct from §1) that you pass into any function that needs heap memory. This makes
allocation a visible, auditable part of every signature.
// The allocator is a parameter — you can SEE that this function allocates
fn loadUsers(allocator: std.mem.Allocator, io: std.Io) ![]User {
var list = std.ArrayList(User){};
errdefer list.deinit(allocator); // free on error path
// ... fill list ...
return list.toOwnedSlice(allocator);
}Allocator as a strategy you choose per scope. Because the allocator is injected, you pick the strategy at the call site, and the same data structure works with any of them:
std.heap.GeneralPurposeAllocator(renamed/refined across versions) — the debug default: detects leaks, double-frees, and use-after-free at runtime, printing the allocation stack.std.heap.ArenaAllocator— bump allocation;deinit()frees everything at once in O(1). This is the idiomatic answer to per-request allocation: wrap the request in an arena, never free individual objects, drop the arena at the end. (Rust reaches this via the allocator API; Go cannot do region free at all.) As of 0.16 the arena is itself thread-safe and lock-free, so several threads can allocate from one arena without a wrapping mutex — part of a broader 0.16 shift in which the separateThreadSafeAllocatorwrapper was removed as an anti-pattern (the guidance is to make the allocator itself lock-free rather than serialize it behind a lock, which also lets an allocator back astd.Ioinstance without needing one).std.heap.FixedBufferAllocator— allocates out of a stack buffer; zero heap, zero syscalls, perfect for embedded or hot paths.std.heap.page_allocator/c_allocator/smp_allocator— backends for raw pages, libcmalloc, or a lock-free general-purpose multi-threaded allocator.
// Per-request arena: bulk-free pattern that Go can't express and Rust needs a feature for
var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator);
defer arena.deinit(); // frees EVERYTHING allocated below, at once
const a = arena.allocator();
const headers = try parseHeaders(a, raw); // none of these are individually freed
const body = try parseBody(a, raw);
// arena.deinit() reclaims it all in O(1)defer/errdefer instead of destructors. Zig has no RAII and no Drop. Cleanup is
explicit: defer x.deinit() runs at scope exit. This is more verbose than Rust's automatic
drop and more honest than Go's GC — you can see every free. The cost is that forgetting a
defer leaks (caught by the GPA in debug) and there is no compile-time guarantee a freed
pointer isn't reused. Zig's safety net is runtime: in Debug and ReleaseSafe, allocators
poison freed memory and bounds-check slices; in ReleaseFast those checks are off and misuse
is UB.
The three models can borrow from one another. Below: how to get a Rust-style bump arena, a Go off-heap reusable arena that sidesteps the GC, and a Zig leak-proof resource discipline suitable for a memory-constrained engine.
Rust — bumpalo: a bump allocator for masses of small, same-lifetime allocations. When you
allocate thousands of small objects that all die at the same moment (an AST during one compile
pass, every entity spawned in one game frame, all nodes parsed from one request), individual
Box/drop traffic is wasteful: each allocation hits the global allocator and each drop frees
individually. A bump allocator holds one chunk and an offset pointer; each allocation just
returns the pointer and advances the offset (a few instructions, no free-list search), and the
entire region is reclaimed at once when the arena is dropped.
use bumpalo::Bump;
struct Particle { pos: [f32; 3], vel: [f32; 3], life: f32 }
fn simulate_frame(prev: &[Particle]) {
let bump = Bump::new(); // one chunk from the global allocator
// Thousands of per-frame allocations — each is just "advance the offset pointer"
let mut live: bumpalo::collections::Vec<&mut Particle> =
bumpalo::collections::Vec::new_in(&bump);
for p in prev {
if p.life > 0.0 {
let np = bump.alloc(Particle { pos: p.pos, vel: p.vel, life: p.life - 0.016 });
live.push(np); // reference tied to `bump`'s lifetime
}
}
render(&live);
} // `bump` drops here → all particles freed in O(1), no per-object deallocation- CPU benefit: allocation is pointer-bump, not a malloc; deallocation is one chunk free instead of N — measured wins are large in allocation-heavy passes. Memory/cache benefit: objects are laid out contiguously in the chunk, so iterating them is cache-friendly. IO/real-time benefit: no allocation stalls mid-frame, which is why game engines use per-frame ("scratch") arenas.
- The cost to know:
bump.alloc(x)does not runx's destructor when the arena drops — the memory is reclaimed butDropis skipped, so don't bump-allocate types that own resources (files, sockets) unless you wrap them inbumpalo::boxed::Box<T>(which does runDrop). The borrow checker still applies: a bump-allocated reference cannot outlive the arena, so use-after-free is a compile error here, unlike the Go equivalent below. UseBump::set_allocation_limitto cap memory in constrained environments;bumpaloisno_std.
Go — a GC-cooperative arena via unsafe, keeping the GC's safety while it skips the tracing.
Go's official arena package (proposal #51317) is experimental and on hold indefinitely
(GOEXPERIMENT=arena, may be removed). A naive workaround is raw mmap off-heap memory the GC
ignores, but that forfeits safety entirely (any internal pointer can dangle). A more refined
technique — laid out in Miguel Young de la Sota's "Cheating the Reaper" (2025) — instead stays
on the Go heap and exploits two GC implementation details to stay both fast and able to hold
pointers into itself: (1) any live pointer into an allocation keeps the whole allocation alive,
and (2) precise GC only traces words a type marks as pointers. The arena allocates large
pointer-free chunks ([N]uintptr, which the GC never scans through, so allocation is a cheap
pointer-bump), and stitches them together so the whole arena stays alive as long as any vended
pointer is.
// Bump-allocate into large word-arrays; the GC treats each chunk as pointer-free,
// so Alloc is just "advance a pointer" with no write barrier on the hot path.
type Arena struct {
next uintptr // a uintptr, NOT unsafe.Pointer — avoids GC write barriers
left, cap uintptr
chunks []unsafe.Pointer // keeps every chunk reachable (so nothing is dropped)
}
func New[T any](a *Arena) *T { // type-safe wrapper over Alloc
var t T
return (*T)(a.Alloc(unsafe.Sizeof(t), unsafe.Alignof(t)))
}
func (a *Arena) Alloc(size, align uintptr) unsafe.Pointer {
size = (size + 7) &^ 7 // round up to 8-byte (max) alignment
words := size / 8
if a.left < words { // grow: allocate a new chunk
a.cap = max(8, a.cap*2, nextPow2(words))
p := a.allocChunk(a.cap) // chunk has a trailing *Arena back-pointer
a.next, a.left = uintptr(p), a.cap
a.chunks = append(a.chunks, p)
}
p := a.next
a.next += size // pure pointer-bump, no branch, no barrier
a.left -= words
return unsafe.Pointer(p)
}The subtlety that makes it safe to store arena pointers inside the arena: each chunk is
allocated (via reflect.StructOf) with one real unsafe.Pointer slot at its end holding a
back-pointer to the *Arena. Because that slot is a traced pointer, the GC, on seeing any pointer
into a chunk, marks the chunk, follows the back-pointer, marks the Arena, and through
a.chunks marks every other chunk — so an arena-allocated *T that points at another
arena-allocated value is kept alive correctly, which a raw mmap arena cannot guarantee.
- CPU benefit: the author's benchmarks show ~2–4× higher allocation throughput than
newacross object sizes (e.g.[64]int: ~7.4 GB/s vs ~2.9 GB/s), because the common path is a pointer-bump and the chunks are pointer-free so the GC skips scanning them. Replacingnext unsafe.Pointerwithnext uintptrremoves the write barrier from the hot store, worth ~20% under GC-heavy churn. Memory benefit: memory is requested in large chunks and the arena can beResetinto async.Poolfor reuse, avoiding repeated zeroing. IO/latency benefit: far less per-object GC tracing means fewer and shorter mark phases, flattening tail latency for allocation-heavy request handlers, AST/IR construction, and protobuf parsing. - The cost to know — the honest tradeoff: this relies on
unsafeand on documented-but-unpromised GC behavior (the post argues Hyrum's Law makes it durable, but Go gives no compatibility guarantee forunsafe).go vetflags theuintptr-as-pointer trick. The safety property holds only if the pointers you store into the arena are themselves arena pointers — store a plainnew(int)there andruntime.GC()will free it under you (use-after-free). The arena is not goroutine-safe without a lock. Packaged alternatives exist (storozhukBM/allocator, whosearena.Ptris a value the GC won't trace;goumemfor rawmmap). The point stands: Go can match C-style arena performance, but you implement and audit the GC-cooperation yourself rather than getting it free.
Zig — RAII-like discipline that guarantees no leak, for a beginner building an engine or TCP
stack under tight memory limits. Zig has no destructors, but its init/deinit + defer/
errdefer convention plus the allocator-as-a-value model let you build a proxy allocator — a
new allocator that wraps any underlying one (the heap, an arena, a fixed buffer) and layers on
conveniences (tracking, a hard budget, auto-free). Because std.mem.Allocator is just a
(ptr, vtable) fat pointer, your proxy implements the same interface and is a drop-in anywhere an
allocator is expected — so a beginner can get leak-proofing, a memory cap, and bulk cleanup
without a GC and without touching every call site.
const std = @import("std");
/// A proxy allocator: wraps ANY backing allocator and adds (1) a hard byte budget,
/// (2) live-usage tracking, and (3) deinit() that frees everything at once.
/// It *is* a std.mem.Allocator, so existing code takes it unchanged.
const TrackingAllocator = struct {
backing: std.mem.Allocator, // the wrapped allocator (heap, arena, fixed buffer…)
limit: usize,
in_use: usize = 0,
records: std.ArrayListUnmanaged([]u8) = .{},
fn init(backing: std.mem.Allocator, limit: usize) TrackingAllocator {
return .{ .backing = backing, .limit = limit };
}
/// Hand out an std.mem.Allocator backed by this proxy's vtable.
fn allocator(self: *TrackingAllocator) std.mem.Allocator {
return .{ .ptr = self, .vtable = &.{ .alloc = alloc, .resize = resize, .free = free } };
}
fn alloc(ctx: *anyopaque, len: usize, a: u8, ra: usize) ?[*]u8 {
const self: *TrackingAllocator = @ptrCast(@alignCast(ctx));
if (self.in_use + len > self.limit) return null; // enforce the budget
const p = self.backing.rawAlloc(len, a, ra) orelse return null;
self.in_use += len;
self.records.append(self.backing, p[0..len]) catch {}; // remember it for bulk free
return p;
}
fn resize(ctx: *anyopaque, buf: []u8, a: u8, new: usize, ra: usize) bool {
const self: *TrackingAllocator = @ptrCast(@alignCast(ctx));
return self.backing.rawResize(buf, a, new, ra);
}
fn free(ctx: *anyopaque, buf: []u8, a: u8, ra: usize) void {
const self: *TrackingAllocator = @ptrCast(@alignCast(ctx));
self.in_use -= buf.len;
self.backing.rawFree(buf, a, ra);
}
/// Convenience the backing allocator doesn't offer: free EVERYTHING in one call.
fn deinit(self: *TrackingAllocator) void {
for (self.records.items) |b| self.backing.rawFree(b, 0, @returnAddress());
self.records.deinit(self.backing);
}
};
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){}; // leak-detecting heap underneath
defer _ = gpa.deinit();
// Wrap the heap in our proxy: cap the whole subsystem at 2 MB, track usage, bulk-free.
var tracker = TrackingAllocator.init(gpa.allocator(), 2 * 1024 * 1024);
const a = tracker.allocator(); // a is a normal std.mem.Allocator
defer tracker.deinit(); // one call frees every allocation made through `a`
// Existing code uses `a` with zero awareness it's proxied:
const conn = try a.alloc(u8, 64 * 1024);
_ = conn;
std.debug.print("in use: {d} bytes\n", .{tracker.in_use});
// Allocating past 2 MB returns error.OutOfMemory instead of growing — a hard cap.
}- Why this suits a constrained engine/stack: the proxy gives a junior dev three things C makes
them hand-roll — a hard memory ceiling (allocation past
limitreturnserror.OutOfMemoryrather than OOM-killing the box), live accounting (tracker.in_usefor an HUD or a leak check), and one-call teardown (deinitfrees the whole subsystem, the same shape asArenaAllocator) — all without changing a single call site, because the proxy satisfies thestd.mem.Allocatorinterface. Swap the backing allocator for aFixedBufferAllocatorover a static buffer and the entire engine runs in a fixed, GC-free footprint. CPU/IO benefit: no GC pauses in the frame/packet loop; bulkdeinitis O(records) with no per-object call-site churn. Memory benefit: the budget makes overrun a recoverable error, not a crash — exactly what you want when shipping to a memory-limited target. - The cost to know: this is runtime safety, not compile-time proof —
defer tracker.deinit()is still a line you can forget, caught by the GPA's leak report in Debug/ReleaseSafe (and undetected in ReleaseFast). The composition is a convention enforced by the allocator vtable, not a guarantee enforced by the type system the way Rust's ownership is. The payoff is that you can give a newcomer a single, reusable safety wrapper they apply once and get throughout the program.
This wrap-an-allocator pattern is idiomatic enough that the standard library already does it —
the GeneralPurposeAllocator is itself a wrapping allocator that adds leak/double-free detection
over a backing allocator, which is the first thing to reach for. Beyond stdlib, community projects
explore the same composition (e.g. comptime-composable allocator stacks, mimalloc-style general
allocators, address-stable virtual-memory arrays), all slotting in under the same
std.mem.Allocator interface so swapping strategy never touches call sites — though, like
anything depending on the allocator interface, check each project's Zig-version support, since
that interface has shifted across releases.
⚡ Perf — task overhead, scheduler efficiency, SPSC/MPMC throughput 🔐 Safety — Rust: data races are compile errors (
Send/Sync); Go: runtime race detector; Zig: no race protection 🧹 DX — Go: goroutine-per-task is ergonomically simpler; Rust: more primitives, more control
This is the area where the languages diverge most sharply. Go bets everything on one elegant model. Rust gives you a toolbox and lets you assemble the right model per workload.
Goroutines are Go's fundamental concurrency primitive. They are green threads managed by the Go runtime scheduler using M:N multiplexing — many goroutines mapped onto a smaller number of OS threads, with work-stealing across threads. A goroutine starts with a 2–8 KB stack that grows and shrinks automatically. One million goroutines costs roughly 2 GB of RAM; if you wish to accomplish the same with hundreds of OS threads, you would have to write your own logic for I/O multiplexing and task coordination.
// Starting 10,000 goroutines is routine and cheap
for _, req := range requests {
go func() {
result := process(req) // Go 1.22+: loop var is per-iteration; no `req := req` needed
resultsCh <- result
}()
}Channels are typed, first-class communication pipes between goroutines. They can be buffered (producer does not block until the buffer is full) or unbuffered (synchronous rendezvous — both sides must be ready simultaneously).
jobs := make(chan Job, 100) // buffered — producer can run ahead
results := make(chan Result) // unbuffered — each result is a rendezvous
// Producer goroutine
go func() {
for _, j := range work { jobs <- j }
close(jobs)
}()
// Pool of 8 worker goroutines
for i := 0; i < 8; i++ {
go func() {
for j := range jobs { results <- process(j) }
}()
}select waits on multiple channel operations simultaneously — whichever is ready first
fires. It is the idiomatic way to handle timeouts, cancellation, and fan-in patterns.
for {
select {
case msg := <-dataCh:
handle(msg)
case <-ctx.Done(): // context cancellation or deadline
return ctx.Err()
case <-time.After(30 * time.Second):
sendHeartbeat()
}
}The context package propagates cancellation and deadlines through call chains. Every
stdlib I/O function accepts a context.Context, making timeout enforcement a one-liner.
ctx, cancel := context.WithTimeout(r.Context(), 5*time.Second)
defer cancel()
rows, err := db.QueryContext(ctx, "SELECT ...") // automatically cancelled after 5sThe sync package provides lower-level primitives: sync.Mutex, sync.RWMutex,
sync.WaitGroup, sync.Once, sync.Pool (object reuse pool — reduces GC pressure),
and sync.Map (concurrent map with no global lock).
Go's concurrency advantage: The goroutine model is the simplest mental model for
concurrent programming in any mainstream language. Spinning up thousands of goroutines
for I/O-bound work (HTTP handlers, DB queries) is idiomatic and performs extremely well.
select is simpler than any equivalent in Rust for the channel-multiplexing use case.
Go's concurrency limitation: Any value can be shared between goroutines — the language
cannot prevent data races at compile time. The -race flag detects races at runtime, but
only if the racing path is actually exercised during a test run.
Rust has no single concurrency model. Instead it provides building blocks that compose,
and the type system (Send/Sync) proves safety across all of them at compile time.
Send and Sync — the foundation of Rust concurrency:
Send means a type is safe to transfer to another thread. Sync means it is safe to
share a reference to it across threads. These are automatically derived or denied based on
a type's contents — you cannot accidentally send an Rc<T> (non-atomic ref count) to
another thread; it is a compile error. Data races between threads are therefore impossible
in safe Rust — not detected at runtime, but prevented from compiling.
let rc = std::rc::Rc::new(data);
std::thread::spawn(move || use_it(rc));
// compile error: Rc<T> cannot be sent between threads — use Arc<T> instead
// This bug is caught at build time, before any test run.OS threads and scoped threads:
// Standard OS thread — data must be 'static or owned
let handle = std::thread::spawn(move || process(owned_data));
let result = handle.join().unwrap();
// Scoped threads (1.63+) — can borrow stack data; compiler proves lifetime
// ⚡ Perf: large datasets can be parallelised without cloning
let large_dataset = load_data();
std::thread::scope(|s| {
s.spawn(|| process_left(&large_dataset[..half]));
s.spawn(|| process_right(&large_dataset[half..]));
}); // both threads guaranteed done here; no clone neededAsync/await — zero-cost cooperative concurrency:
Rust compiles async fn bodies into state machines that live on the heap or stack.
There is no mandatory runtime — you choose your executor. An async task is not a goroutine
(no OS thread stack); it is a tiny struct containing the state of a paused computation.
Millions of tasks can coexist in kilobytes of combined memory.
async fn handle_request(req: Request) -> Response {
let user = db.find_user(req.user_id).await?; // suspends while waiting for DB
let perms = cache.get_perms(user.id).await?; // suspends while waiting for cache
build_response(user, perms)
// No OS thread is blocked during either await — it is freed to run other tasks
}Async closures were stabilised in Rust 1.85, allowing async callbacks without boxing.
Tokio — the production async runtime:
Tokio provides a work-stealing multi-threaded scheduler that maps async tasks onto OS
threads, using io_uring on Linux for kernel-bypass I/O with zero-copy reads and writes.
// Runtime configuration
let rt = tokio::runtime::Builder::new_multi_thread()
.worker_threads(4) // one per CPU core is typical
.max_blocking_threads(512) // pool for spawn_blocking
.enable_io()
.enable_time()
.build()?;
// Spawning async tasks — lightweight, not OS threads
tokio::spawn(async move { process(data).await });
// CPU-heavy work moved off async threads to prevent starving I/O
let result = tokio::task::spawn_blocking(|| {
compress_large_buffer(data) // runs on a dedicated blocking thread pool
}).await?;Tokio's channel suite — choosing the right tool:
// mpsc bounded — backpressure; use for pipeline stages, request queues
// .await on send() blocks the producer when the buffer is full — natural backpressure
let (tx, mut rx) = tokio::sync::mpsc::channel::<Job>(1024);
// oneshot — single value; request/response pattern
// ⚡ Perf: zero allocation on the happy path
let (tx, rx) = tokio::sync::oneshot::channel::<Response>();
tokio::spawn(async move { tx.send(compute().await).ok(); });
let response = rx.await?;
// broadcast — one sender, every receiver gets every message
// Use: pub/sub, config reload notification, cache invalidation
let (tx, _) = tokio::sync::broadcast::channel::<Config>(16);
let mut sub = tx.subscribe();
tx.send(updated_config)?;
// watch — one writer, many readers, always returns latest value
// ⚡ Perf: zero-copy read of latest value via borrow()
// Use: shared health state, feature flags, rate-limit config
let (tx, rx) = tokio::sync::watch::channel(initial_config);
let current = rx.borrow(); // instant, no allocationtokio::select! — async channel multiplexing:
// ⚡ Perf: only futures that are ready are polled; no busy-wait
loop {
tokio::select! {
msg = work_rx.recv() => handle(msg?).await,
_ = shutdown.recv() => { info!("shutting down"); break; },
_ = interval.tick() => send_heartbeat().await,
res = outbound_req => handle_response(res),
}
}Crossbeam — synchronous high-performance channels:
When producer and consumer are both synchronous (not async), crossbeam-channel outperforms
std::sync::mpsc by 2–5x through adaptive spin-then-park algorithms and cache-conscious
queue layout. It supports MPMC (multiple producers, multiple consumers) which std does not.
use crossbeam_channel::{bounded, unbounded, select};
let (tx, rx) = bounded::<Work>(256); // MPMC bounded — backpressure, fixed memory
let tx2 = tx.clone(); // any number of senders and receivers valid
let rx2 = rx.clone();
// select! for synchronous channels — the Go-style select in Rust
loop {
select! {
recv(work_rx) -> msg => process(msg?),
recv(shutdown_rx) -> _ => break,
default(Duration::from_millis(100)) => tick(),
}
}SPSC — maximum throughput for single-producer/single-consumer pipelines:
When exactly one thread produces and exactly one thread consumes, dedicated SPSC data structures eliminate all the overhead of general MPMC:
- No CAS (compare-and-swap) loops — a single atomic load/store per operation suffices
- Producer head and consumer tail live on separate cache lines — zero false sharing
- The access pattern is predictable; the CPU prefetcher loads ahead automatically
- Throughput: 400–800 million items/second vs 100–200M for general MPMC
// rtrb: lock-free wait-free SPSC ring buffer
// ⚡ Perf: purpose-built for audio, trading systems, real-time sensor pipelines
use rtrb::RingBuffer;
let (mut producer, mut consumer) = RingBuffer::<f32>::new(4096);
// Real-time audio thread (producer) — must NEVER block or allocate
std::thread::spawn(move || {
loop {
let sample = read_from_microphone();
producer.push(sample).ok(); // wait-free; returns Err if buffer full
}
});
// DSP processing thread (consumer)
std::thread::spawn(move || {
loop {
if let Ok(sample) = consumer.pop() {
apply_reverb(sample);
}
}
});crossbeam-queue — lock-free bounded and unbounded queues:
use crossbeam_queue::{ArrayQueue, SegQueue};
// ArrayQueue: bounded, lock-free MPMC ring buffer — all memory pre-allocated
// ⚡ Perf: no heap allocation per push/pop; good for object pools
let pool: ArrayQueue<Connection> = ArrayQueue::new(64);
pool.push(conn).ok();
if let Some(c) = pool.pop() { use_connection(c); }
// SegQueue: unbounded, lock-free — grows dynamically via linked segments
// ⚡ Perf: no lock contention between producer and consumer threads
let log_queue: SegQueue<LogEvent> = SegQueue::new();Rayon — trivial data parallelism:
use rayon::prelude::*;
// Sequential:
let sum: u64 = data.iter().map(|x| expensive(*x)).sum();
// Parallel across all CPU cores — one word change:
let sum: u64 = data.par_iter().map(|x| expensive(*x)).sum();
// The borrow checker ensures data is not mutated while being read in parallel.
// ⚡ Perf: near-linear speedup for CPU-bound workloads; work-stealing balances loadChoosing the right tool:
| Scenario | Recommended |
|---|---|
| HTTP server, I/O-bound concurrency | Go goroutines or Tokio |
| Async producer → async consumer | tokio::sync::mpsc |
| Sync producer → sync consumer, high-throughput | crossbeam_channel::bounded |
| Audio / real-time SPSC, wait-free | rtrb::RingBuffer |
| Multiple sync producers + consumers | crossbeam_channel (MPMC) |
| Pub/sub: every subscriber gets every message | tokio::sync::broadcast |
| Shared state, many readers, latest-value-wins | tokio::sync::watch |
| One-shot request/response | tokio::sync::oneshot |
| CPU-bound parallel iteration | Rayon par_iter() |
| Object pool, lock-free | crossbeam_queue::ArrayQueue |
| Microcontroller / no OS | Embassy (Rust no_std async) |
The two models differ fundamentally in where the suspension point lives.
Go's runtime scheduler (G-M-P). A goroutine (G) is a struct holding a stack
(initially ~2 KB, grown by copying when it overflows a stack-check prologue), a program
counter, and scheduling state. An M is an OS thread. A P is a "processor" — a
scheduling context that owns a local run queue of runnable Gs; GOMAXPROCS sets the number
of Ps (default = core count). To run, an M must hold a P. The scheduler multiplexes many Gs
onto few Ms: when a G blocks on a channel or mutex, it is parked and the M grabs the next G
from the P's run queue — no OS context switch, just a register swap and stack pointer change
(~tens of nanoseconds). When a G makes a blocking syscall, the M detaches from its P and
blocks in the kernel; the runtime hands that P to another (possibly newly spawned) M so the
other Gs keep running — this is why blocking I/O in Go "just works" without poisoning the
pool. Network I/O is special-cased through the netpoller (epoll/kqueue/IOCP): a G doing a
socket read is parked and registered with the poller, and the syscall thread is never blocked
at all. Work-stealing balances load: an idle P steals half of another P's run queue.
Preemption is asynchronous since 1.14 — the runtime can interrupt a G at almost any
instruction via a signal, so a tight CPU loop can't starve the scheduler.
Rust's async (compiler-built state machines). There is no runtime in the language. An
async fn is rewritten by the compiler into an anonymous enum implementing Future —
each .await point becomes a variant capturing exactly the locals that are live across
that suspension. Calling poll() runs straight-line code until the next .await, where it
returns Poll::Pending and stores which state to resume into. The whole future for a
request is therefore a single, flat, stack-allocated value whose size is known at compile
time (the sum-type of all suspension states) — no per-task 2 KB stack, often just tens to
hundreds of bytes. A self-referential borrow held across an .await is what Pin exists to
make sound: once polled, the future must not move, because it may contain a pointer into its
own storage. The executor (Tokio, smol, embassy — your choice, not the language's)
owns the run queue, the Waker machinery, and the epoll/io_uring reactor; poll returning
Pending registers a Waker that the reactor invokes when the I/O is ready, re-queuing the
task. Tokio's multi-thread flavour is work-stealing like Go's; the difference is that
suspension points are explicit (.await) and the task object is a compile-time-sized
struct rather than a runtime-managed stack.
Consequences that matter in production:
- Go: cheaper mental model (write blocking code, the runtime makes it concurrent), but every goroutine costs a real growable stack and the scheduler/GC must track it; ~millions of goroutines is feasible but each is heavier than an async task.
- Rust: a suspended task is a tiny struct, so tens of millions of in-flight tasks fit in
modest RAM, and there is no stack-growth copying; the cost is the "function colouring"
problem (
asyncinfects signatures) and the need to pick/configure a runtime. - Go cannot run without its scheduler+GC, so it cannot target bare metal; Rust async runs on
microcontrollers via
embassywith no OS and no allocator.
Concurrency is the headline change in Zig 0.16 (released April 13, 2026). Its approach
targets the "function coloring" problem — the split of Rust and JavaScript code into
async and non-async worlds.
The core idea: Io is a parameter, not a keyword. There are no async/await keywords
in the Zig language (the design explicitly settled this). Instead, anything that can block
or introduce nondeterminism — file I/O, sockets, timers, sleeping, synchronization — now
takes an std.Io instance as a runtime parameter. Io is the same fat-pointer interface
struct from §1: a context pointer plus a vtable. You choose the implementation at startup,
and the identical library code runs on whichever backend you pass:
// The SAME function works synchronously, on a thread pool, or on an event loop —
// determined entirely by which Io implementation the caller injects. No coloring.
fn fetchAndParse(io: std.Io, url: []const u8) !Data {
const response = try httpGet(io, url); // may suspend; no `async` keyword
return parse(response);
}Zig 0.16 ships Io.Threaded (backed by threads; feature-complete, well-tested, and the
implementation the default entry point selects), with experimental event-driven backends —
Io.Evented (M:N green threads / stackful coroutines), plus Io.Uring (io_uring) and
Io.Kqueue/Io.Dispatch (GCD on macOS) proof-of-concepts — informing the interface's
evolution. The build flags -fno-single-threaded / -fsingle-threaded select whether
task-level concurrency and cancellation are available. Programs obtain their io (and gpa,
and an arena) from 0.16's new "Juicy Main" entry point — pub fn main(init: std.process.Init)
— so the application's main constructs the I/O implementation once and threads it down, rather
than reaching for a global; library code takes an Io parameter the way it already takes an
Allocator.
Concurrency primitives — Future, Batch, Group. Rather than spawning goroutines or
tokio::spawn-ing tasks, you express concurrency through Io methods:
Future— the ergonomic, function-level abstraction: start an operation, get a future,awaitit later. Flexible, but allocates task memory and can surfaceerror.ConcurrencyUnavailableon backends that can't honor it.Batch— the optimal primitive for "do N operations at once" without reinventing futures; preferred for reusable, allocation-conscious library code.Group— "spawn a bunch of work that should happen concurrently" and wait for all, the structured-concurrency building block (the closest analogue to a GoWaitGroupor RustJoinSet).
// Structured concurrency via std.Io.Group (real-world pattern, e.g. parallel file processing)
var group: std.Io.Group = .{};
defer group.cancel(io); // cancellation is first-class
for (notes) |note| {
try group.async(io, processNote, .{ io, note });
}
try group.await(io); // wait for all; errors propagateCancellation is first-class. Almost any Io operation can return error.Canceled, and
cancellation propagates through the Io interface — something Rust's ecosystem reconstructs
per-runtime (Tokio CancellationToken, dropping a future) and Go does via context.Context
convention. In Zig it is built into the I/O layer itself.
The honest tradeoffs. This design solves coloring — a library author writes one
version of every function, and it composes into sync or async programs at the call site,
which neither Rust nor Go achieves. The costs: (1) it is brand-new and the event-loop
backends were still being finished as 0.16 shipped, so production users were advised to stay
on 0.15.2 until the final issues cleared; (2) Zig has no data-race protection — unlike
Rust's Send/Sync, the compiler will not stop you sharing mutable state across tasks, so
concurrency correctness is on you (one migration write-up noted needing far more care porting
concurrent code from Rust, where the borrow checker had caught the races for free).
⚡ Perf — zero-copy syscalls (
sendfile/splice/io_uring) avoid user-space buffer copies; whether the API exposes them transparently or by hand differs sharply 🧹 DX — Rust:Read/Write/AsyncReadtraits + Tokio; Go:io.Reader/io.Writer+net/osstdlib; Zig:std.Ioreader/writer (0.16) 🔐 Safety — Rust encodes buffer ownership across async I/O in types (the "buffer must outlive the read" problem io_uring forces); Go and Zig manage it manually 🔍 Debug — abstraction layers can silently defeat the zero-copy fast path (Go's wrapper-type problem); knowing the call chain matters
I/O is where the three languages' abstraction philosophies meet the operating system. The key
questions: what are the core read/write abstractions, what do they cost, when does data avoid
being copied into user space, and how are the OS zero-copy primitives (sendfile, splice,
vmsplice, copy_file_range, io_uring) surfaced.
Go — io.Reader/io.Writer, the stdlib net/os, and a blocking-style API over a non-blocking core.
Go's entire I/O ecosystem is two one-method interfaces: io.Reader (Read(p []byte) (n int, err error)) and io.Writer (Write(p []byte) (n int, err error)). Files (*os.File),
sockets (net.Conn), buffers (bytes.Buffer), HTTP bodies — everything implements them, so
they compose universally (io.Copy, bufio.Scanner, io.MultiWriter, io.TeeReader). Code
is written in straightforward blocking style; underneath, the runtime registers the fd with
the netpoller (epoll/kqueue/IOCP) and parks the goroutine, so a "blocking" conn.Read does
not block an OS thread (see §5). The abstraction cost is near zero for the interface call
itself, but every Read/Write copies bytes between the kernel and the user-space []byte you
provide — unless a fast path applies (below).
// Idiomatic Go: blocking-looking, scheduler-backed, universal interfaces
n, err := conn.Read(buf) // goroutine parks in netpoller; no OS thread blocked
io.Copy(dst, src) // may transparently become sendfile/splice (see 6.2)Rust — Read/Write (sync) and AsyncRead/AsyncWrite (async), split by runtime.
Rust's std::io mirrors Go: Read/Write traits implemented by File, TcpStream,
Vec<u8>, etc., with combinators (BufReader, copy, Read::chain). The split is that
async I/O lives outside std: Tokio (or smol/async-std) provides AsyncRead/AsyncWrite,
tokio::net::TcpStream, tokio::fs::File, and tokio::io::copy. The trait call is zero-cost
(monomorphised, inlined), and buffer ownership is tracked by the borrow checker. The
async-specific wrinkle is real: a future doing a read must keep the buffer alive and unmoved
until the read completes — readily expressed with the poll-based AsyncRead (the runtime owns
the buffer across the await), but it becomes central with io_uring (below), where the kernel
writes into your buffer after the call returns.
use tokio::io::{AsyncReadExt, AsyncWriteExt};
let n = stream.read(&mut buf).await?; // poll-based; buffer borrowed across the await
tokio::io::copy(&mut reader, &mut writer).await?; // may use sendfile on some setupsZig — std.Io reader/writer over the 0.16 interface, allocator-explicit, no hidden buffering.
Zig 0.16 reworked I/O around the std.Io interface (§5). Readers and writers are interface
structs (std.Io.Reader/std.Io.Writer — the "Writergate" 0.15 redesign plus the 0.16
std.Io integration), parameterised over an Io implementation you pass in. There is no hidden
buffering or hidden allocation — you wrap with a buffered reader/writer explicitly and pass an
allocator where one is needed. The same code runs sync (Io.Threaded) or event-driven by which
Io you inject. The abstraction cost is an indirect call through the vtable (like Go's
interface, like Rust's dyn), and the byte-copy semantics are the same: a read fills a
buffer you own.
// Zig 0.16: I/O takes an Io instance; buffering and allocation are explicit
var buf: [4096]u8 = undefined;
const n = try file.read(io, &buf); // you own buf; no hidden allocation
// std.Io.Writer / buffered writer wrappers are composed explicitlyThe classic "read a file, write it to a socket" path copies bytes twice through user space
(kernel→user buffer on read, user→kernel on write). The OS primitives that avoid this:
sendfile(out_fd, in_fd, …)— copies between two fds inside the kernel;in_fdmust be a file (mmap-able),out_fdhistorically a socket. One syscall, no user-space buffer, data moves page-cache→socket.splice(fd_in, fd_out, …)— moves data between an fd and a pipe without a user-space round trip; generalisessendfile(in fact Linux'ssendfileis asplicewrapper). Socket→ socket proxying uses twosplices through an intermediate pipe.vmsplice— maps user pages into a pipe (gift the pages to the kernel), the user-memory→pipe complement ofsplice.copy_file_range— file→file in-kernel copy (and reflink/CoW on filesystems that support it), no socket involved.- kTLS +
sendfile— with kernel-TLS offload, encryption happens in-kernel (or on the NIC), so even an HTTPS file send avoids user-space copies entirely.
How each language exposes these:
Go — transparent, via io.Copy and the ReadFrom/WriteTo fast paths. This is Go's
quiet strength: io.Copy(dst, src) checks whether dst implements io.ReaderFrom or src
implements io.WriterTo and dispatches to an optimised path. *net.TCPConn.ReadFrom tries
splice (for conn→conn) then sendfile (for file→conn) on Linux, falling back to a generic
buffered copy otherwise; *os.File.ReadFrom uses copy_file_range. So io.Copy(tcpConn, file) becomes a sendfile with no user-space buffer, and io.Copy(tcpConn, otherConn) becomes
splice — with no special API call. The crucial caveat, and a real debugging trap: wrapping
either side in a type that doesn't forward ReadFrom/WriteTo (a logging io.Writer, an
io.LimitedReader, io.NopCloser, an HTTP middleware writer) hides the concrete type and
silently drops you back to the copying path. net/http's ServeContent/ServeFile are wired
to preserve it; custom middleware often isn't.
// Both of these are zero-copy on Linux WITHOUT naming the syscall:
io.Copy(tcpConn, osFile) // → sendfile (page cache → socket)
io.Copy(tcpConn, otherConn) // → splice (socket → pipe → socket)
// …but this is NOT, because the wrapper hides *os.File from TCPConn.ReadFrom:
io.Copy(tcpConn, io.LimitReader(osFile, n)) // generic buffered copyRust — explicit, via crates; std does not auto-sendfile. std::io::copy has a
Linux-specific specialisation that uses sendfile/copy_file_range when it can detect both fds,
but the async story is opt-in: Tokio's io::copy does not call sendfile (it copies through a
user buffer), so for true zero-copy you reach for crates — nix::sys::sendfile::sendfile,
tokio-splice/tokio-splice2 (socket↔socket via splice, blocking the file behind &mut to
guard against mid-flight modification), or the io-uring/tokio-uring/glommio stacks for a
fully completion-based design. The trade is Go's transparency-but-fragility versus Rust's
nothing-happens-implicitly: you call the primitive deliberately, and the type system makes you
handle the buffer-lifetime and fd ownership explicitly.
// Deliberate zero-copy file→socket (blocking; run on a blocking pool under async)
use nix::sys::sendfile::sendfile;
let sent = sendfile(socket_fd, file_fd, Some(&mut offset), count)?;
// Or socket→socket proxying with splice via a crate:
// tokio_splice2::zero_copy_bidirectional(&mut a, &mut b).await?;Zig — direct syscalls, thinly wrapped; @cImport for the rest. Zig exposes OS calls
through std.posix/std.os (note 0.16 removed much of the old std.posix, moving toward the
std.Io abstraction). sendfile is available as a thin wrapper where the platform provides it;
for splice/vmsplice/copy_file_range you call the syscall directly (std.os.linux.*) or
@cImport the C headers — both zero-overhead, no binding layer. There is no transparent
io.Copy-style auto-fast-path in std; like Rust, you invoke the primitive explicitly, and the
explicit-allocator/no-hidden-IO philosophy means nothing happens behind your back. The
event-driven std.Io backends (io_uring on Linux) are the direction for making these async.
io_uring is a fundamentally different interface: instead of one syscall per operation, you
write submission-queue entries into shared memory and read completions back — batching accept,
read, write, splice, even sendfile-equivalents with near-zero syscall overhead, fully
async including for file I/O (which epoll never handled).
- Rust has the most developed ecosystem: the
io-uringcrate (low-level, tokio-rs), plus completion-based runtimestokio-uring,glommio(thread-per-core), andmonoio. The borrow-checker tie-in is load-bearing here: because the kernel writes into your buffer after submission, the buffer must outlive the operation and not move — these runtimes use owned-buffer APIs (you hand the buffer to the runtime and get it back on completion) precisely so the type system enforces that invariant. This is a case where Rust's ownership model maps unusually cleanly onto a hard kernel-API constraint. - Go deliberately has not adopted
io_uringin its runtime netpoller (security-surface and portability concerns, and the netpoller already serves Go's blocking-style model well); community crates (iceber/iouring-go,pawelgaczynski/gain) exist for those who want it, but it is off the mainstream path. The result: Go gives you excellent ergonomics and transparentsplice/sendfile, but not the batched-syscall ceiling thatio_uringenables. - Zig targets
io_uringdirectly —std.os.linux.IoUringis a first-class low-level interface, and the 0.16std.Ioevent-driven backend is built on it. Because Zig has no borrow checker, the buffer-lifetime invariant that Rust encodes in types is your manual responsibility, with runtime safety checks (in Debug/ReleaseSafe) as the backstop.
- Rust —
futures::Stream/tokio_stream(async iterators of items),tokio_util::codec(Framed,LengthDelimitedCodec, line codecs) to turn a byte stream into a typed message stream,bytes::Bytes(refcounted, cheaply-cloneable, slice-able buffer that enables zero-copy within user space — clones share the backing allocation). For SSE/WebSocket:tokio-tungstenite,axum's SSE support,reqwest's streaming bodies. - Go —
bufio.Scanner/bufio.Readerfor line/token streaming, channels as the idiomatic in-process event stream,net/http'sFlusherfor server-sent events, andnhooyr/websocketorgorilla/websocket.io.Pipeconnects a writer to a reader in memory. Go's lack of a lazy iterator historically meant channels for streaming; range-over-func (1.23, §1.11) now offers a non-channel option. - Zig — streaming is the
std.Io.Reader/Writerplus thenext()-style iterators (§1.11); buffered wrappers are explicit. Higher-level event-stream abstractions (SSE/WebSocket) come from community libraries (httpz,zap,websocket.zig) rather than std, which is younger and thinner here.
The abstraction-cost summary. All three pay one indirect call (or a monomorphised direct
call, in Rust's generic/impl Trait case) for the reader/writer abstraction, and all three copy
bytes into a user buffer on an ordinary Read. The differences are in the fast paths: Go makes
zero-copy sendfile/splice transparent through io.Copy (powerful but defeatable by
wrappers, and capped below io_uring); Rust makes every optimisation explicit and uses
ownership to make completion-based io_uring buffers safe; Zig exposes the raw kernel interfaces
with the least binding friction and the fewest guardrails. bytes::Bytes (Rust) and slice
re-slicing (Go/Zig) provide intra-user-space zero-copy (sharing a backing buffer without
copying), independent of the kernel primitives.
⚡ Perf — HTTP/TLS/DB throughput is where these languages are most often deployed; the libraries that push the optimization ceiling differ in maturity and design 🧹 DX — Go ships production-grade
net/httpandcrypto/tlsin std; Rust assembles best-of-breed crates; Zig leans on community libs and C 🔐 Safety —rustls(memory-safe TLS) vs OpenSSL's CVE history is a concrete, oft-cited safety win; connection pools and bounded buffers are how each controls memory growth
This section covers the workhorse server-side libraries — HTTP servers, TLS, database drivers, and memory-mapped-file utilities — and, throughout, how each ecosystem keeps memory from growing without bound under load (the practical failure mode of high-throughput services).
Go — net/http (stdlib), the baseline everyone else is measured against. Go's standard
library ships a production-grade HTTP/1.1 and HTTP/2 server and client; net/http powers a
large fraction of the internet's Go services with zero third-party dependencies. Each request is
a goroutine, so the programming model is trivial and the server scales to high connection counts
on the netpoller. For extreme throughput, valyala/fasthttp trades the net/http API for a
lower-allocation design (it reuses request/response objects via pooling, avoiding per-request
allocation), at the cost of a non-standard API and some HTTP-correctness caveats; routers like
chi, gin, and echo sit on top of net/http. Memory-growth control: bounded
MaxHeaderBytes, ReadTimeout/WriteTimeout/IdleTimeout to reap idle connections, and
http.MaxBytesReader to cap request-body size — the defaults are safe but unbounded body reads
are the classic Go OOM.
Rust — hyper (the foundation), axum/actix-web (the frameworks). hyper is the
low-level HTTP/1+HTTP/2 implementation; it consistently sits near the top of the TechEmpower
benchmarks. Most apps use a framework over it: axum (Tokio-native, tower middleware
ecosystem, minimal and composable — you bring sqlx, tower-http, etc.) is the common default;
actix-web is frequently the raw-throughput leader (an actor-model design that benchmarks
~10–15% above axum under heavy load); rocket, warp, and salvo round out the field. A thin
framework like axum can show measurably lower req/s and higher tail latency than raw hyper
(one community measurement put axum ~25% below hyper on req/s), so for the last increment you
drop toward hyper directly. Memory-growth control comes from tower's layered limits
(ConcurrencyLimit, RequestBodyLimit, Timeout, load-shedding) and bytes::Bytes reuse;
backpressure is explicit because the async tasks are bounded by the runtime.
Zig — httpz, zap, http.zig; std.http for basics. std.http provides a basic client
and server, not production-hardened to net/http's level. The community libraries are where
real servers are built: httpz (a fast, allocator-aware HTTP/1.1 server with explicit
per-request arenas), and zap (a wrapper over the C facil.io library, so it inherits a
battle-tested event loop). Memory-growth control is explicit and structural: the idiomatic
pattern is a per-request ArenaAllocator (§4) that is reset/freed wholesale at the end of each
request, which by construction cannot leak across requests — arguably the cleanest "bounded
per-request memory" model of the three, at the cost of writing it yourself. The ecosystem is
young and pre-1.0-churning.
Go — crypto/tls (stdlib). A complete, memory-safe, pure-Go TLS 1.2/1.3 stack in the
standard library, maintained alongside the compiler, with no OpenSSL dependency. It is one of
Go's strongest batteries-included stories: TLS "just works" with net/http, cross-compiles
cleanly (no C), and has a solid security track record. Hardware AES-NI is used automatically.
Rust — rustls (the memory-safe TLS library). rustls is a from-scratch TLS 1.2/1.3
implementation in safe Rust, layered over a pluggable crypto provider — ring (fewer build
deps) or aws-lc-rs (more cipher suites, FIPS option). It is funded as critical infrastructure
(ISRG/Let's Encrypt, with Google/AWS/Microsoft money) precisely to replace OpenSSL's
memory-unsafe C in security-critical paths, and is now widely deployed (it backs reqwest,
hyper-based servers, sqlx, etc., via tokio-rustls). The concrete pitch: TLS is the most
exposed attack surface in a network service, and rustls removes the buffer-overflow/UAF CVE
class that has repeatedly hit OpenSSL. The alternative native-tls/openssl crates exist for
compatibility but reintroduce the C dependency. Feature-flag note (§14 supply-chain): pick
rustls over native-tls to keep the build pure-Rust.
Zig — std.crypto primitives + C OpenSSL/BoringSSL via @cImport. Zig's std.crypto is a
respected suite of primitives (AEADs, hashes, ECC, signatures), and there is ongoing work toward
TLS in std, but there is no production-hardened pure-Zig TLS stack at rustls/crypto/tls
maturity yet. Production Zig TLS today typically means @cImport-ing BoringSSL/OpenSSL (zero
binding overhead, but back to C's safety profile) or using a library like zap that wraps a C
stack. This is one of the clearer gaps versus Go and Rust.
mmap maps a file (or anonymous memory) into the address space so reads/writes hit the page
cache directly — the basis of zero-copy file access, large read-only datasets, and many embedded
databases.
- Rust —
memmap2is the de-facto crate (a maintained fork of the originalmemmap), exposingMmap(read-only) andMmapMut(read-write) with safe-ish wrappers; the inherent unsafety (the file can change under the mapping, violating Rust's aliasing assumptions) is acknowledged in the API. Used bytantivy,polars, and embedded DBs for zero-copy access to on-disk structures. - Go —
golang.org/x/exp/mmap(read-onlyReaderAt) for simple cases, oredsrzf/mmap-gofor read-write; the GC does not manage mmap'd memory, so you control its lifetime explicitly withMunmap.bboltandbadgeruse mmap internally. - Zig — 0.16 adds a portable
std.Io.File.MemoryMapon the I/O interface (its contents are defined to synchronize only at explicit sync points, which lets evented backends fall back to file I/O), while the lowest level is nowstd.posix.system.mmapdirectly — the medium-levelstd.posix.mmapwrapper was among the functions trimmed in 0.16'sstd.posixcleanup, so you go either higher (std.Io) or lower (std.posix.system). Either way you get a[]align(page) u8slice over the mapping and manage it yourself, fitting Zig's explicit-memory model — no wrapper crate.
Across servers and drivers, the same levers recur, and the three languages expose them differently:
- Bounded pools and limits. Connection pools (
database/sqlpool,sqlx/deadpool, hand-rolled in Zig), request-body caps (MaxBytesReader,towerRequestBodyLimit), and concurrency limits cap the number of in-flight allocations. This is the first line of defense in all three. - Buffer reuse vs allocation. Go's
sync.Poolrecycles per-request objects to reduce GC pressure (andfasthttpis built around it); Rust reusesbytes::Bytes/BytesMutand arena/bumpaloallocators; Zig's per-requestArenaAllocatoris the structural answer — allocate freely during a request, free it all in O(1) at the end, so steady-state memory is bounded by the largest single request, not cumulative. - GC tuning vs no GC. Go controls heap growth with
GOGC(growth trigger) andGOMEMLIMIT(soft cap that makes the GC work harder rather than OOM) — the standard way to keep a Go service inside a container memory limit. Rust and Zig have no GC to tune; steady-state memory is whatever you allocate and hold, so growth is controlled by design (pools, arenas, bounded caches) rather than a runtime knob. The trade: Go gives you a dial to contain a leak-ish workload; Rust/Zig give you no dial but also no GC headroom and no pause, so a correctly bounded design has flatter, lower memory. - Fragmentation. Long-running Rust/Zig services can suffer allocator fragmentation under
certain allocation patterns; swapping in
jemalloc/mimalloc(Rust#[global_allocator], Zig allocator choice) is the common fix. Go's allocator manages this internally; its analogue is GC-driven heap compaction (Go does not move heap objects, so it relies on size-class design instead).
The summary: Go bounds memory with pool knobs plus GOGC/GOMEMLIMIT and leans on a strong
stdlib (net/http, crypto/tls, database/sql); Rust bounds it by design with explicit
pools, bytes/arena reuse, and pluggable allocators, while offering stronger compile-time
guarantees (sqlx queries, rustls safety); Zig bounds it most explicitly via per-request
arenas and up-front static allocation, with the youngest library ecosystem and the most reliance
on C for TLS and non-SQLite databases.
⚡ Perf — Rust proc macros generate struct-specific (de)serialization code at compile time, avoiding runtime reflection; the realistic speedup over Go json v2 is a modest single-digit factor on typical struct payloads, larger versus the older json v1, and workload-dependent either way 🧹 DX — Rust: one annotation replaces hundreds of lines of boilerplate at zero runtime cost 🔍 Debug — Rust/Zig: a renamed field is a compile error in generated/comptime code; Go: a
json:"..."typo silently produces wrong JSON
macro_rules! defines hygienic syntax transformations in the language itself. Macro-
introduced variable names cannot clash with caller code. Expansion happens at compile time
with zero runtime overhead.
macro_rules! retry {
($n:expr, $body:expr) => {{
let mut result = Err("no attempts");
for attempt in 0..$n {
match $body {
Ok(v) => { result = Ok(v); break; }
Err(e) => { log::warn!("attempt {attempt} failed: {e}"); }
}
}
result
}};
}
let data = retry!(3, fetch_from_network())?;Procedural macros (proc-macros) receive a TokenStream and produce a TokenStream at compile time. The most visible form is `#[derive(...)]:
#[derive(Debug, Clone, PartialEq, Hash, Serialize, Deserialize)]
struct Config {
host: String,
port: u16,
#[serde(default = "default_timeout")]
timeout_ms: u64,
#[serde(skip_serializing_if = "Option::is_none")]
log_level: Option<String>,
}Serialize and Deserialize generate a parser that knows exactly: Config has four
fields, host is a UTF-8 string, port is a u16, timeout_ms defaults to default_timeout(),
and log_level is omitted if None. This specialised code avoids runtime reflection. The
performance advantage over Go is real but modest on typical payloads — and Go 1.25's
encoding/json/v2 narrowed it considerably; bytedance/sonic (JIT + SIMD) is Go's
throughput leader. The structural win that survives benchmarking is type-safety: a renamed
field breaks compilation with serde, but is a silent runtime mismatch with a Go struct tag.
A field rename (host → hostname) that breaks JSON compatibility is a compile error
with serde (json:"host" must be updated); in Go the struct tag is a string literal and
the typo passes silently.
Go has no macro system. Code generation runs as an explicit build step via //go:generate
directives, which invoke external tools that produce .go source files committed to the
repository.
//go:generate stringer -type=Direction
//go:generate protoc --go_out=. proto/service.proto
//go:generate mockgen -source=service.go -destination=mock_service.goThe generated files appear in version control, show up in diffs, and require the generation tools to be installed. This is more transparent (the output is readable, not a black box) but slower and more brittle than Rust's compile-time approach.
Go's reflect package provides powerful runtime type inspection. You can enumerate struct
fields, read their tags, call methods by name, and create new values of arbitrary types.
This is the foundation of encoding/json, ORM libraries, and dependency injection
frameworks — all built without any codegen step.
t := reflect.TypeOf(cfg)
for i := 0; i < t.NumField(); i++ {
field := t.Field(i)
jsonTag := field.Tag.Get("json")
validate := field.Tag.Get("validate")
fmt.Println(field.Name, jsonTag, validate)
}Zig has no macro system and no separate generics system, because comptime subsumes
both — and a good chunk of what Rust uses proc-macros and Go uses reflect for. comptime
is not a preprocessor or a token-substitution macro; it is ordinary Zig executed by the
compiler. The same language, the same functions, the same types — just evaluated at compile
time. This is the single most distinctive thing about Zig.
Types are comptime values, so "generics" are just functions (shown in §1). But comptime goes much further: you can run arbitrary logic, build lookup tables, validate invariants, and inspect types — covering Rust's derive macros and Go's reflection in one feature, with zero runtime cost because it all happens before codegen.
// Compile-time computation: a CRC table built at build time, baked into the binary
const crc_table: [256]u32 = blk: {
@setEvalBranchQuota(100000);
var table: [256]u32 = undefined;
for (&table, 0..) |*entry, i| {
var crc: u32 = @intCast(i);
for (0..8) |_| crc = if (crc & 1 != 0) (crc >> 1) ^ 0xEDB88320 else crc >> 1;
entry.* = crc;
}
break :blk table; // computed at comptime, stored as a constant
};Compile-time type introspection replaces reflection — at compile time. @typeInfo
gives you a type's full structure (fields, their types, tags) as comptime data you can loop
over. This is how Zig writes a generic JSON serializer or an ORM mapper without a derive
macro and without runtime reflection — the field-walking happens in the compiler and emits
straight-line code:
// Generic field-walking serializer — Go does this with runtime reflect, Rust with a proc-macro,
// Zig with comptime introspection that compiles to direct field accesses.
fn serialize(writer: anytype, value: anytype) !void {
const T = @TypeOf(value);
inline for (@typeInfo(T).@"struct".fields) |field| { // unrolled at compile time
try writer.print("{s}={any} ", .{ field.name, @field(value, field.name) });
}
}inline for and inline while are loops the compiler unrolls at comptime; combined with
@typeInfo, the serializer above compiles to exactly the sequence of print calls for the
concrete struct's fields — no reflection, no vtable, no allocation. A field rename is a
compile error, like serde and unlike a Go struct tag.
Type construction, not just introspection. The inverse of reading a type with @typeInfo
is building one at comptime. Zig 0.16 reshaped this (proposal #10710): the single, clunky
@Type builtin was replaced by a family of purpose-built ones — @Int(.unsigned, 10) for an
arbitrary-width integer, plus @Struct, @Union, @Enum, @Pointer, @Fn, @Tuple, and
@EnumLiteral (there is deliberately no @Array/@Optional/@ErrorUnion — you write [n]T,
?T, E!T). So a function can return a freshly synthesized type — e.g. build a struct whose
field names come from an enum — entirely in the compiler, which is how Zig expresses what Rust
needs a proc-macro for and Go cannot do at all without runtime reflect. Benefit: the
generated type is a normal type with zero runtime cost; use case: deriving a packed
register-map struct from a field description, or generating an SoA (struct-of-arrays) container
from an element type.
comptime parameters and duck typing. anytype parameters are resolved per call site
(structural/duck typing checked at compile time): if the passed value has the methods used,
it compiles; otherwise you get a compile error at the instantiation. This is how Zig gets
generic algorithms without trait bounds — the "bound" is simply whether the body compiles for
that type.
The tradeoff vs Rust macros and Go reflect:
- vs Rust proc-macros: comptime is the same language (no separate
proc_macrocrate, noTokenStream, nosyn/quote), far easier to write and debug, and integrated into normal control flow. It cannot, however, generate new top-level declarations or custom syntax the way proc-macros can, and error messages from deep comptime can be hard to read. - vs Go reflect: comptime does at compile time, with zero runtime cost and full type
safety, what Go does at runtime with allocation and
interface{}. Go's reflection applies where you need runtime dynamism (decode arbitrary JSON intomap[string]any, plugin systems) — comptime is closed at the moment the binary is built.
⚡ Perf — inline assembly and SIMD intrinsics deliver 4–16x throughput on vectorizable code 🔐 Safety — Rust: unsafe is auditable and greppable; the safe subset is formally verified 🔍 Debug —
miridetects UB in unsafe Rust; Go has the race detector and sanitizers
unsafe {} blocks are the opt-in escape hatch from Rust's safety guarantees. Every piece
of unsafe code is explicitly marked, greppable, and isolated. The compiler tracks that
you are in an unsafe context and allows raw pointer dereferences, calling unsafe functions,
implementing unsafe traits, and accessing mutable statics.
// A safe abstraction built on an unsafe foundation
pub fn split_at_mid(slice: &[u8], mid: usize) -> (&[u8], &[u8]) {
assert!(mid <= slice.len());
unsafe {
let ptr = slice.as_ptr();
(
std::slice::from_raw_parts(ptr, mid),
std::slice::from_raw_parts(ptr.add(mid), slice.len() - mid),
)
}
}
// The public API is safe; the unsafe is internal, documented, and boundedasm! (stable since 1.59) provides inline assembly with structured register constraints,
preventing common mistakes (clobber omissions, aliasing) that C's __asm__ volatile misses.
unsafe {
let result: u64;
std::arch::asm!(
"imul {0}, {1}",
inout(reg) a => result,
in(reg) b,
options(pure, nomem),
);
}std::arch provides stable access to platform-specific SIMD intrinsics (SSE2, AVX2,
AVX-512 on x86_64; NEON on AArch64). Portable SIMD (std::simd) is approaching
stabilization for cross-platform vectorization.
#[target_feature(enable = "avx2")]
unsafe fn dot_product_avx2(a: &[f32; 8], b: &[f32; 8]) -> f32 {
use std::arch::x86_64::*;
let va = _mm256_loadu_ps(a.as_ptr());
let vb = _mm256_loadu_ps(b.as_ptr());
let vc = _mm256_mul_ps(va, vb); // 8 multiplications in one instruction
// horizontal sum of 8 lanes...
_mm256_reduce_add_ps(vc)
}
// ⚡ Perf: 8x throughput over scalar; compiler cannot always auto-vectorize complex kernelsmiri is an interpreter for Rust's mid-level IR that detects undefined behaviour in
unsafe code: out-of-bounds accesses, uninitialized memory reads, dangling pointers,
aliasing rule violations. It is the reference implementation for the Rust memory model.
cargo miri test # run test suite under the interpreter; catches UB LLVM might hideGo's unsafe package provides unsafe.Pointer (a pointer that bypasses the type system),
unsafe.Sizeof/Alignof/Offsetof for layout inspection, and unsafe.SliceData /
unsafe.StringData for direct memory access.
// Read a uint32 from a byte slice without a copy (platform-dependent alignment assumed)
func readU32(b []byte) uint32 {
return *(*uint32)(unsafe.Pointer(&b[0]))
}Go assembly is written in separate .s files using Plan 9 assembly syntax. It is used
by the Go standard library for performance-critical paths (hash functions, AES, SHA) but
is less accessible than Rust's inline asm! for ad-hoc use.
// src/sum_amd64.s
TEXT ·vectorSum(SB),NOSPLIT,$0
VMOVUPS (SI), Y0
VMOVUPS 32(SI), Y1
VADDPS Y1, Y0, Y0
VMOVUPS Y0, (DI)
RET
Go 1.26 added the experimental simd/archsimd package for SIMD access from Go source
files. The initial release covers AMD64 with a smaller API surface than std::arch.
Portable SIMD is not yet available. For production SIMD today, Go code typically calls
into C via CGO (with a ~30% reduced overhead after Go 1.26's improvement) or delegates to
pre-compiled assembly files.
Calling C is where "systems language" stops being abstract, and the two designs diverge sharply in cost and ergonomics.
Rust → C is nearly free. Rust has no runtime and uses the platform C ABI natively. An
extern "C" call is an ordinary call instruction — the same one the C compiler would emit;
the optimiser can even inline across the boundary when LTO sees the C code. There is no
marshalling, no stack switch, no scheduler interaction. #[repr(C)] makes a struct's layout
match C exactly, so structs pass by value with zero copying. The work is correctness, not
performance: you wrap the unsafe extern declarations in a safe API, and tools like
bindgen generate the declarations from C headers, cbindgen generates C headers from Rust.
The cost model: roughly a normal function call (single-digit nanoseconds).
Go → C (cgo) is expensive by design. A cgo call cannot be a plain call instruction
because a goroutine runs on a small, movable, segmented stack that C cannot use. So each cgo
call must: switch from the goroutine stack to a dedicated system stack, transition the
calling goroutine into a state where the scheduler knows it's in C (so the GC and preemption
leave it alone), perform the call, then transition back. Historically this cost ~50–100 ns of
pure overhead per call; Go 1.26 cut cgo overhead ~30%, but it remains an order of
magnitude more than a Rust FFI call. Pointers passed into C are subject to strict pointer
passing rules (Go memory handed to C must not contain Go pointers, because the GC may move
or collect them), enforced at runtime by cgocheck. The practical guidance is identical in
both ecosystems but bites harder in Go: batch across the boundary — do bulk work in one C
call rather than many small ones. cgo also disables some of Go's headline advantages: a
binary using cgo is no longer trivially cross-compiled (CGO_ENABLED=0 is the usual default
because cgo breaks static cross-builds), and it pulls a C toolchain into the build.
Calling into each language from C. Both can expose C-ABI entry points
(#[no_mangle] pub extern "C" in Rust; //export + cgo in Go), but with a crucial
asymmetry: a Rust cdylib is a clean shared library with no runtime baggage, suitable as a
drop-in .so/.dll for any language. A Go shared library (-buildmode=c-shared) must carry
the entire Go runtime (GC, scheduler) initialised inside it, which is heavier and has
sharp edges around fork/threading. This is the same root cause as Go's lack of a stable ABI
and its inability to target no_std: the runtime is mandatory.
// Rust: zero-overhead, layout-matched C interop
#[repr(C)]
pub struct Vec3 { x: f32, y: f32, z: f32 }
extern "C" { fn normalize(v: *mut Vec3); } // declared unsafe to call
pub fn normalized(mut v: Vec3) -> Vec3 { // safe wrapper
unsafe { normalize(&mut v); } // plain call, ~ns
v
}Recent FFI parity: Rust 1.93 (January 2026) stabilized declaring C-style variadic functions
for the system/C ABI (unsafe extern "C" fn log(fmt: *const c_char, mut args: ...)), so Rust
can now both call and expose printf-style variadic C interfaces — closing a long-standing FFI
gap where only calling them was possible.
Zig was designed as "a better C," so low-level control is not an unsafe escape hatch — it
is the normal mode of the language.
C interop with no FFI layer at all. Zig can @cImport a C header and call the functions
directly — no bindgen, no wrapper crate, no extern block to hand-write. The Zig compiler
is a C compiler (it ships clang), so it compiles C and Zig in one build and links them with
zero ABI friction. Calls are plain calls, like Rust's (~ns), with none of Go's cgo stack-switch
tax. (0.16 note: the @cImport builtin is deprecated; C translation moves to the build system
via b.addTranslateC(...) — you point it at a C header, link the system libraries, and import
the result as a normal module. The translated code and its zero-overhead nature are identical;
only the invocation site moves from the language into build.zig.)
const c = @cImport({
@cInclude("sqlite3.h"); // use a C library directly — no bindings crate
});
// c.sqlite3_open(...), c.sqlite3_exec(...) are callable immediately, zero overhead.
// Exposing Zig TO C is equally clean — and unlike Go, no runtime is dragged along:
export fn add(a: c_int, b: c_int) c_int { // a clean C-ABI symbol in a .so/.a
return a + b;
}How libc inclusion actually works — and why it's a distinguishing feature. Zig bundles the source of multiple C libraries (musl libc, a curated glibc for many versions, mingw-w64 for Windows, wasi-libc) inside the toolchain and compiles the exact bits you need on demand for the target you ask for. The practical consequences:
- Opt-in libc. A pure-Zig program links no libc by default — it talks to the OS via
std.os/syscalls and ships a freestanding static binary. You opt into libc withexe.linkLibC()inbuild.zig(or-lc), which you need when you@cImporta C library that itself depends on libc, or when a target effectively requires it (e.g. some macOS/Windows paths). This is the inverse of Go (always its own runtime) and of typical C (always libc). - Pick the libc per target. Because Zig carries the sources, you choose
x86_64-linux-musl(fully static, no host glibc) vsx86_64-linux-gnu(dynamic glibc, and you can even pin a minimum glibc version like-gnu.2.28so the binary runs on older distros) — from any host, with no sysroot. Cross-compiling a glibc binary from a macOS laptop "just works." - Mixed Zig+C builds are one graph. Adding
exe.addCSourceFile(...)compiles C/C++ files alongside Zig with shared optimisation flags and LTO;@cImportthen exposes their headers. Vendoring a C dependency (SQLite, zlib, a codec) into a Zig project is routine and needs no separate build system.
// build.zig — link libc and vendor a C source file into the same binary
const exe = b.addExecutable(.{ .name = "app", .root_source_file = b.path("src/main.zig"),
.target = target, .optimize = optimize });
exe.linkLibC(); // opt into libc (musl/glibc per target)
exe.addCSourceFile(.{ .file = b.path("vendor/sqlite3.c"), .flags = &.{"-DSQLITE_THREADSAFE=0"} });
exe.addIncludePath(b.path("vendor")); // so @cImport finds sqlite3.h
b.installArtifact(exe);A Zig static/shared library is as clean as a C one — no runtime, no GC, no init machinery —
which is why Zig is widely used as a cross-compilation toolchain for C/C++ projects even by
non-Zig codebases (zig cc is a drop-in cross-compiler). Rust, by contrast, reaches C via
bindgen/cc-crate and usually a system or vendored libc; Go reaches C via cgo (with the
per-call cost and the loss of CGO_ENABLED=0 static cross-compilation).
You can override libc functions from your own Zig project. The symbols LLVM codegen depends
on — memcpy, memmove, memset, math routines — are provided by Zig's compiler_rt as
weak exports, which means a strong export fn of the same name in your project (or a linked
libc) silently replaces them. You can go further and replace the whole malloc/realloc/free
family with your own implementation simply by exporting C-ABI symbols with those names:
// Override libc malloc with a Zig allocator — linked code (even C code) now calls THIS.
// 0.16: the allocator must be thread-safe without needing an Io instance — a lock-free
// general/arena allocator fits (ThreadSafeAllocator was removed in 0.16 as an anti-pattern).
var gpa: std.heap.GeneralPurposeAllocator(.{ .thread_safe = true }) = .init;
export fn malloc(size: usize) callconv(.c) ?[*]align(16) u8 {
const buf = gpa.allocator().alignedAlloc(u8, 16, size) catch return null;
return buf.ptr; // your allocator now backs every malloc() call
}
export fn free(ptr: ?[*]u8) callconv(.c) void { /* … recover len, gpa.free … */ }This is unusually clean in Zig for a structural reason: Zig's own standard library does not
depend on libc, so when you override malloc, your replacement can use std (and even
allocate) without the recursion hazard that plagues malloc interposers in C (where calling a
libc-backed helper inside your malloc re-enters malloc). Real-world uses: dropping in a custom
or instrumented allocator under a C library you link, building an LD_PRELOAD-style shim,
providing the handful of libc symbols a freestanding target needs, or shrinking binaries by
supplying leaner memcpy/memset than the platform's. CPU/memory/IO angle: because the
override is a normal exported function compiled and inlined in your build (not a runtime hook),
there is no indirection cost; you can specialise the hot memcpy/allocator path for your
workload, and on freestanding/embedded targets you provide exactly the symbols you use and
nothing more. Rust can export #[no_mangle] symbols and set a #[global_allocator], but
overriding the libc codegen intrinsics is not a first-class, weak-symbol-by-default workflow the
way it is in Zig; Go does not expose this at all.
SIMD is a language feature, not a library. Zig has the built-in @Vector type. Arithmetic
operators, @reduce, @shuffle, @select, and @splat work on vectors directly, and the
compiler lowers them to SSE/AVX/AVX-512/NEON per target — portably, in safe code, with no
intrinsics crate and no unsafe. A @Vector(N, T) whose width exceeds the target's native
registers is legal and the compiler splits it across registers, so you can write to a logical
width and let the backend map it to the hardware.
// 1) Portable multiply-add dot product: AVX2 on x86_64, NEON on aarch64, etc.
fn dotProduct(a: []const f32, b: []const f32) f32 {
const V = @Vector(8, f32);
var acc: V = @splat(0.0);
var i: usize = 0;
while (i + 8 <= a.len) : (i += 8) {
const va: V = a[i..][0..8].*;
const vb: V = b[i..][0..8].*;
acc += va * vb; // 8-wide fused multiply-add, one instruction
}
return @reduce(.Add, acc); // horizontal sum across the lanes
}
// 2) Branchless select / clamp: no per-lane branches, uses a mask
fn clamp255(xs: @Vector(16, i16)) @Vector(16, u8) {
const lo: @Vector(16, i16) = @splat(0);
const hi: @Vector(16, i16) = @splat(255);
const clamped = @select(i16, xs < lo, lo, @select(i16, xs > hi, hi, xs));
return @intCast(clamped); // narrow to u8 lanes
}
// 3) SIMD comparison → bitmask, e.g. find which bytes equal a delimiter (parsing fast-path)
fn matchByte(chunk: @Vector(32, u8), needle: u8) u32 {
const mask = chunk == @as(@Vector(32, u8), @splat(needle)); // @Vector(32, bool)
const bits: u32 = @bitCast(mask); // pack the 32 lane-bools into a 32-bit mask
return bits; // bit i set ⇔ chunk[i] == needle
}
// 4) @shuffle to reverse / permute lanes (e.g. byte-swap, transpose building block)
fn reverse4(v: @Vector(4, u32)) @Vector(4, u32) {
return @shuffle(u32, v, undefined, @Vector(4, i32){ 3, 2, 1, 0 });
}These cover the four idioms most SIMD code needs — reduction, branchless select, compare-to-mask
(the heart of SIMD parsing like zimdjson), and permute/shuffle. They are expressed in safe Zig
with no target-specific intrinsics. This is more ergonomic than Rust's std::arch intrinsics
(which need unsafe and per-arch #[cfg] code), lighter than Rust's portable std::simd (still
stabilising), and ahead of Go's experimental simd/archsimd. The trade is that Zig gives you no
guarantee the compiler picks the optimal instruction sequence — you verify the disassembly for
hot kernels, as in C.
Inline assembly and explicit layout. Zig has inline asm with named operand constraints,
packed struct for bit-exact layouts (including sub-byte integer fields like u3), align()
for cache-line control, and volatile/MMIO support for bare-metal — everything the kernel/
embedded/driver world needs, in the open language rather than behind unsafe.
The safety caveat (the recurring theme). Because all of this is normal Zig, there is no
unsafe keyword marking the dangerous parts — the whole language has C-like power, and the
borrow checker that would make Rust's equivalent code safe simply isn't there. Pointer
arithmetic, manual lifetime management, and @ptrCast are available everywhere; correctness
is guarded by runtime safety checks in Debug/ReleaseSafe (bounds checks, overflow checks,
alignment checks, undefined poisoning) and is UB in ReleaseFast. So Zig gives you the
the fewest ceremony layers for low-level work and FFI (no unsafe keyword, direct C import),
Rust gives compile-time memory-safety guarantees over the same operations, and Go keeps you
high-level by default while paying the cgo cost when you go low.
⚡ Perf — Rust serde: compile-time codegen vs Go reflection; modest realistic gap on typical payloads, and Go json v2 (1.25+) / sonic close most of it — type-safety is the more durable serde advantage 🔐 Safety — Rust: UTF-8 validity in types; Go and Zig: string is bytes + convention 🧹 DX — Go: reflect-based JSON works with no codegen step; serde requires a derive
Rust encodes encoding assumptions in the type system:
String/&str— heap-owned / borrowed; guaranteed UTF-8OsString/OsStr— OS-native encoding; for paths on Windows (UTF-16 internally)CString/CStr— null-terminated; for FFI to C functions[u8]/Vec<u8>— raw bytes; no encoding assumption
Passing [u8] where &str is expected is a compile error. Passing a CStr to a
function expecting &str is a compile error. Encoding bugs surface at conversion, not
deep in business logic.
Serde is the de-facto serialisation framework. #[derive(Serialize, Deserialize)] generates
a complete, struct-specific parser with no runtime reflection:
#[derive(Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
struct Order {
order_id: u64,
#[serde(skip_serializing_if = "Option::is_none")]
coupon_code: Option<String>,
#[serde(with = "chrono::serde::ts_milliseconds")]
created_at: DateTime<Utc>,
}
// The generated parser knows at compile time that order_id is a u64, that
// coupon_code may be absent, and that created_at is millisecond Unix timestamp.
// Go json v1 discovers this via reflect per call; json v2 (1.25+) and sonic narrow the gap.Go's string is an immutable byte sequence. UTF-8 is the documentation convention, not
a type guarantee. The standard library is consistent about treating strings as UTF-8, but
user code can construct a string from arbitrary bytes without a compiler error.
encoding/json uses runtime reflection to discover field names, types, and struct tags.
This means no codegen step, no build-time tool dependencies, and excellent flexibility —
you can unmarshal into map[string]any, use json.RawMessage for deferred parsing, and
handle arbitrary dynamic JSON structures without defining structs.
// Flexible dynamic JSON — no pre-defined struct needed
var result map[string]any
json.Unmarshal(data, &result)
// Or structured:
type Order struct {
OrderID uint64 `json:"orderId"`
CouponCode *string `json:"couponCode,omitempty"`
CreatedAt int64 `json:"createdAt"`
}The v1 performance cost is real — encoding/json v1 uses reflect.Value at runtime, with
interface allocation and per-field type switching. The 2025–2026 picture is materially
different from older comparisons, though:
encoding/json/v2(Go 1.25, behindGOEXPERIMENT=jsonv2; candidate for stabilization in/after 1.26) is a ground-up rewrite with streaming, stricter correctness (unique-key checks, no silent array trim,[]/{}instead ofnullfor nil slices/maps), and better diagnostics. Benchmarks put it roughly at parity with the fast third-party libs — v2 is about 1.4x faster to 1.2x slower than v1 depending on payload shape.bytedance/sonicuses JIT compilation plus SIMD and is Go's throughput leader; it skips some UTF-8 validation to go faster, a correctness/speed trade you opt into.goccy/go-jsonandmailru/easyjson(codegen) are drop-in faster alternatives.
// json v2 — explicit, streaming, stricter
import jsonv2 "encoding/json/v2" // Go 1.25+ with GOEXPERIMENT=jsonv2
var order Order
if err := jsonv2.Unmarshal(data, &order); err != nil { /* descriptive error */ }For most HTTP handlers the bottleneck is network I/O, not JSON; v1 is fine. When throughput genuinely matters, v2/sonic/go-json close most of the historical gap to serde. The durable difference is not raw speed but type-safety: a renamed Go struct tag silently produces wrong JSON, whereas the equivalent serde rename fails at compile time.
Zig folds serialization into its comptime story (§8) and treats strings as what they physically are: byte slices.
Strings are []const u8 — bytes, not a distinct type. Zig has no String/str
distinction and no UTF-8 guarantee in the type system. A string literal is a []const u8
(actually *const [N:0]u8, a null-terminated array pointer for C compatibility). UTF-8 is a
convention enforced by functions in std.unicode, not by the type — closer to Go's string
than to Rust's validated String/&str. This is more error-prone than Rust but trivially
zero-copy and C-interop-friendly: a Zig []const u8 is a C string view with no conversion.
std.json uses comptime to (de)serialize into real structs. No derive macro, no runtime
reflection — std.json.parseFromSlice(T, ...) uses @typeInfo(T) at compile time to generate
a parser specialized to T's fields, the same comptime-introspection pattern from §8:
const Order = struct { id: u64, total: f64, items: []const []const u8 };
const parsed = try std.json.parseFromSlice(Order, allocator, json_bytes, .{});
defer parsed.deinit(); // arena-backed; frees the whole parse at once
const order: Order = parsed.value; // a real typed struct, fields checked at comptimeBecause the field-walking is comptime, a field rename is a compile error (like serde, unlike
a Go tag), and there is no interface{}/reflection allocation. std.json also offers a
streaming scanner (std.json.Scanner) for incremental parsing and a std.json.Value dynamic
tree for the "arbitrary JSON" case Go reaches with map[string]any.
Throughput: zimdjson. For raw speed, zimdjson is an actively-maintained Zig port
of simdjson advertising multi-gigabyte-per-second parsing using @Vector SIMD — the
ecosystem's answer to Rust's simd-json and Go's sonic. For typical payloads std.json is
fine; for bulk ingestion zimdjson is the tool.
Date/time is a useful litmus test for "batteries included," because it splits cleanly: a timestamp/duration/monotonic-clock layer (which all three have in stdlib) versus a civil calendar layer — human dates, time zones, parsing/formatting, arithmetic across DST — which only one of the three ships in its standard library.
Go — fully batteries-included; no external library needed. The stdlib time package is
complete and is the type everyone uses: time.Time (an instant, with location/zone),
time.Duration (typed nanoseconds with constants like time.Hour), time.Location (IANA
time-zone database), parsing/formatting via the (in)famous reference-layout strings
("2006-01-02 15:04:05"), arithmetic (Add, Sub, AddDate), comparison, and a monotonic
clock reading embedded in time.Time so interval measurement is correct across wall-clock
adjustments. Nothing third-party is required for normal date/time work; the common complaints are
the reference-layout format (rather than strftime) and that time.Time is a struct you usually
pass by value.
t, _ := time.Parse(time.RFC3339, "2026-06-12T09:30:00Z")
loc, _ := time.LoadLocation("Asia/Kolkata")
inIST := t.In(loc).Add(48 * time.Hour) // tz conversion + arithmetic, all stdlib
fmt.Println(inIST.Format("Mon 02 Jan 2006 15:04 MST"))Rust — stdlib covers only instants/durations; civil dates need a crate. std::time provides
Instant (opaque monotonic clock, for measuring elapsed time), SystemTime (wall clock, but
no calendar operations — you cannot get "the year" from it), and Duration. There is
deliberately no civil date/time type in std — to do anything human-facing (parse an
RFC 3339 string, add a month, convert time zones, format a date) you add a crate:
chrono— the long-standing default:DateTime<Utc>/DateTime<Local>/NaiveDateTime,strftime-style formatting, arithmetic. Note time-zone data is not bundled — you addchrono-tz(ortzfile) for the IANA database, a deliberate binary-size choice.jiff— the newer (2024+) library, explicitly modeled on the Temporal proposal, with built-in IANA tz support, DST-aware arithmetic, and an ergonomic API; increasingly recommended for new code.time— a lighter,no_std-friendly alternative.
use chrono::{DateTime, Utc, TimeZone, Duration};
let t: DateTime<Utc> = "2026-06-12T09:30:00Z".parse()?; // needs the chrono crate
let later = t + Duration::hours(48);
// time-zone conversion to Asia/Kolkata additionally requires the chrono-tz crateZig — stdlib has timestamps/timers only; civil calendar is community or hand-rolled.
For wall-clock and monotonic time, 0.16 folded the old std.time.Instant/std.time.Timer into
the I/O interface: you now read time through std.Io.Timestamp (std.Io.Timestamp.now,
durations via std.Io.Duration), which is the same "primitives only" story but routed through
std.Io so a green-threaded or io_uring backend can virtualize the clock. (Unix-epoch helpers
and the std.time.ns_per_s-style unit constants remain.) There is still no civil date/time
type, no time-zone handling, and no date parser in std — for human dates you reach for a
community library (zig-datetime, or the timezone-aware zeit/zdt) or compute civil dates from
a Unix timestamp yourself (the epoch-to-Y/M/D algorithm is short but is code you own), and IANA
time-zone support is largely DIY or via a C library. This reflects Zig's youth and
minimalist-stdlib stance: the primitives are there, the calendar layer is not.
const now = std.Io.Timestamp.now(io); // 0.16: time is read through std.Io
var start = std.Io.Timestamp.now(io); // monotonic interval timing via Timestamps
const elapsed = std.Io.Timestamp.now(io).since(start);
// Civil date (year/month/day), formatting, time zones → community lib or hand-rolled🧹 DX — Go: first-party batteries (test, cover, fuzz, pprof, vet, generate) in one binary; Rust: more powerful but more decisions 🔒 SecOps — Go: no build-time code execution; Rust: build.rs risk mitigated by cargo-vet ⚡ Build — Go: fastest compile times in the industry; Rust: LLVM backend enables deeper optimization
Cargo.toml is a single declarative file covering dependencies, feature flags, build
profiles, workspace layout, targets, benchmarks, and examples.
Feature flags enable conditional compilation of dependency capabilities:
[dependencies]
tokio = { version = "1", features = ["net", "rt-multi-thread"] }
# timer wheel, signal handling, and fs are NOT compiled — smaller and faster to build
serde = { version = "1", features = ["derive"] }
sqlx = { version = "0.7", features = ["postgres", "runtime-tokio-tls", "macros"] }Build profiles control optimization per use case:
[profile.release]
opt-level = 3
lto = "thin" # link-time optimization across crates
codegen-units = 1 # max optimization at cost of parallelism
panic = "abort" # no unwinding machinery; ~5% smaller binary
strip = true
[profile.profiling] # release perf + debug symbols for flamegraph
inherits = "release"
debug = truebuild.rs runs before compilation: compiles C/C++ extensions, generates Rust source
from Protobuf/FlatBuffer schemas, emits custom linker flags.
// build.rs
fn main() {
cc::Build::new().file("src/fast_hash.c").flag("-mavx2").compile("fast_hash");
println!("cargo:rustc-link-lib=static=fast_hash");
// Generates Rust bindings from a C header automatically
let bindings = bindgen::Builder::default().header("include/fast_hash.h").generate()?;
bindings.write_to_file("src/bindings.rs")?;
}Editions (2015 / 2018 / 2021 / 2024) let the language fix mistakes and improve ergonomics per-crate without breaking existing code. Old edition crates compile forever and link seamlessly with new edition crates.
cargo clippy provides hundreds of semantic lints: correctness (real bugs), performance
(avoid unnecessary clone, prefer extend over repeated push), style, and complexity. Many
lints have machine-applicable fixes that cargo fix --clippy applies automatically.
Security tooling:
cargo audit # check Cargo.lock against RustSec advisory database
cargo deny check # enforce license policy, ban crates, check advisories in one pass
cargo vet # require human-reviewed audit records per crate version
cargo sbom # generate software bill of materialsSupply-chain risk: build.rs and proc-macros execute arbitrary code at compile time.
A malicious package could exfiltrate secrets or download additional payloads during cargo build.
cargo-vet and cargo-deny mitigate this but do not eliminate it.
Go ships a complete, first-party toolchain in a single binary:
go test ./... # test all packages
go test -race ./... # test with runtime race detector
go test -cover ./... # test with coverage
go test -fuzz FuzzXxx . # property-based fuzzing (since 1.18)
go test -cpuprofile cpu.out ./... # CPU profile
go tool pprof cpu.out # interactive profiler
go vet ./... # static analysis
go fix ./... # automated code modernization (revamped in 1.26)
go generate ./... # run code generatorsNo crate decisions, no configuration files, no third-party tool installs. Every tool is versioned together with the compiler and tested against the same stdlib.
Go 1.26 rebuilt go fix into the home of Go's modernizers: a push-button way to update a
codebase to current idioms and stdlib APIs (dozens of fixers — e.g. rewriting old loops to
range-over-int, adopting any, using new library functions). It is built on the same analysis
framework as go vet, so a vet diagnostic can carry a machine-applicable fix, and it adds a
source-level inliner driven by //go:fix inline directives that lets library authors ship
automatic call-site migrations to their users. (The old, obsolete go fix rewriters were
removed.) This narrows the gap with Rust's cargo fix/Clippy autofix, with the difference that
Go's is purely first-party.
gofmt produces one canonical style with zero configuration. The entire Go ecosystem
formats identically. No rustfmt.toml, no style debates, no review comments about whitespace.
go.mod is deliberately minimal — module path, Go version, direct dependencies. No
feature flags, no profiles, no build scripts. Complex build needs end up in a Makefile
next to go.mod. This is a conscious design choice: simplicity over expressiveness.
Go's module system prevents arbitrary code execution at build time. go build downloads
and compiles code; it does not run it. go generate requires explicit invocation and is
never triggered automatically. This is a meaningful supply-chain security advantage.
govulncheck ./... # official vulnerability scanner — reachability-aware
# Reports only CVEs in code paths you actually call; not just packages you import
# "GO-2024-2687: only affects pkg.Foo() which is not called — informational"
# vs: "GO-2024-2688: reachable via your/service.Handle → third/party.Parse — HIGH"Rust and Go support Profile-Guided Optimization (Zig, via LLVM, can use PGO but without first-party tooling). Rust's PGO is LLVM-backed and deep
(affects inlining, branch layout, register allocation — 10–30% gains on suitable workloads).
Go's PGO (stable since 1.22) uses pprof profiles and is simpler to apply (drop a
default.pgo file in the source directory — 2–14% gains). Go 1.25 expanded the set of
optimisations PGO influences.
🔐 Safety — linters catch whole classes of latent bugs (truncating casts, swallowed errors, incorrect comparisons) that compile cleanly but misbehave at runtime 🧹 DX — they encode team conventions as enforced rules, moving style debates out of code review ⚡ Perf — both linters flag performance anti-patterns (needless clones/allocations) at lint time 🔍 Debug — a CI lint gate prevents an entire category of "how did this reach production" incidents
A compiler answers "is this program valid?" A linter answers a harder, more valuable question: "is this program defensible?" The gap between those two is where most maintainability rot lives — code that compiles and even passes tests but quietly truncates an integer, ignores an error, clones in a hot loop, or expresses a condition in a way a reviewer will misread six months later. Linters convert that judgment into automation. Run in CI with warnings-as-errors, they turn "we should really be more careful about X" from a wiki page nobody reads into a build that fails until X is fixed. That is the actual discipline mechanism: not the suggestions, but the gate.
Rust — Clippy:
Clippy is the official linter, shipped via rustup, with 800+ lints organized into categories you opt into by level:
- correctness (deny by default) — code that is almost certainly a bug: comparing a
value to itself, a loop that never iterates,
mem::swapwith identical arguments, anIterator::nth(0)that should benext(). These are not style; they are defects. - suspicious / complexity / style (warn by default) — idiom and clarity:
collapsible
ifs, needlessreturn, manual implementations of standard combinators, redundant clones. - perf (warn by default) — allocation and copy anti-patterns: cloning where a borrow
suffices,
Vecpush-in-a-loop whereextend/collectis better, unnecessary boxing,format!where a direct write would do. - pedantic (allow by default; opt in) — opinionated checks for power users; expect to
sprinkle
#[allow(...)]for intentional exceptions. - nursery (allow by default) — newer lints that may have false positives.
- restriction (allow by default, cherry-pick only) — bans specific language features
for high-assurance codebases: forbid
unwrap()/expect(), forbidpanic!, require everyunsafeblock to carry a// SAFETY:comment (undocumented_unsafe_blocks), forbiddbg!and strayprint!from reaching production. - cargo — manifest hygiene: wildcard dependencies, missing metadata.
Crucially, lint policy lives in the manifest and is versioned with the code, so every developer and CI runner enforces the identical ruleset:
# Cargo.toml — lint policy as code, applied to the whole crate
[lints.rust]
unsafe_code = "warn"
missing_docs = "warn"
[lints.clippy]
# Opt the whole crate into the pedantic group, then carve out intentional exceptions
pedantic = { level = "warn", priority = -1 }
module_name_repetitions = "allow"
must_use_candidate = "allow"
# Cherry-picked restriction lints that enforce real discipline
unwrap_used = "deny" # force explicit error handling; use expect() with a reason
undocumented_unsafe_blocks = "deny" # every unsafe block must justify itself in a // SAFETY: note
dbg_macro = "deny" # no stray dbg!() reaches maincargo clippy --all-targets --all-features -- -D warnings # CI gate: any lint fails the build
cargo clippy --fix # auto-apply machine-applicable fixesWhat makes Clippy a discipline tool rather than a nag:
- Machine-applicable fixes. A large fraction of lints carry an exact rewrite that
cargo clippy --fixapplies, so adopting a stricter ruleset is often a one-command migration. - Per-item escape hatches.
#[allow(clippy::some_lint)]on a function or block documents an intentional exception in the code, visible at the point of deviation — the exception becomes self-documenting rather than invisible. - It composes with the type system. Clippy assumes ownership/borrowing already prevents memory bugs, so its lints target the layer above: idiom, clarity, and the narrow set of logic mistakes the borrow checker cannot see.
Go — go vet, staticcheck, and golangci-lint:
Go's linting is layered and partly first-party:
go vet(first-party, ships with the toolchain) — a conservative set of checks for definitely-wrong constructs:Printfformat-string mismatches, struct tags that won't parse, locks copied by value, unreachable code. Low false-positive rate by design.staticcheck(Dominik Honnef, the de-facto gold standard) — 150+ checks across correctness, simplifications, and dead-code analysis: detecting impossible nil-error patterns, ineffective assignments, incorrecttime.Durationmath, and more.golangci-lint— the meta-linter that bundles and runs 50+ linters in parallel under one config and one command. This is what most production Go teams actually gate CI on. Notable members:errcheck(flags swallowed errors — the single most valuable Go lint, sincedata, _ := f()is legal and silent),gosec(security: hardcoded creds, weak crypto, SQL string-building),gocyclo(cyclomatic complexity ceiling),ineffassign,unparam,revive(configurable style),misspell.
# .golangci.yml — versioned with the repo so every dev and CI runs the same checks
linters:
enable:
- errcheck # catch unchecked errors — Go's biggest silent-failure source
- staticcheck # 150+ deep checks
- govet # first-party correctness
- gosec # security issues
- revive # configurable style rules
- gocyclo # complexity ceiling
- ineffassign # assignments that are never used
- unparam # unused function parameters
- misspell
linters-settings:
errcheck:
check-type-assertions: true
check-blank: true # flag `x, _ := f()` blank-discard of errors
gocyclo:
min-complexity: 15golangci-lint run ./... # run the configured bundle
golangci-lint run --fix ./... # apply auto-fixes where availableerrcheck deserves special mention as a discipline mechanism: because Go makes
result, _ := mayFail() both legal and idiomatic-looking, swallowed errors are the
language's most common latent-failure class. A linter that fails the build on a blank-discard
error is, in practice, the closest Go gets to Rust's compiler-enforced "you must handle the
Result." It is opt-in rather than built into the language, which is exactly the point:
the discipline that Rust bakes into the type system, Go reconstructs at the lint layer.
| Aspect | Rust (Clippy) | Go (golangci-lint stack) |
|---|---|---|
| Official / bundled | ✅ Clippy ships with rustup | Partly — go vet first-party; staticcheck/golangci-lint external |
| Lint count | 800+ in one tool | 150+ (staticcheck) + 50+ bundled linters |
| Policy as code | [lints] in Cargo.toml |
.golangci.yml |
| Auto-fix | cargo clippy --fix (broad) |
golangci-lint run --fix (partial) |
| Error-handling discipline | Mostly enforced by the type system already; unwrap_used lint adds more |
Reconstructed at lint layer (errcheck) — its most important lint |
| Security linting | cargo-audit/cargo-deny (deps) + restriction lints |
gosec (code) + govulncheck (deps, reachability-aware) |
| Setup cost | Near zero (ships with toolchain) | Install golangci-lint; write .golangci.yml |
The deeper point: a linter is how a team encodes its definition of "good code" as an
executable contract. In Rust, the type system already enforces the highest-stakes rules
(memory safety, data-race freedom, error handling), so Clippy operates one level up on
idiom and clarity. In Go, the language deliberately enforces less at compile time, which
makes the linter layer load-bearing — errcheck and staticcheck are not optional polish
but the primary defense against the silent-failure classes the compiler permits. Either way,
the lesson for production is the same: run the linter in CI with failures gating merges, keep
the config in the repo, and treat a new lint the way you treat a failing test.
Zig's build story is unusual: the build script is a Zig program. There is no separate
build DSL (Cargo.toml, go.mod, Makefile) — build.zig is real Zig code that constructs a
build graph using std.Build, and build.zig.zon (ZON = Zig Object Notation) declares
dependencies. Because the build script is the language itself, comptime, loops, and helper
functions are all available to express arbitrarily complex builds without a meta-language.
// build.zig — the build script IS Zig code
pub fn build(b: *std.Build) void {
const target = b.standardTargetOptions(.{});
const optimize = b.standardOptimizeOption(.{}); // Debug/ReleaseSafe/ReleaseFast/ReleaseSmall
const exe = b.addExecutable(.{
.name = "myservice",
.root_source_file = b.path("src/main.zig"),
.target = target,
.optimize = optimize,
});
// Pull a dependency declared in build.zig.zon
const httpz = b.dependency("httpz", .{ .target = target, .optimize = optimize });
exe.root_module.addImport("httpz", httpz.module("httpz"));
b.installArtifact(exe);
const run = b.addRunArtifact(exe);
b.step("run", "Run the app").dependOn(&run.step);
}Optimize modes are first-class, not profiles. Zig has four build modes baked into the
language: Debug (all safety checks, fast compile), ReleaseSafe (optimized with safety
checks kept — a mode neither Rust nor Go offers as a default tier), ReleaseFast (max speed,
safety checks off, UB on misuse), and ReleaseSmall (size-optimized). ReleaseSafe is
notable: it's "optimized but still bounds-checked and overflow-checked," a production sweet
spot for code that wants speed without surrendering memory-safety checks.
zig cc — a C/C++ cross-compiler that ships in the toolchain. Because Zig bundles clang
and a cross-platform libc collection, zig cc is a drop-in, hermetic C/C++ cross-compiler.
Many non-Zig projects adopt it purely for this — it cross-compiles C to any target from any
host with one command, something that is painful with stock GCC/clang and a major reason Zig
shows up in build pipelines for Go and Rust projects too.
First-party tooling, like Go. zig fmt is the canonical formatter (zero config, like
gofmt). zig build test runs tests. zig build handles the whole graph. The 0.15 cycle
added a local zig-pkg/-style package cache and a global compressed cache; 0.16 refined
package workflows further and debuted a new from-scratch ELF linker (-fnew-linker, still
opt-in) aimed at removing the LLD dependency and enabling incremental linking. Compile speed
is a Zig priority: the 0.15 line made debug builds ~5× faster by defaulting to Zig's own x86
backend instead of LLVM.
Linting is not yet first-party, but the community fills it: zlint and KurtWagner/zlinter
(the latter integrates into build.zig) provide style and correctness checks, and zwanzig
adds CFG-based static analysis for leaks, double-frees, optional-unwrap mistakes, and stack
escapes — partly recovering, as opt-in tooling, the bug classes Rust's compiler rejects outright.
For performance work, andrewrk/poop (a CLI perf observer) and zBench (a benchmarking library)
are the common choices, and kubkon/bold is a drop-in faster replacement for Apple's ld.
🔍 Debug — Rust: miri catches UB in unsafe code; compiler error messages are best in class 🔍 Debug — Go: first-party race detector, fuzz testing, and pprof profiler in one toolchain 🧹 DX — Go: test, coverage, fuzz, and profiling work out of the box; Rust: each requires a crate choice
cargo test— runs unit tests, integration tests, and doctests in one command. Rustdoc code examples in///comments are compiled and run; stale docs that no longer compile are caught in CI. Go'sExamplefunctions are similar but live in separate files.cargo bench+ Criterion — statistical benchmarking with Welch's t-test, outlier detection, and HTML reports. Tells you if a change is statistically significant or noise.miri— runs the program under an interpreter that detects undefined behaviour inunsafecode. Only tool that verifies unsafe code against Rust's formal memory model.- Compiler error messages — spans, did-you-mean suggestions,
--explain E0382for a full essay on each error type, and machine-applicable fix suggestions. Widely considered the best diagnostic output of any compiled language. cargo-flamegraph,perf,heaptrack— profiling ecosystem is third-party but deep.
cargo test # unit + integration + doctests
cargo test -- --nocapture # show println! output
cargo miri test # UB detection under interpreter
cargo bench # statistical benchmarks with Criterion
cargo flamegraph # CPU flame graph
RUSTFLAGS="-C instrument-coverage" cargo test # LLVM coveragego testwith subtests, table-driven tests,TestMainfor test harness setup.-parallel Nfor concurrent test execution.go test -race— dynamic race detection with ~5–10% overhead; detects races that actually occur during a test run. Does not prevent races — only finds them.go test -fuzz— built-in property-based / mutation fuzzing (since 1.18). No external crate, no configuration — addfunc FuzzXxx(f *testing.F)and run.go test -coverwith-pkg(1.26) — Go 1.26 added whole-program coverage mode, tracking which code was exercised across integration tests, not just unit tests.go tool pprof— CPU, heap, goroutine, block, and mutex profiles. Produced byruntime/pprofor thenet/http/pprofendpoint; visualised withgo tool pprof -http(which now defaults to the flame-graph view as of 1.26).goroutineleakprofile (experimental, 1.26) — a new profile that detects leaked goroutines (blocked forever on a channel/mutex/sync.Condthat can never be unblocked) by letting the GC find concurrency primitives unreachable from any runnable goroutine. Enabled withGOEXPERIMENT=goroutineleakprofile; it adds no runtime overhead unless actively in use, and is slated to be on by default in 1.27 — a direct answer to one of Go's classic production bugs that previously needed manual goroutine-dump inspection.runtime/metricsscheduler counters (1.26) — new/sched/goroutinesstate counts,/sched/threads, and total-goroutines-created metrics, useful for spotting runaway goroutine growth before it becomes a leak.govulncheck— call-graph-aware vulnerability scanning; reports only reachable CVEs.
go test ./...
go test -race ./...
go test -fuzz FuzzParseConfig -fuzztime 30s ./...
go test -cpuprofile=cpu.out ./... && go tool pprof cpu.out
go test -cover -coverprofile=c.out ./... && go tool cover -html=c.out
govulncheck ./...Zig builds testing into the language, like Go but more deeply: test is a keyword. You
write test "name" { ... } blocks inline next to the code they cover, and zig build test
(or zig test file.zig) runs them.
fn add(a: i32, b: i32) i32 { return a + b; }
test "add basics" {
try std.testing.expectEqual(@as(i32, 5), add(2, 3));
}
test "allocation is leak-checked automatically" {
const a = std.testing.allocator; // a GPA that FAILS the test on leak
const buf = try a.alloc(u8, 64);
defer a.free(buf); // forget this → test fails with a leak report
try std.testing.expect(buf.len == 64);
}Memory-safety checks are part of testing. std.testing.allocator is a
GeneralPurposeAllocator that detects leaks, double-frees, and use-after-free during the test
run and fails the test with the offending allocation's stack trace. This is Zig's substitute
for what Rust's borrow checker proves statically and what Go's GC sidesteps — and it is
remarkably effective in practice, catching the bug classes that motivate Rust's ownership
model, just at test time rather than compile time. Combined with Debug-mode bounds/overflow
checks, zig build test exercises a lot of the safety surface.
Built-in fuzzer. Zig is building an integrated fuzzer (zig build test --fuzz) with the
stated goal of being competitive with AFL — a first-party fuzzing story like go test -fuzz,
versioned with the toolchain rather than bolted on like Rust's cargo-fuzz.
comptime tests and assertions. Because comptime runs real code at build time, you can
assert invariants that fail the build, not the test run — comptime assert(...) and
@compileError give compile-time test-like guarantees (e.g. "this lookup table is the right
size," "this type has the expected layout") that have no direct Go/Rust equivalent without
macros.
What's missing. No miri-equivalent formal UB interpreter (Zig leans on runtime safety
checks instead), no statistical-benchmark framework as polished as Criterion (you write
timing loops by hand or use community crates), and the profiling/observability ecosystem is
thinner — you typically reach for perf, Tracy (via ztracy), or platform tools rather than
an integrated pprof. Net: Zig's test + leak-detection integration rivals Go's
batteries-included ergonomics and covers much of what Rust needs miri for, but its
benchmarking and observability tooling is the least mature of the three.
📦 Binary — Go: GC shapes produce smaller generic binaries; Rust: monomorphization can bloat ⚡ Perf — Rust: 1.5–3x faster on CPU-bound; <10% difference on I/O-bound server workloads 🧹 DX — Go:
CGO_ENABLED=0static binary, trivial cross-compilation; Rust: needs sysroot for cross
Rust monomorphizes generic code — each concrete instantiation of a generic function gets
its own compiled copy. A heavily generic codebase (serde, regex, async state machines)
produces large binaries. Mitigation: opt-level = "z", strip = true, lto = "thin",
and codegen-units = 1 can reduce a 40 MB binary to 5–8 MB.
Go uses GC shapes: types with the same memory layout share one compiled copy (all pointer
types share a single generic function implementation, differentiated by a dictionary at
runtime). Binary size scales better for generic-heavy code. A typical Go service binary is
10–20 MB. Zig monomorphizes comptime instantiations like Rust, so it can see similar
duplication, but it has no runtime to link in; with ReleaseSmall and stripping, Zig
produces some of the smallest binaries of the three, comparable to optimized C.
# Go — one command, any target, no extra tools (only when CGO is disabled)
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 go build -o myservice ./cmd/server
# Rust — requires rustup target, often a C cross-compiler, sometimes Docker
rustup target add aarch64-unknown-linux-musl
CARGO_TARGET_AARCH64_UNKNOWN_LINUX_MUSL_LINKER=aarch64-linux-musl-gcc cargo build --release --target aarch64-unknown-linux-musl
# Zig — bundles libc sources and builds them on demand; cross-compiles C too
zig build -Dtarget=aarch64-linux-muslGo's static binary (with CGO_ENABLED=0) has no dynamic dependencies and produces trivial
FROM scratch Docker images; enabling cgo removes that property. Rust cross-compilation needs
the target's std plus, for many targets, a C cross-linker. Zig ships libc sources for many
targets and builds them on demand, so zig build -Dtarget=... cross-compiles with no extra
toolchain — and because the compiler bundles clang, zig cc (§11) cross-compiles C/C++ as
well, which is why some non-Zig projects use it purely as a cross-compiler.
All three start fast (< 20 ms in most cases). Go carries a GC runtime and goroutine scheduler; Rust and Zig have no GC, giving lower and more predictable peak memory for allocation-heavy workloads (no GC headroom), which matters most in memory-constrained environments. Zig's absence of any runtime means startup is immediate (no runtime init).
For CPU-bound workloads (compression, crypto, parsing, ML inference), Rust and Zig run in the same tier as C — no GC interruptions, LLVM optimization, monomorphized type-specialized loops, SIMD. Go is typically slower on this class of work (one benchmarked range is ~1.5–3× for hot numeric loops, but the figure varies widely with workload): GC pauses during hot loops, GC-shape generics rather than full monomorphization, and weaker auto-vectorization.
For I/O-bound workloads (HTTP APIs, database queries, message brokers), the gap narrows to under ~10% — most time is spent waiting on I/O — and the dominant factor becomes programming model and iteration speed rather than raw language throughput.
Rust with #![no_std] removes the standard library, enabling operation on bare metal with no
OS, no allocator, no thread system; the embassy async runtime runs on microcontrollers with
16 KB flash. Rust is used in the Linux kernel, Windows kernel drivers, Android system
components, and embedded firmware. Zig targets freestanding the same way — no separate
no_std ceremony, because the OS-dependent parts of std simply aren't pulled in, and you
supply an allocator (or a FixedBufferAllocator) — and is used in embedded, kernels, and
bootloaders. Go ships with its runtime (GC, scheduler, heap) and has no no_std/freestanding
mode, so targeting bare metal would mean replacing that runtime — which is the niche TinyGo
fills with a separate compiler and a cut-down runtime for microcontrollers and WASM, at the cost
of some reflect/stdlib compatibility.
Zig has no mandatory runtime, so a hello-world links to a tiny static binary and
ReleaseSmall + stripping yields among the smallest outputs of the three. Its C ABI
(extern struct, export) is the stable interop contract, so shipping a Zig .so/.a for
other languages is clean and runtime-free — unlike Go's c-shared libraries, which embed the
whole runtime. Zig's own inter-version ABI is not formally stable pre-1.0 (the language is
still changing). The flagship production proof point for Zig's no-GC, deterministic profile is
TigerBeetle, a financial database that performs zero runtime allocation after startup. As
elsewhere, this control comes without Rust's compile-time memory-safety guarantees.
WASM is a deployment target where the three diverge sharply, driven by how much runtime each must carry.
- Rust targets
wasm32-unknown-unknownandwasm32-wasip1(WASI) natively.wasm-bindgengenerates the JS glue and TypeScript types;wasm-packproduces npm packages. Because there is no GC or runtime, a compute-focused module is tens to low-hundreds of KB, and there are no GC pauses to disturb frame timing in the browser. This is the most mature browser-WASM story of the three and is widely used in production (image/video processing, crypto, game logic). - Go compiles to WASM (
GOOS=js GOARCH=wasm, andGOOS=wasip1for WASI), but the output embeds the Go runtime (GC + scheduler), historically a multi-MB baseline; Go 1.26 reduced small-heap WASM memory use, but size remains a constraint for browser delivery. TinyGo is the common alternative: a separate compiler producing far smaller WASM (and the usual choice for embedded/WASI), at the cost of somereflect/stdlib compatibility. - Zig treats
wasm32-freestandingandwasm32-wasias ordinary targets — no runtime to embed, so modules are small like Rust's, and you control allocation explicitly (often aFixedBufferAllocatorover WASM linear memory). There is nowasm-bindgen-class JS-glue generator in std; you write the host/JS boundary by hand or use a community helper. Zig is also frequently used to compile C/C++ to WASM viazig cc.
🔒 SecOps — Go:
go buildexecutes no arbitrary code (no build-time code-execution vector); Rustbuild.rs/proc-macros do; Zigbuild.zigdoes 🔒 SecOps — Rust: broad auditing toolchain (cargo-audit, cargo-deny, cargo-vet, cargo-sbom); Go: govulncheck (reachability-aware); Zig: no advisory scanner yet 🔐 Safety — Rust: compile-time memory safety eliminates buffer-overflow/UAF CVE classes; Go: GC memory safety (data races still possible); Zig: runtime-checked in safe build modes only
cargo audit # check against RustSec advisory database
cargo deny check # enforce license policy, bans, and advisories
cargo vet # human-reviewed audit records per crate version (Mozilla-origin)
cargo sbom # generate software bill of materials (SPDX or CycloneDX)Build-time risk: build.rs and proc-macros execute arbitrary code at cargo build.
A compromised dependency can exfiltrate environment variables, download payloads, or
modify source files during compilation. This is a real and actively exploited attack
surface. cargo-vet (requiring manual audit sign-off per crate version) is the primary
mitigation.
Memory safety in safe Rust eliminates entire CVE classes at the language level: buffer overflows, use-after-free, double-free, and dangling pointers cannot exist in safe code. The NSA, CISA, and multiple government agencies now recommend Rust-class memory-safe languages for new systems software specifically for this reason.
govulncheck ./... # reachability-aware — only reports CVEs in code you actually call
nancy ./... # alternative advisory scannergovulncheck performs call-graph analysis: if you import a vulnerable package but never
call the vulnerable function, it reports it as informational rather than actionable. This
produces dramatically fewer false positives than crate-level scanners.
Build-time safety: go build runs no arbitrary code. A malicious Go package can
contain malicious runnable code but cannot execute it at compile time. go generate
requires explicit invocation. Among the three, Go is the only one whose default build runs
no arbitrary code.
Go's GC provides memory safety (no dangling pointers, no double-free) but does not prevent
data races, concurrent map writes, or interface nil-pointer dereferences — all of which are
undefined behaviour in the Go runtime even without the unsafe package.
Zig's security posture is the most nuanced of the three, and honesty requires stating both sides plainly.
Memory safety is checked, not proven. Zig has no borrow checker. In Debug and
ReleaseSafe it inserts runtime checks — bounds checks on slice/array access, integer-overflow
traps, null-unwrap checks on optionals, alignment checks, undefined-value poisoning, and
allocator-level leak/double-free/use-after-free detection (via the GeneralPurposeAllocator).
These catch a large fraction of the bugs that motivate Rust, but only on code paths that
actually execute, and only in safe build modes. In ReleaseFast these checks are off and the
same mistakes are undefined behaviour — the C failure mode. So Zig is meaningfully safer than
C (the checks are on by default in Debug/Safe, and the allocator catches the classic heap bugs)
but categorically weaker than Rust, which proves spatial and temporal memory safety and
data-race freedom at compile time for all builds. Zig also has no Send/Sync analogue: data
races are neither prevented nor detected by the toolchain.
Supply chain: small surface, young ecosystem. Like Go and unlike Rust's build.rs,
fetching a Zig dependency does not run arbitrary code — though build.zig is arbitrary Zig
that runs at build time, so a malicious dependency's build script is an execution vector
similar to build.rs (mitigated in practice by the small, often-vendored dependency culture).
There is no cargo-audit/govulncheck-equivalent advisory-database scanner yet; the ecosystem
is too young to have one. The flip side of that youth is a much smaller transitive-dependency
surface — Zig projects, like Go's, tend to have a handful of dependencies and frequently
vendor C libraries directly rather than pulling deep crate trees.
🧹 DX — Go: write a production HTTP server, query a DB, and parse JSON with zero external imports 🔒 SecOps — fewer dependencies = smaller attack surface; Go stdlib is battle-tested and maintained by Google ⚡ Perf — Rust crates (serde, tokio, axum) are often faster; the cost is more decisions upfront
Go's philosophy: the standard library covers 80% of server-side needs. A new project starts productive without a single external dependency.
Networking and HTTP:
import "net/http"
// Production-grade HTTP/1.1 + HTTP/2 server — zero external deps
mux := http.NewServeMux()
mux.HandleFunc("GET /users/{id}", func(w http.ResponseWriter, r *http.Request) {
id := r.PathValue("id") // path parameters (1.22+)
ctx := r.Context() // carries cancellation and deadlines
user, err := db.GetUser(ctx, id)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(user)
})
srv := &http.Server{
Addr: ":8443",
Handler: mux,
ReadTimeout: 5 * time.Second,
WriteTimeout: 10 * time.Second,
IdleTimeout: 120 * time.Second,
}
log.Fatal(srv.ListenAndServeTLS("cert.pem", "key.pem")) // TLS 1.3 built inJSON encoding/decoding:
import "encoding/json"
type Order struct {
ID uint64 `json:"id"`
UserID uint64 `json:"userId"`
Items []Item `json:"items"`
CreatedAt time.Time `json:"createdAt"`
Total float64 `json:"total,string"` // encode float as JSON string
InternalNote string `json:"-"` // never serialised
}
// Marshal
data, err := json.Marshal(order)
// Unmarshal
var o Order
err = json.Unmarshal(data, &o)
// Streaming decode — memory-efficient for large payloads
dec := json.NewDecoder(r.Body)
dec.DisallowUnknownFields()
err = dec.Decode(&o)Database (driver-agnostic interface):
import (
"database/sql"
_ "github.com/jackc/pgx/v5/stdlib" // import driver for side effects only
)
db, err := sql.Open("pgx", os.Getenv("DATABASE_URL"))
db.SetMaxOpenConns(25)
db.SetConnMaxIdleTime(5 * time.Minute)
// Prepared statement — SQL injection impossible
stmt, err := db.PrepareContext(ctx, `
SELECT id, name, email FROM users WHERE status = $1 LIMIT $2
`)
rows, err := stmt.QueryContext(ctx, "active", 100)
defer rows.Close()
for rows.Next() {
var u User
rows.Scan(&u.ID, &u.Name, &u.Email)
}
if err := rows.Err(); err != nil { log.Fatal(err) }Structured logging (slog, 1.21+):
import "log/slog"
logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
Level: slog.LevelInfo,
}))
logger.Info("order created",
slog.Int64("order_id", 42),
slog.String("user", "alice"),
slog.Float64("total", 99.95),
)
// {"time":"2026-06-11T10:00:00Z","level":"INFO","msg":"order created","order_id":42,...}Other key stdlib packages:
"sync" // Mutex, RWMutex, WaitGroup, Once, Pool, Map
"context" // cancellation, deadlines, request-scoped values
"crypto/tls" // TLS 1.3, certificate management
"crypto/rand" // cryptographically secure random
"encoding/xml" // XML marshal/unmarshal
"text/template" // text templating
"html/template" // auto-escaping HTML templates
"net/url" // URL parsing, query encoding
"regexp" // regular expressions (RE2 syntax, no backtracking)
"strconv" // string ↔ numeric conversions
"strings" // string manipulation (Builder, Split, Contains, etc.)
"bytes" // byte slice manipulation (Buffer, Equal, Split, etc.)
"io" // Reader, Writer, Closer, Pipe, LimitReader
"bufio" // buffered I/O (Scanner, ReadLine)
"os" // file I/O, env vars, process management
"path/filepath" // cross-platform path manipulation
"time" // time, duration, timers, tickers, timezone
"math/rand/v2" // pseudo-random (1.22+ with PCG and ChaCha8 sources)
"testing" // unit tests, benchmarks, fuzzing, examples
"flag" // command-line flag parsing
"embed" // compile-time file embeddingRust's stdlib is deliberately lean: collections, IO traits, threading primitives, sync types, file system, networking. No HTTP, JSON, SQL, or regex in stdlib. Each concern is a crate choice, with mature third-party implementations.
Core stdlib modules:
// Collections
use std::collections::{HashMap, HashSet, BTreeMap, BTreeSet, VecDeque, BinaryHeap, LinkedList};
// io traits
use std::io::{Read, Write, BufRead, Seek, BufReader, BufWriter};
// Sync primitives
use std::sync::{Mutex, RwLock, Arc, Condvar, Barrier, OnceLock, LazyLock};
use std::sync::atomic::{AtomicU64, AtomicBool, Ordering};
// Threading
use std::thread;
// Time
use std::time::{Duration, Instant, SystemTime};
// Env, process, file
use std::env;
use std::process;
use std::fs::{self, File};
use std::path::{Path, PathBuf};The essential crate ecosystem. The de-facto crates for each concern — tokio (async),
serde/serde_json (serialization), axum/actix-web + reqwest (HTTP), sqlx/diesel
(database), thiserror/anyhow (errors), tracing (observability), rayon/crossbeam
(parallelism), clap (CLI), regex, config — are catalogued three-way in §16.0. A typical
Cargo.toml for a web service pulls in tokio, serde, axum, sqlx, thiserror/anyhow,
tracing, and clap. Two representative usages follow.
use axum::{
Router,
routing::{get, post},
extract::{State, Path, Json},
http::StatusCode,
response::IntoResponse,
};
use serde::{Deserialize, Serialize};
#[derive(Debug, Serialize, sqlx::FromRow)]
struct User { id: i64, name: String, email: String }
#[derive(Deserialize)]
struct CreateUser { name: String, email: String }
async fn get_user(
State(pool): State<sqlx::PgPool>,
Path(id): Path<i64>,
) -> impl IntoResponse {
match sqlx::query_as!(User, "SELECT id, name, email FROM users WHERE id = $1", id)
.fetch_optional(&pool)
.await
{
Ok(Some(user)) => (StatusCode::OK, Json(user)).into_response(),
Ok(None) => StatusCode::NOT_FOUND.into_response(),
Err(e) => (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()).into_response(),
}
}
async fn create_user(
State(pool): State<sqlx::PgPool>,
Json(body): Json<CreateUser>,
) -> impl IntoResponse {
let user = sqlx::query_as!(
User,
"INSERT INTO users (name, email) VALUES ($1, $2) RETURNING id, name, email",
body.name, body.email
)
.fetch_one(&pool)
.await?;
(StatusCode::CREATED, Json(user))
}
#[tokio::main]
async fn main() {
let pool = sqlx::PgPool::connect(&std::env::var("DATABASE_URL").unwrap()).await.unwrap();
let app = Router::new()
.route("/users/:id", get(get_user))
.route("/users", post(create_user))
.with_state(pool);
let listener = tokio::net::TcpListener::bind("0.0.0.0:8080").await.unwrap();
axum::serve(listener, app).await.unwrap();
}Structured tracing:
use tracing::{info, warn, error, instrument};
use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt, EnvFilter};
// Initialise JSON tracing (structlog equivalent)
tracing_subscriber::registry()
.with(EnvFilter::from_default_env())
.with(tracing_subscriber::fmt::layer().json())
.init();
// #[instrument] automatically records function entry, exit, and spans
#[instrument(skip(pool), fields(user_id = %id))]
async fn get_user(pool: &sqlx::PgPool, id: i64) -> Result<User, sqlx::Error> {
info!("fetching user");
let user = sqlx::query_as!(User, "SELECT * FROM users WHERE id = $1", id)
.fetch_one(pool)
.await?;
info!(name = %user.name, "user found");
Ok(user)
}
// Output: {"timestamp":"...","level":"INFO","message":"user found","user_id":42,"name":"Alice","target":"my_crate","span":{"name":"get_user"}}A full, current, three-way library reference — async runtime, HTTP, TLS, database, CLI,
logging, testing, numerics, and more — is consolidated in §16.0 (The Foundational Libraries
That Define Each Ecosystem) to avoid duplication. The one-line summary for picking a library:
where Go has a stdlib option (net/http, encoding/json, database/sql, log/slog, time,
regexp, html/template, testing+fuzzing) it is usually production-grade and the default;
Rust and Zig reach for a crate/package for the same concern, trading more decisions and deeper
dependency trees for best-of-breed implementations.
The Rust (150,000+ crates) and Go (comparable module count) ecosystems are large; Zig's is far smaller and younger. Comparing the two largest, their philosophies differ in a way that shows up in production.
Go's approach: Add a dependency only when the stdlib falls short. Many Go services
run with fewer than 10 external dependencies. The module proxy (GOPROXY) provides
immutable download URLs. go mod vendor copies all deps into the repository.
govulncheck scans for reachable CVEs without noise.
Rust's approach: The ecosystem is the stdlib. Fundamental concerns (HTTP, JSON, async
runtime) are crate decisions. This yields mature libraries but more decisions,
more transitive dependencies, and more cargo audit noise. The Cargo.lock file is
comprehensive; cargo-vet requires manual audit sign-offs for each version.
The practical consequence: a Go team can deploy a service with a handful of external packages and keep full audit visibility. A Rust team building an equivalent service commonly has 200–400 transitive crate dependencies even for a modest HTTP API server.
Zig's standard library sits philosophically between Go's batteries-included breadth and
Rust's deliberate minimalism — but with two distinguishing traits: everything that
allocates takes an allocator, and everything that does I/O now takes an std.Io (the
0.16 change). It is broader than Rust's std (it includes crypto, compression, JSON, a
basic HTTP client/server, hashing, many data structures) but narrower and far less stable
than Go's — the stdlib churns hard release-to-release (0.15's "Writergate" rewrote all I/O
around buffered writers; 0.16 rewrote it again around std.Io and removed most of
std.posix).
What's in the box (a sampling):
std.mem,std.heap— allocators (GPA, arena, fixed-buffer, page, c) and slice utilitiesstd.Io(0.16) — the unified I/O interface: files, sockets, timers, async primitivesstd.json— comptime (de)serialization + streaming scanner + dynamicValuestd.crypto— a respected, audited-in-parts suite (AEADs incl. AES-GCM-SIV and Ascon as of 0.16, hashes, ECC, signatures)std.compress— gzip, zstd, flate; 0.16 added a from-scratch deflate compressor (history-window + chained-hash matching), reaching within ~1% of zlib's ratiostd.ArrayList,std.HashMap,std.AutoHashMap,std.MultiArrayList(SoA!),std.BoundedArray(note: 0.16 continued the move to "Unmanaged" containers as the default flavor)std.http— a basic client and server (not production-hardened like Go'snet/http)std.Build— the build system, itself part of stdstd.testing,std.fmt,std.unicode,std.sort,std.Thread
// std.MultiArrayList — struct-of-arrays layout from one declaration: a stdlib feature
// neither Rust std nor Go stdlib offers. Great for cache-friendly ECS / columnar data.
const Entities = std.MultiArrayList(struct { x: f32, y: f32, hp: u32 });
// stored internally as separate x[], y[], hp[] arrays — SIMD- and cache-friendlystd.MultiArrayList deserves a callout: it generates a struct-of-arrays representation from
an ordinary struct definition via comptime, giving columnar memory layout (better cache
behavior and vectorization) with array-of-structs ergonomics — a data-oriented-design tool
that Rust needs a crate (soa_derive) for and Go cannot express ergonomically at all.
Ecosystem reality. Outside the stdlib, the third-party ecosystem is small and young relative to crates.io and Go modules — covered honestly in §16. The dependency culture mirrors Go's (few deps, frequent vendoring, lots of direct C usage via Zig's frictionless C interop), so a Zig service often has a tiny dependency surface. The headline risk is instability: the language and stdlib are pre-1.0 and break across releases, which is the single biggest adoption barrier — production users commonly pin a version (e.g. stayed on 0.15.2 while 0.16 settled) rather than track latest.
⚡ Perf — all three call into native C/C++ libraries with low/zero-overhead FFI; Rust adds borrow-check safety, Zig adds zero-friction
@cImport🧹 DX — quality of the library binding matters as much as the language; this section covers the best available 🔐 Safety — Rust wraps unsafe FFI in safe abstractions; Go relies on CGO discipline; Zig embeds C directly but without a borrow checker 🔒 SecOps — fewer native dependencies = simpler auditing; pure-language implementations preferred where performance allows
Before the specialised domains, this is the load-bearing set — the libraries a large fraction of real projects depend on, organised by concern. These are the de-facto standards as of 2026 (versions/standing verified against current releases). Where one library is the clear default, it is named first; respected alternatives follow.
A caveat that applies to every Zig entry below: Zig 0.16 (April 2026) landed two large breaking
changes — the new concrete std.Io interface and the near-complete removal of std.posix —
following the 0.15 Reader/Writer redesign ("Writergate"). Any library that touches I/O, sockets,
the filesystem, timers, or threads must be reconciled with std.Io, so a meaningful fraction of
the third-party ecosystem is mid-migration: some libraries named here target 0.15.x and need a
version bump, and a few I/O abstractions (e.g. event loops) overlap with std.Io and are being
repackaged as std.Io implementations rather than used beside it. Treat Zig library names as
"the project that exists for this concern," and check its pinned Zig version before depending on
it — pure-data libraries (refcounting, collections, date math) port easily; I/O-touching ones may
lag a release.
Async runtime / concurrency
- Rust:
tokiois the dominant async runtime (the foundation under most of the ecosystem;async-stdis effectively deprecated), withfutures,tokio-util, andtokio-stream.rayonis the standard for CPU data-parallelism;crossbeamfor lock-free structures and scoped threads;tokio-consolefor live async debugging. Thread-per-core alternatives:glommio,monoio. - Go: concurrency is the language (goroutines/channels); the additions are
golang.org/x/sync(errgroup,semaphore,singleflight) andsourcegraph/concfor structured concurrency. - Zig: the 0.16
std.Iointerface is itself the intended runtime —Io.Threaded(feature-complete, the 0.15.x-equivalent path) plus the experimentalIo.Evented(M:N green threads) and anIo.Uringproof-of-concept. This reshapes the third-party landscape: pre-0.16 event loops likelibxev(io_uring/epoll/kqueue) and actor libraries likethespianpredatestd.Io, and the migration path the ecosystem is discussing is to repackage them asstd.Ioimplementations rather than use them alongside it. There is notokio-scale runtime; for 0.16 the idiomatic answer is to write againststd.Ioand inject a backend.
Error handling
- Rust:
thiserror(library error enums) andanyhow(application errors);eyre/color-eyrefor richer reports;miettefor diagnostic-quality errors in tooling. - Go: stdlib
errors(Is/As/Join) +fmt.Errorf("%w", …);pkg/errorsis legacy. - Zig: built-in error unions + error-return-traces (no library).
Serialization
- Rust:
serde+serde_jsonis near-universal;bincode/postcard/rmp-serde/prost(protobuf) for binary;toml/serde_yaml/ronfor config;simd-jsonfor throughput. - Go: stdlib
encoding/json(v2 emerging),bytedance/sonicandgoccy/go-jsonfor speed;google.golang.org/protobuf;gopkg.in/yaml.v3;pelletier/go-toml. - Zig: stdlib
std.json;zimdjson(SIMD) for throughput;zig-toml.
HTTP / web frameworks
- Rust:
axum(0.8.x, the common default — Tokio team, Tower middleware) atophyper;actix-web(4.12.x, the performance leader, TechEmpower-topping);rocket,salvo,warp,poemas alternatives.reqwestis the standard client;tower/tower-httpthe middleware layer;tonicfor gRPC. - Go: stdlib
net/httpis genuinely production-grade and many ship on it alone;chi(idiomatic, stdlib-compatible router),ginandecho(popular full frameworks),fiber(onfasthttp);grpc-go+protobuf;restyfor an ergonomic client. - Zig:
httpz(fast HTTP/1.1 server),zap(facil.io wrapper),tokamak(framework on httpz); stdlibstd.httpis basic.
TLS / crypto
- Rust:
rustls(pure-Rust TLS, increasingly the default over OpenSSL),ring/aws-lc-rs(primitives),sha2/blake3,ed25519-dalek/ed25519,argon2,rcgen(cert gen). - Go: stdlib
crypto/*andcrypto/tlscover most needs first-party;golang.org/x/cryptofor extras (argon2, chacha20poly1305, ssh). - Zig: stdlib
std.cryptois broad and partly audited (AEADs, hashes, ECC, signatures); TLS is still maturing.
Database / data access
- Rust:
sqlx(async, compile-time-checked SQL),diesel(sync ORM/query-builder),sea-orm(async ORM); driverstokio-postgres,redis,mongodb;deadpool/bb8pools. - Go: stdlib
database/sql+jackc/pgx(the PostgreSQL standard),go-sql-driver/mysql,modernc.org/sqlite(pure-Go) ormattn/go-sqlite3(cgo);sqlc(codegen from SQL),gorm/ent(ORMs),redis/go-redis. - Zig:
pg.zig(native PostgreSQL),zqlite/vrischmann/zig-sqlite(SQLite),zuckdb.zig(DuckDB),myzql(native MySQL/MariaDB),okredis(zero-allocation Redis client); otherwise@cImportC clients. (The network drivers here are among the libraries most affected by the 0.16std.Io/std.posixchange, since they speak sockets directly — check the driver's Zig-version support; the SQLite/DuckDB options, being@cImportwrappers around C engines, are less exposed.)
CLI / configuration
- Rust:
clap(the CLI standard, derive-based, 75M+ downloads),argh/lexopt(lightweight);config/figmentfor layered config;dialoguer/indicatiffor interactive UIs and progress bars. - Go:
spf13/cobra+viper(the Kubernetes/Docker-era standard),urfave/cli,alecthomas/kong. - Zig:
zig-clap,zig-cli.
Logging / observability
- Rust:
tracing(structured spans — the observability standard) +tracing-subscriber;log/env_loggerfor simple cases;opentelemetry+tracing-opentelemetry;metrics. - Go: stdlib
log/slog(structured logging, 1.21+) is now the default;uber-go/zapandrs/zerologfor high-performance logging; OpenTelemetry-Go andprometheus/client_golang— Go's observability story is among the most mature anywhere (most of the CNCF stack is Go). - Zig: stdlib
std.log; tracing viaztracy(Tracy) or manual.
Date/time, IDs, randomness, regex, utilities
- Rust:
chrono/time(dates),uuid/ulid,rand,regex(the high-quality stdlib-adjacent engine),itertools,bytes(zero-copy buffers),dashmap(concurrent map),parking_lot(faster locks). - Go: stdlib
time,regexp,math/rand/v2;google/uuid,samber/lo(generics helpers),puzpuzpuz/xsync(concurrent maps). - Zig: stdlib covers time/random/hashing; regex and rich date libraries are community/young.
Testing
- Rust: stdlib
#[test]+cargo test;proptest/quickcheck(property testing),criterion(benchmarks),insta(snapshot),mockall(mocks),rstest(fixtures/parameterised). - Go: stdlib
testing(+ fuzzing, benchmarks, coverage);stretchr/testify(assertions/mocks — near-ubiquitous),golang/mock/uber-go/mock,testcontainers-go(integration). - Zig: stdlib
std.testing(+ leak detection) and the integrated fuzzer; little third-party tooling.
Numerics / scientific / linear algebra
- Rust:
ndarray,nalgebra/glam(linear algebra/graphics math),polars(DataFrames),num/num-bigint,statrs. - Go:
gonum(the numerical-computing suite: matrices, stats, optimisation),gomlx. - Zig: stdlib
@Vector-based math;zmath(zig-gamedev); scientific stack is thin.
Parsing
- Rust:
nom/winnow(parser combinators),pest(PEG),logos(lexers),syn(Rust-token parsing for macros). - Go: stdlib
text/template,go/parser;participle,goyaccfor custom grammars. - Zig: hand-written parsers are idiomatic;
comptimeaids table generation.
Cloud-native / infrastructure — worth calling out because it shapes the status quo:
- Go is the cloud-native language: Kubernetes, Docker/containerd, etcd, Terraform, Prometheus, and Consul are written in Go, so their client libraries (
kubernetes/client-go,aws-sdk-go-v2, cloud SDKs) are first-party and battle-tested. This is Go's single biggest ecosystem moat. - Rust is strong in the adjacent "fast infrastructure component" niche (proxies, databases, CLI tools):
tokio/towerservices, plus flagship apps like thedeno/rustls/ripgreplineage. - Zig has no cloud-native ecosystem; its weight comes from flagship applications (TigerBeetle, Bun, Ghostty) rather than libraries.
⚡ Perf — GPU compute delivers 10–1000x throughput for data-parallel workloads (ML inference, image processing, physics simulation) 🔐 Safety — Rust's wgpu and type-safe compute pipelines catch shader/binding mismatches at compile time where possible 🧹 DX — Go's GPU story is thin; most teams CGO into CUDA/Vulkan C libraries directly
wgpu — cross-platform GPU compute (WebGPU standard):
wgpu is the primary idiomatic Rust GPU library. It targets Vulkan, Metal, DX12, and
WebGPU from one API. Compute shaders are written in WGSL (or SPIR-V) and dispatched
as typed pipeline objects. The binding model is checked at pipeline creation time.
use wgpu::util::DeviceExt;
async fn gpu_matrix_add(a: &[f32], b: &[f32]) -> Vec<f32> {
let instance = wgpu::Instance::default();
let adapter = instance.request_adapter(&Default::default()).await.unwrap();
let (device, queue) = adapter.request_device(&Default::default(), None).await.unwrap();
// Upload input buffers to GPU
let buf_a = device.create_buffer_init(&wgpu::util::BufferInitDescriptor {
label: Some("A"),
contents: bytemuck::cast_slice(a),
usage: wgpu::BufferUsages::STORAGE,
});
let buf_b = device.create_buffer_init(&wgpu::util::BufferInitDescriptor {
label: Some("B"),
contents: bytemuck::cast_slice(b),
usage: wgpu::BufferUsages::STORAGE,
});
let buf_out = device.create_buffer(&wgpu::BufferDescriptor {
size: (a.len() * 4) as u64,
usage: wgpu::BufferUsages::STORAGE | wgpu::BufferUsages::COPY_SRC,
mapped_at_creation: false,
label: Some("out"),
});
// Compile WGSL compute shader
let shader = device.create_shader_module(wgpu::ShaderModuleDescriptor {
label: Some("add"),
source: wgpu::ShaderSource::Wgsl(include_str!("add.wgsl").into()),
});
// add.wgsl:
// @group(0) @binding(0) var<storage, read> a: array<f32>;
// @group(0) @binding(1) var<storage, read> b: array<f32>;
// @group(0) @binding(2) var<storage, read_write> out: array<f32>;
// @compute @workgroup_size(64)
// fn main(@builtin(global_invocation_id) id: vec3<u32>) {
// out[id.x] = a[id.x] + b[id.x];
// }
// Build pipeline, bind group, dispatch, readback (omitted for brevity)
vec![]
}cust — safe CUDA bindings:
cust wraps the CUDA runtime API. Kernels are written in CUDA C/C++ and compiled with
nvcc; cust handles device selection, memory allocation, and kernel launch from Rust
with safe abstractions over the unsafe FFI.
use cust::prelude::*;
static PTX: &str = include_str!(concat!(env!("OUT_DIR"), "/kernel.ptx"));
fn run_cuda_kernel(data: &[f32]) -> CudaResult<Vec<f32>> {
let _ctx = cust::quick_init()?;
let module = Module::from_ptx(PTX, &[])?;
let stream = Stream::new(StreamFlags::NON_BLOCKING, None)?;
let kernel = module.get_function("my_kernel")?;
// Copy host → device
let d_input: DeviceBuffer<f32> = data.as_dbuf()?;
let mut d_out: DeviceBuffer<f32> = DeviceBuffer::zeroed(data.len())?;
// Launch: 128 blocks of 256 threads
let (grid, block) = (128u32, 256u32);
unsafe {
launch!(kernel<<<grid, block, 0, stream>>>(
d_input.as_device_ptr(),
d_out.as_device_ptr(),
data.len()
))?;
}
stream.synchronize()?;
Ok(d_out.as_host_vec()?)
}candle — ML inference on CPU and GPU (Hugging Face):
candle is Hugging Face's Rust ML framework. It runs tensor operations on CPU, CUDA,
and Metal. Pre-trained model weights (safetensors, GGUF) load directly from the
Hugging Face Hub. No Python runtime needed.
use candle_core::{Device, Tensor, DType};
use candle_nn::VarBuilder;
use candle_transformers::models::llama::{Llama, Config};
// Run Llama inference entirely in Rust
let device = Device::Cuda(0)?; // or Device::Cpu, Device::Metal(0)
let dtype = DType::BF16;
let config = Config::config_7b_v2(false);
let weights = unsafe { candle_core::safetensors::MmapedSafetensors::new(weight_path)? };
let vb = VarBuilder::from_mmaped_safetensors(&[weights], dtype, &device)?;
let model = Llama::load(vb, &config)?;
let tokens = tokenizer.encode("Hello, world", true)?;
let input = Tensor::new(tokens.get_ids(), &device)?.unsqueeze(0)?;
let logits = model.forward(&input, 0)?;
// Sample next token from logits...burn — training and inference framework with pluggable backends:
use burn::backend::{Autodiff, Wgpu};
use burn::prelude::*;
type MyBackend = Autodiff<Wgpu>; // swap to NdArray for CPU-only, Cuda for CUDA
#[derive(Module, Debug)]
struct MLP<B: Backend> {
linear1: Linear<B>,
linear2: Linear<B>,
}
// Forward pass, loss computation, and backward pass are backend-agnosticCPU SIMD via std::arch and std::simd:
// Portable SIMD (std::simd, stabilising) — one implementation, all platforms
use std::simd::f32x8;
fn dot_portable(a: &[f32], b: &[f32]) -> f32 {
a.chunks_exact(8).zip(b.chunks_exact(8))
.map(|(av, bv)| (f32x8::from_slice(av) * f32x8::from_slice(bv)).reduce_sum())
.sum()
}
// Compiles to AVX2 on x86_64, NEON on AArch64, scalar on everything elseGo has no native GPU compute API. GPU workloads require CGO into CUDA, OpenCL, or Vulkan C libraries. This works but loses the zero-CGO cross-compilation advantage.
// CGO to CUDA — requires nvcc, CUDA toolkit, and a C wrapper
// #cgo LDFLAGS: -lcuda -lcudart
// #include "my_kernel_wrapper.h"
import "C"
import "unsafe"
func runCudaKernel(data []float32) ([]float32, error) {
out := make([]float32, len(data))
C.launch_kernel(
(*C.float)(unsafe.Pointer(&data[0])),
(*C.float)(unsafe.Pointer(&out[0])),
C.int(len(data)),
)
return out, nil
}
// The CGO boundary adds ~100ns per call; batch work to amortise the costgonum — numerical computing (CPU only):
import (
"gonum.org/v1/gonum/mat"
"gonum.org/v1/gonum/stat"
)
// Dense matrix multiplication — BLAS-backed, CPU only
a := mat.NewDense(3, 3, []float64{1, 2, 3, 4, 5, 6, 7, 8, 9})
b := mat.NewDense(3, 3, []float64{9, 8, 7, 6, 5, 4, 3, 2, 1})
var c mat.Dense
c.Mul(a, b)
// Statistics
xs := []float64{1, 2, 3, 4, 5}
mean := stat.Mean(xs, nil)
std := stat.StdDev(xs, nil)gomlx — ML framework from Google (CPU/XLA backend):
import "github.com/gomlx/gomlx/graph"
g := graph.NewGraph()
x := g.Parameter("x", shapes.Make(dtypes.Float32, 3, 3))
w := g.Parameter("w", shapes.Make(dtypes.Float32, 3, 3))
y := graph.MatMul(x, w)
// Compiles to XLA; GPU requires XLA GPU backend setupZig's GPU story rides on its zero-friction C interop and the zig-gamedev/Mach ecosystems:
zgpu(zig-gamedev) — a helper layer over Dawn, Google's native WebGPU implementation, cross-compiled with Zig into a single static library (mach-gpu-dawn). This gives Zig the same cross-platform GPU-compute substrate Rust gets fromwgpu(both target the WebGPU API; both ultimately wrap Dawn/wgpu-native).- Mach engine's
gpu— Mach (the Zig game engine) exposes a WebGPU-class GPU interface built on the same Dawn foundation, used for both graphics and compute. - CUDA / Vulkan / Metal — via
@cImportof the C headers directly, with zero binding overhead and no separate bindings crate. CallingcudaMalloc/kernel launches is a plain C call (contrast Go's cgo tax).
// CUDA directly via @cImport — no bindings library, plain C calls
const cuda = @cImport({
@cInclude("cuda_runtime.h");
});
var d_ptr: ?*anyopaque = null;
_ = cuda.cudaMalloc(&d_ptr, n * @sizeOf(f32)); // direct, zero-overhead
defer _ = cuda.cudaFree(d_ptr);For CPU SIMD, Zig needs no library at all: the built-in @Vector (§9) is portable across
SSE/AVX/AVX-512/NEON in safe code with no library or unsafe (Rust needs
unsafe std::arch or stabilising std::simd; Go has only the experimental simd/archsimd).
zmath (zig-gamedev) provides a SIMD-accelerated game-math library on top of @Vector. There
is even an early vllm-zig exploring LLM serving with hand-written SIMD matmul kernels.
⚡ Perf — real-time audio demands wait-free, allocation-free callbacks; Rust's SPSC (rtrb) and zero-allocation iterator pipelines are a natural fit 🧹 DX — Go's
pionecosystem is excellent for WebRTC; Rust'ssymphoniahandles decoding without FFmpeg 🔐 Safety — Rust's type system prevents mixing sample formats (f32 vs i16) and sample rates silently
The Rust audio ecosystem is unusually deep because the language's real-time guarantees (no GC, no hidden allocation, wait-free SPSC) line up with the constraints of audio callbacks. The layers practitioners actually ship:
cpal— low-level cross-platform device I/O (ALSA/WASAPI/CoreAudio/JACK/WASM). The callback runs on the audio thread and must be real-time safe. Why it matters: because the callback never allocates and the compiler enforcesSend/Syncon anything it touches, the audio thread's worst-case latency is bounded by the OS, not by a GC — the property a software synth or live-effects rig needs to run at a 64–128 frame buffer without dropouts.symphonia— pure-Rust, 100%-safe decoding for AAC/MP3/FLAC/Vorbis/WAV/ALAC and more; no FFmpeg, no CGO. The default decoder behind most of the stack. Why it matters: no C dependency means trivial cross-compilation and one static binary (an IO/deployment win — no shippinglibavcodec), and safe decoding removes the memory-corruption CVE class that plagues C codec libraries — relevant when decoding untrusted files (a media server, a browser).kira— high-level audio engine for games and apps (mixing, tweens, clocks, spatial audio); built on cpal + symphonia. This is the "just play and sequence sound" layer most app developers want — real-world home in indie games and interactive apps.rodio— simpler high-level playback (also cpal + symphonia).fundspanddasp— DSP:fundspis a composable graph-notation synthesis/ effects library (supportsno_std);daspprovides low-level sample/frame primitives. Why it matters:no_std+ allocation-free graphs mean the same DSP code runs on a desktop plugin and on a microcontroller-based effects pedal (an embedded/CPU use case Go cannot reach).nih-plug— the de-facto Rust framework for shipping VST3 and CLAP audio plugins; used for real commercial and open-source plugins. Real-world use: this is the layer that makes Rust a credible choice for pro-audio plugin vendors, not just app developers.creek— real-time-safe streaming of audio to/from disk;basedrop— RT-safe memory reclamation;rtrb— wait-free SPSC ring buffer for the audio↔worker handoff. Why it matters:rtrbis wait-free, so the producer (a decode/worker thread) and the consumer (the audio callback) hand off samples with no lock and no syscall — the callback can never block waiting on a mutex the OS hasn't scheduled, which is the exact failure the GoCRing/blocking-interface workarounds exist to avoid.creeklets you stream a multi-gigabyte audio file from disk in a DAW without loading it into RAM (a memory/IO win) while keeping the callback allocation-free.
This combination (RT-safe memory + wait-free SPSC + a borrow checker that flags
allocation-in-callback patterns) is the concrete reason audio practitioners increasingly
choose Rust. To reach the same hard-real-time guarantees in Go you would move the callback's
hot path out of the GC's reach yourself — a C-allocated ring buffer and a C-level callback, as
the go-portaudio binding does — which is why Go's audio story centers on playback (oto,
beep) and CGO bindings to miniaudio (malgo) rather than native DSP or plugin development.
cpal — cross-platform audio I/O (microphone and speakers):
cpal is the lowest-level audio I/O library. It opens streams to audio hardware on
ALSA (Linux), WASAPI (Windows), CoreAudio (macOS/iOS), JACK, and WASM. The callback
is called from the audio thread — it must be real-time safe (no allocation, no locking).
use cpal::traits::{DeviceTrait, HostTrait, StreamTrait};
fn record_audio() -> cpal::Stream {
let host = cpal::default_host();
let device = host.default_input_device().expect("no input device");
let config = device.default_input_config().unwrap();
// Audio thread callback — must NEVER allocate or block
// Use rtrb (SPSC ring buffer) to send samples to a processing thread
let (mut producer, consumer) = rtrb::RingBuffer::<f32>::new(48_000);
let stream = device.build_input_stream(
&config.into(),
move |data: &[f32], _| {
for &sample in data {
producer.push(sample).ok(); // wait-free; drops if buffer full
}
},
|err| eprintln!("stream error: {err}"),
None,
).unwrap();
stream.play().unwrap();
stream // keep alive
}symphonia — pure Rust audio decoding (no FFmpeg, no CGO):
Symphonia decodes MP3, AAC, FLAC, OGG/Vorbis, WAV, AIFF, and more in pure Rust. No system library dependency — it links into your binary statically.
use symphonia::core::{
audio::SampleBuffer,
codecs::DecoderOptions,
formats::FormatOptions,
io::{MediaSourceStream, ReadOnlySource},
meta::MetadataOptions,
probe::Hint,
};
fn decode_audio_file(path: &str) -> Vec<f32> {
let src = std::fs::File::open(path).unwrap();
let mss = MediaSourceStream::new(Box::new(src), Default::default());
let probed = symphonia::default::get_probe()
.format(&Hint::new(), mss, &FormatOptions::default(), &MetadataOptions::default())
.unwrap();
let mut format = probed.format;
let track = format.default_track().unwrap();
let mut decoder = symphonia::default::get_codecs()
.make(&track.codec_params, &DecoderOptions::default())
.unwrap();
let mut samples = Vec::new();
loop {
let packet = match format.next_packet() {
Ok(p) => p,
Err(symphonia::core::errors::Error::IoError(_)) => break,
Err(e) => panic!("{e}"),
};
if let Ok(decoded) = decoder.decode(&packet) {
let spec = *decoded.spec();
let mut buf = SampleBuffer::<f32>::new(decoded.capacity() as u64, spec);
buf.copy_interleaved_ref(decoded);
samples.extend_from_slice(buf.samples());
}
}
samples
}rodio — high-level audio playback built on cpal:
use rodio::{Decoder, OutputStream, Sink};
use std::fs::File;
use std::io::BufReader;
let (_stream, stream_handle) = OutputStream::try_default().unwrap();
let sink = Sink::try_new(&stream_handle).unwrap();
let file = BufReader::new(File::open("music.mp3").unwrap());
let source = Decoder::new(file).unwrap();
sink.append(source); // plays asynchronously
sink.sleep_until_end();ffmpeg-next — FFI bindings to FFmpeg (video encode/decode/transcode):
For full video pipeline work (H.264, H.265, AV1, VP9, remuxing, transcoding), ffmpeg-next
provides safe Rust wrappers over FFmpeg's C API. It requires a system FFmpeg installation
or a vendored build.
use ffmpeg_next as ffmpeg;
ffmpeg::init().unwrap();
// Open video file
let mut ictx = ffmpeg::format::input("input.mp4").unwrap();
let input = ictx.streams().best(ffmpeg::media::Type::Video).unwrap();
let idx = input.index();
let ctx = ffmpeg::codec::context::Context::from_parameters(input.parameters()).unwrap();
let mut decoder = ctx.decoder().video().unwrap();
// Decode frames
for (stream, packet) in ictx.packets() {
if stream.index() == idx {
decoder.send_packet(&packet).unwrap();
let mut frame = ffmpeg::frame::Video::empty();
while decoder.receive_frame(&mut frame).is_ok() {
// frame.data(0) = raw pixel data; frame.width(), frame.height()
println!("frame {}x{}", frame.width(), frame.height());
}
}
}gstreamer (gst-rs) — full media pipeline framework:
use gstreamer::prelude::*;
gstreamer::init().unwrap();
// Build a playback pipeline from a URI description string
let pipeline = gstreamer::parse::launch(
"uridecodebin uri=file:///video.mp4 ! videoconvert ! autovideosink"
).unwrap();
pipeline.set_state(gstreamer::State::Playing).unwrap();
// Or build a custom pipeline programmatically
let src = gstreamer::ElementFactory::make("filesrc").build().unwrap();
let demux = gstreamer::ElementFactory::make("qtdemux").build().unwrap();
let decoder = gstreamer::ElementFactory::make("avdec_h264").build().unwrap();
let convert = gstreamer::ElementFactory::make("videoconvert").build().unwrap();
let sink = gstreamer::ElementFactory::make("appsink").build().unwrap();malgo (miniaudio bindings) — cross-platform audio I/O:
import "github.com/gen2brain/malgo"
func recordAudio() {
ctx, _ := malgo.InitContext(nil, malgo.ContextConfig{}, nil)
defer ctx.Uninit()
deviceConfig := malgo.DefaultDeviceConfig(malgo.Capture)
deviceConfig.Capture.Format = malgo.FormatF32
deviceConfig.Capture.Channels = 1
deviceConfig.SampleRate = 44100
device, _ := malgo.InitDevice(ctx.Context, deviceConfig, malgo.DeviceCallbacks{
Data: func(out, in []byte, frameCount uint32) {
// in contains interleaved float32 samples as raw bytes
// Send to processing via channel (not real-time safe — use carefully)
},
})
device.Start()
time.Sleep(5 * time.Second)
device.Stop()
}beep — higher-level audio playback:
Note on the library's status: the original faiface/beep is archived and no longer
maintained; active development moved to gopxl/beep (the import paths below). Both are
built on oto for playback and expose a Streamer interface (an io.Reader for audio
samples) — a clean design, but one that inherits the runtime characteristics discussed next.
import (
"github.com/gopxl/beep"
"github.com/gopxl/beep/mp3"
"github.com/gopxl/beep/speaker"
)
func playMP3(path string) {
f, _ := os.Open(path)
stream, format, _ := mp3.Decode(f)
defer stream.Close()
speaker.Init(format.SampleRate, format.SampleRate.N(time.Second/10))
done := make(chan struct{})
speaker.Play(beep.Seq(stream, beep.Callback(func() { close(done) })))
<-done
}Technical background: why Go audio is solid for playback but fights you for low-latency
real-time — GC pauses, the dual-scheduler problem, and memory growth. Go's audio libraries
(oto, malgo, beep/gopxl) are reliable for playback and decoding, where a buffer of
tens of milliseconds hides timing jitter, but the language's two runtime mechanisms — the
garbage collector and the goroutine scheduler — actively work against hard real-time audio
(small buffers, glitch-free callbacks at high sample rates). The evidence is in the libraries'
own designs and issue trackers:
- GC stop-the-world pauses cause audible distortion at high sample rates. An audio callback
must complete within the buffer period (at 48 kHz with a 128-frame buffer, ~2.6 ms); a GC
stop-the-world pause landing inside that window starves the callback and produces a click or
dropout. This is concrete enough that the
drgolem/go-portaudiobinding ships a C-allocated SPSC ring buffer (CRing) whose callback, by its own documentation, "never enters the Go runtime, making it immune to GC stop-the-world pauses" that "cause audio distortion at high sample rates." The fix is telling: to get reliable audio you move the hot path out of Go entirely, into C memory the collector never scans.- CPU/latency benefit of the workaround: keeping the callback in C means zero GC scanning of the audio buffer and no STW interference, so worst-case callback latency is bounded by the OS audio thread alone — the difference between "occasional clicks under load" and glitch-free output with small buffers.
- The dual-scheduler problem starves callbacks under load. Go multiplexes goroutines onto a
fixed set of OS threads (GOMAXPROCS, §5). As a PortAudio maintainer described it: because the
Go runtime ensures only N threads run user code at once, if N goroutines are already running
when the audio callback fires, the callback waits for a timeslice — directly causing
underruns. The documented workaround is to use PortAudio's blocking interface (which keeps
the callback at the C level and avoids invoking Go's scheduler at all).
- Real-world consequence: on a busy server or game also doing decode/render work, the audio goroutine competes with everything else for a P, so tail-latency spikes show up as crackle exactly when the system is under load.
- Real-time audio wants a thread priority Go won't give it. OS audio stacks run their
callback on a dedicated real-time thread — AAudio uses
SCHED_FIFO, Apple CoreAudio uses a real-time thread class — to minimize scheduling jitter and allow small buffers. Go's runtime owns thread scheduling and does not exposeSCHED_FIFO-style priorities for goroutines, so a pure-Go callback can't get the priority the platform's own audio engine assumes. - Allocation in the audio path triggers the GC, and is easy to commit by accident. The
long-standing advice (going back to golang-dev discussions in 2013) is "do not allocate in the
audio main loop" — any allocation can trigger a GC assist or a future collection. In Go this is
easy to violate unintentionally: a
[]byteconversion, an interface boxing, a closure capture, or a channel send can each allocate.beep's own issue tracker shows the failure modes — choppy audio on Linux attributed to timing (issue #85, a gones emulator) and a runtime out-of-memory when the library is loaded as a CGO.sointo another runtime (issue #51). The standard diagnostic isGODEBUG=gctrace=1, which prints each GC's pause and heap sizes so you can correlate audio glitches with collections.- Memory-growth angle:
otoplayers each hold an internal buffer between theio.Readerand the device (data flowsio.Reader → internal buffer → device), so spawning many players, or feeding them faster than playback drains, grows resident memory; there is no backpressure beyond the reader's own pace, and a leaked/never-closed streamer keeps its decode state and buffers alive. The mitigation is bounding the number of players and reusing buffers, but Go gives you no compile-time guarantee against the leak the way ownership would.
- Memory-growth angle:
The net engineering reality: Go audio is a good choice for media playback, decoding,
soundboards, and especially WebRTC (where pion is best-in-class), because those tolerate
tens of milliseconds of buffering. It is a poor fit for low-latency DSP, software synthesizers,
or pro-audio plugins, where the GC, the scheduler, and the lack of real-time thread priority
combine to make worst-case latency unpredictable — which is precisely why the serious Go audio
bindings push the hot path into C. This is the inverse of the Rust and Zig stories below, where
the real-time path is the native one.
ffmpeg-go — FFmpeg bindings via CGO:
import ffmpeg "github.com/u2takey/ffmpeg-go"
// Transcode video to H.264 MP4
err := ffmpeg.Input("input.mov").
Output("output.mp4", ffmpeg.KwArgs{
"c:v": "libx264",
"crf": "23",
"c:a": "aac",
}).
OverWriteOutput().
Run()pion/webrtc — pure Go WebRTC (no CGO):
Pion is the most mature pure-Go media stack: WebRTC, RTP/RTCP, SRTP, DTLS, ICE, SCTP. Widely used for video conferencing, live streaming, and real-time data channels.
import "github.com/pion/webrtc/v4"
// Create a WebRTC peer connection
pc, _ := webrtc.NewPeerConnection(webrtc.Configuration{
ICEServers: []webrtc.ICEServer{{URLs: []string{"stun:stun.l.google.com:19302"}}},
})
// Add a video track
track, _ := webrtc.NewTrackLocalStaticRTP(
webrtc.RTPCodecCapability{MimeType: webrtc.MimeTypeH264},
"video", "stream",
)
pc.AddTrack(track)
// Handle incoming tracks
pc.OnTrack(func(track *webrtc.TrackRemote, receiver *webrtc.RTPReceiver) {
for {
pkt, _, err := track.ReadRTP()
if err != nil { return }
_ = pkt // decode H264 RTP packets
}
})Zig's audio story leans on the zig-gamedev ecosystem and its frictionless C interop:
zaudio(zig-gamedev) — a fully-featured audio library that wraps miniaudio, the same single-file C library Go reaches throughmalgo. It covers device capture/playback, decoding (WAV/FLAC/MP3), a node-graph engine, spatialization, and effects. Because Zig compiles the C directly, there is no cgo-style call tax and no separate build step — the miniaudio source is built into your binary.zxaudio2(zig-gamedev) — a helper over Windows XAudio2 for low-latency Windows audio.- Raw backends via
@cImport— ALSA, PulseAudio/PipeWire, CoreAudio, JACK, and WASAPI headers can be imported directly when you want to talk to the hardware API without a wrapper. - DSP — there is no
fundsp/dasp-class native DSP framework yet; the idiom is to write DSP kernels by hand using the built-in@Vector(§9), which is well-suited to the tight, allocation-free inner loops audio demands. The explicit-allocator model (§4) also helps: you give the audio callback aFixedBufferAllocator(or none at all) so it provably never touches the heap — the same real-time-safety property Rust gets from discipline +rtrb, but enforced by not handing the callback a general allocator rather than by a borrow check.
// zaudio: device playback via the miniaudio engine (C built directly into the binary)
const za = @import("zaudio");
za.init(allocator);
defer za.deinit();
const engine = try za.Engine.create(null);
defer engine.destroy();
try engine.startSound("kick.wav", null, null);Rather than survey database libraries generically, this subsection dissects one flagship open-source system per language and the code-level architecture and optimizations that make it fast. Each was chosen because it is the showpiece of what its language enables: DataFusion (Rust) for vectorized analytical query processing, nats-server (Go) for high-throughput messaging, and TigerBeetle (Zig) for deterministic OLTP. The point is not the products but the techniques — and how each leans on its language's strengths.
DataFusion is an embeddable SQL/DataFrame query engine using Apache Arrow as its in-memory model. It is the substrate under many Rust data systems (InfluxDB 3.0, several commercial engines) and the SIGMOD 2024 paper shows it matching DuckDB on benchmarks while remaining a modular library.
Columnar, Arrow-native data flow. Data moves between operators as Arrow RecordBatches —
column-oriented chunks defaulting to 8192 rows. Columnar layout is what makes vectorization
possible: a filter or arithmetic kernel runs a tight loop over a contiguous primitive array
(&[i32]), which the compiler auto-vectorizes to SIMD, and Arrow's validity bitmaps handle nulls
without branching per row. Because Arrow is a language-agnostic memory standard, batches cross
FFI boundaries (to Python, C++) with zero copy and zero deserialization.
Volcano pull + exchange execution, built on Rust async. Each physical operator implements
ExecutionPlan, producing a SendableRecordBatchStream — a pinned Stream of RecordBatches.
Execution is pull-based: calling .next().await on the root drives the tree, each operator
pulling batches from its children, computing, and yielding the next batch incrementally. This is
the classic Volcano model, but at batch granularity (not row-at-a-time) and expressed as Rust
async streams driven by Tokio:
impl Stream for FilterExec {
fn poll_next(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Option<Result<RecordBatch>>> {
// pull a batch from the child stream (may yield at .await)
match ready!(self.input.poll_next_unpin(cx)) {
Some(Ok(batch)) => {
let mask = self.predicate.evaluate(&batch)?.into_array(batch.num_rows())?;
Poll::Ready(Some(filter_record_batch(&batch, &mask))) // vectorized filter kernel
}
other => Poll::Ready(other),
}
}
}The DataFusion team explicitly tried a push-based morsel-driven scheduler (the design DuckDB
uses) and found no significant benefit over Tokio's work-stealing runtime, so it stayed with
pull + exchange. Parallelism comes from RepartitionExec — a Volcano "exchange" operator
that fans batches across partitions (round-robin or hash) so downstream operators run on multiple
Tokio tasks across cores. I/O and CPU are deliberately kept on separate thread pools so a slow
scan never starves compute.
The optimizations that matter at code level:
- Two-phase partitioned hash aggregation — each partition aggregates locally, then results merge, avoiding a global lock on the hash table; group keys and accumulator state live in columnar buffers.
- Vectorized expression evaluation — expressions compile to a tree of kernels operating on
whole arrays;
filter,take, and comparison kernels are SIMD-friendly loops over Arrow buffers. - A rich optimizer — logical rewrites (projection/filter/limit pushdown, common-subexpression elimination, subquery flattening) and physical rewrites (removing unnecessary sorts, choosing Hash vs Merge join, maximizing partitioning).
- Spillable, memory-budgeted operators — sorts and joins track memory and spill to disk under a budget rather than OOM-ing.
Why Rust specifically: the borrow checker makes sharing immutable Arrow buffers across many
parallel Tokio tasks safe without a GC; async/Stream gives backpressure-aware streaming for
free; and monomorphized kernels over &[T] hit C-level SIMD throughput. DataFusion is the
clearest demonstration of "fearless concurrency + zero-cost abstractions" applied to data.
nats-server is the core of the NATS messaging system: a pub/sub broker handling millions of messages per second with a tiny (~15 MB) binary, plus JetStream for persistence. Its design is a tour of how to write low-allocation, highly-concurrent network software in Go.
Goroutine-per-connection with dedicated read and write loops. Each client connection is
served by a readLoop goroutine that reads into a dynamically-sized buffer (starts at 512 B,
grows to 64 KiB under load, shrinks back to 64 B after short reads — adapting buffer size to
traffic to balance memory against syscall count). A separate writeLoop goroutine sleeps on
a sync.Cond and wakes when outbound data is queued. This read/write split lets a slow consumer's
writes never block the reader, and the scheduler (§5) parks both goroutines in the netpoller so
thousands of connections cost no OS threads.
Zero-allocation protocol parser. The NATS wire protocol is parsed by a hand-written byte
state machine that operates directly on the read buffer's []byte without allocating — subject,
reply, and payload are slices into the buffer, not copies. Combined with the adaptive buffer,
the hot path from socket bytes to a routed message does essentially no heap allocation.
// Sketch of the zero-alloc parse loop: state machine over the raw buffer, no allocation
func (c *client) parse(buf []byte) error {
for i := 0; i < len(buf); i++ {
b := buf[i]
switch c.state {
case OP_START:
switch b {
case 'P': c.state = OP_P // PUB / PING / PONG
case 'S': c.state = OP_S // SUB
// ...
}
case MSG_PAYLOAD:
// c.pa.subject etc. are sub-slices of buf — no copy, no alloc
if c.processInboundMsg(buf[c.as:i]) { /* ... */ }
}
}
}Subject routing: a trie with a hot-subject cache. Subscriptions are matched by the
Sublist — a trie keyed on subject tokens (orders.created, with */> wildcards). To avoid
re-walking the trie for hot subjects, a 1024-entry result cache (map[string]*SublistResult,
drained to 512 when full) memoizes "which subscribers match this exact subject." For JetStream's
literal-subject indexing there is a second structure — an Adaptive Radix Trie (SubjectTree,
path-compressed) — that minimizes memory for millions of subjects. A notable micro-optimization:
subject tokenization uses a stack-allocated [32]string array, so subjects up to 32 tokens
tokenize with no heap allocation (only deeper subjects escape to the heap).
Scatter-gather writes and a buffer pool. flushOutbound coalesces queued messages and writes
them with net.Buffers.WriteTo, which lowers to the writev syscall — one syscall sends many
buffers (scatter-gather I/O), avoiding both copies and per-message syscalls. Outbound buffers come
from a three-tier sized pool (nbPoolGet) to keep the write path allocation-free.
JetStream persistence via Raft. Durable streams use a NATS-optimized Raft quorum for replication with linearizable writes; the file store indexes messages by subject using the ART above. So the same server is a zero-alloc in-memory router and a replicated log, both in Go.
Why Go specifically: goroutines + the netpoller make goroutine-per-connection with separate
read/write loops simple and scalable; []byte slicing enables a zero-copy parser; and sync.Pool/
custom pools plus net.Buffers/writev claw back the allocations and syscalls that a naive Go
server would pay. nats-server shows how far careful Go can be pushed toward C-like efficiency while
staying idiomatic and readable.
TigerBeetle is a financial transactions database (double-entry accounting) built for mission-critical safety and throughput — up to ~8,189 transactions per batched request where a general DB does one transaction per several queries. Its architecture is the strongest argument for what Zig's control buys you.
Static memory allocation — zero malloc after startup. TigerBeetle calculates its maximum
memory needs at startup, allocates one large contiguous block, and carves all buffer pools
from it. After initialization there is no runtime allocation at all. Benefits that fall out:
no allocation failures under load, no fragmentation, no GC, and fully predictable memory and
latency. This is only ergonomic because Zig makes allocation explicit (§4) — every component is
handed its memory up front, and the absence of hidden allocation is enforceable.
// TigerBeetle pattern: size everything up front, then never allocate on the hot path
const Forest = struct {
grid: *Grid,
transfers: TransfersGroove, // an LSM tree (groove) — pre-sized at init
accounts: AccountsGroove,
// every buffer below is a slice into the one startup allocation
pub fn init(allocator: std.mem.Allocator, options: Options) !Forest {
// allocate the maximum the cluster config permits — once
// runtime hot paths receive slices into this; they never call the allocator
}
};LSM-Forest storage engine, written from scratch. Rather than embed RocksDB, TigerBeetle implements its own LSM-Forest: ~20+ LSM trees ("grooves"), one per object type plus secondary index trees, all sharing a common block grid. Transfers are stored in a tree sorted by a unique timestamp for fast lookup; auxiliary index trees accelerate other queries. Writing its own LSM lets it pipeline compaction, control read/write amplification, and keep snapshots that survive crashes — none of which an off-the-shelf engine would expose.
Co-designed consensus and storage (VSR). Global consensus uses Viewstamped Replication (chosen over Raft partly because view changes are deterministic), co-designed with the local storage engine so the cluster can perform protocol-aware recovery — if a disk sector corrupts on one replica, it self-heals from the cluster rather than re-replicating an entire tree. The state machine is deterministic: every replica applies the same ordered batch of transfers and arrives at identical state, which reduces replication to synchronizing an append-only, hash-chained log.
Determinism as the master principle, and cache-aware layout. Everything is deterministic — same input, same logical result via the same physical path — and the hot path is built for the CPU: cache-line-aligned, fixed-size data structures, zero-copy and zero-deserialization (data on disk matches data in memory), Direct I/O bypassing the page cache, and io_uring for batched async I/O. Control flow is bounded: no recursion, bounded loops, and a minimum of two assertions per function kept enabled in production.
Deterministic Simulation Testing (DST). Because the whole system is deterministic, TigerBeetle
runs in a simulator that injects network, clock, and storage faults (corrupt/misdirected reads
and writes) and replays failures from a seed. This is how a small team validates a consensus +
storage engine to a level Jepsen independently confirmed — a methodology that essentially requires
the determinism Zig's explicit, allocation-free style makes practical. The technique is now being
generalized into reusable form: marionette is a community DST library built on a std.Io
implementation, letting any Zig program inject faults and replay failures from a seed — the same
discipline, packaged.
Why Zig specifically: explicit allocators make static allocation natural; no hidden control
flow or GC makes determinism achievable; comptime sizes data structures and assertions; and
@cImport-free Direct I/O + io_uring access sits right at the syscall layer. TigerBeetle has
zero dependencies except the Zig toolchain — a single self-contained binary.
Different domains, but the same engineering moves recur: batch work (DataFusion's 8192-row
RecordBatches, NATS's coalesced writev, TigerBeetle's 8k-transfer requests); avoid
allocation on the hot path (Arrow buffer reuse, NATS's zero-alloc parser and pools,
TigerBeetle's static allocation); exploit memory layout (columnar SIMD, cache-line alignment,
slice-not-copy); and lean on the language's defining strength — Rust's fearless parallel
sharing of immutable buffers, Go's goroutine-per-connection concurrency, Zig's explicit-allocation
determinism. Each system is the clearest evidence of what its language was built to do.
Comparison (the broader embedded-DB ecosystem): beyond these flagships, for embedded
analytics Rust has datafusion, polars, and duckdb-rs (in-process vectorized SQL), which Go
and Zig lack natively, plus compile-time-verified SQL via sqlx. For operational simplicity,
Go's pure-Go modernc/sqlite and bbolt cross-compile with no C toolchain. Zig embeds C engines
(SQLite, DuckDB) via @cImport with no call overhead and is the language teams reach for to
build a new engine (TigerBeetle). In short: Rust to query columnar data in-process, Go to
ship a service with an embedded store and no C, Zig to build a storage engine or embed a C one.
🧹 DX — Rust and Go have official SDKs for Anthropic/OpenAI; Zig has none (DIY over HTTP); Go has the ergonomic edge for simple request/response agents ⚡ Perf — Rust: local model inference via candle/llama.cpp FFI without Python overhead; Go: network-bound LLM calls — raw throughput matters less than latency 🔐 Safety — MCP tool handlers that process user input need careful validation; Rust's type system enforces input schema compliance at compile time
rig — agent framework with tool use (Anthropic, OpenAI, Cohere, Gemini):
rig is the idiomatic Rust agent framework. It provides a unified client abstraction
over multiple LLM providers, RAG pipeline building blocks, and tool-use orchestration.
use rig::{completion::Prompt, providers::anthropic};
use serde::{Deserialize, Serialize};
// Define a typed tool — input/output schema derived automatically
#[derive(Deserialize, rig::Tool)]
#[tool(description = "Search the web for current information")]
struct WebSearch {
query: String,
max_results: Option<u32>,
}
impl rig::tool::ToolEmbedding for WebSearch {
type InitError = ();
type Context = ();
type State = ();
async fn call(&self, _ctx: ()) -> Result<String, Box<dyn std::error::Error>> {
// Call your search API here
Ok(format!("Results for '{}': ...", self.query))
}
}
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let client = anthropic::ClientBuilder::default().build()?;
// Build an agent with tools and system prompt
let agent = client
.agent(anthropic::CLAUDE_SONNET_4_5)
.preamble("You are a research assistant. Use web search to find current information.")
.max_tokens(4096)
.tool(WebSearch)
.build();
// Multi-turn conversation with automatic tool dispatch
let response = agent.prompt("What are the latest developments in Rust async?").await?;
println!("{response}");
Ok(())
}async-openai — typed OpenAI API client:
use async_openai::{
Client,
types::{
ChatCompletionRequestSystemMessageArgs,
ChatCompletionRequestUserMessageArgs,
CreateChatCompletionRequestArgs,
},
};
async fn chat_example() -> anyhow::Result<()> {
let client = Client::new(); // reads OPENAI_API_KEY from env
let request = CreateChatCompletionRequestArgs::default()
.model("gpt-4o")
.messages([
ChatCompletionRequestSystemMessageArgs::default()
.content("You are a concise assistant.")
.build()?.into(),
ChatCompletionRequestUserMessageArgs::default()
.content("Explain Rust lifetimes in one paragraph.")
.build()?.into(),
])
.build()?;
let response = client.chat().create(request).await?;
let text = response.choices[0].message.content.as_deref().unwrap_or("");
println!("{text}");
Ok(())
}rmcp — the official Rust MCP SDK (modelcontextprotocol/rust-sdk):
MCP lets language models call tools and read resources over a standard JSON-RPC protocol.
rmcp is the official, Anthropic-maintained Rust SDK (~v0.16 in mid-2026). It is async,
built on Tokio, and uses a pluggable transport layer (stdio for Claude Desktop, Streamable
HTTP/SSE for web). Tools are defined with the #[tool] / #[tool_router] macros on an
impl block; the input struct's schema is derived from schemars.
use rmcp::{
ServerHandler, ServiceExt,
handler::server::{router::tool::ToolRouter, tool::Parameters},
model::{ServerInfo, ServerCapabilities, CallToolResult, Content},
tool, tool_router, tool_handler,
transport::stdio,
};
use schemars::JsonSchema;
use serde::Deserialize;
#[derive(Debug, Deserialize, JsonSchema)]
struct QueryArgs {
/// SQL SELECT query to execute
sql: String,
/// Maximum rows to return
#[serde(default = "default_limit")]
limit: u32,
}
fn default_limit() -> u32 { 100 }
#[derive(Clone)]
struct DataWarehouse { pool: sqlx::PgPool, tool_router: ToolRouter<Self> }
#[tool_router]
impl DataWarehouse {
#[tool(description = "Run a read-only SQL query against the data warehouse")]
async fn query_database(
&self,
Parameters(QueryArgs { sql, limit }): Parameters<QueryArgs>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let rows = run_readonly_query(&self.pool, &sql, limit).await
.map_err(|e| rmcp::ErrorData::internal_error(e.to_string(), None))?;
Ok(CallToolResult::success(vec![Content::text(
serde_json::to_string(&rows).unwrap_or_default(),
)]))
}
}
#[tool_handler]
impl ServerHandler for DataWarehouse {
fn get_info(&self) -> ServerInfo {
ServerInfo {
capabilities: ServerCapabilities::builder().enable_tools().build(),
instructions: Some("Query the analytics warehouse with SQL.".into()),
..Default::default()
}
}
}
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let pool = sqlx::PgPool::connect(&std::env::var("DATABASE_URL")?).await?;
let server = DataWarehouse { pool, tool_router: DataWarehouse::tool_router() };
// Serve over stdio (Claude Desktop). For web, use streamable HTTP / SSE transports.
let running = server.serve(stdio()).await?;
running.waiting().await?;
Ok(())
}Local LLM inference with llama.cpp via FFI:
use llama_cpp_rs::{LlamaModel, LlamaParams, SessionParams};
fn run_local_llm(prompt: &str) -> anyhow::Result<String> {
let model = LlamaModel::load_from_file(
"models/llama-3.2-3b-q4_k_m.gguf",
LlamaParams { n_gpu_layers: 35, ..Default::default() },
)?;
let mut session = model.create_session(SessionParams {
n_ctx: 2048,
..Default::default()
})?;
session.advance_context(prompt)?;
let mut output = String::new();
while let Some(token) = session.next_token() {
output.push_str(&token);
}
Ok(output)
}
// ⚡ Perf: llama.cpp uses AVX2/NEON SIMD + GPU offload; no Python or network neededlangchaingo — LangChain port for Go:
import (
"github.com/tmc/langchaingo/llms"
"github.com/tmc/langchaingo/llms/anthropic"
"github.com/tmc/langchaingo/chains"
"github.com/tmc/langchaingo/tools"
"github.com/tmc/langchaingo/agents"
)
func langchainAgent(ctx context.Context) error {
llm, err := anthropic.New(
anthropic.WithModel("claude-sonnet-4-5"),
)
if err != nil { return err }
// Built-in tools: calculator, Wikipedia, DuckDuckGo search, SQL DB, etc.
agentTools := []tools.Tool{
tools.Calculator{},
// tools.NewSerpAPITool(os.Getenv("SERP_API_KEY")),
}
executor, err := agents.Initialize(
llm,
agentTools,
agents.ZeroShotReactDescription,
agents.WithMaxIterations(5),
)
if err != nil { return err }
response, err := chains.Run(ctx, executor,
"What is 15% of the current population of Canada?")
fmt.Println(response)
return err
}Official Anthropic Go SDK:
import anthropic "github.com/anthropics/anthropic-sdk-go"
func claudeExample(ctx context.Context) error {
client := anthropic.NewClient() // reads ANTHROPIC_API_KEY
message, err := client.Messages.New(ctx, anthropic.MessageNewParams{
Model: anthropic.F(anthropic.ModelClaudeSonnet4_5),
MaxTokens: anthropic.F(int64(1024)),
Messages: anthropic.F([]anthropic.MessageParam{
anthropic.UserMessageParam(anthropic.NewTextBlock("What is the capital of France?")),
}),
})
if err != nil { return err }
fmt.Println(message.Content[0].AsUnion().(anthropic.TextBlock).Text)
return nil
}Streaming with tool use:
func streamWithTools(ctx context.Context) error {
client := anthropic.NewClient()
tools := []anthropic.ToolParam{{
Name: anthropic.F("get_weather"),
Description: anthropic.F("Get the current weather for a city"),
InputSchema: anthropic.F[any](map[string]any{
"type": "object",
"properties": map[string]any{
"city": map[string]any{"type": "string", "description": "City name"},
},
"required": []string{"city"},
}),
}}
stream := client.Messages.NewStreaming(ctx, anthropic.MessageNewParams{
Model: anthropic.F(anthropic.ModelClaudeSonnet4_5),
MaxTokens: anthropic.F(int64(1024)),
Tools: anthropic.F(tools),
Messages: anthropic.F([]anthropic.MessageParam{
anthropic.UserMessageParam(anthropic.NewTextBlock("What is the weather in Paris?")),
}),
})
for stream.Next() {
event := stream.Current()
switch delta := event.Delta.(type) {
case anthropic.ContentBlockDeltaEventDelta:
if delta.Type == anthropic.ContentBlockDeltaEventDeltaTypeTextDelta {
fmt.Print(delta.Text)
}
}
}
return stream.Err()
}Official Go MCP SDK (modelcontextprotocol/go-sdk) — MCP server in Go:
The official Go SDK reached a stable v1.0.0 in late 2025 (maintained with Google) and
superseded the community mark3labs/mcp-go as the recommended choice — though mcp-go
remains viable and inspired the official design. Tools are plain typed functions; the
SDK derives the JSON Schema from the input struct's jsonschema tags.
import (
"context"
"log"
"github.com/modelcontextprotocol/go-sdk/mcp"
)
type QueryInput struct {
SQL string `json:"sql" jsonschema:"read-only SQL SELECT statement to execute"`
Limit int `json:"limit" jsonschema:"maximum rows to return"`
}
// A tool is a typed function: (ctx, request, typed args) -> (result, output, error)
func RunQuery(ctx context.Context, req *mcp.CallToolRequest, in QueryInput) (
*mcp.CallToolResult, any, error,
) {
if in.Limit == 0 { in.Limit = 100 }
rows, err := executeQuery(ctx, in.SQL, in.Limit)
if err != nil {
return &mcp.CallToolResult{
IsError: true,
Content: []mcp.Content{&mcp.TextContent{Text: err.Error()}},
}, nil, nil
}
return &mcp.CallToolResult{
Content: []mcp.Content{&mcp.TextContent{Text: formatRows(rows)}},
}, nil, nil
}
func main() {
server := mcp.NewServer(
&mcp.Implementation{Name: "data-warehouse", Version: "v1.0.0"},
nil,
)
// Schema is generated from QueryInput's jsonschema tags
mcp.AddTool(server, &mcp.Tool{
Name: "run_query",
Description: "Execute a read-only SQL query against the warehouse",
}, RunQuery)
// Run over stdio (Claude Desktop) until the client disconnects.
// For web, use mcp.StreamableHTTPHandler.
if err := server.Run(context.Background(), &mcp.StdioTransport{}); err != nil {
log.Fatal(err)
}
}Rust's edge is narrowly about what the MCP server fronts: for local inference, ort (the
ONNX Runtime crate) is the production workhorse — benchmarked around 3–5× faster than Python
ONNX with 60–80% less memory — while candle (Hugging Face) and mistral.rs (built on
candle) run transformer models on CPU/CUDA/Metal with no Python runtime, and llama.cpp FFI
gives quantized local inference with SIMD/GPU offload. For classical ML, linfa is the
scikit-learn-style toolkit; tch-rs binds LibTorch when you must load PyTorch models
directly. If your MCP server wraps a latency-critical local-inference or vector-search backend
(e.g. qdrant-client), Rust's zero-overhead FFI and async model help. If it wraps a hosted LLM
API and some I/O, Go ships faster with less ceremony.
Zig is the least-served of the three here, and honesty matters: there is no official Anthropic or OpenAI SDK for Zig, and no mature first-party MCP SDK. What exists:
- LLM API access — you call the HTTP/JSON APIs directly with
std.http.Client(or a community HTTP client likezig-fetch), constructing requests and parsing responses withstd.json. There's no typed-tool agent framework equivalent to Rust'srigor Go'slangchaingo; you write the request/stream/tool-dispatch loop yourself. - MCP — community MCP server/client implementations exist on GitHub, but none is
official or at the maturity of
rmcp/go-sdk. For production you'd either contribute to one or implement the JSON-RPC protocol over stdio directly (straightforward, but DIY). - Local inference — Zig's real strength:
@cImportbinds llama.cpp (itself heavily optimized C/C++) with zero overhead, and there is genuine interest in Zig for ML kernels (e.g. the experimentalvllm-zigwriting RoPE/GQA/matmul with@VectorSIMD). A more complete effort iszml, a high-performance ML stack for Zig;ggml-zig/zgmlreimplement the ggml tensor library; andonnxruntime.zigwraps ONNX Runtime. Zig is also used as the build/cross-compile toolchain for ML C++ projects viazig cc.
// Hosted LLM call: construct the request and parse the response yourself with std.json
var client = std.http.Client{ .allocator = allocator };
defer client.deinit();
// ... POST to api.anthropic.com/v1/messages with std.json-encoded body,
// read the response, std.json.parseFromSlice into your Response struct ...| Domain | Rust | Go | Zig 0.16 | Edge |
|---|---|---|---|---|
| GPU — cross-platform compute | wgpu (WebGPU/Dawn) |
CGO to Vulkan/CUDA | zgpu/Mach (WebGPU/Dawn) |
🦀 Rust ≈ ⚡ Zig |
| GPU — CUDA | cust (safe wrappers) |
CGO to CUDA C | @cImport CUDA (zero-overhead) |
🦀 Rust (safe API) |
| GPU — ML inference | candle, mistral.rs, ort, burn |
onnxruntime_go, gomlx |
nascent (vllm-zig exp.) |
🦀 Rust (largest ecosystem) |
| CPU SIMD | std::arch (stable), std::simd (stabilising) |
experimental simd/archsimd |
✅ built-in @Vector (portable, safe) |
⚡ Zig (ergonomics) |
| Audio I/O | cpal (RT-safe by contract) |
malgo, oto (miniaudio, CGO) |
zaudio (miniaudio, no cgo tax) |
🦀 Rust ≈ ⚡ Zig |
| Audio decode / engine / DSP | symphonia, kira, fundsp/dasp, nih-plug (VST3/CLAP) |
beep, oto |
zaudio decode; DSP via @Vector (no framework) |
🦀 Rust (deepest stack) |
| Video pipelines | ffmpeg-next, gstreamer |
ffmpeg-go |
@cImport FFmpeg |
🐹 Go (ergonomic API) |
| WebRTC | str0m, webrtc-rs |
pion/webrtc (battle-tested) |
bind C library | 🐹 Go |
| SQLite | rusqlite (+bundled) |
modernc/sqlite (pure Go) |
zqlite/@cImport (engine built-in) |
🐹 Go (pure-Go default) |
| SQL compile-time verified | sqlx (build-time check) |
database/sql (runtime) |
— | 🦀 Rust |
| OLAP / Parquet / columnar | datafusion, polars, duckdb-rs |
duckdb-go (CGO) |
@cImport DuckDB |
🦀 Rust (only native engine) |
| Full-text search | tantivy (Lucene-class) |
bleve (pure Go) |
— | 🦀 Rust |
| Embedded KV store | redb (active) |
bbolt, badger |
LMDB/RocksDB via @cImport |
🦀 Rust ≈ 🐹 Go |
| Build a new DB engine | strong (control + safety) | weak (GC) | ✅ TigerBeetle proves it | ⚡ Zig ≈ 🦀 Rust |
| LLM API (hosted) | rig, async-openai, official SDK |
official anthropic-sdk-go, openai-go, langchaingo |
DIY over std.http (no SDK) |
🐹 Go (ergonomics) |
| Local LLM inference | ort, candle, mistral.rs, llama.cpp FFI |
CGO to llama.cpp/onnxruntime | @cImport llama.cpp; @Vector kernels |
🦀 Rust |
| Classical ML | linfa, tch-rs, ndarray |
gonum, gomlx |
— | 🦀 Rust |
| MCP server | rmcp (official) |
modelcontextprotocol/go-sdk (official) |
community/DIY (no official) | 🦀 Rust ≈ 🐹 Go |
| AI agents | rig (typed tools) |
langchaingo |
DIY | ≈ Rust/Go tie |
Reading the table: Rust has the broadest native coverage — the only in-process analytics
engines (DataFusion/Polars), the largest ML stack (ort/candle/mistral.rs), and the deepest
audio/DSP libraries. Go leads where a specific deployed library or operational simplicity
dominates: WebRTC (pion), pure-Go SQLite, hosted-LLM SDKs, and FFmpeg wrapping. Zig has
built-in portable SIMD (@Vector) and embeds any C engine via @cImport with no call overhead,
and is used to build databases and inference kernels; its client-library ecosystem is the
youngest and it has no official AI SDKs.
🧹 DX — Go: readable in an afternoon; one mental model; fast feedback loop 🧹 DX — Rust: steep initial curve; pays back in fewer production bugs, better diagnostics 🔍 Debug — Rust compiler error messages with
--explainare the best diagnostics in any compiled language
The learning curve is real and well-documented. The borrow checker requires a mental model shift that takes most developers two to six weeks to internalise. Traits, lifetimes, async/await, macros, and the type system each require dedicated study. Teams typically budget 1–3 months to reach production velocity on their first Rust project.
What you get on the other side:
- Compiler error messages with precise source spans, did-you-mean suggestions, and
rustc --explain E0502which opens a full essay on the specific error type rust-analyzerLSP with real-time borrow-check feedback, type inference display, and macro expansion in the editor- A toolchain where the hardest class of bugs (memory safety, data races, unhandled errors) are caught before the code runs
- Editions that let the language improve over time without breaking your existing codebase
The Go specification is short enough to read in an afternoon. The language has ~25 keywords. There are no lifetimes, no borrow checker, no trait bounds, no const generics, no editions. A developer familiar with C, Java, or Python can be writing idiomatic Go in a day.
gofmt eliminates all style decisions. The built-in toolchain handles testing, profiling,
race detection, fuzzing, and code generation with zero configuration. The standard library
covers the majority of server-side use cases. A new team can be productive in Go faster
than in any other compiled systems language.
The tradeoff: Go's productivity advantage is front-loaded. Rust's is back-loaded. Go lets you ship fast today; Rust's compile-time guarantees reduce the debugging and production incident work over the lifetime of a system.
Zig's learning curve is conceptually the smallest of the three — one mechanism (comptime)
instead of traits + lifetimes + generics + macros, no borrow checker to fight, no async
coloring, a tiny keyword set. A C programmer feels at home immediately, and the "no hidden
control flow / no hidden allocations" principle makes code read literally — what you see is
what executes. zig fmt ends style debates like gofmt, and the built-in test/build tooling
needs zero configuration.
The friction is different and real: (1) pre-1.0 instability — the language and stdlib
break across releases (Writergate in 0.15, the std.Io rewrite in 0.16), so you pin versions
and budget migration time; this is the dominant practical cost. (2) Manual memory management
— no borrow checker means the discipline Rust's compiler enforces is on you, caught at runtime
by allocator checks rather than at compile time; productive, but you carry the cognitive load.
(3) Young ecosystem and docs — fewer libraries, thinner tutorials, smaller Stack Overflow
corpus than Go or Rust. (4) comptime error messages can be cryptic when metaprogramming
goes deep.
Where Zig lands: faster to start than Rust (no borrow checker, one core concept), with more control than Go (explicit allocators, no GC, C-level interop). But Go is still faster to reach production for typical services (mature ecosystem, stable language), and Rust pays back its steeper curve with compile-time guarantees Zig does not provide. Zig applies to systems code that wants C-level control and cross-compilation with a smaller language surface, where a moving pre-1.0 target is acceptable — the profile of its current adopters (TigerBeetle, Bun, Ghostty).
Three-way, current as of June 2026 (Rust 1.95 · Go 1.26 · Zig 0.16). "✅/
| Concern | Rust 1.95 | Go 1.26 | Zig 0.16 |
|---|---|---|---|
| Abstraction model | Traits + generics + lifetimes + macros | Interfaces + generics + reflection | One mechanism: comptime |
| Sum types / ADTs | ✅ enums | ❌ (iota + struct) | ✅ tagged unions |
| Exhaustive match | ✅ match |
❌ switch |
✅ switch |
| Generics | ✅ monomorphized, rich bounds | ✅ comptime fn → type | |
| Const generics | ✅ | ❌ | ✅ (comptime params) |
| Generic methods | ✅ (always) | ✅ via comptime | |
| Type-level programming | ✅ typestate, PhantomData, const generics, GATs |
❌ type sets only; value-level checks | @compileError (imperative) |
| Polymorphism (open) | ✅ dyn Trait |
✅ interfaces (always dynamic) | |
| Reflection | ❌ (proc-macro at comptime) | ✅ runtime reflect |
✅ comptime @typeInfo |
| Memory-layout control | ✅ #[repr] + auto-pack |
❌ (manual, lint-assisted) | ✅ extern/packed/align |
| Niche / null opt | ✅ Option<&T> 1 word |
❌ pointer + box | ✅ ?*T 1 word |
| Iteration | lazy zero-cost Iterator trait (~70 adapters) |
range-over-func (1.23+), eager | next()→?T convention, no protocol |
map/filter/reduce chains |
✅ std, fused to one alloc-free loop (+itertools) |
❌ not in std (by design); samber/lo (eager) / lo/it (lazy) |
❌ none in std; explicit loop (or comptime) |
| Operator overloading | ✅ via traits (Add, Index, Deref…) |
❌ (named methods only) | ❌ (named methods only) |
| Slices / arrays | [T;N] vs &[T] (fat ptr) vs Vec |
[N]T (value) vs []T (ptr,len,cap) + append |
[N]T vs []T (ptr,len) + ArrayList |
| Slice aliasing safety | ✅ borrow checker forbids aliased mutation | ❌ shared backing array, append footguns |
|
| Pointer model | &T/&mut T/Box/Rc/Arc/RefCell/raw |
single *T, GC-managed, no arithmetic |
*T/[*]T/[]T/[*:0]T/[*c]T/?*T |
| Pointer shapes in type | partial (NonNull, niches) |
❌ | ✅ richest (one/many/sentinel/optional) |
| Address-stability / pin | ✅ Pin<P> |
n/a (GC) | manual |
| Interior mutability | ✅ Cell/RefCell/Mutex |
implicit (unsafe re: races) | manual |
| Shared ownership | ✅ Rc/Arc + Weak |
GC | manual refcount if needed |
| Weak references | ✅ Weak<T> (breaks Rc/Arc cycles) |
weak pkg (1.24) for caches; GC handles cycles |
❌ none built-in (manual ?*T) |
| Date/time — instants/durations | ✅ std::time (Instant/SystemTime/Duration) |
✅ time (stdlib) |
✅ std.time (timestamp/Timer) |
| Date/time — civil calendar/tz | ❌ crate: chrono+chrono-tz or jiff |
✅ full time stdlib (zones, parse, format) |
❌ community (zig-datetime, tz-aware zeit/zdt) |
| Error handling | Result + ? (must-handle) |
if err != nil (_ legal) |
error unions + try (must-handle) |
| Error cleanup | RAII Drop |
defer |
defer + errdefer |
| Error traces | via anyhow |
manual %w |
✅ built-in error-return-trace |
| Memory | Ownership + borrow checker | GC (Green Tea, 1.26) | Explicit allocators, no GC |
| Safety guarantee | ✅ compile-time proof | ✅ GC (runtime) | |
| Deterministic free | ✅ | ❌ | ✅ |
| Arena / region free | ✅ (allocator API) | ❌ (sync.Pool only) |
✅ idiomatic ArenaAllocator |
| GC pauses | none | sub-ms | none |
| Concurrency | async/await, Tokio, Rayon, crossbeam | goroutines + channels + select | std.Io (no coloring), Threaded/event |
| Data-race prevention | ✅ Send/Sync compile-time |
❌ runtime -race only |
❌ none |
| Function coloring | async infects) |
none (goroutines) | ✅ solved (Io is a param) |
| Cancellation | per-runtime token | context.Context |
✅ first-class error.Canceled |
| Bare-metal concurrency | ✅ embassy (no_std) |
❌ | ✅ (freestanding) |
| I/O abstraction | Read/Write + async AsyncRead (Tokio) |
io.Reader/io.Writer (stdlib, universal) |
std.Io reader/writer (0.16) |
Zero-copy sendfile/splice |
explicit (nix, tokio-splice) |
✅ transparent via io.Copy (defeatable by wrappers) |
explicit (syscall / @cImport) |
io_uring |
✅ richest (tokio-uring, glommio, monoio) |
✅ std.os.linux.IoUring, 0.16 backend |
|
| io_uring buffer safety | ✅ ownership-enforced (owned-buffer APIs) | n/a | |
| Intra-userspace zero-copy | bytes::Bytes (refcounted) |
slice re-slicing | slice re-slicing |
| Event streams / SSE | Stream, tokio-stream, tungstenite |
channels, bufio, Flusher, gorilla/ws |
community (httpz, zap, websocket.zig) |
| Metaprogramming | macros + proc-macros | go generate + reflect |
comptime (subsumes all) |
| Compile-time eval | ✅ const fn |
❌ | ✅ full comptime |
| Compile-time codegen | ✅ derive (type-safe) | ✅ comptime introspection | |
| Low-level / FFI | unsafe + asm! + std::arch |
unsafe + .s + cgo |
normal mode + @Vector + @cImport |
| C interop cost | ~ns (zero overhead) | ~50–100 ns (cgo, −30% in 1.26) | ~ns (compiler is a C compiler) |
| Portable SIMD | std::simd (stabilising) |
❌ experimental (1.26) | ✅ built-in @Vector |
| UB detection | ✅ miri |
❌ | |
| Build / toolchain | Cargo (features, profiles, build.rs) | go build (minimal, fast) |
build.zig (build script is Zig) |
| Compile speed | ❌ slow (LLVM) | ✅ fastest | |
| Cross-compilation | ✅ if CGO_ENABLED=0 |
✅ bundles libc + cross-compiles C (zig cc) |
|
| Formatter | rustfmt (configurable) | ✅ gofmt | ✅ zig fmt |
| Linters | ✅ Clippy (800+) | ||
| Optimize-with-safety mode | ❌ | ❌ | ✅ ReleaseSafe |
| Testing | miri + Criterion + cargo test | ✅ go test (race/fuzz/cover/pprof) | ✅ test blocks + leak-check + fuzzer |
| Leak detection | borrow checker (compile) | GC | ✅ allocator (runtime, test) |
| Stdlib | minimal (crates for HTTP/JSON/SQL) | batteries-included | lean-but-broad, churning pre-1.0 |
| HTTP server | axum/actix (crate), hyper base |
✅ net/http (stdlib), fasthttp | std.http / httpz, zap |
| TLS | rustls (memory-safe, pluggable crypto) |
✅ crypto/tls (stdlib, pure-Go) |
std.crypto primitives; C OpenSSL for full TLS |
| SQL driver | sqlx (compile-time checked), diesel, sea-orm |
database/sql+pgx, sqlc |
zqlite/@cImport C clients |
| Connection pool | sqlx built-in, deadpool/bb8 |
✅ database/sql built-in pool |
❌ hand-rolled |
| mmap | memmap2 |
x/exp/mmap, edsrzf/mmap-go |
std.Io.File.MemoryMap / std.posix.system.mmap |
| Memory-growth control | by design: pools, bytes, arenas, jemalloc |
GOGC/GOMEMLIMIT + pools + sync.Pool |
per-request arenas, static up-front alloc |
| Deployment | static, monomorphized, larger | runtime required, GC | ✅ tiny static, no runtime |
| Embedded / kernel | ✅ no_std + embassy |
❌ | ✅ freestanding |
| Binary size | larger (monomorphization) | medium (GC shapes) | ✅ smallest (ReleaseSmall) |
| CPU-bound perf | ✅ C-tier | adequate (GC) | ✅ C-tier |
| Security | compile-time mem-safety; build.rs risk | GC safety; no build-exec; govulncheck | runtime-checked safety; small surface |
| Supply-chain scanner | ✅ govulncheck (reachability) | ❌ none yet | |
| Maturity | stable, 2015 edition lineage | very stable, 1.0 since 2012 | |
| Onboarding | 1–3 months (borrow checker) | days | weeks (small concepts, manual memory) |
| Flagship production users | Linux kernel, Cloudflare, AWS, Discord | Kubernetes, Docker, Go itself | TigerBeetle, Bun, Ghostty |
This is awesome! we built a deterministic layer which sits between agent and tool calls for security checks like ingesting rust packages. Nothing blocks users as everything is opt-in warn mode. immunity-agent is free and open source, would love some feedback and contribution