Skip to content

Instantly share code, notes, and snippets.

@kazuho
Last active June 18, 2026 02:15
Show Gist options
  • Select an option

  • Save kazuho/fb36ec78ccd062f341d235e0211330c0 to your computer and use it in GitHub Desktop.

Select an option

Save kazuho/fb36ec78ccd062f341d235e0211330c0 to your computer and use it in GitHub Desktop.
#!/usr/bin/env ruby
# frozen_string_literal: true
#
# qpack-compression-sim.rb
#
# Offline simulator + design notes for how h2o should drive the QPACK dynamic
# table on the encode side. Today h2o uses the static table only (write_response
# passes encoder_buf == NULL); this explores how to turn the dynamic table on.
#
# Usage: ruby misc/qpack-compression-sim.rb <file.qif> [capacity_bytes]
# Knobs (env): DHL (decay half-life), SWAPM (swap margin), REPT (repeat gate).
# Byte accounting mirrors h2o's encoder (lib/http3/qpack.c): field-line/insert
# encodings and HPACK varints. The QPACK static table (RFC 9204 App. A) and the
# HPACK Huffman code lengths (RFC 7541 App. B) are embedded as literals.
#
# The file carries the full exploration (numbered strategies 1-15); the strategies
# that matter, and the conclusions, are below.
#
# ===========================================================================
# WHAT WE'RE OPTIMIZING (the objective)
# ===========================================================================
# Get GOOD COMPRESSION while RELYING LITTLE ON THE ENCODER STREAM. The encoder
# stream carries only table-maintenance metadata (inserts / duplicates), never
# response payload -- so every encoder-stream byte is pure overhead that competes
# with real data for the connection's bandwidth / congestion window. Achieving a
# given ratio with a SMALLER encoder stream is generally a win, for several
# reasons (we claim no strict ranking among them):
# * Latency: a bloated encoder stream displaces response bytes in the cwnd
# (worst early, under slow start), delaying delivery even with zero loss --
# e.g. front-loading inserts for 10 responses pushes all 10 responses later.
# * Head-of-line blocking: the encoder stream is a single ordered dependency
# stream; a lost encoder-stream packet stalls every dependent header block.
# * Cost: table churn is CPU/memory on both ends.
# So the aim is most of the compression with a small, mostly front-loaded, low-
# churn encoder stream -- read the `encoder` column ALONGSIDE the total.
#
# Consequences:
# * Short-lived connections dominate, so first-flight usage matters and the table
# is small. TARGET CAPACITY IS 4 KB - 16 KB (at 64 KB the table never fills, the
# eager fill spreads inserts across the whole connection, and strategies
# converge -- more ratio but a much larger encoder stream, a trade we'd rather
# not force).
# * Referencing not-yet-acked entries is fine: for a single response the inserts
# ride the SAME packet flight as their references (completing the header block
# requires the packet(s) that also carry the inserts), so there's no added
# blocking. We deliberately do NOT gate on the ack signal.
# * This sim measures BYTES only -- no cwnd/ack/RTT/loss model -- so read BOTH the
# total (compression) and the `encoder` column (the encoder-stream overhead).
#
# ===========================================================================
# THE STRATEGIES THAT MATTER (baseline, reference, and the recommended design)
# ===========================================================================
# BASELINE -- static+huffman (sim strategy 1): static table + Huffman only, what
# h2o does today (encoder_buf == NULL). ~60% of raw. The thing to beat.
#
# REFERENCE -- oracle / Belady (sim strategy 3): offline perfect-foresight +
# farthest-next-use eviction. NOT A CEILING: optimal for MISS COUNT, not bytes
# (ignores entry size/value, uses no Duplicate), so a byte-aware policy can and
# does beat it. The true byte-optimal under a fixed table is an NP-hard knapsack-
# over-time; we don't compute it. Use it as a sanity reference only.
#
# RECOMMENDED DESIGN -- ONE mechanism with two modes; both are equally "ours":
#
# no-eviction mode == fill-till-full (sim strategy 2): insert every pair until
# the table is full, then stop. The right answer whenever you don't evict, and
# near-optimal at 4-16 KB. Captures the bulk win (~3x over static), and its
# encoder stream is small by construction -- front-loaded (the table fills in
# the first few responses), each insert rides the same flight as its first
# reference, then the encoder stream is silent for the rest of the connection.
#
# eviction mode == gain-weighted decayed-frequency shadow-rank + hysteretic swap
# (sim strategy 16): a STRICT SUPERSET of fill-till-full -- the fill phase IS
# fill-till-full; eviction only engages once the table is full, so it reduces
# exactly to the no-eviction mode for any connection that never fills. Mechanics:
# * Shadow table: a fixed-slot hash -> decayed-frequency-counter sketch
# (no strings stored) ranks ALL observed pairs. The only extra state; in
# prod, a bounded approximate sketch.
# * Rank = decayed-frequency * savedPerHit(value) / entry_size -- bytes saved
# per byte held, so longer-valued pairs (more bytes saved per reference) are
# preferred. Byte-correct and ~free (one multiply); it matters only when the
# table is small relative to the working-set bytes.
# * QPACK table = the materialized TOP-FIT SUBSET of that rank; it has no
# eviction policy of its own.
# * While there is room: insert+reference every pair (== the fill mode).
# * Once full, at a pair's occurrence SWAP it in (insert+reference) iff
# (a) repeat-evidence: decayed score > the current increment, i.e. it
# recurred recently (not a one-shot), AND
# (b) it beats the weakest resident by hysteresis margin M.
# Residents dug past at the FIFO tail are kept via Duplicate (cheap,
# index-only). Never a speculative / unreferenced insert.
# On fb-resp-hq: beats fill-till-full at every cap, within ~1-2 points of
# (miss-optimal) Belady, with tiny churn -- i.e. a small encoder stream.
#
# ===========================================================================
# RESULTS (fb-resp-hq; default knobs DHL=64, SWAPM=2.0, REPT=1.1)
# ===========================================================================
# total = % of the static-only baseline (lower is better); enc = encoder bytes.
#
# 4 KB 16 KB 64 KB (*)
# static+huffman 100.0% / 0 100.0% / 0 100.0% / 0
# fill-till-full 33.0% / 1.7K 27.4% / 7.1K 21.8% / 27K
# oracle (ref, Belady) 29.7% / 1.9K 19.3% / 6.1K 19.3% / 6.1K
# shadow-rank + swap 24.6% / 3.9K 20.9% / 11K 21.4% / 27K
# swaps / reinserts 43 / 372 90 / 45 6 / 9
# gain shadow-rank 24.3% / 4.5K 21.0% / 11K 21.4% / 27K (recommended)
# swaps / reinserts 35 / 333 96 / 51 6 / 9
#
# Reading: in the target range (4-16 KB), shadow-rank+swap gives the best
# ATTAINABLE ratio -- it beats fill-till-full at both, beats the (miss-optimal)
# oracle reference at 4 KB and is within ~1.5 pts of it at 16 KB -- for a modest
# increase in encoder bytes over fill-till-full, with churn (swaps/reinserts over
# 383 responses) kept tiny, i.e. the extra encoder is legitimate inserts, not
# reinsert thrash. fill-till-full remains the smallest-encoder option if you
# choose not to evict.
# (*) 64 KB out of scope: the table never fills, the eager fill alone dominates
# the encoder stream, and the strategies converge -- shown for context only.
#
# ===========================================================================
# WHY THESE ARE THE RIGHT CHOICES (principles validated along the way)
# ===========================================================================
# * LFU, NOT LRU. Under QPACK's FIFO ring, LRU's promote-on-access needs a
# Duplicate per reference (encoder churn); frequency references are free. Rank
# by frequency.
# * But pure LFU under FIFO mass-reinserts (the low-frequency victim is often not
# at the tail, so you Duplicate everything ahead of it). So frequency drives
# the RANK and the swap decision, materialized via FIFO+Duplicate with
# hysteresis -- not as a direct eviction order.
# * THE REPEAT-EVIDENCE GATE IS THE KEY FIX. Admitting once-seen pairs thrashes:
# a fresh one-shot outranks a stale one-shot on recency and swaps in, endlessly.
# Gating swaps on "recurred recently" removed the mid-cap (table ~= working
# set) blow-up entirely -- it was the single biggest improvement.
# * HYSTERESIS MARGIN M keeps the top-fit subset stable (no boundary flapping),
# which keeps swaps/reinserts -- the HoLB surface -- bounded.
# * DECAY is the LFU<->LRU dial. It demotes once-hot-now-dead entries under
# distribution shift (multi-origin / coalesced connections). It is ~free
# (folded into the counter), so it's included as insurance even though a
# single stationary application can't reward it.
# * ONE decayed counter per tracked pair serves BOTH the rank and the repeat
# gate (repeat-evidence = score > current increment), so no separate count.
# * COUNTER = float32 growing-add: add a globally-growing increment on each hit
# (recent hits weigh more == exponential decay), rare rescale on overflow.
# No periodic sweep, decay only touches hit slots, and float add is ~free on
# server CPUs -- simplest and fastest. (int+halving is the memory-thrifty
# alternative if per-connection footprint ever binds.)
#
# ===========================================================================
# KNOBS (only one really matters)
# ===========================================================================
# * M (SWAPM): hysteresis / efficiency dial (churn vs responsiveness). Understood.
# * T (REPT): repeat-evidence threshold; robust -- anything slightly above 1.0.
# * DHL: decay half-life -- the ONE knob that genuinely matters, and the only one
# a single-application trace CANNOT validate (a stationary distribution never
# exercises decay). Its right value depends on real multi-origin traffic.
# On fb-resp-hq, DHL is flat for >= ~64; only too-fast decay (<= 16) hurts.
#
# ===========================================================================
# CAVEATS / NOT VALIDATED
# ===========================================================================
# * One trace, one application (fb-resp-hq) -> stationary header distribution ->
# decay's value and DHL are UNTESTED; needs multi-origin / coalesced traffic.
# * "Oracle" is miss-optimal, not a byte ceiling (see above).
# * The sim scores BYTES; HoLB itself is not modeled (no ack/RTT/loss). The
# encoder-bytes column is the HoLB proxy.
# * Reproduce the headline result (target range) e.g.:
# for c in 4096 16384; do ruby misc/qpack-compression-sim.rb \
# deps/qifs/qifs/fb-resp-hq.qif $c | grep -E 'static|insert-till|oracle|shadow'; done
# (strategy 2 "insert-till-full" is fill-till-full; strategy 15 is our approach.)
# ---------------------------------------------------------------------------
# Tunables for "our approach" (strategies 3 & 4)
# ---------------------------------------------------------------------------
PRIOR_ALPHA = (ENV["ALPHA"] || 1.0).to_f # pseudo-hits (benefit of the doubt for unseen names)
PRIOR_BETA = (ENV["BETA"] || 2.0).to_f # pseudo-novels
BAR_T = (ENV["T"] || 0.20).to_f # score bar above 1/2 capacity / admission floor
HYSTERESIS = (ENV["HYST"] || 0.0).to_f # (strat 4) margin a candidate must beat the victim by
REFRESH_BAR = (ENV["RBAR"] || 0.20).to_f # (strat 5) min score to Duplicate-refresh a tail entry
MAX_REFRESH = (ENV["MAXREF"] || 8).to_i # (strat 5) cap on Duplicates per insertion (progress guard)
MIN_SECTIONS = (ENV["MINSEC"] || 32).to_i # admit freely until this many sections sampled (warm-up)
Z = (ENV["Z"] || 1.0).to_f # (strat 7) SEs the estimate must clear BAR_T/w to reject
PERIOD = (ENV["PERIOD"] || 16).to_i # (strat 11) sections per batched table-update window
DECAY_HL = (ENV["DHL"] || 64).to_f # (strat 13) decay half-life, in responses (the LFU<->LRU dial)
DECAY_FACTOR = 2.0**(1.0 / DECAY_HL) # per-response growth of the add value = 2^(1/HL)
PROT_DIV = (ENV["PROTDIV"] || 4).to_i # (strat 13) bottom 1/PROT_DIV of the table is eviction-eligible
SWAP_MARGIN = (ENV["SWAPM"] || 2.0).to_f # (strat 15) hysteresis: riser must beat resident's rank by this factor
REP_T = (ENV["REPT"] || 1.1).to_f # (strat 15) repeat-evidence gate: freq must exceed add*REP_T (>1.0) to swap
ENTRY_OVERHEAD = 32 # QPACK per-entry size overhead (RFC 9204)
# ---------------------------------------------------------------------------
# HPACK Huffman code lengths in bits, per byte value 0..255 (RFC 7541 App. B)
# ---------------------------------------------------------------------------
HUFF_NBITS = [
13,23,28,28,28,28,28,28,28,24,30,28,28,30,28,28,
28,28,28,28,28,28,30,28,28,28,28,28,28,28,28,28,
6,10,10,12,13,6,8,11,10,10,8,11,8,6,6,6,
5,5,5,6,6,6,6,6,6,6,7,8,15,6,12,10,
13,6,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
7,7,7,7,7,7,7,7,8,7,8,13,19,13,14,6,
15,5,6,5,6,5,6,6,6,5,7,7,6,6,6,5,
6,7,6,5,5,6,7,7,7,7,7,15,11,14,13,28,
20,22,20,20,22,22,22,23,22,23,23,23,23,23,24,23,
24,24,22,23,24,23,23,23,23,21,22,23,22,23,23,24,
22,21,20,22,22,23,23,21,23,22,22,24,21,22,23,23,
21,21,22,21,23,22,23,23,20,22,22,22,23,22,22,23,
26,26,20,19,22,23,22,25,26,26,26,27,27,26,24,25,
19,21,26,27,27,26,27,24,21,21,26,26,28,27,27,27,
20,24,20,21,22,21,21,23,22,22,25,25,24,24,26,23,
26,27,26,26,27,27,27,27,27,28,27,27,27,27,27,26,
].freeze
raise "huffman table size" unless HUFF_NBITS.size == 256
def huff_len(str)
bits = 0
str.each_byte { |b| bits += HUFF_NBITS[b] }
(bits + 7) / 8
end
# encoded length of a string literal payload: min(raw, huffman)
def enc_str_payload(str)
raw = str.bytesize
h = huff_len(str)
h < raw ? h : raw
end
# ---------------------------------------------------------------------------
# QPACK static table, index 0..98 (RFC 9204 Appendix A), as [name, value]
# ---------------------------------------------------------------------------
STATIC = [
[":authority", ""], [":path", "/"], ["age", "0"], ["content-disposition", ""],
["content-length", "0"], ["cookie", ""], ["date", ""], ["etag", ""],
["if-modified-since", ""], ["if-none-match", ""], ["last-modified", ""], ["link", ""],
["location", ""], ["referer", ""], ["set-cookie", ""],
[":method", "CONNECT"], [":method", "DELETE"], [":method", "GET"], [":method", "HEAD"],
[":method", "OPTIONS"], [":method", "POST"], [":method", "PUT"],
[":scheme", "http"], [":scheme", "https"],
[":status", "103"], [":status", "200"], [":status", "304"], [":status", "404"], [":status", "503"],
["accept", "*/*"], ["accept", "application/dns-message"],
["accept-encoding", "gzip, deflate, br"], ["accept-ranges", "bytes"],
["access-control-allow-headers", "cache-control"], ["access-control-allow-headers", "content-type"],
["access-control-allow-origin", "*"],
["cache-control", "max-age=0"], ["cache-control", "max-age=2592000"], ["cache-control", "max-age=604800"],
["cache-control", "no-cache"], ["cache-control", "no-store"], ["cache-control", "public, max-age=31536000"],
["content-encoding", "br"], ["content-encoding", "gzip"],
["content-type", "application/dns-message"], ["content-type", "application/javascript"],
["content-type", "application/json"], ["content-type", "application/x-www-form-urlencoded"],
["content-type", "image/gif"], ["content-type", "image/jpeg"], ["content-type", "image/png"],
["content-type", "text/css"], ["content-type", "text/html; charset=utf-8"],
["content-type", "text/plain"], ["content-type", "text/plain;charset=utf-8"],
["range", "bytes=0-"],
["strict-transport-security", "max-age=31536000"],
["strict-transport-security", "max-age=31536000; includesubdomains"],
["strict-transport-security", "max-age=31536000; includesubdomains; preload"],
["vary", "accept-encoding"], ["vary", "origin"],
["x-content-type-options", "nosniff"], ["x-xss-protection", "1; mode=block"],
[":status", "100"], [":status", "204"], [":status", "206"], [":status", "302"], [":status", "400"],
[":status", "403"], [":status", "421"], [":status", "425"], [":status", "500"],
["accept-language", ""],
["access-control-allow-credentials", "FALSE"], ["access-control-allow-credentials", "TRUE"],
["access-control-allow-headers", "*"],
["access-control-allow-methods", "get"], ["access-control-allow-methods", "get, post, options"],
["access-control-allow-methods", "options"],
["access-control-expose-headers", "content-length"],
["access-control-request-headers", "content-type"],
["access-control-request-method", "get"], ["access-control-request-method", "post"],
["alt-svc", "clear"], ["authorization", ""],
["content-security-policy", "script-src 'none'; object-src 'none'; base-uri 'none'"],
["early-data", "1"], ["expect-ct", ""], ["forwarded", ""], ["if-range", ""], ["origin", ""],
["purpose", "prefetch"], ["server", ""], ["timing-allow-origin", "*"],
["upgrade-insecure-requests", "1"], ["user-agent", ""], ["x-forwarded-for", ""],
["x-frame-options", "deny"], ["x-frame-options", "sameorigin"],
].freeze
raise "static table size" unless STATIC.size == 99
STATIC_PAIR = {} # "name\0value" -> index
STATIC_NAME = {} # name -> first index
STATIC.each_with_index do |(n, v), i|
STATIC_PAIR["#{n}\0#{v}"] ||= i
STATIC_NAME[n] ||= i
end
# ---------------------------------------------------------------------------
# Encoding-size primitives (mirroring lib/http3/qpack.c)
# ---------------------------------------------------------------------------
def int_len(value, prefix_bits)
maxv = (1 << prefix_bits) - 1
return 1 if value < maxv
value -= maxv
n = 1
while value >= 128
value >>= 7
n += 1
end
n + 1
end
def str_len(str, prefix_bits)
l = enc_str_payload(str)
int_len(l, prefix_bits) + l
end
# field-line encodings (header block)
def cost_static_indexed(idx) = int_len(idx, 6)
def cost_dyn_indexed(rel) = int_len(rel, 6)
def cost_static_nameref(idx, v) = int_len(idx, 4) + str_len(v, 7)
def cost_dyn_nameref(rel, v) = int_len(rel, 4) + str_len(v, 7)
def cost_literal(n, v) = str_len(n, 3) + str_len(v, 7)
# insert encodings (encoder stream)
def cost_insert_nameref(idx, v) = int_len(idx, 6) + str_len(v, 7)
def cost_insert_literal(n, v) = str_len(n, 5) + str_len(v, 7)
def cost_duplicate(rel) = int_len(rel, 5) # Duplicate instruction (RFC 9204 4.3.4)
def entry_size(name, value) = name.bytesize + value.bytesize + ENTRY_OVERHEAD
# ---------------------------------------------------------------------------
# Dynamic table model
# ---------------------------------------------------------------------------
class DynTable
Entry = Struct.new(:name, :value, :size, :seqno, :hits, :next_use, :referenced)
attr_reader :bytes, :entries # @entries is in insertion order (oldest first = FIFO tail)
def initialize(cap)
@cap = cap
@bytes = 0
@inserts = 0
@by_pair = {} # "name\0value" -> Entry
@by_name = Hash.new { |h, k| h[k] = [] } # name -> [Entry]
@entries = []
end
def cap = @cap
def lookup_pair(n, v) = @by_pair["#{n}\0#{v}"]
def lookup_name(n) = @by_name[n].last
def rel(e) = @inserts - 1 - e.seqno
def room?(size) = @bytes + size <= @cap
def oldest = @entries.first
def insert(n, v, next_use = nil)
e = Entry.new(n, v, entry_size(n, v), @inserts, 0, next_use, false)
@inserts += 1
@bytes += e.size
@by_pair["#{n}\0#{v}"] = e
@by_name[n] << e
@entries << e
e
end
def evict(e)
@bytes -= e.size
@by_pair.delete("#{e.name}\0#{e.value}")
arr = @by_name[e.name]
arr.delete(e)
@by_name.delete(e.name) if arr.empty?
@entries.delete(e)
end
end
# ---------------------------------------------------------------------------
# Reference / insert cost helpers shared by the dynamic strategies
# ---------------------------------------------------------------------------
# cost to emit a field, given the table, WITHOUT inserting
def emit_ref_cost(table, n, v)
if (i = STATIC_PAIR["#{n}\0#{v}"])
return cost_static_indexed(i)
end
if table && (e = table.lookup_pair(n, v))
return cost_dyn_indexed(table.rel(e))
end
if (i = STATIC_NAME[n])
return cost_static_nameref(i, v)
end
if table && (e = table.lookup_name(n))
return cost_dyn_nameref(table.rel(e), v)
end
cost_literal(n, v)
end
# cost of the insert instruction on the encoder stream
def insert_cost(table, n, v)
if (i = STATIC_NAME[n])
cost_insert_nameref(i, v)
elsif table && (e = table.lookup_name(n))
cost_insert_nameref(table.rel(e), v)
else
cost_insert_literal(n, v)
end
end
# ---------------------------------------------------------------------------
# Strategy 1: static table + Huffman only
# ---------------------------------------------------------------------------
def run_static(blocks)
hdr = 0
blocks.each { |b| b.each { |n, v| hdr += emit_ref_cost(nil, n, v) } }
{ header: hdr, encoder: 0 }
end
# ---------------------------------------------------------------------------
# Insert-until-full: insert every not-yet-present pair, no policy, never evict.
# The minimal "just turn the dynamic table on" baseline.
# ---------------------------------------------------------------------------
def run_insert_until_full(blocks)
cap = $cap
table = DynTable.new(cap)
hdr = 0
enc = 0
blocks.each do |b|
b.each do |n, v|
if (e = table.lookup_pair(n, v))
hdr += cost_dyn_indexed(table.rel(e))
next
end
size = entry_size(n, v)
if size <= cap && table.room?(size)
enc += insert_cost(table, n, v)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
else
hdr += emit_ref_cost(table, n, v)
end
end
end
{ header: hdr, encoder: enc }
end
# ---------------------------------------------------------------------------
# Strategy 2: offline oracle (perfect foresight + Belady eviction)
# ---------------------------------------------------------------------------
def run_oracle(blocks, cap)
# flatten to occurrence list, precompute next-use index per occurrence
occ = []
blocks.each { |b| b.each { |nv| occ << nv } }
nextidx = Array.new(occ.size, Float::INFINITY)
last = {}
(occ.size - 1).downto(0) do |i|
key = "#{occ[i][0]}\0#{occ[i][1]}"
nextidx[i] = last.fetch(key, Float::INFINITY)
last[key] = i
end
table = DynTable.new(cap)
hdr = 0
enc = 0
occ.each_with_index do |(n, v), i|
if (e = table.lookup_pair(n, v))
hdr += cost_dyn_indexed(table.rel(e))
e.next_use = nextidx[i]
next
end
size = entry_size(n, v)
want_insert = nextidx[i] != Float::INFINITY && size <= cap
if want_insert
# Belady: victim = farthest next-use among residents (and the candidate)
until table.room?(size)
victim = table.entries.max_by(&:next_use)
if victim.next_use >= nextidx[i]
want_insert = false # candidate itself is used farthest -> don't cache
break
end
table.evict(victim)
end
end
if want_insert
enc += insert_cost(table, n, v)
e = table.insert(n, v, nextidx[i])
hdr += cost_dyn_indexed(table.rel(e))
else
hdr += emit_ref_cost(table, n, v)
end
end
{ header: hdr, encoder: enc }
end
# ---------------------------------------------------------------------------
# Our score model (strategies 3 & 4)
# ---------------------------------------------------------------------------
class NameStats
def initialize
@hits = Hash.new(0)
@novel = Hash.new(0)
end
def f_hat(name)
(@hits[name] + PRIOR_ALPHA) / (@novel[name] + PRIOR_BETA)
end
def hit!(name) = @hits[name] += 1
def novel!(name) = @novel[name] += 1
# posterior over the name's expected references-per-value (rate lambda),
# Gamma-Poisson conjugate: Gamma(alpha+hits, beta+novel). mean = (a+h)/(b+m)
# is exactly f_hat; var = (a+h)/(b+m)^2. returns [mean, standard error].
def posterior(name)
a = PRIOR_ALPHA + @hits[name]
b = PRIOR_BETA + @novel[name]
[a / b, Math.sqrt(a) / b]
end
end
# score = f_hat(name) * savedPerHit(value) / size -- bytes saved per byte held
def score(stats, n, v)
saved = enc_str_payload(v) # ~ bytes saved on each future reference
stats.f_hat(n) * saved / entry_size(n, v).to_f
end
# Confidence-bound admission (strat 7): the gate admits when p*w >= BAR_T, i.e.
# p >= p* = BAR_T/w (w = savedPerHit/size). Admit-by-default until we are Z SEs
# certain the name's repeat-fraction is BELOW p* -- so a fresh/uncertain name is
# explored, and only the confidently-not-worth-it names get rejected. Replaces
# the MIN_SECTIONS warm-up and the 1/2-capacity bar with one knob (Z).
def confident_admit?(stats, n, v)
saved = enc_str_payload(v)
return false if saved <= 0
pstar = BAR_T / (saved.to_f / entry_size(n, v))
mean, se = stats.posterior(n)
(mean + Z * se) >= pstar
end
# Strategy 3: never evict, 1/2-capacity exploration gate (after warm-up)
def run_ours_noevict(blocks)
cap = $cap
table = DynTable.new(cap)
stats = NameStats.new
hdr = 0
enc = 0
sec = 0
blocks.each do |b|
sec += 1
b.each do |n, v|
if (e = table.lookup_pair(n, v))
hdr += cost_dyn_indexed(table.rel(e))
stats.hit!(n)
next
end
stats.novel!(n)
size = entry_size(n, v)
bar = (sec <= MIN_SECTIONS || table.bytes < cap / 2) ? 0.0 : BAR_T
if size <= cap && table.room?(size) && score(stats, n, v) >= bar
enc += insert_cost(table, n, v)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
else
hdr += emit_ref_cost(table, n, v)
end
end
end
{ header: hdr, encoder: enc }
end
# Strategy 4: idealized NON-FIFO eviction -- evict the lowest-score entry from
# anywhere, full re-insert on recurrence (no Duplicate). Not how QPACK works;
# kept as an optimistic eviction bound (between the oracle and faithful FIFO).
def run_ours_ideal_evict(blocks)
cap = $cap
table = DynTable.new(cap)
stats = NameStats.new
hdr = 0
enc = 0
blocks.each do |b|
b.each do |n, v|
if (e = table.lookup_pair(n, v))
hdr += cost_dyn_indexed(table.rel(e))
stats.hit!(n)
next
end
stats.novel!(n)
size = entry_size(n, v)
cand = score(stats, n, v)
do_insert = false
if size <= cap
if table.room?(size)
do_insert = true # free slot
else
# try to evict the weakest residents the candidate beats
freed = table.bytes
victims = table.entries.sort_by { |e| score(stats, e.name, e.value) }
need = size - (cap - table.bytes)
chosen = []
got = 0
victims.each do |e|
break if got >= need
break unless cand > score(stats, e.name, e.value) + HYSTERESIS
chosen << e
got += e.size
end
if got >= need
chosen.each { |e| table.evict(e) }
do_insert = true
end
end
end
if do_insert
enc += insert_cost(table, n, v)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
else
hdr += emit_ref_cost(table, n, v)
end
end
end
{ header: hdr, encoder: enc }
end
# Strategy 5: faithful QPACK eviction -- FIFO ring (evict oldest), with the
# Duplicate instruction used to refresh a still-hot tail entry to the head
# instead of dropping it. This is how a real encoder manages the table.
def fifo_make_room(table, stats, need)
enc = 0
freed = 0
refreshes = 0
until freed >= need || table.entries.empty?
e = table.oldest
hot = e.referenced && score(stats, e.name, e.value) >= REFRESH_BAR && refreshes < MAX_REFRESH
if hot
rel = table.rel(e) # Duplicate references the entry before it leaves
table.evict(e)
enc += cost_duplicate(rel)
table.insert(e.name, e.value) # copy re-added at head, referenced flag reset
refreshes += 1
else
freed += e.size
table.evict(e) # genuinely dropped
end
end
enc
end
def run_ours_fifo_dup(blocks)
cap = $cap
table = DynTable.new(cap)
stats = NameStats.new
hdr = 0
enc = 0
sec = 0
blocks.each do |b|
sec += 1
b.each do |n, v|
if (e = table.lookup_pair(n, v))
hdr += cost_dyn_indexed(table.rel(e))
e.referenced = true
stats.hit!(n)
next
end
stats.novel!(n)
size = entry_size(n, v)
bar = (sec <= MIN_SECTIONS || table.bytes < cap / 2) ? 0.0 : BAR_T
if size <= cap && score(stats, n, v) >= bar
need = table.bytes + size - cap
enc += fifo_make_room(table, stats, need) if need > 0
enc += insert_cost(table, n, v)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
else
hdr += emit_ref_cost(table, n, v)
end
end
end
{ header: hdr, encoder: enc }
end
# Strategy 7: confidence-bound admission + FIFO+Duplicate eviction. No warm-up
# constant and no 1/2 bar -- the BAR_T threshold and the per-name posterior do
# the gating, self-adapting to how much evidence each name has.
def run_ours_ucb(blocks)
cap = $cap
table = DynTable.new(cap)
stats = NameStats.new
hdr = 0
enc = 0
blocks.each do |b|
b.each do |n, v|
if (e = table.lookup_pair(n, v))
hdr += cost_dyn_indexed(table.rel(e))
e.referenced = true
stats.hit!(n)
next
end
stats.novel!(n)
size = entry_size(n, v)
# always gate: only insert pairs we believe will repeat (don't pollute the
# table / add dynamic-table dependencies for one-shots, even if room is free)
if size <= cap && confident_admit?(stats, n, v)
need = table.bytes + size - cap
enc += fifo_make_room(table, stats, need) if need > 0
enc += insert_cost(table, n, v)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
else
hdr += emit_ref_cost(table, n, v)
end
end
end
{ header: hdr, encoder: enc }
end
# Strategy 8: confidence-bound admission, NO eviction (insert until full, stop).
def run_ours_ucb_noevict(blocks)
cap = $cap
table = DynTable.new(cap)
stats = NameStats.new
hdr = 0
enc = 0
blocks.each do |b|
b.each do |n, v|
if (e = table.lookup_pair(n, v))
hdr += cost_dyn_indexed(table.rel(e))
e.referenced = true
stats.hit!(n)
next
end
stats.novel!(n)
size = entry_size(n, v)
if size <= cap && table.room?(size) && confident_admit?(stats, n, v)
enc += insert_cost(table, n, v)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
else
hdr += emit_ref_cost(table, n, v)
end
end
end
{ header: hdr, encoder: enc }
end
# Strategy 9: confidence admission backed by a SEPARATE reuse tracker. Stats are
# updated from every occurrence (whether or not the pair is in the dyn table),
# so reuse of declined pairs is still observed -- removing the cold-start bias.
# (FIFO+Duplicate eviction.) Tracker = unbounded "seen pairs" set; fine offline.
def run_ours_tracker(blocks)
cap = $cap
table = DynTable.new(cap)
stats = NameStats.new
seen = {}
hdr = 0
enc = 0
blocks.each do |b|
b.each do |n, v|
key = "#{n}\0#{v}"
if seen[key] # tracker: unbiased reuse signal, independent of the dyn table
stats.hit!(n)
else
seen[key] = true
stats.novel!(n)
end
if (e = table.lookup_pair(n, v))
hdr += cost_dyn_indexed(table.rel(e))
e.referenced = true
next
end
size = entry_size(n, v)
if size <= cap && confident_admit?(stats, n, v)
need = table.bytes + size - cap
enc += fifo_make_room(table, stats, need) if need > 0
enc += insert_cost(table, n, v)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
else
hdr += emit_ref_cost(table, n, v)
end
end
end
{ header: hdr, encoder: enc }
end
# CLOCK / second-chance make-room: walk the FIFO tail; if an entry's touched bit
# is set, give it a second chance (Duplicate to head, clear the bit) instead of
# dropping it. Pure touched-bit, no score, no threshold. Terminates: at most one
# reprieve per entry, then forced evictions free space.
def fifo_make_room_clock(table, need)
enc = 0
freed = 0
dups = 0
limit = table.entries.size
until freed >= need || table.entries.empty?
e = table.oldest
if e.referenced && dups < limit
rel = table.rel(e)
table.evict(e)
enc += cost_duplicate(rel)
table.insert(e.name, e.value) # second chance; touched bit reset
dups += 1
else
freed += e.size
table.evict(e)
end
end
enc
end
# Strategy 10: insert everything (no admission policy) + touched-bit CLOCK
# eviction. Parameter-free: the only "smartness" is the second-chance bit.
def run_insert_all_clock(blocks)
cap = $cap
table = DynTable.new(cap)
hdr = 0
enc = 0
blocks.each do |b|
b.each do |n, v|
if (e = table.lookup_pair(n, v))
hdr += cost_dyn_indexed(table.rel(e))
e.referenced = true
next
end
size = entry_size(n, v)
if size <= cap
need = table.bytes + size - cap
enc += fifo_make_room_clock(table, need) if need > 0
enc += insert_cost(table, n, v)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
else
hdr += emit_ref_cost(table, n, v)
end
end
end
{ header: hdr, encoder: enc }
end
# Strategy 11: periodic batched update. Between period boundaries the table is
# frozen (pure references/literals, zero encoder traffic). Every PERIOD sections
# we promote pairs that recurred (>=2x) during the window -- skipping one-shots
# -- and CLOCK-evict to fit. Concentrates encoder traffic into bursts with stable
# gaps (HoLB-benign) and skips the one-shot inserts that bloat insert-all.
def run_periodic_update(blocks)
cap = $cap
table = DynTable.new(cap)
hdr = 0
enc = 0
window = Hash.new(0) # pair -> occurrences in current window
sec = 0
blocks.each do |b|
sec += 1
b.each do |n, v|
if (e = table.lookup_pair(n, v))
hdr += cost_dyn_indexed(table.rel(e))
e.referenced = true
next
end
hdr += emit_ref_cost(table, n, v) # not cached -> literal/nameref for now
window["#{n}\x00#{v}"] += 1
end
next unless sec % PERIOD == 0
window.each do |key, cnt| # period boundary: promote recurring pairs
next if cnt < 2
n, v = key.split("\x00", 2)
next if table.lookup_pair(n, v)
size = entry_size(n, v)
next if size > cap
need = table.bytes + size - cap
enc += fifo_make_room_clock(table, need) if need > 0
enc += insert_cost(table, n, v)
table.insert(n, v)
end
window.clear
end
{ header: hdr, encoder: enc }
end
# Strategy 12: eager fill, then per-PAIR frequency (LFU) replacement, at-occurrence.
# A per-pair occurrence counter (incremented on every sighting, cached or not) is
# the reuse signal -- so resident entries' counts keep growing as they're used and
# stay resident, while a candidate only displaces entries it is reused MORE than.
# While there's room: insert every uncached pair. Once full: at a pair's occurrence
# evict the least-frequent entries and insert+reference iff the candidate's count
# exceeds theirs; otherwise literal. No per-name confidence, no speculative insert.
def run_fill_then_replace(blocks)
cap = $cap
table = DynTable.new(cap)
count = Hash.new(0) # per-pair occurrence count (all pairs, cached or not)
hdr = 0
enc = 0
blocks.each do |b|
b.each do |n, v|
key = "#{n}\x00#{v}"
count[key] += 1
if (e = table.lookup_pair(n, v))
hdr += cost_dyn_indexed(table.rel(e))
next
end
size = entry_size(n, v)
if size > cap
hdr += emit_ref_cost(table, n, v)
elsif table.room?(size)
enc += insert_cost(table, n, v) # fill: eager, insert+reference
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
else
cand = count[key] # full: LFU replacement
need = size - (cap - table.bytes)
victims = []
freed = 0
table.entries.sort_by { |e| count["#{e.name}\x00#{e.value}"] }.each do |e|
break if freed >= need
break unless cand > count["#{e.name}\x00#{e.value}"] # only displace less-reused
victims << e
freed += e.size
end
if freed >= need
victims.each { |e| table.evict(e) }
enc += insert_cost(table, n, v)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
else
hdr += emit_ref_cost(table, n, v)
end
end
end
end
{ header: hdr, encoder: enc }
end
# Strategy 13: decayed-frequency (W-TinyLFU-ish), faithful FIFO + Duplicate.
# Score = exponentially-decayed frequency, implemented with a growing `add`
# (add *= 2^(1/HL) per response; `score += add` on every occurrence). No per-entry
# timestamp; rare rescale prevents overflow. Eager fill while there's room; once
# full, only the bottom 1/PROT_DIV by decayed score is eviction-eligible -- a
# candidate is admitted (at its occurrence, insert+reference) iff it outranks the
# evictable entries it would displace, and protected entries at the tail are kept
# via Duplicate (counted). All inserts are referenced; reinserts are index-cheap.
def run_decay_fifo(blocks)
cap = $cap
table = DynTable.new(cap)
freq = Hash.new(0.0) # per-pair decayed score (the tracker; unbounded in-sim)
add = 1.0
hdr = 0
enc = 0
reinserts = 0
blocks.each do |b|
add *= DECAY_FACTOR
if add > 1e18 # rescale to keep things in range (rare: ~every HL*60 responses)
freq.transform_values! { |x| x / add }
add = 1.0
end
b.each do |n, v|
key = "#{n}\x00#{v}"
freq[key] += add # the "hit" -- recent hits weigh more, so this is decay
if (e = table.lookup_pair(n, v))
hdr += cost_dyn_indexed(table.rel(e))
next
end
size = entry_size(n, v)
if size > cap
hdr += emit_ref_cost(table, n, v)
next
end
if table.room?(size)
enc += insert_cost(table, n, v) # eager fill: insert + reference
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
next
end
# full: bottom 1/PROT_DIV by score is evictable; candidate must outrank what
# it displaces. pre-check droppable bytes to avoid emitting wasted Duplicates.
fc = freq[key]
ranked = table.entries.map { |e| freq["#{e.name}\x00#{e.value}"] }.sort
threshold = ranked[ranked.size / PROT_DIV]
droppable = table.entries.sum do |e|
f = freq["#{e.name}\x00#{e.value}"]
(f <= threshold && f < fc) ? e.size : 0
end
need = size - (cap - table.bytes)
if droppable >= need
freed = 0
until freed >= need
e = table.oldest
f = freq["#{e.name}\x00#{e.value}"]
if f <= threshold && f < fc
freed += e.size
table.evict(e) # drop a bottom-tier, outranked entry
else
rel = table.rel(e) # protect: Duplicate to head (cheap, index-only)
table.evict(e)
enc += cost_duplicate(rel)
table.insert(e.name, e.value)
reinserts += 1
end
end
enc += insert_cost(table, n, v)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
else
hdr += emit_ref_cost(table, n, v)
end
end
end
{ header: hdr, encoder: enc, reinserts: reinserts }
end
# CLOCK make-room returning [num_duplicates, duplicate_bytes]. Reprieve (Duplicate)
# a touched tail entry and CLEAR its bit (so it gets at most one reprieve per lap
# -> reinserts bounded); drop untouched ones.
def clock_make_room2(table, need)
bytes = 0
freed = 0
dups = 0
limit = table.entries.size
until freed >= need || table.entries.empty?
e = table.oldest
if e.referenced && dups < limit
rel = table.rel(e)
table.evict(e)
bytes += cost_duplicate(rel)
table.insert(e.name, e.value) # bit reset on reinsert == cleared
dups += 1
else
freed += e.size
table.evict(e)
end
end
[dups, bytes]
end
# Strategy 14: W-TinyLFU split -- recency (CLOCK bit, cleared on reprieve) decides
# the eviction ORDER (so reinserts are bounded), decayed frequency decides ADMISSION
# (candidate admitted only if it outranks the least-frequent resident). Eager fill
# while room. All at-occurrence (insert+reference).
def run_clock_freq(blocks)
cap = $cap
table = DynTable.new(cap)
freq = Hash.new(0.0)
add = 1.0
hdr = 0
enc = 0
reinserts = 0
blocks.each do |b|
add *= DECAY_FACTOR
if add > 1e18
freq.transform_values! { |x| x / add }
add = 1.0
end
b.each do |n, v|
key = "#{n}\x00#{v}"
freq[key] += add
if (e = table.lookup_pair(n, v))
hdr += cost_dyn_indexed(table.rel(e))
e.referenced = true
next
end
size = entry_size(n, v)
if size > cap
hdr += emit_ref_cost(table, n, v)
next
end
if table.room?(size)
enc += insert_cost(table, n, v)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
next
end
ranked = table.entries.map { |e| freq["#{e.name}\x00#{e.value}"] }.sort
bar = ranked[ranked.size / PROT_DIV] # admission: candidate must beat the bottom-1/PROT_DIV cutoff
if freq[key] > bar
d, db = clock_make_room2(table, size - (cap - table.bytes))
reinserts += d
enc += db
enc += insert_cost(table, n, v)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
else
hdr += emit_ref_cost(table, n, v)
end
end
end
{ header: hdr, encoder: enc, reinserts: reinserts }
end
# Strategy 15: shadow-rank + QPACK-as-mirror. A shadow table (freq, decayed) ranks
# all observed pairs; the QPACK table is the materialized top-fit subset. No policy
# on the QPACK side -- the only decision is the SWAP: a riser is admitted (evicting
# residents it beats by >= SWAP_MARGIN, Duplicating the rest dug past) only when it
# clearly outranks them. The margin is the hysteresis that keeps the subset (and so
# the encoder stream) stable. All at-occurrence. Reports reinserts and swaps (churn).
def run_shadow_rank(blocks)
cap = $cap
table = DynTable.new(cap)
freq = Hash.new(0.0) # the larger shadow table: decayed rank for all pairs
add = 1.0
hdr = 0
enc = 0
reinserts = 0
swaps = 0
blocks.each do |b|
add *= DECAY_FACTOR
if add > 1e18
freq.transform_values! { |x| x / add }
add = 1.0
end
b.each do |n, v|
key = "#{n}\x00#{v}"
freq[key] += add
if (e = table.lookup_pair(n, v))
hdr += cost_dyn_indexed(table.rel(e)) # in the subset -> reference
next
end
size = entry_size(n, v)
if size > cap
hdr += emit_ref_cost(table, n, v)
next
end
if table.room?(size)
enc += insert_cost(table, n, v) # room -> materialize (fill)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
next
end
fc = freq[key]
need = size - (cap - table.bytes)
droppable = table.entries.sum { |e| fc > freq["#{e.name}\x00#{e.value}"] * SWAP_MARGIN ? e.size : 0 }
if fc >= add * REP_T && droppable >= need # repeat-evidence (recent recurrence) + clearly beats residents
freed = 0
until freed >= need
e = table.oldest
if fc > freq["#{e.name}\x00#{e.value}"] * SWAP_MARGIN
freed += e.size
table.evict(e) # outranked by margin -> drop (falls out of subset)
else
rel = table.rel(e) # still ranks -> keep via Duplicate
table.evict(e)
enc += cost_duplicate(rel)
table.insert(e.name, e.value)
reinserts += 1
end
end
enc += insert_cost(table, n, v)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
swaps += 1
else
hdr += emit_ref_cost(table, n, v)
end
end
end
{ header: hdr, encoder: enc, reinserts: reinserts, swaps: swaps }
end
# Strategy 16: the recommended design (see header) -- shadow-rank + swap with the
# rank gain-weighted: freq * savedPerHit(value) / size (bytes-saved-per-byte-held,
# the `score` model of strategies 3-9) instead of bare decayed frequency, so a
# large-payload pair can outrank a more frequent but tiny-payload one. The repeat-
# evidence gate stays on RAW frequency (it asks "did this recur recently", a
# question about counts, not bytes).
def gain_weight(n, v)
saved = enc_str_payload(v) # bytes saved per future reference (value payload)
saved <= 0 ? 0.0 : saved.to_f / entry_size(n, v)
end
def run_shadow_rank_gain(blocks)
cap = $cap
table = DynTable.new(cap)
freq = Hash.new(0.0) # raw decayed frequency (drives the repeat-evidence gate)
add = 1.0
hdr = 0
enc = 0
reinserts = 0
swaps = 0
rank = ->(key, n, v) { freq[key] * gain_weight(n, v) } # gain-weighted shadow rank
blocks.each do |b|
add *= DECAY_FACTOR
if add > 1e18
freq.transform_values! { |x| x / add }
add = 1.0
end
b.each do |n, v|
key = "#{n}\x00#{v}"
freq[key] += add
if (e = table.lookup_pair(n, v))
hdr += cost_dyn_indexed(table.rel(e)) # in the subset -> reference
next
end
size = entry_size(n, v)
if size > cap
hdr += emit_ref_cost(table, n, v)
next
end
if table.room?(size)
enc += insert_cost(table, n, v) # room -> materialize (fill)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
next
end
rc = rank.call(key, n, v) # candidate's gain-weighted rank
need = size - (cap - table.bytes)
droppable = table.entries.sum { |e| rc > rank.call("#{e.name}\x00#{e.value}", e.name, e.value) * SWAP_MARGIN ? e.size : 0 }
if freq[key] >= add * REP_T && droppable >= need # repeat-evidence (raw freq) + clearly beats residents
freed = 0
until freed >= need
e = table.oldest
if rc > rank.call("#{e.name}\x00#{e.value}", e.name, e.value) * SWAP_MARGIN
freed += e.size
table.evict(e) # outranked by margin -> drop (falls out of subset)
else
rel = table.rel(e) # still ranks -> keep via Duplicate
table.evict(e)
enc += cost_duplicate(rel)
table.insert(e.name, e.value)
reinserts += 1
end
end
enc += insert_cost(table, n, v)
e = table.insert(n, v)
hdr += cost_dyn_indexed(table.rel(e))
swaps += 1
else
hdr += emit_ref_cost(table, n, v)
end
end
end
{ header: hdr, encoder: enc, reinserts: reinserts, swaps: swaps }
end
# ---------------------------------------------------------------------------
# QIF parsing + main
# ---------------------------------------------------------------------------
def parse_qif(path)
blocks = []
cur = []
File.foreach(path) do |line|
line = line.chomp
if line.empty?
blocks << cur unless cur.empty?
cur = []
next
end
next if line.start_with?("#")
name, value = line.split("\t", 2)
cur << [name, value || ""]
end
blocks << cur unless cur.empty?
blocks
end
return unless __FILE__ == $PROGRAM_NAME # allow `require`/`load` for offline analysis
path = ARGV[0] or abort "usage: ruby #{$0} <file.qif> [capacity_bytes]"
$cap = (ARGV[1] || 4096).to_i
blocks = parse_qif(path)
nfields = blocks.sum(&:size)
raw = blocks.sum { |b| b.sum { |n, v| n.bytesize + v.bytesize } }
results = {
"1. static+huffman" => run_static(blocks),
"2. insert-till-full" => run_insert_until_full(blocks),
"3. oracle (Belady)" => run_oracle(blocks, $cap),
"4. ours no-evict" => run_ours_noevict(blocks),
"5. ours evict ideal" => run_ours_ideal_evict(blocks),
"6. ours FIFO+dup" => run_ours_fifo_dup(blocks),
"7. ours UCB+FIFO" => run_ours_ucb(blocks),
"8. ours UCB no-evict" => run_ours_ucb_noevict(blocks),
"9. ours UCB+tracker" => run_ours_tracker(blocks),
"10. insert-all CLOCK" => run_insert_all_clock(blocks),
"11. periodic update" => run_periodic_update(blocks),
"12. fill-then-replace" => run_fill_then_replace(blocks),
"13. decayed-LFU FIFO" => run_decay_fifo(blocks),
"14. CLOCK+freq-admit" => run_clock_freq(blocks),
"15. shadow-rank+swap" => run_shadow_rank(blocks),
"16. gain shadow-rank" => run_shadow_rank_gain(blocks),
}
puts "file: #{path}"
puts "responses: #{blocks.size}"
puts "fields: #{nfields}"
puts "raw n+v: #{raw} bytes"
puts "capacity: #{$cap} bytes (priors a=#{PRIOR_ALPHA} b=#{PRIOR_BETA}, T=#{BAR_T})"
puts
printf "%-20s %10s %10s %10s %8s %8s\n", "strategy", "header", "encoder", "total", "vs raw", "vs (1)"
base = results["1. static+huffman"].then { |r| r[:header] + r[:encoder] }
results.each do |name, r|
total = r[:header] + r[:encoder]
extra = ""
extra += " reinserts=#{r[:reinserts]}" if r[:reinserts]
extra += " swaps=#{r[:swaps]}" if r[:swaps]
printf "%-20s %10d %10d %10d %7.1f%% %7.1f%%%s\n",
name, r[:header], r[:encoder], total, 100.0 * total / raw, 100.0 * total / base, extra
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment