Skip to content

exp(targeting): bench six exposure-storage shapes for fcap_keys[]#106

Open
bokelley wants to merge 1 commit into
mainfrom
bokelley/expbench-fcap-storage
Open

exp(targeting): bench six exposure-storage shapes for fcap_keys[]#106
bokelley wants to merge 1 commit into
mainfrom
bokelley/expbench-fcap-storage

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

Summary

Empirical comparison of six storage shapes for the per-user exposure log under the fcap_keys[] data model proposed in #104. Sanity test verifies all rolling-window variants compute identical eligibility answers; bucket variants implement different rule semantics (calendar bucket vs. rolling) and are tested separately.

Run against real valkey 7.2 (docker), six variants, single-client local, 200 iterations per measurement.

Variants

  • binary: generalized fixed-stride byte slab — current targeting/exposure_binary.go shape, extended to carry up to 8 fcap_key hashes per impression
  • zset-array: one ZSET per user, member encodes [imp_hash, fcap_key_hashes[]]
  • zset-perkey: one ZSET per user, K members per impression (one per fcap_key)
  • zset-perkeyed: one ZSET per (user, fcap_key) pair — single-key reads only fetch the relevant data
  • bucket-day: SET per (user, day) — singleton-per-period rule shape (the AppNexus model)
  • bucket-count: HASH per (user, day) — count-per-period rule shape

Headline finding

The fcap rule shape matters more than the storage shape. Splitting rules into bucket-per-period (90% of real rules) vs. rolling-window (10%) collapses most of #104's complexity:

Rule shape Storage Heavy 30K perf
singleton:{day,week,month,lifetime} SET per (user, period), TTL read1k 482 µs, mem 572 KB, free cleanup
count:{day,week,month} HASH per (user, period), HINCRBY read1k 536 µs, mem 249 KB, free cleanup
rolling:N:Ns ZSET per (user, fcap_key) read1 305 µs

Binary log loses on writes by 100+× — the read-modify-write cost on every impression is structural, not optimizable. Goes away in the proposed split.

ZSET wire-size advantage shrinks at 30d rules — 24h rule benchmarks understated the worst case by ~10×; 30d caps pull the entire log because there's no server-side window filter to skip.

Full results table and analysis posted on #104.

What this is — and isn't

Running

docker run -d -p 6380:6379 valkey/valkey:7.2
VALKEY_ADDR=localhost:6380 go test -count=1 -run TestBench$ -v ./targeting/expbench/

Test plan

  • All six variants compile and pass equivalence sanity check (TestSanity_Equivalence)
  • Bench runs to completion against real valkey
  • Numbers reproducible across multiple runs (within ~20% noise band)
  • Reviewer sanity-check on whether the bench fairly represents production workload

Decision pending on #104

Issue comment proposes the AppNexus-style split as the new scope: schema declares rule_shape per fcap_key, two thin stores (BucketStore + RollingStore), engine dispatches by shape, binary log apparatus deleted entirely. This PR is the data behind that proposal.

Related: #104, adcontextprotocol/adcp#3359

🤖 Generated with Claude Code

…a model

Six storage variants benched end-to-end against real valkey:
  - binary: generalized fixed-stride byte slab (current shape, fcap_keys-extended)
  - zset-array: one ZSET per user, member encodes [imp_hash, fcap_keys[]]
  - zset-perkey: one ZSET per user, K members per impression
  - zset-perkeyed: one ZSET per (user, fcap_key) pair
  - bucket-day: SET per (user, day) — singleton-per-period rule shape
  - bucket-count: HASH per (user, day) — count-per-period rule shape

Sanity test verifies all rolling-window variants compute identical answers
under the same workload; bucket variants are excluded from that test
because they answer a different rule semantic (calendar bucket vs. rolling).

Findings (heavy 30K-impression user, 30-day retention):
  - Binary log loses on writes by 100+× due to read-modify-write per impression
  - ZSET wire-size advantage shrinks at 30d rules (no server-side window filter benefit)
  - ZSET-perkeyed wins single-key reads; ZSET-array wins large-batch
  - bucket-day/bucket-count win every dimension when rules fit calendar buckets:
      reads flat across batch sizes (single SMEMBERS/HGETALL), 6-21× smaller memory,
      free TTL-driven cleanup

Bench package is its own go module to keep redis client deps out of the main
module. Run with `VALKEY_ADDR=host:port go test ./targeting/expbench/`.

Exploratory work for the fcap_keys generalization (issue #104). Not intended
to be production code; kept in-tree as durable comparison artifact.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants