Wraith release notes and API twin conformance progress
v0.9.0 - 2026-05-26
Section titled “v0.9.0 - 2026-05-26”Overlays. A consumer team can layer their own routes, variants, fixtures, and fault profiles onto a provider-owned base twin without forking the base, and ship that layer as its own .wraith artifact. Pre-existing root twins are completely unaffected — overlays are inert without a [base] section in wraith.toml.
Why you might want this
Section titled “Why you might want this”- You need a behavior the provider hasn’t recorded (a webhook replay path, an error scenario, a specific edge case in CI).
- Your test environment needs different fixture data than the base ships.
- You want to add fault injection or latency profiles without touching the shared twin.
If you’d otherwise vendor and edit a copy of someone else’s twin, you want an overlay.
wraith init checkout-billing --base billing-api@sha256:abc --owner checkoutwraith record checkout-billing --tag happy-pathwraith synth checkout-billing # --delta is the defaultwraith compose --base billing-api.wraith \ --overlay checkout-billing.wraith \ --output compositewraith serve compositeSee Overlays for the full workflow, configuration reference, and v0 scope notes.
New commands
Section titled “New commands”wraith compose— merge a base plus one or more overlays into a materialized composite twin (a workspace or.wraitharchive). Deterministic: same inputs in the same order produce byte-identical outputs.wraith rebase-check— when the base advances, classify whether your overlay still applies cleanly against the new digest without having to re-record. Emitscompatible,additive-safe, orconflictwith evidence.wraith promote— gated publication of an overlay artifact. Requires policy pass plus evidence sufficiency. Evidence-light overlays can still be checked, but they can’t be promoted.
New flags
Section titled “New flags”wraith init --base <ref> --owner <team>— initialize a twin as an overlay against a digest-pinned base.wraith synth --delta | --full | --base-path <path>—--delta(the default for overlay twins) synthesizes only the routes that diverge from the base;--fullsynthesizes the entire twin. Root twins always synth full.wraith serve --overlay <ovl.wraith> [--keep-composite] [--fixture <name>]— convenience for “compose then serve” without writing a composite to disk first.--keep-compositeretains the materialized workspace for debugging.wraith check --fixture <name>— pick which overlay’s fixture set seeds the default namespace during conformance.wraith pack --include-diagnostics— ship compose-phase diagnostics inside the packed archive’sreports/tree.
Safety posture
Section titled “Safety posture”Overlay policy uses the existing exit-code discipline:
- 0 — composes cleanly.
- 1 — user error (bad config, missing artifact).
- 3 — policy violation (weaker scrub posture, base-route deletion, Lua handler shadowing, etc.).
- 4 — runtime error during composition.
Other improvements
Section titled “Other improvements”composeoutput is fully self-contained. Composite workspaces now carry the mergedstate/fixtures/and recordings rather than referring back to the input artifacts.composerejects archive entries with traversal-shaped paths (.., absolute paths, symlinks pointing outside the input root). Defense-in-depth on the unpack stage.- Same inputs to
composeproduce byte-identical artifacts. SetSOURCE_DATE_EPOCHto pin timestamps further. Useful for CI that diffs.wraitharchives. wraith synth --deltawritesbuild/delta-report.jsonwith a per-route breakdown ofcovered_by_base/delta/unreplayableand structured advice (overlay-is-redundant,many-unreplayable,base-route-missing) — gives a clear signal about whether an overlay is doing anything new or whether you should re-record.wraith lintcatches overlay misconfigurations. Missing[overlay].owner, invalid base digest, mismatched capability flags, and overlay twins that try to enable passthrough are flagged with the same surfacewraith doctoralready used.
v0.8.4 - 2026-05-21
Section titled “v0.8.4 - 2026-05-21”Patch. Search and query POST routes no longer mint phantom entities in state.
POSTs like POST /v1/assets/actions/search were being classified as resource Create operations, so every search call left a junk entity behind in the per-session state store. After v0.8.3 wired seeded fixtures through serve, this caused three visible problems: search-shaped fixture-name collisions, faster-than-expected exhaustion of serve.limits.max_entities_per_type, and state snapshots polluted with synthetic search responses.
Action POSTs are now detected by two signals — the last URL segment (search, query, count, aggregate, summarize, lookup, and the Stripe-style /actions/<verb> shape) and the response body shape (single-array bodies or paginated {results: [], next_cursor: …} shapes). Routes matching either signal dispatch without state mutation.
If you have a twin where this heuristic now applies (e.g. POST /v1/customers/search), re-run wraith synth to pick up the fix.
v0.8.3 - 2026-05-21
Section titled “v0.8.3 - 2026-05-21”Patch. state/fixtures/ is now actually loaded at serve time.
The state/fixtures/<entity>.json shape has been documented since v0.1, but wraith serve never read those files — every state-backed Read or List started with an empty store regardless of what was on disk. This is now wired through end-to-end.
- Per-session seeding.
state/fixtures/<entity_type>.jsonis loaded once perX-Wraith-Sessionnamespace on first use. A delete then persists for the rest of the session; no re-seeding mid-session. - Default namespace too. Requests without an
X-Wraith-Sessionheader still get seeded. state/schema.jsondeclarations merge into the route-derived schema. Route-derived wins on conflict, so an emptyentity_types: {}(thewraith initdefault) is fully inert.- Fail-safe. Missing or malformed files warn-log and proceed rather than crash
serve. A twin with nostate/directory behaves exactly as it did pre-v0.8.3.
Use case. Multi-twin demos and shared-entity test scenarios — e.g. customer cus_123 referenced consistently across a CRM twin, a billing twin, and an orders twin — can now be set up by authoring one fixture per twin rather than driving a POST sequence at the start of every session.
Heads up — outbound scrub still runs on seeded fixtures. A fixture entity with a name field will be tokenized on the wire by the default PII rules ("alpha" → "name_<base62>"). This is the v0.6.0 PII behavior, not a regression — but it surprises fixture authors. Workarounds: add the field to your [pii] allowlist in scrub.toml, or set [pii] detect = false for twins where seeded values aren’t real PII.
v0.8.2 - 2026-05-19
Section titled “v0.8.2 - 2026-05-19”Patch. Closes the remaining $arr_N placeholder leak on List and Read routes.
v0.7.1 fixed Create dispatch, but nested array placeholders — e.g. {"data":{"items":["$arr_0"]}} or a sibling meta / facets array — still leaked through List and Read because those handlers only rewrote the top-level collection array. They’re now expanded everywhere variants surface a body, so no literal $arr_N markers reach the wire.
If your synthesized responses include nested arrays and you saw ["$arr_0"] in serve output before, upgrade and they’re gone. No re-synth required.
v0.8.1 - 2026-05-18
Section titled “v0.8.1 - 2026-05-18”Patch. Completes the v0.8.0 CORS preflight fix under the default config.
v0.8.0 correctly synthesized access-control-allow-{methods,headers} and vary on the OPTIONS variant, but the default strip_headers = true config (created by wraith init) then stripped those exact headers on the way out because they weren’t in the response-header allowlist. The v0.8.0 fix was therefore inert for most twins.
access-control-allow-methods, access-control-allow-headers, access-control-max-age, and vary are now in the default allowlist. Cross-origin clients hitting a synth twin behave correctly under strip_headers = true. Conformance scoring is unaffected.
v0.8.0 - 2026-05-18
Section titled “v0.8.0 - 2026-05-18”Feature release. Closes three rough edges that came up in real-corpus use: dropped CORS preflight headers, repetitive array elements, and routes whose response depends on a request field. Every new behavior is opt-in or a strict bugfix; pre-existing twins keep their current bytes unless you opt in.
CORS preflights actually work in browsers
Section titled “CORS preflights actually work in browsers”wraith serve --fidelity synth returned a bare 204 for cross-origin OPTIONS preflights, dropping access-control-allow-{origin,methods,headers} and vary. Every browser request was therefore blocked at the preflight stage. The synthesized OPTIONS variant now carries the recorded CORS headers, body-less status groups (204 / 304) included. Strict-mode replay was already correct; synth now matches.
Configurable array-element variety
Section titled “Configurable array-element variety”array_length = "p90" (v0.7.2) recovered a ~500-long array but anti-unification still capped the distinct elements at 8 and tiled them to length — list UIs showed 8 rows repeated ~62×.
[generate.anti_unification]max_array_representatives = "all" # or a bound like 200Default stays at 8 so existing twins are byte-unchanged. Catalog or search APIs whose recordings carry many distinct rows are the main beneficiaries.
Request-keyed response bucketing
Section titled “Request-keyed response bucketing”Some routes return different bodies depending on a request field — a parent id, a useCase scope, a search filter. Without help, synth collapses every input to one global representative, and every variation in the request returns the same canned response. The new request-keying machinery synthesizes one response per request-field bucket and routes the right one back.
[generate.request_keying]mode = "manual" # or "auto" for conservative auto-detection
[[generate.request_keying.route]]route = "POST /v1/assets/actions/search"fields = ["$.input.filter.parentId"]Default is mode = "off", fully inert. Use manual to declare keys per-route, or auto to let synth try to detect a key for unruled routes when one strongly predicts the response.
Recommended config
Section titled “Recommended config”For catalog / search-shaped APIs that combine bimodal arrays with request-keyed responses:
[generate.anti_unification]array_length = "p90"drop_empty_array_responses = truemax_array_representatives = "all"
[generate.request_keying]mode = "manual"v0.7.2 - 2026-05-15
Section titled “v0.7.2 - 2026-05-15”Feature release. Adds two knobs so synth handles bimodal / search corpora correctly. Both default to pre-v0.7.2 behavior exactly — existing twins are byte-unchanged unless you opt in.
A debounced search endpoint records a flood of empty no-match responses interleaved with a few real catalog loads. Synth’s default median-length array policy then collapsed such routes to ~1-element arrays even though the data was right there in the recordings.
Two new knobs, both under [generate.anti_unification]:
array_length—"median"(default),"p75","p90", or"max". Pick the length statistic that matches your corpus shape.drop_empty_array_responses—false(default). Whentrue, all-empty responses are excluded from anti-unification per status group, but only when at least one non-empty response exists for that group, so error variants and scalar responses are never dropped.
wraith synth now prints the active policy in its fidelity warning and, on collapse-prone defaults, suggests the exact stanza to add.
Recommended config for bimodal / search APIs
Section titled “Recommended config for bimodal / search APIs”[generate.anti_unification]array_length = "p90" # or "max"drop_empty_array_responses = truev0.7.1 - 2026-05-14
Section titled “v0.7.1 - 2026-05-14”Patch. Fixes a placeholder leak in synth-mode Create responses.
wraith serve --fidelity synth was returning literal ["$arr_0"] strings in POST responses for routes classified as Create whose variants used variable-length array placeholders. The same string was also being persisted into state, so subsequent Read / List requests for that entity kept emitting it indefinitely.
Fixed at write time — expanded entities go into state, and expanded bodies go to clients. Re-pack any twin whose recordings include Create routes with variable-length arrays to flush the bad state from earlier serve runs.
The related nested-placeholder leak on List / Read routes is fixed in v0.8.2.
v0.7.0 - 2026-05-13
Section titled “v0.7.0 - 2026-05-13”wraith generate hardening release. Four review passes on generate alone surfaced 11 fixable bugs — budgets that didn’t enforce, audits that didn’t write, scores that disagreed with wraith check, rejection reasons that hid the real cause. All fixed. The agentic and single-shot loops are now trustworthy enough to drive in CI.
Hard budgets
Section titled “Hard budgets”--time-budgetcancels in-flight LLM calls. Was advisory — a stalled call ran until external SIGKILL. Now each provider call is wrapped against the run-level deadline; on expiry the in-flight HTTP future is dropped and the process exits withintime_budget + 5sgrace. Covers ollama, openai, openrouter, and command providers, in agentic and single-shot modes.--token-budgetenforced per-call. The LLM’s completion is capped atmin(8192, tokens_remaining)so a single response can’t push wildly over budget. Prompt tokens are also accounted now —estimate_prompt_tokens()(chars/4) subtracts from the budget beforemax_tokensis computed, and the call is skipped entirely when the prompt alone would exceed the budget. Stripe-sized prompts (~28k tokens) overshoot dropped from ~22% to ~0%.
Conformance and audit fidelity
Section titled “Conformance and audit fidelity”- Generate’s score matches
wraith check. Previously generate called the conformance engine withlua_dir=None, so on twins with Lua handlers (orderledger has 7) the engine returned 501s the diff engine saw as 233 phantom divergences. The Lua directory is now threaded through every call site; generate’s reported score equalswraith check --in-memory. generate-audit-*.jsonwritten on every run. Previously the audit directory was empty after every run (wrong write path). A new RAII writer atomically rewrites the file at start, after each round, on success, on error, and on panic-unwind. Schema: timestamps, twin/provider/model, budgets, initial + final conformance, per-route patches with reasons, per-round agentic transcripts, token spend, exhaustion reason.- SIGKILL-safe audits. A new
startedexhaustion-reason marker is written at construction so SIGKILL’d runs leave a meaningful marker on disk — readers can distinguish “still running” from “completed cleanly” instead of seeingnull.
Envelope honesty
Section titled “Envelope honesty”- Unified
exhaustion_reasonacross envelope and audit. Was two separate enums with different precedence — the same run could reportiterationsin the envelope andbudget_exhaustedin the audit. Now a single enum with documented precedence (error > panic > killed > time_exhausted > budget_exhausted > iterations_exhausted > completed); the two surfaces always agree. - Token-vs-time precedence is honest. A pre-call gate previously set a generic
budget_hitflag that mapped totime_exhaustedalways — so a token-budget run reportedtime_exhausted. A typedBudgetHitCausecarries the specific cause and routes each variant to the rightExhaustionReason. - Real rejection reasons. Rejected patches no longer all report
"no edits made". Each rejection site emits a specificrejection_reason: budget-exhausted | parse-failure | regression-rejected | empty-edits | protocol-failure | llm-error | user-declined.
Working --interactive
Section titled “Working --interactive”--interactivenow actually prompts. Was declared and documented but never read. Now: before applying each accepted patch, a unified diff of{status, headers, template}is printed to stderr followed byapply this patch? [y/N]:.y/yesaccepts; anything else (including EOF / empty line) rejects withrejection_reason: user-declined. Stdout JSON envelope stays clean. Works in both agentic and--no-agenticmodes.
- Lib tests: 2890 → 2953 (+63 across the release).
- 11 generate-related bones closed across 4 review passes; zero open bugs at cut.
v0.6.0 - 2026-05-11
Section titled “v0.6.0 - 2026-05-11”Brutal-review shakedown. 14 review passes, 70+ fixes, zero open bugs at cut. New wire-mode conformance, new wraith install, principled PII machinery.
New commands
Section titled “New commands”wraith install <pack.wraith>— inverse ofwraith pack. Extracts a packaged twin into a usable workspace. Verifies per-artifact digests before writing any files. Defense-in-depth PII rescrub on extraction.--name,--into,--force,--no-verify,--rescrub.wraith check --wire— wire-mode conformance. Spawns the real serve on a loopback port and replays recorded requests through it. Catches protocol-level bugs the in-memory check is blind to (header stripping, scrub layer mismatch, status code drift). Emits a separatewire_fidelity_bpscore with the same partial-credit formula as the replay score.wraith check --upstreamwithout--targetor--in-memorynow defaults to in-memory replay (previously silently no-op’d). Emits info advice noting the implicit choice.wraith reducestrategies are distinct.coverageuses greedy set cover;diversityuses farthest-point-first by Jaccard distance;recencyranks by timestamp. Invalid--target-size(e.g.abc, bare50without%) now exits non-zero with a hint instead of silently no-op’ing.
Conformance honesty
Section titled “Conformance honesty”- Error-severity divergences count against the score.
wraith checkno longer reports 100% conformance while emitting thousands of severity=error divergences. Any error-severity divergence on an exchange zeros the affected component score. drift_typeclassifier refined. Newnumeric_drift,host_rewrite,url_drift,value_drift.enum_expansionreserved for real string-enum cases.upstream_fidelity_bp— separate score answering “does the twin look like the live upstream right now?” Network failures degrade gracefully.
State engine fidelity
Section titled “State engine fidelity”- 404 on unknown IDs for Read endpoints when both 2xx and 4xx variants are present.
GET /v1/customers/cus_FAKE→ 404 instead of 200 with empty body. - POST /:id classified as Update, not Create. Matches Stripe convention. Sub-resource POSTs (
/cancel,/capture) still classify as Action. - DELETE preserves pre-mutation membership — first delete returns 200, second returns 404. Was: first delete returned 404 with
deleted:truebody (status/body mismatch). - List endpoints honor pagination —
?limit,?offset,?page+per_page,?starting_after,?ending_before,?cursor.has_moreis set when the template carries the field. Stripe, PostgREST, page-style, and Google-style conventions covered. - List handler is O(limit), not O(N).
?limit=10against 10k entities: 70ms → 7ms. 1000 parallel?limit=10: 66s → 0.7s. Idempotency-Keyhonored on POST (opt-in via[serve.idempotency]). Per-namespace(route, key) → cached response.- REST and GraphQL malformed bodies return 400. Empty body, primitives, shape-mismatched arrays all rejected with a structured
invalid_request_errorenvelope. Default fallback when no recorded 4xx variant exists. - URL normalization at request entry.
/v1/customers/.and/v1/customers/..are rejected with 400;/v1/customers//collapses to the list route. RFC 3986 dot-segment handling. - Seen IDs serve recordings verbatim. When the request path matches a recorded URL exactly, serve the recorded body bit-for-bit. The new hash-based variation only fires for unseen IDs.
Synthesis fidelity
Section titled “Synthesis fidelity”- Path collapser preserves collection roots.
/v1/balance,/v1/charges,/v1/payment_intents, etc. stay as specific routes; only ID-shaped segments become:param. No more spurious/v1/:paramcatch-alls. - Numeric path segments collapse to
:paramafter N distinct values (was N=∞)./pokemon/{1,4,25}→/pokemon/:param. Was: 3 separate routes; unseen IDs returned 501. - Array length distribution preserved. Synthesized responses render arrays at the median observed length, cycling through up to 8 representative elements, instead of folding to a single placeholder.
- Cardinality-detected per-twin enum_paths. A new synth-time analyzer marks low-cardinality high-repetition kebab/snake-case fields as enum. The PII walker skips them. No more hardcoded list of “pokeapi.ability.name” / etc. entries in source — a new API (Discord, Salesforce, anything) gets the same treatment automatically.
- Per-request hash-seeded representative selection. Same path → same response (deterministic). Different paths → different response content drawn from observed representatives.
Runtime fidelity
Section titled “Runtime fidelity”- Lua handler errors return 500. Previously silently fell through to template rendering with a random
muxemwxu-shaped id, making test failures invisible. - Lua handlers resolve by filename convention when no explicit hook is set in the model. Was: synthesis never populated
vm.lua_hook, so handlers loaded but never ran; template rendering clobbered computed values (total: 134.34template constant). - Form-encoded numeric scalars coerce to recorded type. Stripe
amount=8888now renders asValue::Number(8888)(was"8888"). - Clock holes resolve per-request. New
[serve.clock] mode = real | deterministic | fixed. Default is real wallclock; deterministic uses a seeded monotonic counter. - URL rewrite on outbound responses. Absolute URLs at the recorded upstream host are rewritten to point at the twin. Third-party URLs (GitHub raw, CDNs) preserved verbatim (was being replaced with UUID placeholders).
- Vendor headers stripped on serve by default (
Cf-Ray,X-Cache,Server, etc.). Configurable via[serve] strip_headers.
Scrub and PII
Section titled “Scrub and PII”- Default scrub rules cover email, phone, name, SSN, git author blobs. Git commit metadata in GitHub recordings is tokenized at write time.
- Doctor scans recordings + model bodies for PII. New
--allow-piiflag downgrades findings to info.wraith export openapi githubandwraith packboth re-scrub before emit so legacy twins don’t ship raw PII. [pii]scrub.toml section.detecttoggle,allowlistfor legitimate non-PII paths,default_action,fields.alwaysfor explicit overrides. Suffix-matching on*_name/*_emailcatchescustomer_name,employee_name,author_email.pseudonymizescrub action — deterministicuser_<base62>replacement keyed by HMAC. Stable across recordings/exports/packs for the same input.wraith packarchives are byte-stable with[serve.clock] mode = "deterministic". Two consecutive packs produce identical sha256 hashes.wraith verify-packreports PII findings alongside the digest check.--strictflips warnings to failures.- Confidence-based outbound scrub on live serve. Enum values (
bulbasaur,grass,razor-wind) preserved; real person names (including short ones likebob) tokenized. Cardinality detection distinguishes thing-with-a-label-name entities (preserve.name) from person-with-a-personal-name entities (scrub).
grpc-statusin HTTP/2 trailers for non-empty bodies. Was in initial headers — a spec violation that grpcurl, tonic, gRPC-Go, gRPC-Java, and official Python gRPC all reject. Empty-body errors still use the spec-permitted Trailers-Only form.
Reliability
Section titled “Reliability”- UTF-8-safe
common_prefix. Synthesis no longer panics on multi-byte UTF-8 (Japanese, Cyrillic, accented Latin, emoji). API twins for internationalized APIs (anything with localized strings) build successfully.
- Lib test count: 2403 → 2890 (+487).
- 14 brutal-review passes; ended with zero open bugs.
- 70+ feature/fix commits since v0.5.2.
v0.5.2 - 2026-05-01
Section titled “v0.5.2 - 2026-05-01”Streaming and capture fidelity. Three new fixture twins.
Streaming + recording
Section titled “Streaming + recording”wraith recordsurvives SIGTERM mid-stream. Long SSE/gRPC streams cut by SIGTERM (orwraith record stop, vessel, systemd) now persist their WREC and session manifest withtruncated=trueinstead of vanishing silently. The forward proxy now also handles SIGTERM; previously onlyCtrl-Cwas caught.- In-flight streams pin sessions against the idle timeout. A long SSE stream (e.g. an LLM streaming for >30s on CPU) no longer fragments surrounding exchanges into separate sessions in
wraith inspect. Sessions close when the activity actually stops, not when the next exchange happens to start. - gRPC replay is byte-faithful for fixed-length arrays. Fixed-position event slots in a recorded stream now render with the correct per-slot template instead of position 0’s. No more ghost proto3 default values on the wire.
- Synthesized 429 bodies match the route’s recorded 4xx shape. Stripe gets
{error: {type, code, message}}, GitHub gets{message, documentation_url}, Twilio and GraphQL likewise. Fallback when no 4xx is recorded is a structured{status, code, message, retry_after}- friendlier to clients deserializing into typed error structs. - Volatile response headers freshly emitted at serve time.
Date,Server,X-Request-Id,Cf-Ray,Etagare dropped at synth time and synthesized at serve time so 200s and 429s carry the same wallclockDatesource - important for HMAC signers and freshness checks.
Variant routing
Section titled “Variant routing”- Header presence as a guard. When a single route records both authed (200) and unauthed (401) shapes,
wraith synthinfersHeaderPresent/HeaderAbsentguards on the discriminating header (e.g.Authorization). At serve and check time, requests route to the matching variant. Header-name-agnostic - any consistently-present-vs-absent header qualifies.
wraith.toml artifact completeness
Section titled “wraith.toml artifact completeness”twin.wir.json is the documented portable twin artifact. It used to silently drop several pieces of metadata that wraith serve already supported via the in-memory model. Now round-tripped:
- Per-route binary content type and body (HTML, plain text, opaque binary endpoints)
- Per-route gRPC marker
- Per-variant Lua hook handler
- Per-route symbol table
- Per-variant header programs and optional-field lists
All additions are backward-compatible - existing twin.wir.json files load unchanged.
- Exercise scripts force a session boundary (
POST /__wraith/new-session) between recording iterations. Multi-session runs now produce real session boundaries instead of one giant session. wraith inspectsurfaces refresh probe recordings (recordings/refresh/<run_id>/sessions/) alongside regular ones.
New twins (podman fixtures)
Section titled “New twins (podman fixtures)”Three streaming-fixture twins for contributors to replay end to end:
- mercure - pure SSE hub. Infinite-stream regression target.
- caddy-sse - minimal controlled SSE fixture with configurable event count, cadence, and payload shape.
- qdrant - vector DB gRPC twin. Validates the unary gRPC + protobuf-descriptor pipeline.
v0.5.1 - 2026-04-30
Section titled “v0.5.1 - 2026-04-30”v0.4 shakedown follow-ups. Twin-quality fixes + lifecycle commands.
Twin-quality fixes
Section titled “Twin-quality fixes”- DELETE replay matches recorded shape.
wraith servenow renders the variant body template on DELETE instead of substituting a hardcoded{deleted, id}body. Literal fields likeobject: "coupon"survive. - Numeric epoch fields stay numeric. Fields like Stripe’s
created(Unix epoch seconds, integer) are no longer overlaid with ISO 8601 strings. The classified clock unit (epoch_sec/epoch_ms/iso_string) drives output, not the field name. - No more
$hole_*placeholder leaks. Unfilled holes can never reach the wire under any classification. The hole classifier learns ID shape from observations: prefix, length, and character class. Stripe-shaped IDs (cus_<14 base62>) and short token fields (e.g. 7-char uppercase alnum) are generated correctly. /__wraith/readyreturns 200 once the listener is bound. Previously it returned 503 forever, breakingwraith up’s ready poll andwraith status’s ready probe.wraith coveragereports real session counts. Previously every route showedsessions=0.- Trace ring buffer captures non-200 responses.
--tracenow records 429s, fault-injected 5xx, throttle, drop, and timeout responses - exactly the responses you want with--chaos-seed --trace.
New commands
Section titled “New commands”wraith down: stops twins started bywraith up. SIGTERM with SIGKILL escalation. Idempotent.wraith status: per-twin alive + ready report. Polls/__wraith/readyfor each running twin.wraith env: emitsWRAITH_<NAME>_PORTandWRAITH_<NAME>_BASE_URLfor each twin in the project manifest. Pasteable into a shell or consumed via--format json.
Manifest plumbs simulation flags through wraith up
Section titled “Manifest plumbs simulation flags through wraith up”Project manifests can now drive the v0.4 simulation layers per twin:
[twins.stripe]path = "twins/stripe"port = 8181chaos_seed = 42latency_mode = "auto"trace = truetrace_capacity = 500rate_limit = truerate_limit_override = ["GET /v1/foo=5/1sec"]debug = falselisten = "0.0.0.0:8181"fidelity = "synth"All fields optional; existing manifests parse unchanged.
v0.5.0 - 2026-04-29
Section titled “v0.5.0 - 2026-04-29”SSE and gRPC server-streaming. Record, synthesize, serve, and conformance-check streaming APIs end to end. See the Streaming guide.
Streaming protocols
Section titled “Streaming protocols”- SSE (
text/event-stream):wraith recordcaptures live without buffering - long-lived streams no longer deadlock the recorder.wraith serveemits realistic streams with per-event timing and rotating per-event content (an LLM twin emits the recorded token sequence, not one repeated character). - gRPC server-streaming:
wraith recordforwards frames live with HTTP/2 trailers preserved.wraith serveemits frame-correct length-prefixed protobuf withgrpc-statustrailers - gRPC clients connect and stream withoutInternal: missing trailers. - Long-lived bidi streams (cancelled by client deadline, no trailers received) classify as truncated; replay matches.
Conformance for streaming exchanges
Section titled “Conformance for streaming exchanges”wraith check now scores streaming exchanges under dedicated PASS criteria:
- Event count must match the recording.
- Per-event structural shape (keys, types, constants) must match.
- Hole-marked fields (variable LLM token text, etcd event keys) tolerate value variance.
- Termination shape and gRPC trailers must match.
Previously, streaming exchanges rolled up into the unary scorer where streaming-specific divergences could be diluted into a passing score. New behavior: a streaming Error-severity divergence fails the session.
[[diff.suppress]] now affects the score
Section titled “[[diff.suppress]] now affects the score”Suppression rules in wraith.toml are applied before scoring, so a suppressed divergence no longer counts against the conformance score. Previously [[diff.suppress]] filtered the report only.
Variant routing
Section titled “Variant routing”wraith synth infers body-field guards on routes whose variants are discriminated by request-body string fields. Glob paths like messages[*].content are supported. At serve time, when multiple variants’ guards match a request, wraith serve picks the most-specific variant - so a request that matches both a loose 200 catch-all and a tight 404 error variant routes to the 404.
A single route can mix streaming and non-streaming variants. The 200 SSE variant serves a stream; the sibling 404 invalid-model JSON variant serves a normal response.
New twins
Section titled “New twins”- ollama - twins the OpenAI-compat
/v1/chat/completionsendpoint withstream: truefor any local Ollama model. - etcd-streaming - extends the etcd twin with
KV.Watch, the canonical server-streaming RPC.
Both ship with podman fixtures so contributors can replay end-to-end.
v0.4.0 - 2026-04-21
Section titled “v0.4.0 - 2026-04-21”Faulty-service simulation + OpenAPI seed + trace endpoints. Six orphan subsystems wired into the CLI.
See the Simulation guide for the fault/latency/rate-limit story end to end.
Realistic simulation in wraith serve
Section titled “Realistic simulation in wraith serve”- Fault injection (
--fault-profile <path>,--chaos-seed <u64>): six fault types (Error / Delay / Timeout / Drop / Throttle / Partial), deterministic seeded RNG, route globs, header matching, percentage rolls, per-rule trigger caps.generate_chaos_profilebuilds a realistic mix from the loaded WIR when given just a seed. - Latency simulation (
--latency-mode <fixed|uniform|recorded|normal|percentile>+ aux flags): per-route overrides, seeded ChaCha RNG for deterministic replay. When a faultDelayrule fires, it replaces the latency simulator’s contribution for that request (no compounding). - Rate-limit simulation (
--rate-limit,--rate-limit-override "METHOD /path=N/Wsec"): FixedWindow and SlidingWindow algorithms, standardX-RateLimit-*+Retry-Afterheaders, shared 429-response builder for faultThrottleand the rate-limit gate. - Evaluation order: rate-limit -> fault -> latency -> dispatch. All three layers are
Option<Arc<...>>- zero overhead when their flags are absent.
Trace endpoints (--trace [--trace-capacity N])
Section titled “Trace endpoints (--trace [--trace-capacity N])”GET /__wraith/trace/logreturns the ring buffer in reverse-chronological order.GET /__wraith/trace/<id>fetches a single trace by id.POST /__wraith/trace/resetclears the buffer.- Bounded ring buffer with FIFO eviction. Same control-plane auth policy as the existing
/__wraith/*surface. Disabled by default.
Drift classification in wraith check
Section titled “Drift classification in wraith check”- Each divergence gets a stable
drift_id(fingerprint) and aDriftTypeclassification (schema-change / field-removed / status-shift / etc.). - JSON envelope adds a
drifts[]summary grouping divergences bydrift_id, and per-divergencedrift_id+drift_type. Additive only - existing consumers see the old shape whenskip_serializing_ifsuppresses empty fields. twins/<name>/drift.toml(sibling ofscrub.toml) supports[[suppress]]and[[reclassify]]rules matched by glob ondrift_id/route/path/drift_type. Absent file is a silent no-op.- Refresh integration deferred until refresh’s probe-execution path lands.
OpenAPI seed mode
Section titled “OpenAPI seed mode”- New
wraith explore --from-openapi <spec.yaml> [--against <url>]: parses OpenAPI 3.x (YAML or JSON), generates scenario plans, optionally executes them against a live URL and reports per-step match/mismatch/error counts. Auth via repeated--headerflags. wraith coverage --openapi <spec>extends coverage to report spec-vs-recordings gaps (covered_count,total_count,uncovered_operations).- Additive JSON envelope fields - no breaking changes to existing coverage consumers.
Post-v0.3.0 bug-hunt round
Section titled “Post-v0.3.0 bug-hunt round”- Router backtracking: literal subtrees with wrong-method no longer block backtracking to param subtrees.
- Scrub null handling: null JSON values no longer get tokenized.
- Header allowlist: user
with_extra_compare_headersopt-ins no longer overridden by blanket x-* filter. - Sync conformance replay: query params now carried through.
- VCR base64 handling: case-insensitive
base64detection. - Async CRUD handlers: error-variant short-circuit restored across Update / Delete.
- Async
handle_list: array-key detection + totalItems / totalPages parity with sync path. - Async/sync drift eliminated: async CRUD handlers now delegate to sync
dispatch(-561 LOC of duplicate logic). - Clock holes carry unit info:
ClockUnit::{EpochSec, EpochMs, IsoString}with serde-compatible migration.
- 1991 lib tests passing (+43 vs v0.3.0). 40+ new integration tests across
e2e_fault,e2e_latency,e2e_rate_limit,e2e_servetrace suite,explore_openapi. cli/up.rs,cli/refresh.rs, synth-side rate-limit / latency auto-population remain TODO for v0.4.x or v0.5.
v0.3.0 - 2026-03-30
Section titled “v0.3.0 - 2026-03-30”18 twins (REST + GraphQL + gRPC). All PASS. Honest conformance with granular suppression.
gRPC support (full pipeline)
Section titled “gRPC support (full pipeline)”- Protobuf codec: decode (wire->JSON) and encode (JSON->wire) via prost-reflect. 14 tests.
- gRPC framing: detect, parse, encode length-prefixed frames, extract trailers. 21 tests.
- HTTP/2 proxy: h2c listener (auto-detects h1/h2), hyper-based upstream client with trailer forwarding,
GrpcProxyBodyfor proper trailer delivery. - Synth detection:
is_grpc_endpoint(), method-name state op inference (Create/Get/List/Update/Delete),grpcflag on RouteModel. 22 tests. - Serve handler:
GrpcConfigloads proto descriptors, decodes protobuf requests, encodes protobuf responses. Trailers-only format for unary RPCs. - Codec wired into pipeline: synth decodes protobuf bodies to JSON before anti-unification; check decodes recorded protobuf before diffing. Real templates, not echo fallback.
X-Wraith-Format: json: debug header bypasses protobuf encoding, returns raw JSON from synth handler.X-Wraith-*headers stripped before forwarding to upstream during recording.- Go test service: 6 RPCs (CRUD + streaming), all proto types (nested, enum, oneof, map, repeated, timestamps). Dockerfile for podman.
- Validated on etcd: real-world gRPC KV service, 3 routes, 0 divergences.
Conformance engine improvements
Section titled “Conformance engine improvements”- Granular list-body suppression: suppress only array contents, not entire envelope. Scalar envelope fields (count, summary, pagination) compared normally.
- Numeric value comparison:
50and50.0treated as equal (f64 comparison). - Empty-string ID mapping fix: prevented path corruption during conformance replay. Fixed Stripe (95->0) and PocketBase (168->0, FAIL->PASS).
- User field classifications override all auto-detection, including list-body suppression.
Lua handlers
Section titled “Lua handlers”check --in-memoryloads Lua with state:handle_request_syncnow callsinvoke_handler_with_state. Lua handlers get fullstate.*andclock.*access.- OrderLedger stress test: 5 patterns (computed totals, conditional shapes, list aggregates, state machine, cross-entity joins). 7 handlers. 2 divergences with Lua vs 185 without.
POST /__wraith/new-session: force recording session boundary without restarting proxy.- Cross-session re-recording: Cloudflare, GitHub, Odoo, Stripe, Linear re-recorded with 2+ sessions each.
- GitHub GraphQL v4: 16 operations (fragments, anonymous queries, inline fragments, deep nesting, mutations).
- Updated docs: twin-lifecycle.md rewritten, configuration.md expanded, quickstart updated.
v0.2.0 - 2026-03-27
Section titled “v0.2.0 - 2026-03-27”15 APIs at zero divergences. 53/53 sessions passing.
REST (13): Cloudflare, Forgejo, Gitea, GitHub, GitLab, Keycloak, Mattermost, Notion, Odoo, PocketBase, Stripe, Supabase, Twilio. GraphQL (2): Linear (19 ops), Saleor (16 ops, anonymous queries).
Highlights
Section titled “Highlights”- GraphQL operation routing: Detects GraphQL endpoints, splits single
POST /graphqlroute into per-operation variants with guards. Handles both named operations (operationNamefield) and anonymous queries (parsed root field). NewQueryRootFieldguard predicate. - Header allowlist: Replaced 40+ entry blocklist with 3-entry allowlist (content-type, www-authenticate, proxy-authenticate). Opt-in via
with_extra_compare_headers(). - Divergence suppression:
[[diff.suppress]]in wraith.toml for user-declared suppression rules with glob patterns.--show-suppressedflag lists distinct suppressed paths with reasons. - Transparent heuristics: Hex color normalization, search/list-like body classification, scalar clobber guard - all reported as suppressed, not hidden.
- Session tagging:
wraith record --tag+wraith synth --tagfor selective synthesis. - Recording control plane:
/__wraith/health,/__wraith/ready,/__wraith/infoendpoints during recording. - Agentic route fixer: 5 modules, 12 tools, text-based TOOL_CALL protocol. Verified end-to-end.
- Lua handler sandbox: Full state API (get/put/delete/list/query/count/counter + clock), hot reload, doctor validation.
- Synth default changed to
synthfidelity (wasstrict).
Engine fixes (0.1.x -> 0.2.0)
Section titled “Engine fixes (0.1.x -> 0.2.0)”- Scalar clobber guard: don’t overlay entity scalar onto template compound type
- Search/list-like classification: POST search + bare array -> Generated body
- Hex color heuristic:
#e11d48vse11d48suppressed - Variant routing guards (PathSegmentEquals, PathSegmentPrefix, FieldEquals, QueryRootField)
- Dynamic-key object map suppression
- Order-independent array matching
- Heuristic timestamp/counter suppression
- Empty-body response handling
- Non-JSON content echo (binary/HTML/text strict replay)
- Gzip decompression in conformance normalizer
- 30+ additional deterministic fixes across 5 days