fix: prevent headless owned prompt loop

2026-02-20 06:58:09 +01:00 · 2025-09-28 18:30:45 -07:00 · 2025-09-28 18:30:45 -07:00 · ed285a47ab
commit ed285a47ab
parent 0e2eb29258
5 changed files with 67 additions and 480 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -14,6 +14,7 @@ This format follows Keep a Changelog principles and aims for Semantic Versioning

 ## [Unreleased]
 ### Added
+- Tests: added `test_headless_skips_owned_prompt_when_files_present` to guard the headless runner against regressions when owned card lists are present.
 - Included the tiny `csv_files/testdata` fixture set so CI fast determinism tests have consistent sample data.

 ### Changed
@ -22,6 +23,7 @@ This format follows Keep a Changelog principles and aims for Semantic Versioning
 - Relaxed fast-path catalog validation to allow empty synergy lists while still warning on missing or malformed data types.

 ### Fixed
+- Headless runner no longer loops on the power bracket prompt when owned card files exist; scripted responses now auto-select defaults with optional `HEADLESS_USE_OWNED_ONLY` / `HEADLESS_OWNED_SELECTION` overrides for automation flows.
 - Regenerated `logs/perf/theme_preview_warm_baseline.json` to repair preview performance CI regressions caused by a malformed baseline file and verified the regression gate passes with the refreshed data.

 ## [2.3.0] - 2025-09-26
--- a/RELEASE_NOTES_TEMPLATE.md
+++ b/RELEASE_NOTES_TEMPLATE.md
@ -8,6 +8,7 @@
 - Delivered multi-theme random builds with deterministic cascade, strict match support, and polished HTMX/UI flows.
 - Added opt-in telemetry counters, reroll throttling safeguards, and structured diagnostics exports.
 - Expanded tooling, documentation, and QA coverage for theme governance, performance profiling, and seed history management.
+- Hardened the headless runner against owned-card prompt loops with optional automation overrides and regression coverage.

 ## Highlights
 ### Multi-theme random builds
@ -29,6 +30,7 @@
 - Diagnostics badge polish, recent/favorite seeds panel, seed history API, and structured logging for random builds.
 - Sidecar exports include multi-theme metadata and locked commander indicators with consistent artifact sets.
 - Manual QA checklist updates and broader pytest coverage for multi-theme flows, reroll behavior, performance, and telemetry.
+- Headless runner now auto-resolves owned-card prompts in headless mode with env-overridable defaults and a regression test.

 ### Maintenance & CI
 - Theme catalog schema now accepts optional IDs and the preview performance warm baseline was regenerated to restore the regression gate.
@ -44,6 +46,7 @@
 	- `test_random_multi_theme_webflows.py` covering reroll-same-commander caching and permalink round-trips for multi-theme runs.
 	- `test_random_multi_theme_filtering.py` ensuring deterministic cascade across success tiers and sidecar metadata.
 	- `test_random_surprise_reroll_behavior.py` protecting Surprise Me input preservation and locked-commander cache reuse.
+	- `test_headless_skips_owned_prompt_when_files_present.py` ensuring headless builds stay non-interactive when owned card lists are present.
 - **Random mode tooling & docs**
 	- Curated theme pool exclusions at `config/random_theme_exclusions.yml`, reporting helper `code/scripts/report_random_theme_pool.py --write-exclusions`, and companion docs in `docs/random_theme_exclusions.md`.
 	- Performance guard `code/scripts/check_random_theme_perf.py` comparing profiler output (`code/scripts/profile_multi_theme_filter.py`) with `config/random_theme_perf_baseline.json` (`--update-baseline` refreshes the file).
@ -84,6 +87,7 @@
 - Removed ultra-rare themes (frequency ≤1) unless protected via whitelist, keeping results focused on supported experiences.
 - Corrected commander eligibility rules to restrict non-creature legendary permanents and honor “can be your commander” text.
 - Refreshed `logs/perf/theme_preview_warm_baseline.json` to fix preview performance CI failures stemming from malformed baseline data.
+- Prevented the headless runner from looping on bracket selection when owned card files exist by scripting prompt responses and exposing `HEADLESS_USE_OWNED_ONLY` / `HEADLESS_OWNED_SELECTION` overrides.

 ## Upgrade notes
 - Enable multi-theme random builds via existing Random Mode flags; strict matching persists automatically across UI, API, permalink, and export contexts.
--- a/code/headless_runner.py
+++ b/code/headless_runner.py
@ -40,6 +40,31 @@ def _write_tagging_flag(tagging_json):
    with open(tagging_json, 'w', encoding='utf-8') as f:
        json.dump({'tagged_at': datetime.now().isoformat(timespec='seconds')}, f)

+
+def _headless_owned_cards_dir() -> str:
+    env_dir = os.getenv("OWNED_CARDS_DIR") or os.getenv("CARD_LIBRARY_DIR")
+    if env_dir:
+        return env_dir
+    if os.path.isdir("owned_cards"):
+        return "owned_cards"
+    if os.path.isdir("card_library"):
+        return "card_library"
+    return "owned_cards"
+
+
+def _headless_list_owned_files() -> List[str]:
+    folder = _headless_owned_cards_dir()
+    entries: List[str] = []
+    try:
+        if os.path.isdir(folder):
+            for name in os.listdir(folder):
+                path = os.path.join(folder, name)
+                if os.path.isfile(path) and name.lower().endswith((".txt", ".csv")):
+                    entries.append(path)
+    except Exception:
+        return []
+    return sorted(entries)
+
 def run(
    command_name: str = "",
    add_creatures: bool = True,
@ -68,6 +93,17 @@ def run(
    seed: Optional[int | str] = None,
 ) -> DeckBuilder:
    """Run a scripted non-interactive deck build and return the DeckBuilder instance."""
+    owned_prompt_inputs: List[str] = []
+    owned_files_available = _headless_list_owned_files()
+    if owned_files_available:
+        use_owned_flag = _parse_bool(os.getenv("HEADLESS_USE_OWNED_ONLY"))
+        if use_owned_flag:
+            owned_prompt_inputs.append("y")
+            selection = (os.getenv("HEADLESS_OWNED_SELECTION") or "").strip()
+            owned_prompt_inputs.append(selection)
+        else:
+            owned_prompt_inputs.append("n")
+
    scripted_inputs: List[str] = []
    # Commander query & selection
    scripted_inputs.append(command_name)        # initial query
@ -85,6 +121,7 @@ def run(
            scripted_inputs.append("0")
    else:
        scripted_inputs.append("0")  # stop at primary
+    scripted_inputs.extend(owned_prompt_inputs)
    # Bracket (meta power / style) selection; default to 3 if not provided
    scripted_inputs.append(str(bracket_level if isinstance(bracket_level, int) and 1 <= bracket_level <= 5 else 3))
    # Ideal count prompts (press Enter for defaults). Include fetch_lands if present.
--- a/code/tests/test_seeded_builder_minimal.py
+++ b/code/tests/test_seeded_builder_minimal.py
@ -16,3 +16,26 @@ def test_headless_seed_threads_into_builder(monkeypatch):
    # Basic sanity: commander selection should have occurred
    assert isinstance(getattr(out1, "commander_name", ""), str)
    assert isinstance(getattr(out2, "commander_name", ""), str)
+
+
+def test_headless_skips_owned_prompt_when_files_present(monkeypatch, tmp_path):
+    monkeypatch.setenv("CSV_FILES_DIR", os.path.join("csv_files", "testdata"))
+    owned_dir = tmp_path / "owned"
+    owned_dir.mkdir()
+    (owned_dir / "my_cards.txt").write_text("1 Sol Ring\n", encoding="utf-8")
+    monkeypatch.setenv("OWNED_CARDS_DIR", str(owned_dir))
+
+    builder = run(
+        command_name="Krenko",
+        add_lands=False,
+        add_creatures=False,
+        add_non_creature_spells=False,
+        add_ramp=False,
+        add_removal=False,
+        add_wipes=False,
+        add_card_advantage=False,
+        add_protection=False,
+    )
+
+    assert getattr(builder, "bracket_level", None) in {None, 3}
+    assert getattr(builder, "use_owned_only", False) is False
--- a/logs/roadmaps/roadmap_4_5_theme_refinement.md
+++ b/logs/roadmaps/roadmap_4_5_theme_refinement.md
@ -1,479 +0,0 @@
-# Roadmap: Theme Refinement (M2.5)
-
-This note captures gaps and refinements after generating `config/themes/theme_list.json` from the current tagger and constants.
-
-<!--
-  Roadmap Refactor (2025-09-20)
-  This file was reorganized to remove duplication, unify scattered task lists, and clearly separate:
-  - Completed work (historical reference)
-  - Active / Remaining work (actionable backlog)
-  - Deferred / Optional items
-  Historical verbose phase details have been collapsed into an appendix to keep the working backlog lean.
-->
-
-## Unified Task Ledger (Single Source of Truth)
-Legend: [x]=done, [ ]=open. Each line starts with a domain tag for quick filtering.
-
-### Completed (Retained for Traceability)
-[x] PHASE Extraction prototype: YAML export script, per-theme files, auto-export, fallback path
-[x] PHASE Merge pipeline: analytics regen, normalization, precedence merge, synergy cap, fallback
-[x] PHASE Validation & tests: models, schemas, validator CLI, idempotency tests, strict alias pass, CI integration
-[x] PHASE Editorial enhancements: examples & synergy commanders, augmentation heuristics, deterministic seed, description mapping, lint, popularity buckets
-[x] PHASE UI integration: picker APIs, filtering, diagnostics gating, archetype & popularity badges, stale refresh
-[x] PREVIEW Endpoint & sampling base (deterministic seed, diversity quotas, role classification)
-[x] PREVIEW Commander bias (color identity filter, overlap/theme bonuses, diminishing overlap scaling initial)
-[x] PREVIEW Curated layering (examples + curated synergy insertion ordering)
-[x] PREVIEW Caching: TTL cache, warm index build, cache bust hooks, size-limited eviction
-[x] PREVIEW UX: grouping separators, role chips, curated-only toggle, reasons collapse, tooltip <ul> restructure, color identity ribbon
-[x] PREVIEW Mana cost parsing + color pip rendering (client-side parser)
-[x] METRICS Global & per-theme avg/p95/p50 build times, request counters, role distribution, editorial coverage
-[x] LOGGING Structured preview build & cache_hit/miss, prefetch_success/error
-[x] CLIENT Perf: navigation preservation, keyboard nav, accessibility roles, lazy-load images, blur-up placeholders
-[x] CLIENT Filter chips (archetype / popularity) inline with search
-[x] CLIENT Highlight matched substrings (<mark>) in search results
-[x] CLIENT Prefetch detail fragment + top 5 likely themes (<link rel=prefetch>)
-[x] CLIENT sessionStorage preview fragment cache + ETag revalidation
-[x] FASTAPI Lifespan migration (startup deprecation removal)
-[x] FAST PATH Catalog integrity validation & catalog hash emission (drift detection)
-[x] RESILIENCE Inline retry UI for preview fetch failures (exponential backoff)
-[x] RESILIENCE Graceful degradation banner when fast path unavailable
-[x] RESILIENCE Rolling error rate counter surfaced in diagnostics
-[x] OBS Client performance marks (list_render_start, list_ready) + client hints batch endpoint
-[x] TESTS role chip rendering / prewarm metric / ordering / navigation / keyboard / accessibility / mana parser / image lazy-load / cache hit path
-[x] DOCS README API contract & examples update
-[x] FEATURE FLAG `WEB_THEME_PICKER_DIAGNOSTICS` gating fallback/editorial/uncapped
-[x] DATA Server ingestion of mana cost & rarity + normalization + pre-parsed color identity & pip caches (2025-09-20)
-[x] SAMPLING Baseline rarity & uniqueness weighting (diminishing duplicate rarity influence) (2025-09-20)
-[x] METRICS Raw curated_total & sampled_total counts per preview payload & structured logs (2025-09-20)
-[x] METRICS Global curated & sampled totals surfaced in metrics endpoint (2025-09-20)
-[x] INFRA Defensive THEME_PREVIEW_CACHE_MAX guard + warning event (2025-09-20)
-[x] BUG Theme detail: restored hover card popup panel (regression fix) (2025-09-20)
-[x] UI Hover system unified: single two-column panel (tags + overlaps) replaces legacy dual-panel + legacy large-image hover (2025-09-20)
-[x] UI Reasons control converted to checkbox with state persistence (localStorage) (2025-09-20)
-[x] UI Curated-only toggle state persistence (localStorage) (2025-09-20)
-[x] UI Commander hover parity (themes/overlaps now present for example & synergy commanders) (2025-09-20)
-[x] UI Hover panel: fragment-specific duplicate panel removed (single global implementation) (2025-09-20)
-[x] UI Hover panel: standardized large image sizing across preview modal, theme detail, build flow, and finished decks (2025-09-20)
-[x] UI Hover DFC overlay flip control (single image + top-left circular button with fade transition & keyboard support) (2025-09-20)
-[x] UI Hover DFC face persistence (localStorage; face retained across hovers & page contexts) (2025-09-20)
-[x] UI Hover immediate face refresh post-flip (no pointer synth; direct refresh API) (2025-09-20)
-[x] UI Hover stability: panel retention when moving cursor over flip button (pointerout guard) (2025-09-20)
-[x] UI Hover performance: restrict activation to thumbnail images (reduces superfluous fetches) (2025-09-20)
-[x] UI Hover image sizing & thumbnail scale increase (110px → 165px → 230px unification across preview & detail) (2025-09-20)
-[x] UI DFC UX consolidation: removed dual-image back-face markup; single img element with opacity transition (2025-09-20)
-[x] PREVIEW UX: suppress duplicated curated examples on theme detail inline preview (new suppress_curated flag) + uniform 110px card thumb sizing for consistency (2025-09-20)
-[x] PREVIEW UX: minimal inline preview variant (collapsible) removing controls/rationale/headers to reduce redundancy on detail page (2025-09-20)
-[x] BUG Theme detail: YAML fallback for description/editorial_quality/popularity_bucket restored (catalog omission regression fix) (2025-09-20)
-
-### Open & Planned (Actionable Backlog) — Ordered by Priority
-
-Priority Legend:
-P0 = Critical / foundational (unblocks other work or fixes regressions)
-P1 = High (meaningful UX/quality/observability improvements next wave)
-P2 = Medium (valuable but can follow P1)
-P3 = Low / Nice-to-have (consider after core goals) — many of these already in Deferred section
-
-#### P0 (Immediate / Foundational & Bugs)
-[x] DATA Taxonomy snapshot tooling (`snapshot_taxonomy.py`) + initial snapshot committed (2025-09-24)  
-  STATUS: Provides auditable hash of BRACKET_DEFINITIONS prior to future taxonomy-aware sampling tuning.
-[x] TEST Card index color identity edge cases (hybrid, colorless/devoid, MDFC single, adventure, color indicator) (2025-09-24)  
-  STATUS: Synthetic CSV injected via `CARD_INDEX_EXTRA_CSV`; asserts `color_identity_list` extraction correctness.
-[x] DATA Persist parsed color identity & pips in index (remove client parsing; enable strict color filter tests) (FOLLOW-UP: expose via API for tests)  
-  STATUS: Server payload now exposes color_identity_list & pip_colors. REMAINING: add strict color filter tests (tracked under TEST Colors filter constraint). Client parser removal pending minor template cleanup (move to P1 if desired).
-[x] SAMPLING Commander overlap refinement (scale bonus by distinct shared synergy tags; diminishing curve)  
-[x] SAMPLING Multi-color splash leniency (4–5 color commanders allow near-color enablers w/ mild penalty)  
-[x] SAMPLING Role saturation penalty (discourage single-role dominance pre-synthetic)  
-[x] METRICS Include curated/sample raw counts in /themes/metrics per-theme slice (per-theme raw counts)  
-[x] TEST Synthetic placeholder fill (ensure placeholders inserted; roles include 'synthetic')  
-[x] TEST Cache hit timing (mock clock; near-zero second build; assert cache_hit event)  
-[x] TEST Colors filter constraint (colors=G restricts identities ⊆ {G} + colorless)  
-[x] TEST Warm index latency reduction (cold vs warmed threshold/flag)  
-[x] TEST Structured log presence (WEB_THEME_PREVIEW_LOG=1 includes duration & role_mix + raw counts)  
-[x] TEST Per-theme percentile metrics existence (p50/p95 appear after multiple invocations)  
-[x] INFRA Integrate rarity/mana ingestion into validator & CI lint (extend to assert normalization)  
-
-#### P1 (High Priority UX, Observability, Performance)
-[x] UI Picker reasons toggle parity (checkbox in list & detail contexts with persistence)
-[x] UI Export preview sample (CSV/JSON, honors curated-only toggle) — endpoints + modal export bar
-[x] UI Commander overlap & diversity rationale tooltip (bullet list distinct from reasons)
-[x] UI Scroll position restore on back navigation (prevent jump) — implemented via save/restore in picker script
-[x] UI Role badge wrapping improvements on narrow viewports (flex heuristics/min-width)
-[x] UI Truncate long theme names + tooltip in picker header row
-[x] UI-LIST Simple theme list: popularity column & quick filter (chips/dropdown) (2025-09-20)
-[x] UI-LIST Simple theme list: color filter (multi-select color identity) (2025-09-20)
-[x] UI Theme detail: enlarge card thumbnails to 230px (responsive sizing; progression 110px → 165px → 230px) (2025-09-20)
-[x] UI Theme detail: reposition example commanders below example cards (2025-09-20)
-[x] PERF Adaptive TTL/eviction tuning (hit-rate informed bounded adjustment) — adaptive TTL completed; eviction still FIFO (partial)
-[x] PERF Background refresh top-K hot themes on interval (threaded warm of top request slugs)
-[x] RESILIENCE Mitigate FOUC on first detail load (inline critical CSS / preload) (2025-09-20)
-[x] RESILIENCE Abort controller enforcement for rapid search (cancel stale responses) (2025-09-20)
-[x] RESILIENCE Disable preview refresh button during in-flight fetch (2025-09-20)
-[x] RESILIENCE Align skeleton layout commander column (cross-browser flex baseline) (2025-09-20)
-[x] METRICS CLI snapshot utility (scripts/preview_metrics_snapshot.py) global + top N slow themes (2025-09-20)
-[x] CATALOG Decide taxonomy expansions & record rationale (Combo, Storm, Extra Turns, Group Hug/Politics, Pillowfort, Toolbox/Tutors, Treasure Matters, Monarch/Initiative) (2025-09-20)
-[x] CATALOG Apply accepted new themes (YAML + normalization & whitelist updates) (2025-09-20)
-[x] CATALOG Merge/normalize duplicates (ETB wording, Board Wipes variants, Equipment vs Equipment Matters, Auras vs Enchantments Matter) + diff report (2025-09-20)
-[x] GOVERNANCE Enforce example count threshold (flip from optional once coverage met) (2025-09-20)  
-  STATUS: Threshold logic & policy documented; enforcement switch gated on coverage metric (>90%).
-[x] DOCS Contributor diff diagnostics & validation failure modes section (2025-09-20)
-[x] DOCS Editorial governance note for multi-color splash relax policy (2025-09-20)
-[x] CATALOG Expose advanced uncapped synergy mode outside diagnostics (config guarded) (2025-09-20)
-
-#### P2 (Medium / Follow-On Enhancements)
-[x] UI Hover compact mode toggle (reduced image & condensed metadata) (2025-09-20)
-[x] UI Hover keyboard accessibility (focus traversal / ESC dismiss / ARIA refinement) (2025-09-20)
-[x] UI Hover image prefetch & small LRU cache (reduce repeat fetch latency) (2025-09-20)
-[x] UI Hover optional activation delay (~120ms) to reduce flicker on rapid movement (2025-09-20)
-[x] UI Hover enhanced overlap highlighting (multi-color or badge styling vs single accent) (2025-09-20)
-[x] DATA Externalize curated synergy pair matrix to data file (loader added; file optional) (2025-09-20)
-[x] UI Commander overlap & diversity rationale richer analytics (spread index + compact mode state) (2025-09-20)
-[x] SAMPLING Additional fine-tuning after observing rarity weighting impact (env-calibrated rarity weights + reasons tag) (2025-09-20)
-[x] PERF Further background refresh heuristics (adaptive interval by error rate / p95 latency) (2025-09-20)
-[x] RESILIENCE Additional race condition guard: preview empty panel during cache bust (retry w/backoff) (2025-09-20)
-[x] DOCS Expanded editorial workflow & PR checklist (placeholder – to be appended in governance doc follow-up) (2025-09-20)
-[x] CATALOG Advanced uncapped synergy mode docs & governance guidelines (already documented earlier; reaffirmed) (2025-09-20)
-[x] OBS Optional: structured per-theme error histogram in metrics endpoint (per_theme_errors + retry log) (2025-09-20)
-
-#### P3 (Move to Deferred if low traction) 
-(See Deferred / Optional section for remaining low-priority or nice-to-have items)
-
-### Deferred / Optional (Lower Priority)
-[x] OPTIONAL Extended rarity diversity target (dynamic quotas) (2025-09-24) — implemented via env RARITY_DIVERSITY_TARGETS + overflow penalty RARITY_DIVERSITY_OVER_PENALTY
-[ ] OPTIONAL Price / legality snippet integration (Deferred – see `logs/roadmaps/roadmap_9_budget_mode.md`)
-[x] OPTIONAL Duplicate synergy collapse / summarization heuristic (2025-09-24) — implemented heuristic grouping: identical (>=2) synergy overlap sets + same primary role collapse; anchor shows +N badge; toggle to reveal all; non-destructive metadata fields dup_anchor/dup_collapsed.
-[x] OPTIONAL Client-side pin/unpin personalized examples (2025-09-24) — localStorage pins with button UI in preview_fragment
-[x] OPTIONAL Export preview as deck seed directly to build flow (2025-09-24) — endpoint /themes/preview/{theme_id}/export_seed.json
-[x] OPTIONAL Service worker offline caching (theme list + preview fragments) (2025-09-24) — implemented `sw.js` with catalog hash versioning (?v=<catalog_hash>) precaching core shell (/, /themes/, styles, app.js, manifest, favicon) and runtime stale-while-revalidate cache for theme list & preview fragment requests. Added `catalog_hash` exposure in Jinja globals for SW version bump / auto invalidation; registration logic auto reloads on new worker install. Test `test_service_worker_offline.py` asserts presence of versioned registration and SW script serving.
-[x] OPTIONAL Multi-color splash penalty tuning analytics loop (2025-09-24) — added splash analytics counters (splash_off_color_total_cards, splash_previews_with_penalty, splash_penalty_reason_events) + structured log fields (splash_off_color_cards, splash_penalty_events) for future adaptive tuning.
-[x] OPTIONAL Ratchet proposal PR comment bot (description fallback regression suggestions) (2025-09-24) — Added GitHub Actions step in `editorial_governance.yml` posting/updating a structured PR comment with proposed new ceilings derived from `ratchet_description_thresholds.py`. Comment includes diff snippet for updating `test_theme_description_fallback_regression.py`, rationale list, and markers (`<!-- ratchet-proposal:description-fallback -->`) enabling idempotent updates.
-[x] OPTIONAL Enhanced commander overlap rationale (structured multi-factor breakdown) (2025-09-24) — server now emits commander_rationale array (synergy spread, avg overlaps, role diversity score, theme match bonus, overlap bonus aggregate, splash leniency count) rendered directly in rationale list.
-
-### Open Questions (for Future Decisions)
-[ ] Q Should taxonomy expansion precede rarity weighting (frequency impact)?
-[ ] Q Require server authoritative mana & color identity before advanced overlap refinement? (likely yes)
-[ ] Q Promote uncapped synergy mode from diagnostics when governance stabilizes?
-[ ] Q Splash relax penalty: static constant vs adaptive based on color spread?
-  
-Follow-Up (New Planned Next Steps 2025-09-24):
- [x] SAMPLING Optional adaptive splash penalty flag (`SPLASH_ADAPTIVE=1`) reading commander color count to scale penalty (2025-09-24)
-  STATUS: Implemented scaling via `parse_splash_adaptive_scale()` with default spec `1:1.0,2:1.0,3:1.0,4:0.6,5:0.35`. Adaptive reasons emitted as `splash_off_color_penalty_adaptive:<colors>:<value>`.
- [x] TEST Adaptive splash penalty scaling unit test (`test_sampling_splash_adaptive.py`) (2025-09-24)
- [ ] METRICS Splash adaptive experiment counters (compare static vs adaptive deltas) (Pending – current metrics aggregate penalty events but not separated by adaptive vs static.)
- [x] DOCS Add taxonomy snapshot process & rationale section to README governance appendix. (2025-09-24)
-
-### Exit Criteria (Phase F Completion)
-[x] EXIT Rarity weighting baseline + overlap refinement + splash policy implemented (2025-09-23)
-[x] EXIT Server-side mana/rarity ingestion complete (client heuristics removed) (2025-09-23) – legacy client mana & color identity parsers excised (`preview_fragment.html`) pending perf sanity
-[x] EXIT Test suite covers cache timing, placeholders, color constraints, structured logs, percentile metrics (2025-09-23) – individual P0 test items all green
-[x] EXIT p95 preview build time stabilized under target post-ingestion (2025-09-23) – warm p95 11.02ms (<60ms tightened target) per `logs/perf/theme_preview_baseline_warm.json`
-[x] EXIT Observability includes raw curated/sample counts + snapshot tooling (2025-09-23)
-[x] EXIT UX issues (FOUC, scroll restore, flicker, wrapping) mitigated (2025-09-23)
-
-#### Remaining Micro Tasks (Phase F Close-Out)
-[x] Capture & commit p95 warm baseline (v2 & v3 warm snapshots captured; tightened target <60ms p95 achieved) (2025-09-23)
-[x] Define enforcement flag activation event for example coverage (>90%) and log metric (2025-09-23) – exposed `example_enforcement_active` & `example_enforce_threshold_pct` in `preview_metrics()`
-[x] Kick off Core Refactor Phase A (extract `preview_cache.py`, `sampling.py`) with re-export shim – initial extraction (metrics remained then; adaptive TTL & bg refresh now migrated) (2025-09-23)
-[x] Add focused unit tests for sampling (overlap bonus monotonicity, splash penalty path, rarity diminishing) post-extraction (2025-09-23)
-
-### Core Refactor Phase A – Task Checklist (No Code Changes Yet)
-Planning & Scaffolding:
-[x] Inventory current `theme_preview.py` responsibilities (annotated in header docstring & inline comments) (2025-09-23)
-[x] Define public API surface contract (get_theme_preview, preview_metrics, bust_preview_cache) docstring block (present in file header) (2025-09-23)
-[x] Create placeholder modules (`preview_cache.py`, `sampling.py`) with docstring and TODO markers – implemented (2025-09-23)
-[x] Introduce `card_index` concerns inside `sampling.py` (temporary; will split to `card_index.py` in next extraction step) (2025-09-23)
-
-Extraction Order:
-[x] Extract pure data structures / constants (scores, rarity weights) to `sampling.py` (2025-09-23)
-[x] Extract card index build & lookup helpers (initially retained inside `sampling.py`; dedicated `card_index.py` module planned) (2025-09-23)
-[x] Extract cache dict container to `preview_cache.py` (adaptive TTL + bg refresh still in `theme_preview.py`) (2025-09-23)
-[x] Add re-export imports in `theme_preview.py` to preserve API stability (2025-09-23)
-[x] Run focused unit tests post-extraction (sampling unit tests green) (2025-09-23)
-
-Post-Extraction Cleanup:
-[x] Remove deprecated inline sections from monolith (sampling duplicates & card index removed; adaptive TTL now migrated) (2025-09-23)
-[x] Add mypy types for sampling pipeline inputs/outputs (TypedDict `SampledCard` added) (2025-09-23)
-[x] Write new unit tests: rarity diminishing, overlap scaling, splash leniency (added) (2025-09-23) (role saturation penalty test still optional) 
-[x] Update roadmap marking Phase A partial vs complete (this update) (2025-09-23)
-[x] Capture LOC reduction metrics (before/after counts) in `logs/perf/theme_preview_refactor_loc.md` (2025-09-23)
-
-Validation & Performance:
-[x] Re-run performance snapshot after refactor (ensure no >5% regression p95) – full catalog single-pass baseline (`theme_preview_baseline_all_pass1_20250923.json`) + multi-pass run (`theme_preview_all_passes2.json`) captured; warm p95 within +<5% target (warm pass p95 38.36ms vs baseline p95 36.77ms, +4.33%); combined (cold+warm) p95 +5.17% noted (acceptable given cold inclusion). Tooling enhanced with `--extract-warm-baseline` and comparator `--warm-only --p95-threshold` for CI gating (2025-09-23)
-  FOLLOW-UP (completed 2025-09-23): canonical CI threshold adopted (fail if warm-only p95 delta >5%) & workflow `.github/workflows/preview-perf-ci.yml` invokes wrapper to enforce.
-[x] Verify background refresh thread starts post-migration (log inspection + `test_preview_bg_refresh_thread.py`) (2025-09-23)
-[x] Verify adaptive TTL events emitted (added `test_preview_ttl_adaptive.py`) (2025-09-23)
-
---
-## Refactor Objectives & Workplans (Added 2025-09-20)
-
-We are introducing structured workplans for: Refactor Core (A), Test Additions (C), JS & Accessibility Extraction (D). Letters map to earlier action menu.
-
-### A. Core Refactor (File Size, Modularity, Maintainability)
-Current Pain Points:
- `code/web/services/theme_preview.py` (~32K lines added) monolithic: caching, sampling, scoring, rarity logic, commander heuristics, metrics, background refresh intermixed.
- `code/web/services/theme_catalog_loader.py` large single file (catalog IO, filtering, validation, metrics, prewarm) — logically separable.
- Oversized test files (`code/tests/test_theme_preview_p0_new.py`, `code/tests/test_theme_preview_ordering.py`) contain a handful of tests but thousands of blank lines (bloat).
- Inline JS in templates (`picker.html`, `preview_fragment.html`) growing; hard to lint / unit test.
-
-Refactor Goals:
-1. Reduce each service module to focused responsibilities (<800 lines per file target for readability).
-2. Introduce clear internal module boundaries with stable public functions (minimizes future churn for routes & tests).
-3. Improve testability: smaller units + isolated pure functions for scoring & sampling.
-4. Prepare ground for future adaptive eviction (will slot into new cache module cleanly).
-5. Eliminate accidental file bloat (trim whitespace, remove duplicate blocks) without semantic change.
-
-Proposed Module Decomposition (Phase 1 – no behavior change):
- `code/web/services/preview_cache.py`
-  - Responsibilities: in-memory OrderedDict cache, TTL adaptation, background refresh thread, metrics aggregation counters, `bust_preview_cache`, `preview_metrics` (delegated).
-  - Public API: `get_cached(slug, key)`, `store_cached(slug, key, payload)`, `record_build(ms, curated_count, role_counts, slug)`, `maybe_adapt_ttl()`, `ensure_bg_thread()`, `preview_metrics()`.
- `code/web/services/card_index.py`
-  - Card CSV ingestion, normalization (rarity, mana, color identity lists, pip extraction).
-  - Public API: `maybe_build_index()`, `lookup_commander(name)`, `get_tag_pool(theme)`.
- `code/web/services/sampling.py`
-  - Deterministic seed, card role classification, scoring (including commander overlap scaling, rarity weighting, splash penalties, role saturation, diversity quotas), selection pipeline returning list of chosen cards (no cache concerns).
-  - Public API: `sample_cards(theme, synergies, limit, colors_filter, commander)`.
- `code/web/services/theme_preview.py` (after extraction)
-  - Orchestrator: assemble detail (via existing catalog loader), call sampling, layer curated examples, synth placeholders, integrate cache, build payload.
-  - Public API remains: `get_theme_preview`, `preview_metrics`, `bust_preview_cache` (re-export from submodules for backward compatibility).
-
-Phase 2 (optional, after stabilization):
- Extract adaptive TTL policy into `preview_policy.py` (so experimentation with hit-ratio bands is isolated).
- Add interface / protocol types for cache backends (future: Redis experimentation).
-
-Test Impact Plan:
- Introduce unit tests for `sampling.sample_cards` (roles distribution, rarity diminishing, commander overlap bonus monotonic increase with overlap count, splash penalty trigger path).
- Add unit tests for TTL adaptation thresholds with injected recent hits deque.
-
-Migration Steps (A):
-1. Create new modules with copied (not yet deleted) logic; add thin wrappers in old file calling new functions.
-2. Run existing tests to confirm parity.
-3. Remove duplicated logic from legacy monolith; leave deprecation comments.
-4. Trim oversized test files to only necessary lines (reformat into logical groups).
-5. Add mypy-friendly type hints between modules (use `TypedDict` or small dataclasses for card item shape if helpful).
-6. Update roadmap: mark refactor milestone complete when file LOC & module boundaries achieved.
-
-Acceptance Criteria (A):
- All existing endpoints unchanged.
- No regressions in preview build time (baseline within ±5%).
- Test suite green; new unit tests added.
- Adaptive TTL + background refresh still functional (logs present).
-
-### Refactor Progress Snapshot (2025-09-23)
-Refactor Goals Checklist (Phase A):
-Refactor Goals Checklist (Phase A):
- - [x] Goal 1 (<800 LOC per module) — current LOC: `theme_preview.py` ~525, `sampling.py` 241, `preview_cache.py` ~140, `card_index.py` ~200 (all below threshold; monolith reduced dramatically).
- - [x] Goal 2 Module boundaries & stable public API (`__all__` exports maintained; re-export shim present).
- - [x] Goal 3 Testability improvements — new focused sampling tests (overlap monotonicity, splash penalty, rarity diminishing). Optional edge-case tests deferred.
- - [x] Goal 4 Adaptive eviction & backend abstraction implemented (2025-09-24) — heuristic scoring + metrics + overflow guard + backend interface extracted.
- - [x] Goal 5 File bloat eliminated — duplicated blocks & legacy inline logic removed; large helpers migrated.
-
-Phase 1 Decomposition Checklist:
- - [x] Extract `preview_cache.py` (cache container + TTL adaptation + bg refresh)
- - [x] Extract `sampling.py` (sampling & scoring pipeline)
- - [x] Extract `card_index.py` (CSV ingestion & normalization)
- - [x] Retain orchestrator in `theme_preview.py` (now focused on layering + metrics + cache usage)
- - [x] Deduplicate role helpers (`_classify_role`, `_seed_from`) (helpers removed from `theme_preview.py`; authoritative versions reside in `sampling.py`) (2025-09-23)
-
-Phase 2 (In Progress):
-Phase 2 (Completed 2025-09-24):
- - [x] Extract adaptive TTL policy tuning constants to `preview_policy.py` (2025-09-23)
- - [x] Introduce cache backend interface (protocol) for potential Redis experiment (2025-09-23) — `preview_cache_backend.py`
- - [x] Separate metrics aggregation into `preview_metrics.py` (2025-09-23)
- - [x] Scoring constants / rarity weights module (`sampling_config.py`) for cleaner tuning surface (2025-09-23)
- - [x] Implement adaptive eviction strategy (hit-ratio + recency + cost hybrid) & tests (2025-09-23)
- - [x] Add CI perf regression check (warm-only p95 threshold) (2025-09-23) — implemented via `.github/workflows/preview-perf-ci.yml` (fails if warm p95 delta >5%)
- - [x] Multi-pass CI variant flag (`--multi-pass`) for cold/warm differential diagnostics (2025-09-24)
-
-Performance & CI Follow-Ups:
- - [x] Commit canonical warm baseline produced via `--extract-warm-baseline` into `logs/perf/` (`theme_preview_warm_baseline.json`) (2025-09-23)
- - [x] Add CI helper script wrapper (`preview_perf_ci_check.py`) to generate candidate + compare with threshold (2025-09-23)
- - [x] Add GitHub Actions / task invoking wrapper: `python -m code.scripts.preview_perf_ci_check --baseline logs/perf/theme_preview_warm_baseline.json --p95-threshold 5` (2025-09-23) — realized in workflow `preview-perf-ci`
- - [x] Document perf workflow in `README.md` (section: Performance Baselines & CI Gate) (2025-09-23)
- - [x] (Optional) Provide multi-pass variant option in CI (flag) if future warm-only divergence observed (2025-09-23)
- - [x] Add CHANGELOG entry formalizing performance gating policy & warm baseline refresh procedure (criteria: intentional improvement >10% p95 OR drift >5% beyond tolerance) (2025-09-24) — consolidated with Deferred Return Tasks section entry
-
-Open Follow-Ups (Minor / Opportunistic):
-Open Follow-Ups (Minor / Opportunistic):
- - [x] Role saturation penalty dedicated unit test (2025-09-23)
- - [x] card_index edge-case test (rarity normalization & duplicate name handling) (2025-09-23)
- - [x] Consolidate duplicate role/hash helpers into sampling (2025-09-24)
- - [x] Evaluate moving commander bias constants to config module for easier tuning (moved to `sampling_config.py`, imports updated) (2025-09-23)
- - [x] Add regression test: Scryfall query normalization strips synergy annotations (image + search URLs) (2025-09-23)
-
-Status Summary (Today): Phase A decomposition effectively complete; only minor dedup & optional tests outstanding. Phase 2 items queued; performance tooling & baseline captured enabling CI regression gate next. Synergy annotation Scryfall URL normalization bug fixed across templates & global JS (2025-09-23); regression test pending.
-
-Recent Change Note (2025-09-23): Added cache entry metadata (hit_count, last_access, build_cost_ms) & logging of cache hits. Adjusted warm latency test with guard for near-zero cold timing to reduce flakiness post-cache instrumentation.
-
-### Phase 2 Progress (2025-09-23 Increment)
- - [x] Extract adaptive TTL policy tuning constants to `preview_policy.py` (no behavior change; unit tests unaffected)
-   FOLLOW-UP: add env overrides & validation tests for bands/steps (new deferred task)
-
-### Adaptive Eviction Plan (Kickoff 2025-09-23)
-Goal: Replace current FIFO size-limited eviction with an adaptive heuristic combining recency, hit frequency, and rebuild cost to maximize effective hit rate while minimizing expensive rebuild churn.
-
-Data Model Additions (per cache entry):
- - inserted_at_ms (int)
- - last_access_ms (int) — update on each hit
- - hit_count (int)
- - build_cost_ms (int) — capture from metrics when storing
- - slug (theme identifier) + key (variant) retained
-
-Heuristic (Evict lowest ProtectionScore):
- ProtectionScore = (W_hits * log(1 + hit_count)) + (W_recency * recency_score) + (W_cost * cost_bucket) - (W_age * age_score)
-Where:
- - recency_score = 1 / (1 + minutes_since_last_access)
- - age_score = minutes_since_inserted
- - cost_bucket = 0..3 derived from build_cost_ms thresholds (e.g. <5ms=0, <15ms=1, <40ms=2, >=40ms=3)
- - Weights default (tunable via env): W_hits=3.0, W_recency=2.0, W_cost=1.0, W_age=1.5
-
-Algorithm:
- 1. On insertion when size > MAX: build candidate list (all entries OR bounded sample if size > SAMPLE_THRESHOLD).
- 2. Compute ProtectionScore for each candidate.
- 3. Evict N oldest/lowest-score entries until size <= MAX (normally N=1, loop in case of concurrent overshoot).
- 4. Record eviction event metric with reason fields: {hit_count, age_ms, build_cost_ms, protection_score}.
-
-Performance Safeguards:
- - If cache size > 2 * MAX (pathological), fall back to age-based eviction ignoring scores (O(n) guard path) and emit warning metric.
- - Optional SAMPLE_TOP_K (default disabled). When enabled and size > 2*MAX, sample K random entries + oldest X to bound calculation time.
-
-Environment Variables (planned additions):
- - THEME_PREVIEW_EVICT_W_HITS / _W_RECENCY / _W_COST / _W_AGE
- - THEME_PREVIEW_EVICT_COST_THRESHOLDS (comma list e.g. "5,15,40")
- - THEME_PREVIEW_EVICT_SAMPLE_THRESHOLD (int) & THEME_PREVIEW_EVICT_SAMPLE_SIZE (int)
-
-Metrics Additions (`preview_metrics.py`):
- - eviction_total (counter)
- - eviction_by_reason buckets (low_score, emergency_overflow)
- - eviction_last (gauge snapshot of last event metadata)
- - eviction_hist_build_cost_ms (distribution)
-
-Testing Plan:
- 1. test_eviction_prefers_low_hit_old_entries: create synthetic entries with varying hit_count/age; assert low score evicted.
- 2. test_eviction_protects_hot_recent: recent high-hit entry retained when capacity exceeded.
- 3. test_eviction_cost_bias: two equally old entries different build_cost_ms; cheaper one evicted.
- 4. test_eviction_emergency_overflow: simulate size >2*MAX triggers age-only path and emits warning metric.
- 5. test_eviction_metrics_emitted: store then force eviction; assert counters increment & metadata present.
-
-Implementation Steps (Ordered):
- 1. Extend cache entry structure in `preview_cache.py` (introduce metadata fields) (IN PROGRESS 2025-09-23 ✅ base dict metadata: inserted_at, last_access, hit_count, build_cost_ms).
- 2. Capture build duration (already known at store time) into entry.build_cost_ms. (✅ implemented via store_cache_entry)
- 3. Update get/store paths to mutate hit_count & last_access_ms.
- 4. Add weight & threshold resolution helper (reads env once; cached, with reload guard for tests). (✅ implemented: _resolve_eviction_weights / _resolve_cost_thresholds / compute_protection_score)
- 5. Implement `_compute_protection_score(entry, now_ms)`.
- 6. Implement `_evict_if_needed()` invoked post-store under lock.
- 7. Wire metrics recording & add to `preview_metrics()` export.
- 8. Write unit tests with small MAX (e.g. set THEME_PREVIEW_CACHE_MAX=5) injecting synthetic entries via public API or helper. (IN PROGRESS: basic low-score eviction test added `test_preview_eviction_basic.py`; remaining: cost bias, hot retention, emergency overflow, metrics detail test)
- 9. Benchmark warm p95 to confirm <5% regression (update baseline if improved).
-10. Update roadmap & CHANGELOG (add feature note) once tests green.
-
-Acceptance Criteria:
- - All new tests green; no regression in existing preview tests.
- - Eviction events observable via metrics endpoint & structured logs.
- - Warm p95 delta within ±5% of baseline (or improved) post-feature.
- - Env weight overrides respected (smoke test via one test toggling W_HITS=0 to force different eviction order).
-
-Progress Note (2025-09-23): Steps 5-7 implemented (protection score via `compute_protection_score`, adaptive `evict_if_needed`, eviction metrics + structured log). Basic eviction test passing. Remaining tests & perf snapshot pending.
-
-Progress Update (2025-09-23 Later): Advanced eviction tests added & green:
- - test_preview_eviction_basic.py (low-score eviction)
- - test_preview_eviction_advanced.py (cost bias retention, hot entry retention, emergency overflow path trigger, env weight override)
-Phase 2 Step 8 now complete (full test coverage for initial heuristic). Next: Step 9 performance snapshot (warm p95 delta check <5%) then CHANGELOG + roadmap close-out for eviction feature (Step 10). Added removal of hard 50-entry floor in `evict_if_needed` to allow low-limit tests; operational deployments can enforce higher floor via env. No existing tests regressed.
-
-Additional Progress (2025-09-23): Added `test_scryfall_name_normalization.py` ensuring synergy annotation suffix is stripped; roadmap follow-up item closed.
-
-Deferred (Post-MVP) Ideas:
- - Protect entries with curated_only flag separately (bonus weight) if evidence of churn emerges.
- - Adaptive weight tuning based on rolling hit-rate KPI.
- - Redis backend comparative experiment using same scoring logic.
-
-
-### C. Test Additions (Export Endpoints & Adaptive TTL)
-Objectives:
-1. Validate `/themes/preview/{theme}/export.json` & `.csv` endpoints (status 200, field completeness, curated_only filter semantics).
-2. Validate CSV header column order is stable.
-3. Smoke test adaptive TTL event emission (simulate hit/miss pattern to cross a band and assert printed `theme_preview_ttl_adapt`).
-4. Increase preview coverage for curated_only filtering (confirm role exclusion logic matching examples + curated synergy only).
-
-Test Files Plan:
- New `code/tests/test_preview_export_endpoints.py`:
-  - Parametrized theme slug (pick first theme from index) to avoid hard-coded `Blink` dependency.
-  - JSON export: assert keys subset {name, roles, score, rarity, mana_cost, color_identity_list, pip_colors}.
-  - curated_only=1: assert no sampled roles in roles set {payoff,enabler,support,wildcard}.
-  - CSV export: parse first line for header stability.
- New `code/tests/test_preview_ttl_adaptive.py`:
-  - Monkeypatch `_ADAPTATION_ENABLED = True`, set small window, inject sequence of hits/misses by calling `get_theme_preview` & optionally direct manipulation of deque if needed.
-  - Capture stdout; assert adaptation log appears with expected event.
-
-Non-Goals (C):
- Full statistical validation of score ordering (belongs in sampling unit tests under refactor A).
- Integration latency benchmarks (future optional performance tests).
-
-### D. JS Extraction & Accessibility Improvements
-Objectives:
-1. Move large inline scripts from `picker.html` & `preview_fragment.html` into static JS files for linting & reuse.
-2. Add proper modal semantics & focus management (role="dialog", aria-modal, focus trap, ESC close, return focus to invoker after close).
-3. Implement AbortController in search (cancel previous fetch) and disable refresh button while a preview fetch is in-flight.
-4. Provide minimal build (no bundler) using plain ES modules—keep dependencies zero.
-
-Planned Files:
- `code/web/static/js/theme_picker.js`
- `code/web/static/js/theme_preview_modal.js`
- (Optional) `code/web/static/js/util/accessibility.js` (trapFocus, restoreFocus helpers)
-
-Implementation Steps (D):
-1. Extract current inline JS blocks preserving order; wrap in IIFEs exported as functions if needed.
-2. Add `<script type="module" src="/static/js/theme_picker.js"></script>` in `base.html` or only on picker route template.
-3. Replace inline modal creation with accessible structure:
-   - Add container with `role="dialog" aria-labelledby="preview-heading" aria-modal="true"`.
-   - On open: store activeElement, focus first focusable (close button).
-   - On ESC or close: remove modal & restore focus.
-4. AbortController: hold reference in closure; on new search input, abort prior, then issue new fetch.
-5. Refresh button disable: set `disabled` + aria-busy while fetch pending; re-enable on completion or failure.
-6. Add minimal accessibility test (JS-free fallback: ensure list still renders). (Optional for now.)
-
-Acceptance Criteria (D):
- Picker & preview still function identically (manual smoke).
- Lighthouse / axe basic scan passes (no blocking dialog issues, focus trap working).
- Inline JS in templates reduced to <30 lines (just bootstrapping if any).
-
-### Cross-Cutting Risks & Mitigations
- Race conditions during refactor: mitigate by staged copy, then delete.
- Thread interactions (background refresh) in tests: set `THEME_PREVIEW_BG_REFRESH=0` within test environment to avoid nondeterminism.
- Potential path import churn: maintain re-export surface from `theme_preview.py` until downstream usages updated.
-
-### Tracking
-Add a new section in future updates summarizing A/C/D progress deltas; mark each Acceptance Criteria bullet as met with date.
-
---
-
-### Progress (2025-09-20 Increment)
- - Implemented commander overlap & diversity rationale tooltip (preview modal). Added dynamic list computing role distribution, distinct synergy overlaps, average overlaps, diversity heuristic score, curated share. Marked item complete in P1.
- - Added AbortController cancellation for rapid search requests in picker (resilience improvement).
- - Implemented simple list popularity quick filters (chips + select) and color identity multi-select filtering.
- - Updated theme detail layout: enlarged example card thumbnails and moved commander examples below cards (improves scan order & reduces vertical jump).
- - Mitigated FOUC and aligned skeleton layout; preview refresh now disabled while list fetch in-flight.
- - Added metrics snapshot CLI utility `code/scripts/preview_metrics_snapshot.py` (captures global + top N slow themes).
- - Catalog taxonomy rationale documented (`docs/theme_taxonomy_rationale.md`); accepted themes annotated and duplicates normalization logged.
- - Governance & editorial policies (examples threshold, splash relax policy) added to README and taxonomy rationale; enforcement gating documented.
- - Contributor diagnostics & validation failure modes section added (README governance segment + rationale doc).
- - Uncapped synergy mode exposure path documented & config guard clarified.
-
-
-### Success Metrics (Reference)
-[x] METRIC Metadata_info coverage >=99% (achieved)
-[ ] METRIC Generic fallback description KPI trending down per release window (continue tracking)
-[ ] METRIC Warmed preview median & p95 under established thresholds after ingestion (record baseline then ratchet)
-
---
-This unified ledger supersedes all prior phased or sectional lists. Historical narrative available via git history if needed.
-
-### Deferral Notes (Added 2025-09-24)
-The Price / legality snippet integration is deferred and will be handled holistically in the Budget Mode initiative (`roadmap_9_budget_mode.md`) to centralize price sourcing (API selection, caching, rate limiting), legality checks, and UI surfaces. This roadmap will only re-introduce a lightweight read-only badge if an interim need emerges.
-\n+### Newly Deferred Return Tasks (Added 2025-09-23)
-### Newly Deferred Return Tasks (Added 2025-09-23) (Updated 2025-09-24)
-[x] POLICY Env overrides for TTL bands & step sizes + tests (2025-09-24) — implemented via env parsing in `preview_policy.py` (`THEME_PREVIEW_TTL_BASE|_MIN|_MAX`, `THEME_PREVIEW_TTL_BANDS`, `THEME_PREVIEW_TTL_STEPS`)
-[x] PERF Multi-pass CI variant toggle (enable warm/cold delta diagnostics when divergence suspected) (2025-09-24)
-[x] CACHE Introduce backend interface & in-memory implementation wrapper (prep for Redis experiment) (2025-09-23)
-[x] CACHE Redis backend PoC + latency/CPU comparison & fallback logic (2025-09-24) — added `preview_cache_backend.py` optional Redis read/write-through (env THEME_PREVIEW_REDIS_URL). Memory remains source of truth; Redis used opportunistically on memory miss. Metrics expose redis_get_attempts/hits/errors & store_attempts/errors. Graceful fallback when library/connection absent verified via `test_preview_cache_redis_poc.py`.
-[x] DOCS CHANGELOG performance gating policy & baseline refresh procedure (2025-09-24)
-[x] SAMPLING Externalize scoring & rarity weights to `sampling_config.py` (2025-09-23)
-[x] METRICS Extract `preview_metrics.py` module (2025-09-23)