mirror of
https://github.com/mwisnowski/mtg_python_deckbuilder.git
synced 2026-03-24 14:06:31 +01:00
3 KiB
3 KiB
MTG Python Deckbuilder
[Unreleased]
Added
- Theme Editorial Quality & Standards: Complete editorial system for theme catalog curation
- Editorial Metadata Fields:
description_source(tracks provenance: official/inferred/custom) andpopularity_pinned(manual tier override) - Heuristics Externalization: Theme classification rules moved to
config/themes/editorial_heuristics.ymlfor maintainability - Enhanced Quality Scoring: Four-tier system (Excellent/Good/Fair/Poor) with 0.0-1.0 numerical scores based on uniqueness, duplication, description quality, and metadata completeness
- CLI Linter:
validate_theme_catalog.py --lintflag with configurable thresholds for duplication and quality warnings, provides actionable improvement suggestions - Editorial Documentation: Comprehensive guide at
docs/theme_editorial_guide.mdcovering quality scoring, best practices, linter usage, and workflow examples
- Editorial Metadata Fields:
- Theme Stripping Configuration: Configurable minimum card threshold for theme retention
- THEME_MIN_CARDS Setting: Environment variable (default: 5) to strip themes with too few cards from catalogs and card metadata
- Analysis Tooling:
analyze_theme_distribution.pyscript to visualize theme distribution and identify stripping candidates - Core Threshold Logic:
theme_stripper.pymodule with functions to identify and filter low-card-count themes - Catalog Stripping: Automated removal of low-card themes from YAML catalog with backup/logging via
strip_catalog_themes.pyscript
Changed
- Build Process Modernization: Theme catalog generation now reads from parquet files instead of obsolete CSV format
- Updated
build_theme_catalog.pyandextract_themes.pyto use parquet data (matches rest of codebase) - Removed silent CSV exception handling (build now fails loudly if parquet read fails)
- Added THEME_MIN_CARDS filtering directly in build pipeline (themes below threshold excluded during generation)
theme_list.jsonnow auto-generated from stripped parquet data after theme stripping- Eliminated manual JSON stripping step (JSON is derived artifact, not source of truth)
- Updated
- Parquet Theme Stripping: Strip low-card themes directly from card data files
- Added
strip_parquet_themes.pyscript with dry-run, verbose, and backup modes - Added parquet manipulation functions to
theme_stripper.py: backup, filter, update, and strip operations - Handles multiple themeTags formats: numpy arrays, lists, and comma/pipe-separated strings
- Stripped 97 theme tag occurrences from 30,674 cards in
all_cards.parquet - Updated
stripped_themes.ymllog with 520 themes stripped from parquet source - Automatic integration: Theme stripping now runs automatically in
run_tagging()after tagging completes (whenTHEME_MIN_CARDS> 1, default: 5) - Integrated into web UI setup, CLI tagging, and CI/CD workflows (build-similarity-cache)
- Added
Fixed
No unreleased changes yet
Removed
No unreleased changes yet