mtg_python_deckbuilder/RELEASE_NOTES_TEMPLATE.md

37 lines
3 KiB
Markdown

# MTG Python Deckbuilder
## [Unreleased]
### Added
- **Theme Editorial Quality & Standards**: Complete editorial system for theme catalog curation
- **Editorial Metadata Fields**: `description_source` (tracks provenance: official/inferred/custom) and `popularity_pinned` (manual tier override)
- **Heuristics Externalization**: Theme classification rules moved to `config/themes/editorial_heuristics.yml` for maintainability
- **Enhanced Quality Scoring**: Four-tier system (Excellent/Good/Fair/Poor) with 0.0-1.0 numerical scores based on uniqueness, duplication, description quality, and metadata completeness
- **CLI Linter**: `validate_theme_catalog.py --lint` flag with configurable thresholds for duplication and quality warnings, provides actionable improvement suggestions
- **Editorial Documentation**: Comprehensive guide at `docs/theme_editorial_guide.md` covering quality scoring, best practices, linter usage, and workflow examples
- **Theme Stripping Configuration**: Configurable minimum card threshold for theme retention
- **THEME_MIN_CARDS Setting**: Environment variable (default: 5) to strip themes with too few cards from catalogs and card metadata
- **Analysis Tooling**: `analyze_theme_distribution.py` script to visualize theme distribution and identify stripping candidates
- **Core Threshold Logic**: `theme_stripper.py` module with functions to identify and filter low-card-count themes
- **Catalog Stripping**: Automated removal of low-card themes from YAML catalog with backup/logging via `strip_catalog_themes.py` script
### Changed
- **Build Process Modernization**: Theme catalog generation now reads from parquet files instead of obsolete CSV format
- Updated `build_theme_catalog.py` and `extract_themes.py` to use parquet data (matches rest of codebase)
- Removed silent CSV exception handling (build now fails loudly if parquet read fails)
- Added THEME_MIN_CARDS filtering directly in build pipeline (themes below threshold excluded during generation)
- `theme_list.json` now auto-generated from stripped parquet data after theme stripping
- Eliminated manual JSON stripping step (JSON is derived artifact, not source of truth)
- **Parquet Theme Stripping**: Strip low-card themes directly from card data files
- Added `strip_parquet_themes.py` script with dry-run, verbose, and backup modes
- Added parquet manipulation functions to `theme_stripper.py`: backup, filter, update, and strip operations
- Handles multiple themeTags formats: numpy arrays, lists, and comma/pipe-separated strings
- Stripped 97 theme tag occurrences from 30,674 cards in `all_cards.parquet`
- Updated `stripped_themes.yml` log with 520 themes stripped from parquet source
- **Automatic integration**: Theme stripping now runs automatically in `run_tagging()` after tagging completes (when `THEME_MIN_CARDS` > 1, default: 5)
- Integrated into web UI setup, CLI tagging, and CI/CD workflows (build-similarity-cache)
### Fixed
_No unreleased changes yet_
### Removed
_No unreleased changes yet_