mirror of
https://github.com/mwisnowski/mtg_python_deckbuilder.git
synced 2025-12-18 16:40:12 +01:00
Merge pull request #32 from mwisnowski/feature/tagging-refinement
Feature/tagging refinement
This commit is contained in:
commit
4c79a7b45b
40 changed files with 5632 additions and 2789 deletions
|
|
@ -92,6 +92,12 @@ WEB_AUTO_REFRESH_DAYS=7 # dockerhub: WEB_AUTO_REFRESH_DAYS="7"
|
||||||
WEB_TAG_PARALLEL=1 # dockerhub: WEB_TAG_PARALLEL="1"
|
WEB_TAG_PARALLEL=1 # dockerhub: WEB_TAG_PARALLEL="1"
|
||||||
WEB_TAG_WORKERS=2 # dockerhub: WEB_TAG_WORKERS="4"
|
WEB_TAG_WORKERS=2 # dockerhub: WEB_TAG_WORKERS="4"
|
||||||
WEB_AUTO_ENFORCE=0 # dockerhub: WEB_AUTO_ENFORCE="0"
|
WEB_AUTO_ENFORCE=0 # dockerhub: WEB_AUTO_ENFORCE="0"
|
||||||
|
|
||||||
|
# Tagging Refinement Feature Flags
|
||||||
|
TAG_NORMALIZE_KEYWORDS=1 # dockerhub: TAG_NORMALIZE_KEYWORDS="1" # Normalize keywords & filter specialty mechanics
|
||||||
|
TAG_PROTECTION_GRANTS=1 # dockerhub: TAG_PROTECTION_GRANTS="1" # Protection tag only for cards granting shields
|
||||||
|
TAG_METADATA_SPLIT=1 # dockerhub: TAG_METADATA_SPLIT="1" # Separate metadata tags from themes in CSVs
|
||||||
|
|
||||||
# DFC_COMPAT_SNAPSHOT=0 # 1=write legacy unmerged MDFC snapshots alongside merged catalogs (deprecated compatibility workflow)
|
# DFC_COMPAT_SNAPSHOT=0 # 1=write legacy unmerged MDFC snapshots alongside merged catalogs (deprecated compatibility workflow)
|
||||||
# WEB_CUSTOM_EXPORT_BASE= # Custom basename for exports (optional).
|
# WEB_CUSTOM_EXPORT_BASE= # Custom basename for exports (optional).
|
||||||
# THEME_CATALOG_YAML_SCAN_INTERVAL_SEC=2.0 # Poll for YAML changes (dev)
|
# THEME_CATALOG_YAML_SCAN_INTERVAL_SEC=2.0 # Poll for YAML changes (dev)
|
||||||
|
|
|
||||||
61
CHANGELOG.md
61
CHANGELOG.md
|
|
@ -9,16 +9,69 @@ This format follows Keep a Changelog principles and aims for Semantic Versioning
|
||||||
|
|
||||||
## [Unreleased]
|
## [Unreleased]
|
||||||
### Summary
|
### Summary
|
||||||
- _No unreleased changes yet_
|
- Card tagging system improvements split metadata from gameplay themes for cleaner deck building experience
|
||||||
|
- Keyword normalization reduces specialty keyword noise by 96% while maintaining theme catalog quality
|
||||||
|
- Protection tag now focuses on cards that grant shields to others, not just those with inherent protection
|
||||||
|
- Web UI improvements: faster polling, fixed progress display, and theme refresh stability
|
||||||
|
- **Protection System Overhaul**: Comprehensive enhancement to protection card detection, classification, and deck building
|
||||||
|
- Fine-grained scope metadata distinguishes self-protection from board-wide effects ("Your Permanents: Hexproof" vs "Self: Hexproof")
|
||||||
|
- Enhanced grant detection with Equipment/Aura patterns, phasing support, and complex trigger handling
|
||||||
|
- Intelligent deck builder filtering includes board-relevant protection while excluding self-only and type-specific cards
|
||||||
|
- Tiered pool limiting focuses on high-quality staples while maintaining variety across builds
|
||||||
|
- Improved scope tagging for cards with keyword-only protection effects (no grant text, just inherent keywords)
|
||||||
|
- **Tagging Module Refactoring**: Large-scale refactor to improve code quality and maintainability
|
||||||
|
- Centralized regex patterns, extracted reusable utilities, decomposed complex functions
|
||||||
|
- Improved code organization and readability while maintaining 100% tagging accuracy
|
||||||
|
|
||||||
### Added
|
### Added
|
||||||
- _None_
|
- Metadata partition system separates diagnostic tags from gameplay themes in card data
|
||||||
|
- Keyword normalization system with smart filtering of one-off specialty mechanics
|
||||||
|
- Allowlist preserves important keywords like Flying, Myriad, and Transform
|
||||||
|
- Protection grant detection identifies cards that give Hexproof, Ward, or Indestructible to other permanents
|
||||||
|
- Automatic tagging for creature-type-specific protection (e.g., "Knights Gain Protection")
|
||||||
|
- New `metadataTags` column in card data for bracket annotations and internal diagnostics
|
||||||
|
- Static phasing keyword detection from keywords field (catches creatures like Breezekeeper)
|
||||||
|
- "Other X you control have Y" protection pattern for static ability grants
|
||||||
|
- "Enchanted creature has phasing" pattern detection
|
||||||
|
- Chosen type blanket phasing patterns
|
||||||
|
- Complex trigger phasing patterns (reactive, consequent, end-of-turn)
|
||||||
|
- Protection scope filtering in deck builder (feature flag: `TAG_PROTECTION_SCOPE`) intelligently selects board-relevant protection
|
||||||
|
- Phasing cards with "Your Permanents:" or "Targeted:" metadata now tagged as Protection and included in protection pool
|
||||||
|
- Metadata tags temporarily visible in card hover previews for debugging (shows scope like "Your Permanents: Hexproof")
|
||||||
|
- Web-slinging tagger function to identify cards with web-slinging mechanics
|
||||||
|
|
||||||
### Changed
|
### Changed
|
||||||
- _None_
|
- Card tags now split between themes (for deck building) and metadata (for diagnostics)
|
||||||
|
- Keywords now consolidate variants (e.g., "Commander ninjutsu" becomes "Ninjutsu")
|
||||||
|
- Setup progress polling reduced from 3s to 5-10s intervals for better performance
|
||||||
|
- Theme catalog streamlined from 753 to 736 themes (-2.3%) with improved quality
|
||||||
|
- Protection tag refined to focus on 329 cards that grant shields (down from 1,166 with inherent effects)
|
||||||
|
- Protection tag renamed to "Protective Effects" throughout web interface to avoid confusion with the Magic keyword "protection"
|
||||||
|
- Theme catalog automatically excludes metadata tags from theme suggestions
|
||||||
|
- Grant detection now strips reminder text before pattern matching to avoid false positives
|
||||||
|
- Deck builder protection phase now filters by scope metadata: includes "Your Permanents:", excludes "Self:" protection
|
||||||
|
- Protection card selection now randomized per build for variety (using seeded RNG when deterministic mode enabled)
|
||||||
|
- Protection pool now limited to ~40-50 high-quality cards (tiered selection: top 3x target + random 10-20 extras)
|
||||||
|
- Tagging module imports standardized with consistent organization and centralized constants
|
||||||
|
|
||||||
### Fixed
|
### Fixed
|
||||||
- _None_
|
- Setup progress now shows 100% completion instead of getting stuck at 99%
|
||||||
|
- Theme catalog no longer continuously regenerates after setup completes
|
||||||
|
- Health indicator polling optimized to reduce server load
|
||||||
|
- Protection detection now correctly excludes creatures with only inherent keywords
|
||||||
|
- Dive Down, Glint no longer falsely identified as granting to opponents (reminder text fix)
|
||||||
|
- Drogskol Captain, Haytham Kenway now correctly get "Your Permanents" scope tags
|
||||||
|
- 7 cards with static Phasing keyword now properly detected (Breezekeeper, Teferi's Drake, etc.)
|
||||||
|
- Type-specific protection grants (e.g., "Knights Gain Indestructible") now correctly excluded from general protection pool
|
||||||
|
- Protection scope filter now properly prioritizes exclusions over inclusions (fixes Knight Exemplar in non-Knight decks)
|
||||||
|
- Inherent protection cards (Aysen Highway, Phantom Colossus, etc.) now correctly get "Self: Protection" metadata tags
|
||||||
|
- Scope tagging now applies to ALL cards with protection effects, not just grant cards
|
||||||
|
- Cloak of Invisibility, Teferi's Curse now get "Your Permanents: Phasing" tags
|
||||||
|
- Shimmer now gets "Blanket: Phasing" tag for chosen type effect
|
||||||
|
- King of the Oathbreakers now gets "Self: Phasing" tag for reactive trigger
|
||||||
|
- Cards with static keywords (Protection, Hexproof, Ward, Indestructible) in their keywords field now get proper scope metadata tags
|
||||||
|
- Cards with X in their mana cost now properly identified and tagged with "X Spells" theme for better deck building accuracy
|
||||||
|
- Card tagging system enhanced with smarter pattern detection and more consistent categorization
|
||||||
|
|
||||||
## [2.5.2] - 2025-10-08
|
## [2.5.2] - 2025-10-08
|
||||||
### Summary
|
### Summary
|
||||||
|
|
|
||||||
46
README.md
46
README.md
|
|
@ -99,15 +99,51 @@ Execute saved configs without manual input.
|
||||||
|
|
||||||
### Initial Setup
|
### Initial Setup
|
||||||
Refresh data and caches when formats shift.
|
Refresh data and caches when formats shift.
|
||||||
- Runs card downloads, CSV regeneration, tagging, and commander catalog rebuilds.
|
- Runs card downloads, CSV regeneration, smart tagging (keywords + protection grants), and commander catalog rebuilds.
|
||||||
- Controlled by `SHOW_SETUP=1` (on by default in compose).
|
- Controlled by `SHOW_SETUP=1` (on by default in compose).
|
||||||
- Force a rebuild manually:
|
- **Force a full rebuild (setup + tagging)**:
|
||||||
```powershell
|
```powershell
|
||||||
docker compose run --rm --entrypoint bash web -lc "python -m code.file_setup.setup"
|
# Docker:
|
||||||
|
docker compose run --rm web python -c "from code.file_setup.setup import initial_setup; from code.tagging.tagger import run_tagging; initial_setup(); run_tagging()"
|
||||||
|
|
||||||
|
# Local (with venv activated):
|
||||||
|
python -c "from code.file_setup.setup import initial_setup; from code.tagging.tagger import run_tagging; initial_setup(); run_tagging()"
|
||||||
|
|
||||||
|
# With parallel processing (faster):
|
||||||
|
python -c "from code.file_setup.setup import initial_setup; from code.tagging.tagger import run_tagging; initial_setup(); run_tagging(parallel=True)"
|
||||||
|
|
||||||
|
# With parallel processing and custom worker count:
|
||||||
|
python -c "from code.file_setup.setup import initial_setup; from code.tagging.tagger import run_tagging; initial_setup(); run_tagging(parallel=True, max_workers=4)"
|
||||||
```
|
```
|
||||||
- Rebuild only the commander catalog:
|
- **Rebuild only CSVs without tagging**:
|
||||||
```powershell
|
```powershell
|
||||||
docker compose run --rm --entrypoint bash web -lc "python -m code.scripts.refresh_commander_catalog"
|
# Docker:
|
||||||
|
docker compose run --rm web python -c "from code.file_setup.setup import initial_setup; initial_setup()"
|
||||||
|
|
||||||
|
# Local:
|
||||||
|
python -c "from code.file_setup.setup import initial_setup; initial_setup()"
|
||||||
|
```
|
||||||
|
- **Run only tagging (CSVs must exist)**:
|
||||||
|
```powershell
|
||||||
|
# Docker:
|
||||||
|
docker compose run --rm web python -c "from code.tagging.tagger import run_tagging; run_tagging()"
|
||||||
|
|
||||||
|
# Local:
|
||||||
|
python -c "from code.tagging.tagger import run_tagging; run_tagging()"
|
||||||
|
|
||||||
|
# With parallel processing (faster):
|
||||||
|
python -c "from code.tagging.tagger import run_tagging; run_tagging(parallel=True)"
|
||||||
|
|
||||||
|
# With parallel processing and custom worker count:
|
||||||
|
python -c "from code.tagging.tagger import run_tagging; run_tagging(parallel=True, max_workers=4)"
|
||||||
|
```
|
||||||
|
- **Rebuild only the commander catalog**:
|
||||||
|
```powershell
|
||||||
|
# Docker:
|
||||||
|
docker compose run --rm web python -m code.scripts.refresh_commander_catalog
|
||||||
|
|
||||||
|
# Local:
|
||||||
|
python -m code.scripts.refresh_commander_catalog
|
||||||
```
|
```
|
||||||
|
|
||||||
### Owned Library
|
### Owned Library
|
||||||
|
|
|
||||||
|
|
@ -1,26 +1,67 @@
|
||||||
# MTG Python Deckbuilder ${VERSION}
|
# MTG Python Deckbuilder ${VERSION}
|
||||||
|
|
||||||
|
## [Unreleased]
|
||||||
### Summary
|
### Summary
|
||||||
- Builder responsiveness upgrades: smarter HTMX caching, shared debounce helpers, and virtualization hints keep long card lists responsive.
|
- Card tagging system improvements split metadata from gameplay themes for cleaner deck building experience
|
||||||
- Commander catalog now ships skeleton placeholders, lazy commander art loading, and cached default results for faster repeat visits.
|
- Keyword normalization reduces specialty keyword noise by 96% while maintaining theme catalog quality
|
||||||
- Deck summary streams via an HTMX fragment while virtualization powers summary lists without loading every row up front.
|
- Protection tag now focuses on cards that grant shields to others, not just those with inherent protection
|
||||||
- Mana analytics load on demand with collapsible sections and interactive chart tooltips that support click-to-pin comparisons.
|
- Web UI improvements: faster polling, fixed progress display, and theme refresh stability
|
||||||
|
- Comprehensive enhancement to protection card detection, classification, and deck building
|
||||||
|
- Fine-grained scope metadata distinguishes self-protection from board-wide effects ("Your Permanents: Hexproof" vs "Self: Hexproof")
|
||||||
|
- Enhanced grant detection with Equipment/Aura patterns, phasing support, and complex trigger handling
|
||||||
|
- Intelligent deck builder filtering includes board-relevant protection while excluding self-only and type-specific cards
|
||||||
|
- Tiered pool limiting focuses on high-quality staples while maintaining variety across builds
|
||||||
|
- Improved scope tagging for cards with keyword-only protection effects (no grant text, just inherent keywords)
|
||||||
|
- Large-scale refactor to improve code quality and maintainability
|
||||||
|
- Centralized regex patterns, extracted reusable utilities, decomposed complex functions
|
||||||
|
- Improved code organization and readability while maintaining 100% tagging accuracy
|
||||||
|
|
||||||
### Added
|
### Added
|
||||||
- Skeleton placeholders accept `data-skeleton-label` microcopy and only surface after ~400 ms across the build wizard, stage navigator, and alternatives panel.
|
- Metadata partition system separates diagnostic tags from gameplay themes in card data
|
||||||
- Must-have toggle API (`/build/must-haves/toggle`), telemetry ingestion route (`/telemetry/events`), and structured logging helpers capture include/exclude beacons.
|
- Keyword normalization system with smart filtering of one-off specialty mechanics
|
||||||
- Commander catalog results wrap in a deferred skeleton list while commander art lazy-loads via a new `IntersectionObserver` helper in `code/web/static/app.js`.
|
- Allowlist preserves important keywords like Flying, Myriad, and Transform
|
||||||
- Collapsible accordions for Mana Overview and Test Hand sections defer heavy analytics until they are expanded.
|
- Protection grant detection identifies cards that give Hexproof, Ward, or Indestructible to other permanents
|
||||||
- Click-to-pin chart tooltips keep comparisons anchored and add copy-friendly working buttons.
|
- Automatic tagging for creature-type-specific protection (e.g., "Knights Gain Protection")
|
||||||
- Virtualized card lists automatically render only visible items once 12+ cards are present.
|
- New `metadataTags` column in card data for bracket annotations and internal diagnostics
|
||||||
|
- Static phasing keyword detection from keywords field (catches creatures like Breezekeeper)
|
||||||
|
- "Other X you control have Y" protection pattern for static ability grants
|
||||||
|
- "Enchanted creature has phasing" pattern detection
|
||||||
|
- Chosen type blanket phasing patterns
|
||||||
|
- Complex trigger phasing patterns (reactive, consequent, end-of-turn)
|
||||||
|
- Protection scope filtering in deck builder (feature flag: `TAG_PROTECTION_SCOPE`) intelligently selects board-relevant protection
|
||||||
|
- Phasing cards with "Your Permanents:" or "Targeted:" metadata now tagged as Protection and included in protection pool
|
||||||
|
- Metadata tags temporarily visible in card hover previews for debugging (shows scope like "Your Permanents: Hexproof")
|
||||||
|
- Web-slinging tagger function to identify cards with web-slinging mechanics
|
||||||
|
|
||||||
### Changed
|
### Changed
|
||||||
- Commander search and theme picker now share an intelligent debounce to prevent redundant requests while typing.
|
- Card tags now split between themes (for deck building) and metadata (for diagnostics)
|
||||||
- Card grids adopt modern containment rules to minimize layout recalculations on large decks.
|
- Keywords now consolidate variants (e.g., "Commander ninjutsu" becomes "Ninjutsu")
|
||||||
- Include/exclude buttons respond immediately with optimistic updates, reconciling gracefully if the server disagrees.
|
- Setup progress polling reduced from 3s to 5-10s intervals for better performance
|
||||||
- Frequently accessed views, like the commander catalog default, now pull from an in-memory cache for sub-200 ms reloads.
|
- Theme catalog streamlined from 753 to 736 themes (-2.3%) with improved quality
|
||||||
- Deck review loads in focused chunks, keeping the initial page lean while analytics stream progressively.
|
- Protection tag refined to focus on 329 cards that grant shields (down from 1,166 with inherent effects)
|
||||||
- Chart hover zones expand to full column width for easier interaction.
|
- Protection tag renamed to "Protective Effects" throughout web interface to avoid confusion with the Magic keyword "protection"
|
||||||
|
- Theme catalog automatically excludes metadata tags from theme suggestions
|
||||||
|
- Grant detection now strips reminder text before pattern matching to avoid false positives
|
||||||
|
- Deck builder protection phase now filters by scope metadata: includes "Your Permanents:", excludes "Self:" protection
|
||||||
|
- Protection card selection now randomized per build for variety (using seeded RNG when deterministic mode enabled)
|
||||||
|
- Protection pool now limited to ~40-50 high-quality cards (tiered selection: top 3x target + random 10-20 extras)
|
||||||
|
- Tagging module imports standardized with consistent organization and centralized constants
|
||||||
|
|
||||||
### Fixed
|
### Fixed
|
||||||
- _None_
|
- Setup progress now shows 100% completion instead of getting stuck at 99%
|
||||||
|
- Theme catalog no longer continuously regenerates after setup completes
|
||||||
|
- Health indicator polling optimized to reduce server load
|
||||||
|
- Protection detection now correctly excludes creatures with only inherent keywords
|
||||||
|
- Dive Down, Glint no longer falsely identified as granting to opponents (reminder text fix)
|
||||||
|
- Drogskol Captain, Haytham Kenway now correctly get "Your Permanents" scope tags
|
||||||
|
- 7 cards with static Phasing keyword now properly detected (Breezekeeper, Teferi's Drake, etc.)
|
||||||
|
- Type-specific protection grants (e.g., "Knights Gain Indestructible") now correctly excluded from general protection pool
|
||||||
|
- Protection scope filter now properly prioritizes exclusions over inclusions (fixes Knight Exemplar in non-Knight decks)
|
||||||
|
- Inherent protection cards (Aysen Highway, Phantom Colossus, etc.) now correctly get "Self: Protection" metadata tags
|
||||||
|
- Scope tagging now applies to ALL cards with protection effects, not just grant cards
|
||||||
|
- Cloak of Invisibility, Teferi's Curse now get "Your Permanents: Phasing" tags
|
||||||
|
- Shimmer now gets "Blanket: Phasing" tag for chosen type effect
|
||||||
|
- King of the Oathbreakers now gets "Self: Phasing" tag for reactive trigger
|
||||||
|
- Cards with static keywords (Protection, Hexproof, Ward, Indestructible) in their keywords field now get proper scope metadata tags
|
||||||
|
- Cards with X in their mana cost now properly identified and tagged with "X Spells" theme for better deck building accuracy
|
||||||
|
- Card tagging system enhanced with smarter pattern detection and more consistent categorization
|
||||||
|
|
|
||||||
|
|
@ -1,5 +0,0 @@
|
||||||
import urllib.request, json
|
|
||||||
raw = urllib.request.urlopen("http://localhost:8000/themes/metrics").read().decode()
|
|
||||||
js=json.loads(raw)
|
|
||||||
print('example_enforcement_active=', js.get('preview',{}).get('example_enforcement_active'))
|
|
||||||
print('example_enforce_threshold_pct=', js.get('preview',{}).get('example_enforce_threshold_pct'))
|
|
||||||
|
|
@ -1 +0,0 @@
|
||||||
=\ 1\; & \c:/Users/Matt/mtg_python/mtg_python_deckbuilder/.venv/Scripts/python.exe\ code/scripts/build_theme_catalog.py --output config/themes/theme_list_tmp.json
|
|
||||||
|
|
@ -1,3 +0,0 @@
|
||||||
from code.web.services import orchestrator
|
|
||||||
orchestrator._ensure_setup_ready(print, force=False)
|
|
||||||
print('DONE')
|
|
||||||
|
|
@ -1759,6 +1759,7 @@ class DeckBuilder(
|
||||||
entry['Synergy'] = synergy
|
entry['Synergy'] = synergy
|
||||||
else:
|
else:
|
||||||
# If no tags passed attempt enrichment from filtered pool first, then full snapshot
|
# If no tags passed attempt enrichment from filtered pool first, then full snapshot
|
||||||
|
metadata_tags: list[str] = []
|
||||||
if not tags:
|
if not tags:
|
||||||
# Use filtered pool (_combined_cards_df) instead of unfiltered (_full_cards_df)
|
# Use filtered pool (_combined_cards_df) instead of unfiltered (_full_cards_df)
|
||||||
# This ensures exclude filtering is respected during card enrichment
|
# This ensures exclude filtering is respected during card enrichment
|
||||||
|
|
@ -1774,6 +1775,13 @@ class DeckBuilder(
|
||||||
# tolerate comma separated
|
# tolerate comma separated
|
||||||
parts = [p.strip().strip("'\"") for p in raw_tags.split(',')]
|
parts = [p.strip().strip("'\"") for p in raw_tags.split(',')]
|
||||||
tags = [p for p in parts if p]
|
tags = [p for p in parts if p]
|
||||||
|
# M5: Extract metadata tags for web UI display
|
||||||
|
raw_meta = row_match.iloc[0].get('metadataTags', [])
|
||||||
|
if isinstance(raw_meta, list):
|
||||||
|
metadata_tags = [str(t).strip() for t in raw_meta if str(t).strip()]
|
||||||
|
elif isinstance(raw_meta, str) and raw_meta.strip():
|
||||||
|
parts = [p.strip().strip("'\"") for p in raw_meta.split(',')]
|
||||||
|
metadata_tags = [p for p in parts if p]
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
# Enrich missing type and mana_cost for accurate categorization
|
# Enrich missing type and mana_cost for accurate categorization
|
||||||
|
|
@ -1811,6 +1819,7 @@ class DeckBuilder(
|
||||||
'Mana Value': mana_value,
|
'Mana Value': mana_value,
|
||||||
'Creature Types': creature_types,
|
'Creature Types': creature_types,
|
||||||
'Tags': tags,
|
'Tags': tags,
|
||||||
|
'MetadataTags': metadata_tags, # M5: Store metadata tags for web UI
|
||||||
'Commander': is_commander,
|
'Commander': is_commander,
|
||||||
'Count': 1,
|
'Count': 1,
|
||||||
'Role': (role or ('commander' if is_commander else None)),
|
'Role': (role or ('commander' if is_commander else None)),
|
||||||
|
|
|
||||||
|
|
@ -438,7 +438,7 @@ DEFAULT_REMOVAL_COUNT: Final[int] = 10 # Default number of spot removal spells
|
||||||
DEFAULT_WIPES_COUNT: Final[int] = 2 # Default number of board wipes
|
DEFAULT_WIPES_COUNT: Final[int] = 2 # Default number of board wipes
|
||||||
|
|
||||||
DEFAULT_CARD_ADVANTAGE_COUNT: Final[int] = 10 # Default number of card advantage pieces
|
DEFAULT_CARD_ADVANTAGE_COUNT: Final[int] = 10 # Default number of card advantage pieces
|
||||||
DEFAULT_PROTECTION_COUNT: Final[int] = 8 # Default number of protection spells
|
DEFAULT_PROTECTION_COUNT: Final[int] = 8 # Default number of protective effects (hexproof, indestructible, protection, ward, etc.)
|
||||||
|
|
||||||
# Deck composition prompts
|
# Deck composition prompts
|
||||||
DECK_COMPOSITION_PROMPTS: Final[Dict[str, str]] = {
|
DECK_COMPOSITION_PROMPTS: Final[Dict[str, str]] = {
|
||||||
|
|
@ -450,7 +450,7 @@ DECK_COMPOSITION_PROMPTS: Final[Dict[str, str]] = {
|
||||||
'removal': 'Enter desired number of spot removal spells (default: 10):',
|
'removal': 'Enter desired number of spot removal spells (default: 10):',
|
||||||
'wipes': 'Enter desired number of board wipes (default: 2):',
|
'wipes': 'Enter desired number of board wipes (default: 2):',
|
||||||
'card_advantage': 'Enter desired number of card advantage pieces (default: 10):',
|
'card_advantage': 'Enter desired number of card advantage pieces (default: 10):',
|
||||||
'protection': 'Enter desired number of protection spells (default: 8):',
|
'protection': 'Enter desired number of protective effects (default: 8):',
|
||||||
'max_deck_price': 'Enter maximum total deck price in dollars (default: 400.0):',
|
'max_deck_price': 'Enter maximum total deck price in dollars (default: 400.0):',
|
||||||
'max_card_price': 'Enter maximum price per card in dollars (default: 20.0):'
|
'max_card_price': 'Enter maximum price per card in dollars (default: 20.0):'
|
||||||
}
|
}
|
||||||
|
|
@ -511,7 +511,7 @@ DEFAULT_THEME_TAGS = [
|
||||||
'Combat Matters', 'Control', 'Counters Matter', 'Energy',
|
'Combat Matters', 'Control', 'Counters Matter', 'Energy',
|
||||||
'Enter the Battlefield', 'Equipment', 'Exile Matters', 'Infect',
|
'Enter the Battlefield', 'Equipment', 'Exile Matters', 'Infect',
|
||||||
'Interaction', 'Lands Matter', 'Leave the Battlefield', 'Legends Matter',
|
'Interaction', 'Lands Matter', 'Leave the Battlefield', 'Legends Matter',
|
||||||
'Life Matters', 'Mill', 'Monarch', 'Protection', 'Ramp', 'Reanimate',
|
'Life Matters', 'Mill', 'Monarch', 'Protective Effects', 'Ramp', 'Reanimate',
|
||||||
'Removal', 'Sacrifice Matters', 'Spellslinger', 'Stax', 'Superfriends',
|
'Removal', 'Sacrifice Matters', 'Spellslinger', 'Stax', 'Superfriends',
|
||||||
'Theft', 'Token Creation', 'Tokens Matter', 'Voltron', 'X Spells'
|
'Theft', 'Token Creation', 'Tokens Matter', 'Voltron', 'X Spells'
|
||||||
]
|
]
|
||||||
|
|
|
||||||
|
|
@ -539,6 +539,10 @@ class SpellAdditionMixin:
|
||||||
"""Add protection spells to the deck.
|
"""Add protection spells to the deck.
|
||||||
Selects cards tagged as 'protection', prioritizing by EDHREC rank and mana value.
|
Selects cards tagged as 'protection', prioritizing by EDHREC rank and mana value.
|
||||||
Avoids duplicates and commander card.
|
Avoids duplicates and commander card.
|
||||||
|
|
||||||
|
M5: When TAG_PROTECTION_SCOPE is enabled, filters to include only cards that
|
||||||
|
protect your board (Your Permanents:, {Type} Gain) and excludes self-only or
|
||||||
|
opponent protection cards.
|
||||||
"""
|
"""
|
||||||
target = self.ideal_counts.get('protection', 0)
|
target = self.ideal_counts.get('protection', 0)
|
||||||
if target <= 0 or self._combined_cards_df is None:
|
if target <= 0 or self._combined_cards_df is None:
|
||||||
|
|
@ -546,14 +550,88 @@ class SpellAdditionMixin:
|
||||||
already = {n.lower() for n in self.card_library.keys()}
|
already = {n.lower() for n in self.card_library.keys()}
|
||||||
df = self._combined_cards_df.copy()
|
df = self._combined_cards_df.copy()
|
||||||
df['_ltags'] = df.get('themeTags', []).apply(bu.normalize_tag_cell)
|
df['_ltags'] = df.get('themeTags', []).apply(bu.normalize_tag_cell)
|
||||||
pool = df[df['_ltags'].apply(lambda tags: any('protection' in t for t in tags))]
|
|
||||||
|
# M5: Apply scope-based filtering if enabled
|
||||||
|
import settings as s
|
||||||
|
if getattr(s, 'TAG_PROTECTION_SCOPE', True):
|
||||||
|
# Check metadata tags for scope information
|
||||||
|
df['_meta_tags'] = df.get('metadataTags', []).apply(bu.normalize_tag_cell)
|
||||||
|
|
||||||
|
def is_board_relevant_protection(row):
|
||||||
|
"""Check if protection card helps protect your board.
|
||||||
|
|
||||||
|
Includes:
|
||||||
|
- Cards with "Your Permanents:" metadata (board-wide protection)
|
||||||
|
- Cards with "Blanket:" metadata (affects all permanents)
|
||||||
|
- Cards with "Targeted:" metadata (can target your stuff)
|
||||||
|
- Legacy cards without metadata tags
|
||||||
|
|
||||||
|
Excludes:
|
||||||
|
- "Self:" protection (only protects itself)
|
||||||
|
- "Opponent Permanents:" protection (helps opponents)
|
||||||
|
- Type-specific grants like "Knights Gain" (too narrow, handled by kindred synergies)
|
||||||
|
"""
|
||||||
|
theme_tags = row.get('_ltags', [])
|
||||||
|
meta_tags = row.get('_meta_tags', [])
|
||||||
|
|
||||||
|
# First check if it has general protection tag
|
||||||
|
has_protection = any('protection' in t for t in theme_tags)
|
||||||
|
if not has_protection:
|
||||||
|
return False
|
||||||
|
|
||||||
|
# INCLUDE: Board-relevant scopes
|
||||||
|
# "Your Permanents:", "Blanket:", "Targeted:"
|
||||||
|
has_board_scope = any(
|
||||||
|
'your permanents:' in t or 'blanket:' in t or 'targeted:' in t
|
||||||
|
for t in meta_tags
|
||||||
|
)
|
||||||
|
|
||||||
|
# EXCLUDE: Self-only, opponent protection, or type-specific grants
|
||||||
|
# Check for type-specific grants FIRST (highest priority exclusion)
|
||||||
|
has_type_specific = any(
|
||||||
|
' gain ' in t.lower() # "Knights Gain", "Treefolk Gain", etc.
|
||||||
|
for t in meta_tags
|
||||||
|
)
|
||||||
|
|
||||||
|
has_excluded_scope = any(
|
||||||
|
'self:' in t or
|
||||||
|
'opponent permanents:' in t
|
||||||
|
for t in meta_tags
|
||||||
|
)
|
||||||
|
|
||||||
|
# Include if board-relevant, or if no scope tags (legacy cards)
|
||||||
|
# ALWAYS exclude type-specific grants (too narrow for general protection)
|
||||||
|
if meta_tags:
|
||||||
|
# Has metadata - use it for filtering
|
||||||
|
# Exclude if type-specific OR self/opponent
|
||||||
|
if has_type_specific or has_excluded_scope:
|
||||||
|
return False
|
||||||
|
# Otherwise include if board-relevant
|
||||||
|
return has_board_scope
|
||||||
|
else:
|
||||||
|
# No metadata - legacy card, include by default
|
||||||
|
return True
|
||||||
|
|
||||||
|
pool = df[df.apply(is_board_relevant_protection, axis=1)]
|
||||||
|
|
||||||
|
# Log scope filtering stats
|
||||||
|
original_count = len(df[df['_ltags'].apply(lambda tags: any('protection' in t for t in tags))])
|
||||||
|
filtered_count = len(pool)
|
||||||
|
if original_count > filtered_count:
|
||||||
|
self.output_func(f"Protection scope filter: {filtered_count}/{original_count} cards (excluded {original_count - filtered_count} self-only/opponent cards)")
|
||||||
|
else:
|
||||||
|
# Legacy behavior: include all cards with 'protection' tag
|
||||||
|
pool = df[df['_ltags'].apply(lambda tags: any('protection' in t for t in tags))]
|
||||||
|
|
||||||
pool = pool[~pool['type'].fillna('').str.contains('Land', case=False, na=False)]
|
pool = pool[~pool['type'].fillna('').str.contains('Land', case=False, na=False)]
|
||||||
commander_name = getattr(self, 'commander', None)
|
commander_name = getattr(self, 'commander', None)
|
||||||
if commander_name:
|
if commander_name:
|
||||||
pool = pool[pool['name'] != commander_name]
|
pool = pool[pool['name'] != commander_name]
|
||||||
pool = self._apply_bracket_pre_filters(pool)
|
pool = self._apply_bracket_pre_filters(pool)
|
||||||
pool = bu.sort_by_priority(pool, ['edhrecRank','manaValue'])
|
pool = bu.sort_by_priority(pool, ['edhrecRank','manaValue'])
|
||||||
|
|
||||||
self._debug_dump_pool(pool, 'protection')
|
self._debug_dump_pool(pool, 'protection')
|
||||||
|
|
||||||
try:
|
try:
|
||||||
if str(os.getenv('DEBUG_SPELL_POOLS', '')).strip().lower() in {"1","true","yes","on"}:
|
if str(os.getenv('DEBUG_SPELL_POOLS', '')).strip().lower() in {"1","true","yes","on"}:
|
||||||
names = pool['name'].astype(str).head(30).tolist()
|
names = pool['name'].astype(str).head(30).tolist()
|
||||||
|
|
@ -580,6 +658,48 @@ class SpellAdditionMixin:
|
||||||
if existing >= target and to_add == 0:
|
if existing >= target and to_add == 0:
|
||||||
return
|
return
|
||||||
target = to_add if existing < target else to_add
|
target = to_add if existing < target else to_add
|
||||||
|
|
||||||
|
# M5: Limit pool size to manageable tier-based selection
|
||||||
|
# Strategy: Top tier (3x target) + random deeper selection
|
||||||
|
# This keeps the pool focused on high-quality options (~50-70 cards typical)
|
||||||
|
original_pool_size = len(pool)
|
||||||
|
if len(pool) > 0 and target > 0:
|
||||||
|
try:
|
||||||
|
# Tier 1: Top quality cards (3x target count)
|
||||||
|
tier1_size = min(3 * target, len(pool))
|
||||||
|
tier1 = pool.head(tier1_size).copy()
|
||||||
|
|
||||||
|
# Tier 2: Random additional cards from remaining pool (10-20 cards)
|
||||||
|
if len(pool) > tier1_size:
|
||||||
|
remaining_pool = pool.iloc[tier1_size:].copy()
|
||||||
|
tier2_size = min(
|
||||||
|
self.rng.randint(10, 20) if hasattr(self, 'rng') and self.rng else 15,
|
||||||
|
len(remaining_pool)
|
||||||
|
)
|
||||||
|
if hasattr(self, 'rng') and self.rng and len(remaining_pool) > tier2_size:
|
||||||
|
# Use random.sample() to select random indices from the remaining pool
|
||||||
|
tier2_indices = self.rng.sample(range(len(remaining_pool)), tier2_size)
|
||||||
|
tier2 = remaining_pool.iloc[tier2_indices]
|
||||||
|
else:
|
||||||
|
tier2 = remaining_pool.head(tier2_size)
|
||||||
|
pool = tier1._append(tier2, ignore_index=True)
|
||||||
|
else:
|
||||||
|
pool = tier1
|
||||||
|
|
||||||
|
if len(pool) != original_pool_size:
|
||||||
|
self.output_func(f"Protection pool limited: {len(pool)}/{original_pool_size} cards (tier1: {tier1_size}, tier2: {len(pool) - tier1_size})")
|
||||||
|
except Exception as e:
|
||||||
|
self.output_func(f"Warning: Pool limiting failed, using full pool: {e}")
|
||||||
|
|
||||||
|
# Shuffle pool for variety across builds (using seeded RNG for determinism)
|
||||||
|
try:
|
||||||
|
if hasattr(self, 'rng') and self.rng is not None:
|
||||||
|
pool_list = pool.to_dict('records')
|
||||||
|
self.rng.shuffle(pool_list)
|
||||||
|
import pandas as pd
|
||||||
|
pool = pd.DataFrame(pool_list)
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
added = 0
|
added = 0
|
||||||
added_names: List[str] = []
|
added_names: List[str] = []
|
||||||
for _, r in pool.iterrows():
|
for _, r in pool.iterrows():
|
||||||
|
|
|
||||||
|
|
@ -878,7 +878,7 @@ class ReportingMixin:
|
||||||
|
|
||||||
headers = [
|
headers = [
|
||||||
"Name","Count","Type","ManaCost","ManaValue","Colors","Power","Toughness",
|
"Name","Count","Type","ManaCost","ManaValue","Colors","Power","Toughness",
|
||||||
"Role","SubRole","AddedBy","TriggerTag","Synergy","Tags","Text","DFCNote","Owned"
|
"Role","SubRole","AddedBy","TriggerTag","Synergy","Tags","MetadataTags","Text","DFCNote","Owned"
|
||||||
]
|
]
|
||||||
|
|
||||||
header_suffix: List[str] = []
|
header_suffix: List[str] = []
|
||||||
|
|
@ -946,6 +946,9 @@ class ReportingMixin:
|
||||||
role = info.get('Role', '') or ''
|
role = info.get('Role', '') or ''
|
||||||
tags = info.get('Tags', []) or []
|
tags = info.get('Tags', []) or []
|
||||||
tags_join = '; '.join(tags)
|
tags_join = '; '.join(tags)
|
||||||
|
# M5: Include metadata tags in export
|
||||||
|
metadata_tags = info.get('MetadataTags', []) or []
|
||||||
|
metadata_tags_join = '; '.join(metadata_tags)
|
||||||
text_field = ''
|
text_field = ''
|
||||||
colors = ''
|
colors = ''
|
||||||
power = ''
|
power = ''
|
||||||
|
|
@ -1014,6 +1017,7 @@ class ReportingMixin:
|
||||||
info.get('TriggerTag') or '',
|
info.get('TriggerTag') or '',
|
||||||
info.get('Synergy') if info.get('Synergy') is not None else '',
|
info.get('Synergy') if info.get('Synergy') is not None else '',
|
||||||
tags_join,
|
tags_join,
|
||||||
|
metadata_tags_join, # M5: Include metadata tags
|
||||||
text_field[:800] if isinstance(text_field, str) else str(text_field)[:800],
|
text_field[:800] if isinstance(text_field, str) else str(text_field)[:800],
|
||||||
dfc_note,
|
dfc_note,
|
||||||
owned_flag
|
owned_flag
|
||||||
|
|
|
||||||
|
|
@ -2,7 +2,23 @@
|
||||||
|
|
||||||
This module provides the main setup functionality for the MTG Python Deckbuilder
|
This module provides the main setup functionality for the MTG Python Deckbuilder
|
||||||
application. It handles initial setup tasks such as downloading card data,
|
application. It handles initial setup tasks such as downloading card data,
|
||||||
creating color-filtered card lists, and generating commander-eligible card lists.
|
creating color-filtered card lists, and gener logger.info(f'Downloading latest card data for {color} cards')
|
||||||
|
download_cards_csv(MTGJSON_API_URL, f'{CSV_DIRECTORY}/cards.csv')
|
||||||
|
|
||||||
|
logger.info('Loading and processing card data')
|
||||||
|
try:
|
||||||
|
df = pd.read_csv(f'{CSV_DIRECTORY}/cards.csv', low_memory=False)
|
||||||
|
except pd.errors.ParserError as e:
|
||||||
|
logger.warning(f'CSV parsing error encountered: {e}. Retrying with error handling...')
|
||||||
|
df = pd.read_csv(
|
||||||
|
f'{CSV_DIRECTORY}/cards.csv',
|
||||||
|
low_memory=False,
|
||||||
|
on_bad_lines='warn', # Warn about malformed rows but continue
|
||||||
|
encoding_errors='replace' # Replace bad encoding chars
|
||||||
|
)
|
||||||
|
logger.info('Successfully loaded card data with error handling (some rows may have been skipped)')
|
||||||
|
|
||||||
|
logger.info(f'Regenerating {color} cards CSV')der-eligible card lists.
|
||||||
|
|
||||||
Key Features:
|
Key Features:
|
||||||
- Initial setup and configuration
|
- Initial setup and configuration
|
||||||
|
|
@ -197,7 +213,17 @@ def regenerate_csvs_all() -> None:
|
||||||
download_cards_csv(MTGJSON_API_URL, f'{CSV_DIRECTORY}/cards.csv')
|
download_cards_csv(MTGJSON_API_URL, f'{CSV_DIRECTORY}/cards.csv')
|
||||||
|
|
||||||
logger.info('Loading and processing card data')
|
logger.info('Loading and processing card data')
|
||||||
df = pd.read_csv(f'{CSV_DIRECTORY}/cards.csv', low_memory=False)
|
try:
|
||||||
|
df = pd.read_csv(f'{CSV_DIRECTORY}/cards.csv', low_memory=False)
|
||||||
|
except pd.errors.ParserError as e:
|
||||||
|
logger.warning(f'CSV parsing error encountered: {e}. Retrying with error handling...')
|
||||||
|
df = pd.read_csv(
|
||||||
|
f'{CSV_DIRECTORY}/cards.csv',
|
||||||
|
low_memory=False,
|
||||||
|
on_bad_lines='warn', # Warn about malformed rows but continue
|
||||||
|
encoding_errors='replace' # Replace bad encoding chars
|
||||||
|
)
|
||||||
|
logger.info(f'Successfully loaded card data with error handling (some rows may have been skipped)')
|
||||||
|
|
||||||
logger.info('Regenerating color identity sorted files')
|
logger.info('Regenerating color identity sorted files')
|
||||||
save_color_filtered_csvs(df, CSV_DIRECTORY)
|
save_color_filtered_csvs(df, CSV_DIRECTORY)
|
||||||
|
|
@ -234,7 +260,12 @@ def regenerate_csv_by_color(color: str) -> None:
|
||||||
download_cards_csv(MTGJSON_API_URL, f'{CSV_DIRECTORY}/cards.csv')
|
download_cards_csv(MTGJSON_API_URL, f'{CSV_DIRECTORY}/cards.csv')
|
||||||
|
|
||||||
logger.info('Loading and processing card data')
|
logger.info('Loading and processing card data')
|
||||||
df = pd.read_csv(f'{CSV_DIRECTORY}/cards.csv', low_memory=False)
|
df = pd.read_csv(
|
||||||
|
f'{CSV_DIRECTORY}/cards.csv',
|
||||||
|
low_memory=False,
|
||||||
|
on_bad_lines='skip', # Skip malformed rows (MTGJSON CSV has escaping issues)
|
||||||
|
encoding_errors='replace' # Replace bad encoding chars
|
||||||
|
)
|
||||||
|
|
||||||
logger.info(f'Regenerating {color} cards CSV')
|
logger.info(f'Regenerating {color} cards CSV')
|
||||||
# Use shared utilities to base-filter once then slice color, honoring bans
|
# Use shared utilities to base-filter once then slice color, honoring bans
|
||||||
|
|
|
||||||
203
code/scripts/audit_protection_full_v2.py
Normal file
203
code/scripts/audit_protection_full_v2.py
Normal file
|
|
@ -0,0 +1,203 @@
|
||||||
|
"""
|
||||||
|
Full audit of Protection-tagged cards with kindred metadata support (M2 Phase 2).
|
||||||
|
|
||||||
|
Created: October 8, 2025
|
||||||
|
Purpose: Audit and validate Protection tag precision after implementing grant detection.
|
||||||
|
Can be re-run periodically to check tagging quality.
|
||||||
|
|
||||||
|
This script audits ALL Protection-tagged cards and categorizes them:
|
||||||
|
- Grant: Gives broad protection to other permanents YOU control
|
||||||
|
- Kindred: Gives protection to specific creature types (metadata tags)
|
||||||
|
- Mixed: Both broad and kindred/inherent
|
||||||
|
- Inherent: Only has protection itself
|
||||||
|
- ConditionalSelf: Only conditionally grants to itself
|
||||||
|
- Opponent: Grants to opponent's permanents
|
||||||
|
- Neither: False positive
|
||||||
|
|
||||||
|
Outputs:
|
||||||
|
- m2_audit_v2.json: Full analysis with summary
|
||||||
|
- m2_audit_v2_grant.csv: Cards for main Protection tag
|
||||||
|
- m2_audit_v2_kindred.csv: Cards for kindred metadata tags
|
||||||
|
- m2_audit_v2_mixed.csv: Cards with both broad and kindred grants
|
||||||
|
- m2_audit_v2_conditional.csv: Conditional self-grants (exclude)
|
||||||
|
- m2_audit_v2_inherent.csv: Inherent protection only (exclude)
|
||||||
|
- m2_audit_v2_opponent.csv: Opponent grants (exclude)
|
||||||
|
- m2_audit_v2_neither.csv: False positives (exclude)
|
||||||
|
- m2_audit_v2_all.csv: All cards combined
|
||||||
|
"""
|
||||||
|
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
import pandas as pd
|
||||||
|
import json
|
||||||
|
|
||||||
|
# Add project root to path
|
||||||
|
project_root = Path(__file__).parent.parent.parent
|
||||||
|
sys.path.insert(0, str(project_root))
|
||||||
|
|
||||||
|
from code.tagging.protection_grant_detection import (
|
||||||
|
categorize_protection_card,
|
||||||
|
get_kindred_protection_tags,
|
||||||
|
is_granting_protection,
|
||||||
|
)
|
||||||
|
|
||||||
|
def load_all_cards():
|
||||||
|
"""Load all cards from color/identity CSV files."""
|
||||||
|
csv_dir = project_root / 'csv_files'
|
||||||
|
|
||||||
|
# Get all color/identity CSVs (not the raw cards.csv)
|
||||||
|
csv_files = list(csv_dir.glob('*_cards.csv'))
|
||||||
|
csv_files = [f for f in csv_files if f.stem not in ['cards', 'testdata']]
|
||||||
|
|
||||||
|
all_cards = []
|
||||||
|
for csv_file in csv_files:
|
||||||
|
try:
|
||||||
|
df = pd.read_csv(csv_file)
|
||||||
|
all_cards.append(df)
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Warning: Could not load {csv_file.name}: {e}")
|
||||||
|
|
||||||
|
# Combine all DataFrames
|
||||||
|
combined = pd.concat(all_cards, ignore_index=True)
|
||||||
|
|
||||||
|
# Drop duplicates (cards appear in multiple color files)
|
||||||
|
combined = combined.drop_duplicates(subset=['name'], keep='first')
|
||||||
|
|
||||||
|
return combined
|
||||||
|
|
||||||
|
def audit_all_protection_cards():
|
||||||
|
"""Audit all Protection-tagged cards."""
|
||||||
|
print("Loading all cards...")
|
||||||
|
df = load_all_cards()
|
||||||
|
|
||||||
|
print(f"Total cards loaded: {len(df)}")
|
||||||
|
|
||||||
|
# Filter to Protection-tagged cards (column is 'themeTags' in color CSVs)
|
||||||
|
df_prot = df[df['themeTags'].str.contains('Protection', case=False, na=False)].copy()
|
||||||
|
|
||||||
|
print(f"Protection-tagged cards: {len(df_prot)}")
|
||||||
|
|
||||||
|
# Categorize each card
|
||||||
|
categories = []
|
||||||
|
grants_list = []
|
||||||
|
kindred_tags_list = []
|
||||||
|
|
||||||
|
for idx, row in df_prot.iterrows():
|
||||||
|
name = row['name']
|
||||||
|
text = str(row.get('text', '')).replace('\\n', '\n') # Convert escaped newlines to real newlines
|
||||||
|
keywords = str(row.get('keywords', ''))
|
||||||
|
card_type = str(row.get('type', ''))
|
||||||
|
|
||||||
|
# Categorize with kindred exclusion enabled
|
||||||
|
category = categorize_protection_card(name, text, keywords, card_type, exclude_kindred=True)
|
||||||
|
|
||||||
|
# Check if it grants broadly
|
||||||
|
grants_broad = is_granting_protection(text, keywords, exclude_kindred=True)
|
||||||
|
|
||||||
|
# Get kindred tags
|
||||||
|
kindred_tags = get_kindred_protection_tags(text)
|
||||||
|
|
||||||
|
categories.append(category)
|
||||||
|
grants_list.append(grants_broad)
|
||||||
|
kindred_tags_list.append(', '.join(sorted(kindred_tags)) if kindred_tags else '')
|
||||||
|
|
||||||
|
df_prot['category'] = categories
|
||||||
|
df_prot['grants_broad'] = grants_list
|
||||||
|
df_prot['kindred_tags'] = kindred_tags_list
|
||||||
|
|
||||||
|
# Generate summary (convert numpy types to native Python for JSON serialization)
|
||||||
|
summary = {
|
||||||
|
'total': int(len(df_prot)),
|
||||||
|
'categories': {k: int(v) for k, v in df_prot['category'].value_counts().to_dict().items()},
|
||||||
|
'grants_broad_count': int(df_prot['grants_broad'].sum()),
|
||||||
|
'kindred_cards_count': int((df_prot['kindred_tags'] != '').sum()),
|
||||||
|
}
|
||||||
|
|
||||||
|
# Calculate keep vs remove
|
||||||
|
keep_categories = {'Grant', 'Mixed'}
|
||||||
|
kindred_only = df_prot[df_prot['category'] == 'Kindred']
|
||||||
|
keep_count = len(df_prot[df_prot['category'].isin(keep_categories)])
|
||||||
|
remove_count = len(df_prot[~df_prot['category'].isin(keep_categories | {'Kindred'})])
|
||||||
|
|
||||||
|
summary['keep_main_tag'] = keep_count
|
||||||
|
summary['kindred_metadata'] = len(kindred_only)
|
||||||
|
summary['remove'] = remove_count
|
||||||
|
summary['precision_estimate'] = round((keep_count / len(df_prot)) * 100, 1) if len(df_prot) > 0 else 0
|
||||||
|
|
||||||
|
# Print summary
|
||||||
|
print(f"\n{'='*60}")
|
||||||
|
print("AUDIT SUMMARY")
|
||||||
|
print(f"{'='*60}")
|
||||||
|
print(f"Total Protection-tagged cards: {summary['total']}")
|
||||||
|
print(f"\nCategories:")
|
||||||
|
for cat, count in sorted(summary['categories'].items()):
|
||||||
|
pct = (count / summary['total']) * 100
|
||||||
|
print(f" {cat:20s} {count:4d} ({pct:5.1f}%)")
|
||||||
|
|
||||||
|
print(f"\n{'='*60}")
|
||||||
|
print(f"Main Protection tag: {keep_count:4d} ({keep_count/len(df_prot)*100:5.1f}%)")
|
||||||
|
print(f"Kindred metadata only: {len(kindred_only):4d} ({len(kindred_only)/len(df_prot)*100:5.1f}%)")
|
||||||
|
print(f"Remove: {remove_count:4d} ({remove_count/len(df_prot)*100:5.1f}%)")
|
||||||
|
print(f"{'='*60}")
|
||||||
|
print(f"Precision estimate: {summary['precision_estimate']}%")
|
||||||
|
print(f"{'='*60}\n")
|
||||||
|
|
||||||
|
# Export results
|
||||||
|
output_dir = project_root / 'logs' / 'roadmaps' / 'source' / 'tagging_refinement'
|
||||||
|
output_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
# Export JSON summary
|
||||||
|
with open(output_dir / 'm2_audit_v2.json', 'w') as f:
|
||||||
|
json.dump({
|
||||||
|
'summary': summary,
|
||||||
|
'cards': df_prot[['name', 'type', 'category', 'grants_broad', 'kindred_tags', 'keywords', 'text']].to_dict(orient='records')
|
||||||
|
}, f, indent=2)
|
||||||
|
|
||||||
|
# Export CSVs by category
|
||||||
|
export_cols = ['name', 'type', 'category', 'grants_broad', 'kindred_tags', 'keywords', 'text']
|
||||||
|
|
||||||
|
# Grant category
|
||||||
|
df_grant = df_prot[df_prot['category'] == 'Grant']
|
||||||
|
df_grant[export_cols].to_csv(output_dir / 'm2_audit_v2_grant.csv', index=False)
|
||||||
|
print(f"Exported {len(df_grant)} Grant cards to m2_audit_v2_grant.csv")
|
||||||
|
|
||||||
|
# Kindred category
|
||||||
|
df_kindred = df_prot[df_prot['category'] == 'Kindred']
|
||||||
|
df_kindred[export_cols].to_csv(output_dir / 'm2_audit_v2_kindred.csv', index=False)
|
||||||
|
print(f"Exported {len(df_kindred)} Kindred cards to m2_audit_v2_kindred.csv")
|
||||||
|
|
||||||
|
# Mixed category
|
||||||
|
df_mixed = df_prot[df_prot['category'] == 'Mixed']
|
||||||
|
df_mixed[export_cols].to_csv(output_dir / 'm2_audit_v2_mixed.csv', index=False)
|
||||||
|
print(f"Exported {len(df_mixed)} Mixed cards to m2_audit_v2_mixed.csv")
|
||||||
|
|
||||||
|
# ConditionalSelf category
|
||||||
|
df_conditional = df_prot[df_prot['category'] == 'ConditionalSelf']
|
||||||
|
df_conditional[export_cols].to_csv(output_dir / 'm2_audit_v2_conditional.csv', index=False)
|
||||||
|
print(f"Exported {len(df_conditional)} ConditionalSelf cards to m2_audit_v2_conditional.csv")
|
||||||
|
|
||||||
|
# Inherent category
|
||||||
|
df_inherent = df_prot[df_prot['category'] == 'Inherent']
|
||||||
|
df_inherent[export_cols].to_csv(output_dir / 'm2_audit_v2_inherent.csv', index=False)
|
||||||
|
print(f"Exported {len(df_inherent)} Inherent cards to m2_audit_v2_inherent.csv")
|
||||||
|
|
||||||
|
# Opponent category
|
||||||
|
df_opponent = df_prot[df_prot['category'] == 'Opponent']
|
||||||
|
df_opponent[export_cols].to_csv(output_dir / 'm2_audit_v2_opponent.csv', index=False)
|
||||||
|
print(f"Exported {len(df_opponent)} Opponent cards to m2_audit_v2_opponent.csv")
|
||||||
|
|
||||||
|
# Neither category
|
||||||
|
df_neither = df_prot[df_prot['category'] == 'Neither']
|
||||||
|
df_neither[export_cols].to_csv(output_dir / 'm2_audit_v2_neither.csv', index=False)
|
||||||
|
print(f"Exported {len(df_neither)} Neither cards to m2_audit_v2_neither.csv")
|
||||||
|
|
||||||
|
# All cards
|
||||||
|
df_prot[export_cols].to_csv(output_dir / 'm2_audit_v2_all.csv', index=False)
|
||||||
|
print(f"Exported {len(df_prot)} total cards to m2_audit_v2_all.csv")
|
||||||
|
|
||||||
|
print(f"\nAll files saved to: {output_dir}")
|
||||||
|
|
||||||
|
return df_prot, summary
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
df_results, summary = audit_all_protection_cards()
|
||||||
|
|
@ -1,6 +1,7 @@
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
# Standard library imports
|
# Standard library imports
|
||||||
|
import os
|
||||||
from typing import Dict, List, Optional
|
from typing import Dict, List, Optional
|
||||||
|
|
||||||
# ----------------------------------------------------------------------------------
|
# ----------------------------------------------------------------------------------
|
||||||
|
|
@ -99,3 +100,19 @@ FILL_NA_COLUMNS: Dict[str, Optional[str]] = {
|
||||||
'colorIdentity': 'Colorless', # Default color identity for cards without one
|
'colorIdentity': 'Colorless', # Default color identity for cards without one
|
||||||
'faceName': None # Use card's name column value when face name is not available
|
'faceName': None # Use card's name column value when face name is not available
|
||||||
}
|
}
|
||||||
|
|
||||||
|
# ----------------------------------------------------------------------------------
|
||||||
|
# TAGGING REFINEMENT FEATURE FLAGS (M1-M5)
|
||||||
|
# ----------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
# M1: Enable keyword normalization and singleton pruning (completed)
|
||||||
|
TAG_NORMALIZE_KEYWORDS = os.getenv('TAG_NORMALIZE_KEYWORDS', '1').lower() not in ('0', 'false', 'off', 'disabled')
|
||||||
|
|
||||||
|
# M2: Enable protection grant detection (completed)
|
||||||
|
TAG_PROTECTION_GRANTS = os.getenv('TAG_PROTECTION_GRANTS', '1').lower() not in ('0', 'false', 'off', 'disabled')
|
||||||
|
|
||||||
|
# M3: Enable metadata/theme partition (completed)
|
||||||
|
TAG_METADATA_SPLIT = os.getenv('TAG_METADATA_SPLIT', '1').lower() not in ('0', 'false', 'off', 'disabled')
|
||||||
|
|
||||||
|
# M5: Enable protection scope filtering in deck builder (completed - Phase 1-3, in progress Phase 4+)
|
||||||
|
TAG_PROTECTION_SCOPE = os.getenv('TAG_PROTECTION_SCOPE', '1').lower() not in ('0', 'false', 'off', 'disabled')
|
||||||
|
|
@ -1,9 +1,11 @@
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
# Standard library imports
|
||||||
import json
|
import json
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Dict, Iterable, Set
|
from typing import Dict, Iterable, Set
|
||||||
|
|
||||||
|
# Third-party imports
|
||||||
import pandas as pd
|
import pandas as pd
|
||||||
|
|
||||||
def _ensure_norm_series(df: pd.DataFrame, source_col: str, norm_col: str) -> pd.Series:
|
def _ensure_norm_series(df: pd.DataFrame, source_col: str, norm_col: str) -> pd.Series:
|
||||||
|
|
|
||||||
|
|
@ -1,9 +1,11 @@
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
# Standard library imports
|
||||||
|
import json
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import List, Optional
|
from typing import List, Optional
|
||||||
|
|
||||||
import json
|
# Third-party imports
|
||||||
from pydantic import BaseModel, Field
|
from pydantic import BaseModel, Field
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,14 +1,17 @@
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import json
|
# Standard library imports
|
||||||
import ast
|
import ast
|
||||||
|
import json
|
||||||
|
from collections import defaultdict
|
||||||
from dataclasses import dataclass
|
from dataclasses import dataclass
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Dict, List, Set, DefaultDict
|
from typing import DefaultDict, Dict, List, Set
|
||||||
from collections import defaultdict
|
|
||||||
|
|
||||||
|
# Third-party imports
|
||||||
import pandas as pd
|
import pandas as pd
|
||||||
|
|
||||||
|
# Local application imports
|
||||||
from settings import CSV_DIRECTORY, SETUP_COLORS
|
from settings import CSV_DIRECTORY, SETUP_COLORS
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -73,6 +73,132 @@ def load_merge_summary() -> Dict[str, Any]:
|
||||||
return {"updated_at": None, "colors": {}}
|
return {"updated_at": None, "colors": {}}
|
||||||
|
|
||||||
|
|
||||||
|
def _merge_tag_columns(work_df: pd.DataFrame, group_sorted: pd.DataFrame, primary_idx: int) -> None:
|
||||||
|
"""Merge list columns (themeTags, roleTags) into union values.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
work_df: Working DataFrame to update
|
||||||
|
group_sorted: Sorted group of faces for a multi-face card
|
||||||
|
primary_idx: Index of primary face to update
|
||||||
|
"""
|
||||||
|
for column in _LIST_UNION_COLUMNS:
|
||||||
|
if column in group_sorted.columns:
|
||||||
|
union_values = _merge_object_lists(group_sorted[column])
|
||||||
|
work_df.at[primary_idx, column] = union_values
|
||||||
|
|
||||||
|
if "keywords" in group_sorted.columns:
|
||||||
|
keyword_union = _merge_keywords(group_sorted["keywords"])
|
||||||
|
work_df.at[primary_idx, "keywords"] = _join_keywords(keyword_union)
|
||||||
|
|
||||||
|
|
||||||
|
def _build_face_payload(face_row: pd.Series) -> Dict[str, Any]:
|
||||||
|
"""Build face metadata payload from a single face row.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
face_row: Single face row from grouped DataFrame
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dictionary containing face metadata
|
||||||
|
"""
|
||||||
|
text_val = face_row.get("text") or face_row.get("oracleText") or ""
|
||||||
|
mana_cost_val = face_row.get("manaCost", face_row.get("mana_cost", "")) or ""
|
||||||
|
mana_value_raw = face_row.get("manaValue", face_row.get("mana_value", ""))
|
||||||
|
|
||||||
|
try:
|
||||||
|
if mana_value_raw in (None, ""):
|
||||||
|
mana_value_val = None
|
||||||
|
else:
|
||||||
|
mana_value_val = float(mana_value_raw)
|
||||||
|
if math.isnan(mana_value_val):
|
||||||
|
mana_value_val = None
|
||||||
|
except Exception:
|
||||||
|
mana_value_val = None
|
||||||
|
|
||||||
|
type_val = face_row.get("type", "") or ""
|
||||||
|
|
||||||
|
return {
|
||||||
|
"face": str(face_row.get("faceName") or face_row.get("name") or ""),
|
||||||
|
"side": str(face_row.get("side") or ""),
|
||||||
|
"layout": str(face_row.get("layout") or ""),
|
||||||
|
"themeTags": _merge_object_lists([face_row.get("themeTags", [])]),
|
||||||
|
"roleTags": _merge_object_lists([face_row.get("roleTags", [])]),
|
||||||
|
"type": str(type_val),
|
||||||
|
"text": str(text_val),
|
||||||
|
"mana_cost": str(mana_cost_val),
|
||||||
|
"mana_value": mana_value_val,
|
||||||
|
"produces_mana": _text_produces_mana(text_val),
|
||||||
|
"is_land": 'land' in str(type_val).lower(),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _build_merge_detail(name: str, group_sorted: pd.DataFrame, faces_payload: List[Dict[str, Any]]) -> Dict[str, Any]:
|
||||||
|
"""Build detailed merge information for a multi-face card group.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
name: Card name
|
||||||
|
group_sorted: Sorted group of faces
|
||||||
|
faces_payload: List of face metadata dictionaries
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dictionary containing merge details
|
||||||
|
"""
|
||||||
|
layout_set = sorted({f.get("layout", "") for f in faces_payload if f.get("layout")})
|
||||||
|
removed_faces = faces_payload[1:] if len(faces_payload) > 1 else []
|
||||||
|
|
||||||
|
return {
|
||||||
|
"name": name,
|
||||||
|
"total_faces": len(group_sorted),
|
||||||
|
"dropped_faces": max(len(group_sorted) - 1, 0),
|
||||||
|
"layouts": layout_set,
|
||||||
|
"primary_face": faces_payload[0] if faces_payload else {},
|
||||||
|
"removed_faces": removed_faces,
|
||||||
|
"theme_tags": sorted({tag for face in faces_payload for tag in face.get("themeTags", [])}),
|
||||||
|
"role_tags": sorted({tag for face in faces_payload for tag in face.get("roleTags", [])}),
|
||||||
|
"faces": faces_payload,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _log_merge_summary(color: str, merged_count: int, drop_count: int, multi_face_count: int, logger) -> None:
|
||||||
|
"""Log merge summary with structured and human-readable formats.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
color: Color being processed
|
||||||
|
merged_count: Number of card groups merged
|
||||||
|
drop_count: Number of face rows dropped
|
||||||
|
multi_face_count: Total multi-face rows processed
|
||||||
|
logger: Logger instance
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
logger.info(
|
||||||
|
"dfc_merge_summary %s",
|
||||||
|
json.dumps(
|
||||||
|
{
|
||||||
|
"event": "dfc_merge_summary",
|
||||||
|
"color": color,
|
||||||
|
"groups_merged": merged_count,
|
||||||
|
"faces_dropped": drop_count,
|
||||||
|
"multi_face_rows": multi_face_count,
|
||||||
|
},
|
||||||
|
sort_keys=True,
|
||||||
|
),
|
||||||
|
)
|
||||||
|
except Exception:
|
||||||
|
logger.info(
|
||||||
|
"dfc_merge_summary event=%s groups=%d dropped=%d rows=%d",
|
||||||
|
color,
|
||||||
|
merged_count,
|
||||||
|
drop_count,
|
||||||
|
multi_face_count,
|
||||||
|
)
|
||||||
|
|
||||||
|
logger.info(
|
||||||
|
"Merged %d multi-face card groups for %s (dropped %d extra faces)",
|
||||||
|
merged_count,
|
||||||
|
color,
|
||||||
|
drop_count,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def merge_multi_face_rows(
|
def merge_multi_face_rows(
|
||||||
df: pd.DataFrame,
|
df: pd.DataFrame,
|
||||||
color: str,
|
color: str,
|
||||||
|
|
@ -93,7 +219,6 @@ def merge_multi_face_rows(
|
||||||
return df
|
return df
|
||||||
|
|
||||||
work_df = df.copy()
|
work_df = df.copy()
|
||||||
|
|
||||||
layout_series = work_df["layout"].fillna("").astype(str).str.lower()
|
layout_series = work_df["layout"].fillna("").astype(str).str.lower()
|
||||||
multi_mask = layout_series.isin(_MULTI_FACE_LAYOUTS)
|
multi_mask = layout_series.isin(_MULTI_FACE_LAYOUTS)
|
||||||
|
|
||||||
|
|
@ -110,66 +235,15 @@ def merge_multi_face_rows(
|
||||||
|
|
||||||
group_sorted = _sort_faces(group)
|
group_sorted = _sort_faces(group)
|
||||||
primary_idx = group_sorted.index[0]
|
primary_idx = group_sorted.index[0]
|
||||||
faces_payload: List[Dict[str, Any]] = []
|
|
||||||
|
|
||||||
for column in _LIST_UNION_COLUMNS:
|
_merge_tag_columns(work_df, group_sorted, primary_idx)
|
||||||
if column in group_sorted.columns:
|
|
||||||
union_values = _merge_object_lists(group_sorted[column])
|
|
||||||
work_df.at[primary_idx, column] = union_values
|
|
||||||
|
|
||||||
if "keywords" in group_sorted.columns:
|
faces_payload = [_build_face_payload(row) for _, row in group_sorted.iterrows()]
|
||||||
keyword_union = _merge_keywords(group_sorted["keywords"])
|
|
||||||
work_df.at[primary_idx, "keywords"] = _join_keywords(keyword_union)
|
|
||||||
|
|
||||||
for _, face_row in group_sorted.iterrows():
|
drop_indices.extend(group_sorted.index[1:])
|
||||||
text_val = face_row.get("text") or face_row.get("oracleText") or ""
|
|
||||||
mana_cost_val = face_row.get("manaCost", face_row.get("mana_cost", "")) or ""
|
|
||||||
mana_value_raw = face_row.get("manaValue", face_row.get("mana_value", ""))
|
|
||||||
try:
|
|
||||||
if mana_value_raw in (None, ""):
|
|
||||||
mana_value_val = None
|
|
||||||
else:
|
|
||||||
mana_value_val = float(mana_value_raw)
|
|
||||||
if math.isnan(mana_value_val):
|
|
||||||
mana_value_val = None
|
|
||||||
except Exception:
|
|
||||||
mana_value_val = None
|
|
||||||
type_val = face_row.get("type", "") or ""
|
|
||||||
faces_payload.append(
|
|
||||||
{
|
|
||||||
"face": str(face_row.get("faceName") or face_row.get("name") or ""),
|
|
||||||
"side": str(face_row.get("side") or ""),
|
|
||||||
"layout": str(face_row.get("layout") or ""),
|
|
||||||
"themeTags": _merge_object_lists([face_row.get("themeTags", [])]),
|
|
||||||
"roleTags": _merge_object_lists([face_row.get("roleTags", [])]),
|
|
||||||
"type": str(type_val),
|
|
||||||
"text": str(text_val),
|
|
||||||
"mana_cost": str(mana_cost_val),
|
|
||||||
"mana_value": mana_value_val,
|
|
||||||
"produces_mana": _text_produces_mana(text_val),
|
|
||||||
"is_land": 'land' in str(type_val).lower(),
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
for idx in group_sorted.index[1:]:
|
|
||||||
drop_indices.append(idx)
|
|
||||||
|
|
||||||
merged_count += 1
|
merged_count += 1
|
||||||
layout_set = sorted({f.get("layout", "") for f in faces_payload if f.get("layout")})
|
merge_details.append(_build_merge_detail(name, group_sorted, faces_payload))
|
||||||
removed_faces = faces_payload[1:] if len(faces_payload) > 1 else []
|
|
||||||
merge_details.append(
|
|
||||||
{
|
|
||||||
"name": name,
|
|
||||||
"total_faces": len(group_sorted),
|
|
||||||
"dropped_faces": max(len(group_sorted) - 1, 0),
|
|
||||||
"layouts": layout_set,
|
|
||||||
"primary_face": faces_payload[0] if faces_payload else {},
|
|
||||||
"removed_faces": removed_faces,
|
|
||||||
"theme_tags": sorted({tag for face in faces_payload for tag in face.get("themeTags", [])}),
|
|
||||||
"role_tags": sorted({tag for face in faces_payload for tag in face.get("roleTags", [])}),
|
|
||||||
"faces": faces_payload,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
if drop_indices:
|
if drop_indices:
|
||||||
work_df = work_df.drop(index=drop_indices)
|
work_df = work_df.drop(index=drop_indices)
|
||||||
|
|
@ -192,38 +266,10 @@ def merge_multi_face_rows(
|
||||||
logger.warning("Failed to record DFC merge summary for %s: %s", color, exc)
|
logger.warning("Failed to record DFC merge summary for %s: %s", color, exc)
|
||||||
|
|
||||||
if logger is not None:
|
if logger is not None:
|
||||||
try:
|
_log_merge_summary(color, merged_count, len(drop_indices), int(multi_mask.sum()), logger)
|
||||||
logger.info(
|
|
||||||
"dfc_merge_summary %s",
|
|
||||||
json.dumps(
|
|
||||||
{
|
|
||||||
"event": "dfc_merge_summary",
|
|
||||||
"color": color,
|
|
||||||
"groups_merged": merged_count,
|
|
||||||
"faces_dropped": len(drop_indices),
|
|
||||||
"multi_face_rows": int(multi_mask.sum()),
|
|
||||||
},
|
|
||||||
sort_keys=True,
|
|
||||||
),
|
|
||||||
)
|
|
||||||
except Exception:
|
|
||||||
logger.info(
|
|
||||||
"dfc_merge_summary event=%s groups=%d dropped=%d rows=%d",
|
|
||||||
color,
|
|
||||||
merged_count,
|
|
||||||
len(drop_indices),
|
|
||||||
int(multi_mask.sum()),
|
|
||||||
)
|
|
||||||
logger.info(
|
|
||||||
"Merged %d multi-face card groups for %s (dropped %d extra faces)",
|
|
||||||
merged_count,
|
|
||||||
color,
|
|
||||||
len(drop_indices),
|
|
||||||
)
|
|
||||||
|
|
||||||
_persist_merge_summary(color, summary_payload, logger)
|
_persist_merge_summary(color, summary_payload, logger)
|
||||||
|
|
||||||
# Reset index to keep downstream expectations consistent.
|
|
||||||
return work_df.reset_index(drop=True)
|
return work_df.reset_index(drop=True)
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
213
code/tagging/phasing_scope_detection.py
Normal file
213
code/tagging/phasing_scope_detection.py
Normal file
|
|
@ -0,0 +1,213 @@
|
||||||
|
"""
|
||||||
|
Phasing Scope Detection Module
|
||||||
|
|
||||||
|
Detects the scope of phasing effects with multiple dimensions:
|
||||||
|
- Targeted: Phasing (any targeting effect)
|
||||||
|
- Self: Phasing (phases itself out)
|
||||||
|
- Your Permanents: Phasing (phases your permanents out)
|
||||||
|
- Opponent Permanents: Phasing (phases opponent permanents - removal)
|
||||||
|
- Blanket: Phasing (phases all permanents out)
|
||||||
|
|
||||||
|
Cards can have multiple scope tags (e.g., Targeted + Your Permanents).
|
||||||
|
|
||||||
|
Refactored in M2: Create Scope Detection Utilities to use generic scope detection.
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Standard library imports
|
||||||
|
import re
|
||||||
|
from typing import Set
|
||||||
|
|
||||||
|
# Local application imports
|
||||||
|
from . import scope_detection_utils as scope_utils
|
||||||
|
from code.logging_util import get_logger
|
||||||
|
|
||||||
|
logger = get_logger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
# Phasing scope pattern definitions
|
||||||
|
def _get_phasing_scope_patterns() -> scope_utils.ScopePatterns:
|
||||||
|
"""
|
||||||
|
Build scope patterns for phasing abilities.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
ScopePatterns object with compiled patterns
|
||||||
|
"""
|
||||||
|
# Targeting patterns (special for phasing - detects "target...phases out")
|
||||||
|
targeting_patterns = [
|
||||||
|
re.compile(r'target\s+(?:\w+\s+)*(?:creature|permanent|artifact|enchantment|nonland\s+permanent)s?(?:[^.]*)?phases?\s+out', re.IGNORECASE),
|
||||||
|
re.compile(r'target\s+player\s+controls[^.]*phases?\s+out', re.IGNORECASE),
|
||||||
|
]
|
||||||
|
|
||||||
|
# Self-reference patterns
|
||||||
|
self_patterns = [
|
||||||
|
re.compile(r'this\s+(?:creature|permanent|artifact|enchantment)\s+phases?\s+out', re.IGNORECASE),
|
||||||
|
re.compile(r'~\s+phases?\s+out', re.IGNORECASE),
|
||||||
|
# Triggered self-phasing (King of the Oathbreakers)
|
||||||
|
re.compile(r'whenever.*(?:becomes\s+the\s+target|becomes\s+target).*(?:it|this\s+creature)\s+phases?\s+out', re.IGNORECASE),
|
||||||
|
# Consequent self-phasing (Cyclonus: "connive. Then...phase out")
|
||||||
|
re.compile(r'(?:then|,)\s+(?:it|this\s+creature)\s+phases?\s+out', re.IGNORECASE),
|
||||||
|
# At end of turn/combat self-phasing
|
||||||
|
re.compile(r'(?:at\s+(?:the\s+)?end\s+of|after).*(?:it|this\s+creature)\s+phases?\s+out', re.IGNORECASE),
|
||||||
|
]
|
||||||
|
|
||||||
|
# Opponent patterns
|
||||||
|
opponent_patterns = [
|
||||||
|
re.compile(r'target\s+(?:\w+\s+)*(?:creature|permanent)\s+an?\s+opponents?\s+controls?\s+phases?\s+out', re.IGNORECASE),
|
||||||
|
# Unqualified targets (can target opponents' stuff if no "you control" restriction)
|
||||||
|
re.compile(r'(?:up\s+to\s+)?(?:one\s+|x\s+|that\s+many\s+)?(?:other\s+)?(?:another\s+)?target\s+(?:\w+\s+)*(?:creature|permanent|artifact|enchantment|nonland\s+permanent)s?(?:[^.]*)?phases?\s+out', re.IGNORECASE),
|
||||||
|
re.compile(r'target\s+(?:\w+\s+)*(?:creature|permanent|artifact|enchantment|land|nonland\s+permanent)(?:,|\s+and)?\s+(?:then|and)?\s+it\s+phases?\s+out', re.IGNORECASE),
|
||||||
|
]
|
||||||
|
|
||||||
|
# Your permanents patterns
|
||||||
|
your_patterns = [
|
||||||
|
# Explicit "you control"
|
||||||
|
re.compile(r'(?:target\s+)?(?:creatures?|permanents?|nonland\s+permanents?)\s+you\s+control\s+phases?\s+out', re.IGNORECASE),
|
||||||
|
re.compile(r'(?:target\s+)?(?:other\s+)?(?:creatures?|permanents?)\s+you\s+control\s+phases?\s+out', re.IGNORECASE),
|
||||||
|
re.compile(r'permanents?\s+you\s+control\s+phase\s+out', re.IGNORECASE),
|
||||||
|
re.compile(r'(?:any|up\s+to)\s+(?:number\s+of\s+)?(?:target\s+)?(?:other\s+)?(?:creatures?|permanents?|nonland\s+permanents?)\s+you\s+control\s+phases?\s+out', re.IGNORECASE),
|
||||||
|
re.compile(r'all\s+(?:creatures?|permanents?)\s+you\s+control\s+phase\s+out', re.IGNORECASE),
|
||||||
|
re.compile(r'each\s+(?:creature|permanent)\s+you\s+control\s+phases?\s+out', re.IGNORECASE),
|
||||||
|
# Pronoun reference to "you control" context
|
||||||
|
re.compile(r'(?:creatures?|permanents?|planeswalkers?)\s+you\s+control[^.]*(?:those|the)\s+(?:creatures?|permanents?|planeswalkers?)\s+phase\s+out', re.IGNORECASE),
|
||||||
|
re.compile(r'creature\s+you\s+control[^.]*(?:it)\s+phases?\s+out', re.IGNORECASE),
|
||||||
|
re.compile(r'you\s+control.*those\s+(?:creatures?|permanents?|planeswalkers?)\s+phase\s+out', re.IGNORECASE),
|
||||||
|
# Equipment/Aura
|
||||||
|
re.compile(r'equipped\s+(?:creature|permanent)\s+(?:gets\s+[^.]*\s+and\s+)?phases?\s+out', re.IGNORECASE),
|
||||||
|
re.compile(r'enchanted\s+(?:creature|permanent)\s+(?:gets\s+[^.]*\s+and\s+)?phases?\s+out', re.IGNORECASE),
|
||||||
|
re.compile(r'enchanted\s+(?:creature|permanent)\s+(?:has|gains?)\s+phasing', re.IGNORECASE),
|
||||||
|
re.compile(r'(?:equipped|enchanted)\s+(?:creature|permanent)[^.]*,?\s+(?:then\s+)?that\s+(?:creature|permanent)\s+phases?\s+out', re.IGNORECASE),
|
||||||
|
# Target controlled by specific player
|
||||||
|
re.compile(r'(?:each|target)\s+(?:creature|permanent)\s+target\s+player\s+controls\s+phases?\s+out', re.IGNORECASE),
|
||||||
|
]
|
||||||
|
|
||||||
|
# Blanket patterns
|
||||||
|
blanket_patterns = [
|
||||||
|
re.compile(r'all\s+(?:nontoken\s+)?(?:creatures?|permanents?)(?:\s+of\s+that\s+type)?\s+(?:[^.]*\s+)?phase\s+out', re.IGNORECASE),
|
||||||
|
re.compile(r'each\s+(?:creature|permanent)\s+(?:[^.]*\s+)?phases?\s+out', re.IGNORECASE),
|
||||||
|
# Type-specific blanket (Shimmer)
|
||||||
|
re.compile(r'each\s+(?:land|creature|permanent|artifact|enchantment)\s+of\s+the\s+chosen\s+type\s+has\s+phasing', re.IGNORECASE),
|
||||||
|
re.compile(r'(?:lands?|creatures?|permanents?|artifacts?|enchantments?)\s+of\s+the\s+chosen\s+type\s+(?:have|has)\s+phasing', re.IGNORECASE),
|
||||||
|
# Pronoun reference to "all creatures"
|
||||||
|
re.compile(r'all\s+(?:nontoken\s+)?(?:creatures?|permanents?)[^.]*,?\s+(?:then\s+)?(?:those|the)\s+(?:creatures?|permanents?)\s+phase\s+out', re.IGNORECASE),
|
||||||
|
]
|
||||||
|
|
||||||
|
return scope_utils.ScopePatterns(
|
||||||
|
opponent=opponent_patterns,
|
||||||
|
self_ref=self_patterns,
|
||||||
|
your_permanents=your_patterns,
|
||||||
|
blanket=blanket_patterns,
|
||||||
|
targeted=targeting_patterns
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def get_phasing_scope_tags(text: str, card_name: str, keywords: str = '') -> Set[str]:
|
||||||
|
"""
|
||||||
|
Get all phasing scope metadata tags for a card.
|
||||||
|
|
||||||
|
A card can have multiple scope tags:
|
||||||
|
- "Targeted: Phasing" - Uses targeting
|
||||||
|
- "Self: Phasing" - Phases itself out
|
||||||
|
- "Your Permanents: Phasing" - Phases your permanents
|
||||||
|
- "Opponent Permanents: Phasing" - Phases opponent permanents (removal)
|
||||||
|
- "Blanket: Phasing" - Phases all permanents
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Card text
|
||||||
|
card_name: Card name
|
||||||
|
keywords: Card keywords (to check for static "Phasing" ability)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Set of metadata tags
|
||||||
|
"""
|
||||||
|
if not card_name:
|
||||||
|
return set()
|
||||||
|
|
||||||
|
text_lower = text.lower() if text else ''
|
||||||
|
keywords_lower = keywords.lower() if keywords else ''
|
||||||
|
tags = set()
|
||||||
|
|
||||||
|
# Check for static "Phasing" keyword ability (self-phasing)
|
||||||
|
# Only add Self tag if card doesn't grant phasing to others
|
||||||
|
if 'phasing' in keywords_lower:
|
||||||
|
# Define patterns for checking if card grants phasing to others
|
||||||
|
grants_pattern = [re.compile(
|
||||||
|
r'(other|target|each|all|enchanted|equipped|creatures? you control|permanents? you control).*phas',
|
||||||
|
re.IGNORECASE
|
||||||
|
)]
|
||||||
|
|
||||||
|
is_static = scope_utils.check_static_keyword_legacy(
|
||||||
|
keywords=keywords,
|
||||||
|
static_keyword='phasing',
|
||||||
|
text=text,
|
||||||
|
grant_patterns=grants_pattern
|
||||||
|
)
|
||||||
|
|
||||||
|
if is_static:
|
||||||
|
tags.add('Self: Phasing')
|
||||||
|
return tags # Early return - static keyword only
|
||||||
|
|
||||||
|
# Check if phasing is mentioned in text
|
||||||
|
if 'phas' not in text_lower:
|
||||||
|
return tags
|
||||||
|
|
||||||
|
# Build phasing patterns and detect scopes
|
||||||
|
patterns = _get_phasing_scope_patterns()
|
||||||
|
|
||||||
|
# Detect all scopes (phasing can have multiple)
|
||||||
|
scopes = scope_utils.detect_multi_scope(
|
||||||
|
text=text,
|
||||||
|
card_name=card_name,
|
||||||
|
ability_keyword='phas', # Use 'phas' to catch both 'phase' and 'phasing'
|
||||||
|
patterns=patterns,
|
||||||
|
check_grant_verbs=False # Phasing doesn't need grant verb checking
|
||||||
|
)
|
||||||
|
|
||||||
|
# Format scope tags with "Phasing" ability name
|
||||||
|
for scope in scopes:
|
||||||
|
if scope == "Targeted":
|
||||||
|
tags.add("Targeted: Phasing")
|
||||||
|
else:
|
||||||
|
tags.add(scope_utils.format_scope_tag(scope, "Phasing"))
|
||||||
|
logger.debug(f"Card '{card_name}': detected {scope}: Phasing")
|
||||||
|
|
||||||
|
return tags
|
||||||
|
|
||||||
|
|
||||||
|
def has_phasing(text: str) -> bool:
|
||||||
|
"""
|
||||||
|
Quick check if card text contains phasing keywords.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Card text
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if phasing keyword found
|
||||||
|
"""
|
||||||
|
if not text:
|
||||||
|
return False
|
||||||
|
|
||||||
|
text_lower = text.lower()
|
||||||
|
|
||||||
|
# Check for phasing keywords
|
||||||
|
phasing_keywords = [
|
||||||
|
'phase out',
|
||||||
|
'phases out',
|
||||||
|
'phasing',
|
||||||
|
'phase in',
|
||||||
|
'phases in',
|
||||||
|
]
|
||||||
|
|
||||||
|
return any(keyword in text_lower for keyword in phasing_keywords)
|
||||||
|
|
||||||
|
|
||||||
|
def is_removal_phasing(tags: Set[str]) -> bool:
|
||||||
|
"""
|
||||||
|
Check if phasing effect acts as removal (targets opponent permanents).
|
||||||
|
|
||||||
|
Args:
|
||||||
|
tags: Set of phasing scope tags
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if this is removal-style phasing
|
||||||
|
"""
|
||||||
|
return "Opponent Permanents: Phasing" in tags
|
||||||
551
code/tagging/protection_grant_detection.py
Normal file
551
code/tagging/protection_grant_detection.py
Normal file
|
|
@ -0,0 +1,551 @@
|
||||||
|
"""
|
||||||
|
Protection grant detection implementation for M2.
|
||||||
|
|
||||||
|
This module provides helpers to distinguish cards that grant protection effects
|
||||||
|
from cards that have inherent protection effects.
|
||||||
|
|
||||||
|
Usage in tagger.py:
|
||||||
|
from code.tagging.protection_grant_detection import is_granting_protection
|
||||||
|
|
||||||
|
if is_granting_protection(text, keywords):
|
||||||
|
# Tag as Protection
|
||||||
|
"""
|
||||||
|
import re
|
||||||
|
from typing import List, Pattern, Set
|
||||||
|
from . import regex_patterns as rgx
|
||||||
|
from . import tag_utils
|
||||||
|
from .tag_constants import CONTEXT_WINDOW_SIZE, CREATURE_TYPES, PROTECTION_KEYWORDS
|
||||||
|
|
||||||
|
|
||||||
|
# Pre-compile kindred detection patterns at module load for performance
|
||||||
|
# Pattern: (compiled_regex, tag_name_template)
|
||||||
|
def _build_kindred_patterns() -> List[tuple[Pattern, str]]:
|
||||||
|
"""Build pre-compiled kindred patterns for all creature types.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of tuples containing (compiled_pattern, tag_name)
|
||||||
|
"""
|
||||||
|
patterns = []
|
||||||
|
|
||||||
|
for creature_type in CREATURE_TYPES:
|
||||||
|
creature_lower = creature_type.lower()
|
||||||
|
creature_escaped = re.escape(creature_lower)
|
||||||
|
tag_name = f"{creature_type}s Gain Protection"
|
||||||
|
pattern_templates = [
|
||||||
|
rf'\bother {creature_escaped}s?\b.*\b(have|gain)\b',
|
||||||
|
rf'\b{creature_escaped} creatures?\b.*\b(have|gain)\b',
|
||||||
|
rf'\btarget {creature_escaped}\b.*\bgains?\b',
|
||||||
|
]
|
||||||
|
|
||||||
|
for pattern_str in pattern_templates:
|
||||||
|
try:
|
||||||
|
compiled = re.compile(pattern_str, re.IGNORECASE)
|
||||||
|
patterns.append((compiled, tag_name))
|
||||||
|
except re.error:
|
||||||
|
# Skip patterns that fail to compile
|
||||||
|
pass
|
||||||
|
|
||||||
|
return patterns
|
||||||
|
KINDRED_PATTERNS: List[tuple[Pattern, str]] = _build_kindred_patterns()
|
||||||
|
|
||||||
|
|
||||||
|
# Grant verb patterns - cards that give protection to other permanents
|
||||||
|
# These patterns look for grant verbs that affect OTHER permanents, not self
|
||||||
|
# M5: Added phasing support
|
||||||
|
# Pre-compiled at module load for performance
|
||||||
|
GRANT_VERB_PATTERNS: List[Pattern] = [
|
||||||
|
re.compile(r'\bgain[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection|phasing)\b', re.IGNORECASE),
|
||||||
|
re.compile(r'\bgive[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection|phasing)\b', re.IGNORECASE),
|
||||||
|
re.compile(r'\bgrant[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection|phasing)\b', re.IGNORECASE),
|
||||||
|
re.compile(r'\bhave\b.*\b(hexproof|shroud|indestructible|ward|protection|phasing)\b', re.IGNORECASE), # "have hexproof" static grants
|
||||||
|
re.compile(r'\bget[s]?\b.*\+.*\b(hexproof|shroud|indestructible|ward|protection|phasing)\b', re.IGNORECASE), # "gets +X/+X and has hexproof" direct
|
||||||
|
re.compile(r'\bget[s]?\b.*\+.*\band\b.*\b(gain[s]?|have)\b.*\b(hexproof|shroud|indestructible|ward|protection|phasing)\b', re.IGNORECASE), # "gets +X/+X and gains hexproof"
|
||||||
|
re.compile(r'\bphases? out\b', re.IGNORECASE), # M5: Direct phasing triggers (e.g., "it phases out")
|
||||||
|
]
|
||||||
|
|
||||||
|
# Self-reference patterns that should NOT count as granting
|
||||||
|
# Reminder text and keyword lines only
|
||||||
|
# M5: Added phasing support
|
||||||
|
# Pre-compiled at module load for performance
|
||||||
|
SELF_REFERENCE_PATTERNS: List[Pattern] = [
|
||||||
|
re.compile(r'^\s*(hexproof|shroud|indestructible|ward|protection|phasing)', re.IGNORECASE), # Start of text (keyword ability)
|
||||||
|
re.compile(r'\([^)]*\b(hexproof|shroud|indestructible|ward|protection|phasing)[^)]*\)', re.IGNORECASE), # Reminder text in parens
|
||||||
|
]
|
||||||
|
|
||||||
|
# Conditional self-grant patterns - activated/triggered abilities that grant to self
|
||||||
|
# Pre-compiled at module load for performance
|
||||||
|
CONDITIONAL_SELF_GRANT_PATTERNS: List[Pattern] = [
|
||||||
|
# Activated abilities
|
||||||
|
re.compile(r'\{[^}]*\}.*:.*\bthis (creature|permanent|artifact|enchantment)\b.*\bgain[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection)\b', re.IGNORECASE),
|
||||||
|
re.compile(r'discard.*:.*\bthis (creature|permanent|artifact|enchantment)\b.*\bgain[s]?\b', re.IGNORECASE),
|
||||||
|
re.compile(r'\{t\}.*:.*\bthis (creature|permanent|artifact|enchantment)\b.*\bgain[s]?\b', re.IGNORECASE),
|
||||||
|
re.compile(r'sacrifice.*:.*\bthis (creature|permanent|artifact|enchantment)\b.*\bgain[s]?\b', re.IGNORECASE),
|
||||||
|
re.compile(r'pay.*life.*:.*\bthis (creature|permanent|artifact|enchantment)\b.*\bgain[s]?\b', re.IGNORECASE),
|
||||||
|
# Triggered abilities that grant to self only
|
||||||
|
re.compile(r'whenever.*\b(this creature|this permanent|it)\b.*\bgain[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection)\b', re.IGNORECASE),
|
||||||
|
re.compile(r'whenever you (cast|play|attack|cycle|discard|commit).*\b(this creature|this permanent|it)\b.*\bgain[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection)\b', re.IGNORECASE),
|
||||||
|
re.compile(r'at the beginning.*\b(this creature|this permanent|it)\b.*\bgain[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection)\b', re.IGNORECASE),
|
||||||
|
re.compile(r'whenever.*\b(this creature|this permanent)\b (attacks|enters|becomes).*\b(this creature|this permanent|it)\b.*\bgain[s]?\b', re.IGNORECASE),
|
||||||
|
# Named self-references (e.g., "Pristine Skywise gains")
|
||||||
|
re.compile(r'whenever you cast.*[A-Z][a-z]+.*gains.*\b(hexproof|shroud|indestructible|ward|protection)\b', re.IGNORECASE),
|
||||||
|
re.compile(r'whenever you.*[A-Z][a-z]+.*gains.*\b(hexproof|shroud|indestructible|ward|protection)\b', re.IGNORECASE),
|
||||||
|
# Static conditional abilities (as long as, if you control X)
|
||||||
|
re.compile(r'as long as.*\b(this creature|this permanent|it|has)\b.*(has|gains?).*\b(hexproof|shroud|indestructible|ward|protection)\b', re.IGNORECASE),
|
||||||
|
]
|
||||||
|
|
||||||
|
# Mass grant patterns - affects multiple creatures YOU control
|
||||||
|
# Pre-compiled at module load for performance
|
||||||
|
MASS_GRANT_PATTERNS: List[Pattern] = [
|
||||||
|
re.compile(r'creatures you control (have|gain|get)', re.IGNORECASE),
|
||||||
|
re.compile(r'other .* you control (have|gain|get)', re.IGNORECASE),
|
||||||
|
re.compile(r'(artifacts?|enchantments?|permanents?) you control (have|gain|get)', re.IGNORECASE), # Artifacts you control have...
|
||||||
|
re.compile(r'other (creatures?|artifacts?|enchantments?) (have|gain|get)', re.IGNORECASE), # Other creatures have...
|
||||||
|
re.compile(r'all (creatures?|slivers?|permanents?) (have|gain|get)', re.IGNORECASE), # All creatures/slivers have...
|
||||||
|
]
|
||||||
|
|
||||||
|
# Targeted grant patterns - must specify "you control"
|
||||||
|
# Pre-compiled at module load for performance
|
||||||
|
TARGETED_GRANT_PATTERNS: List[Pattern] = [
|
||||||
|
re.compile(r'target .* you control (gains?|gets?|has)', re.IGNORECASE),
|
||||||
|
re.compile(r'equipped creature (gains?|gets?|has)', re.IGNORECASE),
|
||||||
|
re.compile(r'enchanted enchantment (gains?|gets?|has)', re.IGNORECASE),
|
||||||
|
]
|
||||||
|
|
||||||
|
# Exclusion patterns - cards that remove or prevent protection
|
||||||
|
# Pre-compiled at module load for performance
|
||||||
|
EXCLUSION_PATTERNS: List[Pattern] = [
|
||||||
|
re.compile(r"can't have (hexproof|indestructible|ward|shroud)", re.IGNORECASE),
|
||||||
|
re.compile(r"lose[s]? (hexproof|indestructible|ward|shroud|protection)", re.IGNORECASE),
|
||||||
|
re.compile(r"without (hexproof|indestructible|ward|shroud)", re.IGNORECASE),
|
||||||
|
re.compile(r"protection from.*can't", re.IGNORECASE),
|
||||||
|
]
|
||||||
|
|
||||||
|
# Opponent grant patterns - grants to opponent's permanents (EXCLUDE these)
|
||||||
|
# NOTE: "all creatures" and "all permanents" are BLANKET effects (help you too),
|
||||||
|
# not opponent grants. Only exclude effects that ONLY help opponents.
|
||||||
|
# Pre-compiled at module load for performance
|
||||||
|
OPPONENT_GRANT_PATTERNS: List[Pattern] = [
|
||||||
|
rgx.TARGET_OPPONENT,
|
||||||
|
rgx.EACH_OPPONENT,
|
||||||
|
rgx.OPPONENT_CONTROL,
|
||||||
|
re.compile(r'opponent.*permanents?.*have', re.IGNORECASE), # opponent's permanents have
|
||||||
|
]
|
||||||
|
|
||||||
|
# Blanket grant patterns - affects all permanents regardless of controller
|
||||||
|
# These are VALID protection grants that should be tagged (Blanket scope in M5)
|
||||||
|
# Pre-compiled at module load for performance
|
||||||
|
BLANKET_GRANT_PATTERNS: List[Pattern] = [
|
||||||
|
re.compile(r'\ball creatures? (have|gain|get)\b', re.IGNORECASE), # All creatures gain hexproof
|
||||||
|
re.compile(r'\ball permanents? (have|gain|get)\b', re.IGNORECASE), # All permanents gain indestructible
|
||||||
|
re.compile(r'\beach creature (has|gains?|gets?)\b', re.IGNORECASE), # Each creature gains ward
|
||||||
|
rgx.EACH_PLAYER, # Each player gains hexproof (very rare but valid blanket)
|
||||||
|
]
|
||||||
|
|
||||||
|
# Kindred-specific grant patterns for metadata tagging
|
||||||
|
KINDRED_GRANT_PATTERNS = {
|
||||||
|
'Knights Gain Protection': [
|
||||||
|
r'knight[s]? you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
r'other knight[s]?.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
],
|
||||||
|
'Merfolk Gain Protection': [
|
||||||
|
r'merfolk you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
r'other merfolk.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
],
|
||||||
|
'Zombies Gain Protection': [
|
||||||
|
r'zombie[s]? you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
r'other zombie[s]?.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
r'target.*zombie.*\bgain[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
],
|
||||||
|
'Vampires Gain Protection': [
|
||||||
|
r'vampire[s]? you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
r'other vampire[s]?.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
],
|
||||||
|
'Elves Gain Protection': [
|
||||||
|
r'el(f|ves) you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
r'other el(f|ves).*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
],
|
||||||
|
'Dragons Gain Protection': [
|
||||||
|
r'dragon[s]? you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
r'other dragon[s]?.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
],
|
||||||
|
'Goblins Gain Protection': [
|
||||||
|
r'goblin[s]? you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
r'other goblin[s]?.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
],
|
||||||
|
'Slivers Gain Protection': [
|
||||||
|
r'sliver[s]? you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
r'all sliver[s]?.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
r'other sliver[s]?.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
],
|
||||||
|
'Artifacts Gain Protection': [
|
||||||
|
r'artifact[s]? you control (have|gain).*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
r'other artifact[s]? (have|gain).*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
],
|
||||||
|
'Enchantments Gain Protection': [
|
||||||
|
r'enchantment[s]? you control (have|gain).*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
r'other enchantment[s]? (have|gain).*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||||
|
],
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def get_kindred_protection_tags(text: str) -> Set[str]:
|
||||||
|
"""
|
||||||
|
Identify kindred-specific protection grants for metadata tagging.
|
||||||
|
|
||||||
|
Returns a set of metadata tag names like:
|
||||||
|
- "Knights Gain Hexproof"
|
||||||
|
- "Spiders Gain Ward"
|
||||||
|
- "Artifacts Gain Indestructible"
|
||||||
|
|
||||||
|
Uses both predefined patterns and dynamic creature type detection,
|
||||||
|
with specific ability detection (hexproof, ward, indestructible, shroud, protection).
|
||||||
|
|
||||||
|
IMPORTANT: Only tags the specific abilities that appear in the same sentence
|
||||||
|
as the creature type grant to avoid false positives like Svyelun.
|
||||||
|
"""
|
||||||
|
if not text:
|
||||||
|
return set()
|
||||||
|
|
||||||
|
text_lower = text.lower()
|
||||||
|
tags = set()
|
||||||
|
|
||||||
|
# Only proceed if protective abilities are present (performance optimization)
|
||||||
|
protective_abilities = ['hexproof', 'shroud', 'indestructible', 'ward', 'protection']
|
||||||
|
if not any(keyword in text_lower for keyword in protective_abilities):
|
||||||
|
return tags
|
||||||
|
for tag_base, patterns in KINDRED_GRANT_PATTERNS.items():
|
||||||
|
for pattern in patterns:
|
||||||
|
pattern_compiled = re.compile(pattern, re.IGNORECASE) if isinstance(pattern, str) else pattern
|
||||||
|
match = pattern_compiled.search(text_lower)
|
||||||
|
if match:
|
||||||
|
creature_type = tag_base.split(' Gain ')[0]
|
||||||
|
# Get the matched text to check which abilities are in this specific grant
|
||||||
|
matched_text = match.group(0)
|
||||||
|
# Only tag abilities that appear in the matched phrase
|
||||||
|
if 'hexproof' in matched_text:
|
||||||
|
tags.add(f"{creature_type} Gain Hexproof")
|
||||||
|
if 'shroud' in matched_text:
|
||||||
|
tags.add(f"{creature_type} Gain Shroud")
|
||||||
|
if 'indestructible' in matched_text:
|
||||||
|
tags.add(f"{creature_type} Gain Indestructible")
|
||||||
|
if 'ward' in matched_text:
|
||||||
|
tags.add(f"{creature_type} Gain Ward")
|
||||||
|
if 'protection' in matched_text:
|
||||||
|
tags.add(f"{creature_type} Gain Protection")
|
||||||
|
break # Found match for this kindred type, move to next
|
||||||
|
|
||||||
|
# Use pre-compiled patterns for all creature types
|
||||||
|
for compiled_pattern, tag_template in KINDRED_PATTERNS:
|
||||||
|
match = compiled_pattern.search(text_lower)
|
||||||
|
if match:
|
||||||
|
creature_type = tag_template.split(' Gain ')[0]
|
||||||
|
# Get the matched text to check which abilities are in this specific grant
|
||||||
|
matched_text = match.group(0)
|
||||||
|
# Only tag abilities that appear in the matched phrase
|
||||||
|
if 'hexproof' in matched_text:
|
||||||
|
tags.add(f"{creature_type} Gain Hexproof")
|
||||||
|
if 'shroud' in matched_text:
|
||||||
|
tags.add(f"{creature_type} Gain Shroud")
|
||||||
|
if 'indestructible' in matched_text:
|
||||||
|
tags.add(f"{creature_type} Gain Indestructible")
|
||||||
|
if 'ward' in matched_text:
|
||||||
|
tags.add(f"{creature_type} Gain Ward")
|
||||||
|
if 'protection' in matched_text:
|
||||||
|
tags.add(f"{creature_type} Gain Protection")
|
||||||
|
# Don't break - a card could grant to multiple creature types
|
||||||
|
|
||||||
|
return tags
|
||||||
|
|
||||||
|
|
||||||
|
def is_opponent_grant(text: str) -> bool:
|
||||||
|
"""
|
||||||
|
Check if card grants protection to opponent's permanents ONLY.
|
||||||
|
|
||||||
|
Returns True if this grants ONLY to opponents (should be excluded from Protection tag).
|
||||||
|
Does NOT exclude blanket effects like "all creatures gain hexproof" which help you too.
|
||||||
|
"""
|
||||||
|
if not text:
|
||||||
|
return False
|
||||||
|
|
||||||
|
text_lower = text.lower()
|
||||||
|
|
||||||
|
# Remove reminder text (in parentheses) to avoid false positives
|
||||||
|
# Reminder text often mentions "opponents control" for hexproof/shroud explanations
|
||||||
|
text_no_reminder = tag_utils.strip_reminder_text(text_lower)
|
||||||
|
for pattern in OPPONENT_GRANT_PATTERNS:
|
||||||
|
match = pattern.search(text_no_reminder)
|
||||||
|
if match:
|
||||||
|
# Must be in context of granting protection
|
||||||
|
if any(prot in text_lower for prot in ['hexproof', 'shroud', 'indestructible', 'ward', 'protection']):
|
||||||
|
context = tag_utils.extract_context_window(
|
||||||
|
text_no_reminder, match.start(), match.end(),
|
||||||
|
window_size=CONTEXT_WINDOW_SIZE, include_before=True
|
||||||
|
)
|
||||||
|
|
||||||
|
# If "you control" appears in the context, it's limiting to YOUR permanents, not opponents
|
||||||
|
if 'you control' not in context:
|
||||||
|
return True
|
||||||
|
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def has_conditional_self_grant(text: str) -> bool:
|
||||||
|
"""
|
||||||
|
Check if card has any conditional self-grant patterns.
|
||||||
|
This does NOT check if it ALSO grants to others.
|
||||||
|
"""
|
||||||
|
if not text:
|
||||||
|
return False
|
||||||
|
|
||||||
|
text_lower = text.lower()
|
||||||
|
for pattern in CONDITIONAL_SELF_GRANT_PATTERNS:
|
||||||
|
if pattern.search(text_lower):
|
||||||
|
return True
|
||||||
|
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def is_conditional_self_grant(text: str) -> bool:
|
||||||
|
"""
|
||||||
|
Check if card only conditionally grants protection to itself.
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
- "{B}, Discard a card: This creature gains hexproof until end of turn."
|
||||||
|
- "Whenever you cast a noncreature spell, untap this creature. It gains protection..."
|
||||||
|
- "Whenever this creature attacks, it gains indestructible until end of turn."
|
||||||
|
|
||||||
|
These should be excluded as they don't provide protection to OTHER permanents.
|
||||||
|
"""
|
||||||
|
if not text:
|
||||||
|
return False
|
||||||
|
|
||||||
|
text_lower = text.lower()
|
||||||
|
found_conditional_self = has_conditional_self_grant(text)
|
||||||
|
|
||||||
|
if not found_conditional_self:
|
||||||
|
return False
|
||||||
|
|
||||||
|
# If we found a conditional self-grant, check if there's ALSO a grant to others
|
||||||
|
other_grant_patterns = [
|
||||||
|
rgx.OTHER_CREATURES,
|
||||||
|
re.compile(r'creatures you control (have|gain)', re.IGNORECASE),
|
||||||
|
re.compile(r'target (creature|permanent) you control gains', re.IGNORECASE),
|
||||||
|
re.compile(r'another target (creature|permanent)', re.IGNORECASE),
|
||||||
|
re.compile(r'equipped creature (has|gains)', re.IGNORECASE),
|
||||||
|
re.compile(r'enchanted creature (has|gains)', re.IGNORECASE),
|
||||||
|
re.compile(r'target legendary', re.IGNORECASE),
|
||||||
|
re.compile(r'permanents you control gain', re.IGNORECASE),
|
||||||
|
]
|
||||||
|
has_other_grant = any(pattern.search(text_lower) for pattern in other_grant_patterns)
|
||||||
|
|
||||||
|
# Return True only if it's ONLY conditional self-grants (no other grants)
|
||||||
|
return not has_other_grant
|
||||||
|
|
||||||
|
|
||||||
|
def _should_exclude_token_creation(text_lower: str) -> bool:
|
||||||
|
"""Check if card only creates tokens with protection (not granting to existing permanents).
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text_lower: Lowercased card text
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if card only creates tokens, False if it also grants
|
||||||
|
"""
|
||||||
|
token_with_protection = re.compile(r'create.*token.*with.*(hexproof|shroud|indestructible|ward|protection)', re.IGNORECASE)
|
||||||
|
if token_with_protection.search(text_lower):
|
||||||
|
has_grant_to_others = any(pattern.search(text_lower) for pattern in MASS_GRANT_PATTERNS)
|
||||||
|
return not has_grant_to_others
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def _should_exclude_kindred_only(text: str, text_lower: str, exclude_kindred: bool) -> bool:
|
||||||
|
"""Check if card only grants to specific kindred types.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Original card text
|
||||||
|
text_lower: Lowercased card text
|
||||||
|
exclude_kindred: Whether to exclude kindred-specific grants
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if card only has kindred grants, False if it has broad grants
|
||||||
|
"""
|
||||||
|
if not exclude_kindred:
|
||||||
|
return False
|
||||||
|
|
||||||
|
kindred_tags = get_kindred_protection_tags(text)
|
||||||
|
if not kindred_tags:
|
||||||
|
return False
|
||||||
|
broad_only_patterns = [
|
||||||
|
re.compile(r'\bcreatures you control (have|gain)\b(?!.*(knight|merfolk|zombie|elf|dragon|goblin|sliver))', re.IGNORECASE),
|
||||||
|
re.compile(r'\bpermanents you control (have|gain)\b', re.IGNORECASE),
|
||||||
|
re.compile(r'\beach (creature|permanent) you control', re.IGNORECASE),
|
||||||
|
re.compile(r'\ball (creatures?|permanents?)', re.IGNORECASE),
|
||||||
|
]
|
||||||
|
|
||||||
|
has_broad_grant = any(pattern.search(text_lower) for pattern in broad_only_patterns)
|
||||||
|
return not has_broad_grant
|
||||||
|
|
||||||
|
|
||||||
|
def _check_pattern_grants(text_lower: str, pattern_list: List[Pattern]) -> bool:
|
||||||
|
"""Check if text contains protection grants matching pattern list.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text_lower: Lowercased card text
|
||||||
|
pattern_list: List of grant patterns to check
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if protection grant found, False otherwise
|
||||||
|
"""
|
||||||
|
for pattern in pattern_list:
|
||||||
|
match = pattern.search(text_lower)
|
||||||
|
if match:
|
||||||
|
context = tag_utils.extract_context_window(text_lower, match.start(), match.end())
|
||||||
|
if any(prot in context for prot in PROTECTION_KEYWORDS):
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def _has_inherent_protection_only(text_lower: str, keywords: str, found_grant: bool) -> bool:
|
||||||
|
"""Check if card only has inherent protection without granting.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text_lower: Lowercased card text
|
||||||
|
keywords: Card keywords
|
||||||
|
found_grant: Whether a grant pattern was found
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if card only has inherent protection, False otherwise
|
||||||
|
"""
|
||||||
|
if not keywords:
|
||||||
|
return False
|
||||||
|
|
||||||
|
keywords_lower = keywords.lower()
|
||||||
|
has_inherent = any(k in keywords_lower for k in PROTECTION_KEYWORDS)
|
||||||
|
|
||||||
|
if not has_inherent or found_grant:
|
||||||
|
return False
|
||||||
|
stat_only_pattern = re.compile(r'(get[s]?|gain[s]?)\s+[+\-][0-9X]+/[+\-][0-9X]+', re.IGNORECASE)
|
||||||
|
has_stat_only = bool(stat_only_pattern.search(text_lower))
|
||||||
|
mentions_other_without_prot = False
|
||||||
|
if 'other' in text_lower:
|
||||||
|
other_idx = text_lower.find('other')
|
||||||
|
remaining_text = text_lower[other_idx:]
|
||||||
|
mentions_other_without_prot = not any(prot in remaining_text for prot in PROTECTION_KEYWORDS)
|
||||||
|
|
||||||
|
return has_stat_only or mentions_other_without_prot
|
||||||
|
|
||||||
|
|
||||||
|
def is_granting_protection(text: str, keywords: str, exclude_kindred: bool = False) -> bool:
|
||||||
|
"""
|
||||||
|
Determine if a card grants protection effects to other permanents.
|
||||||
|
|
||||||
|
Returns True if the card gives/grants protection to other cards unconditionally.
|
||||||
|
Returns False if:
|
||||||
|
- Card only has inherent protection
|
||||||
|
- Card only conditionally grants to itself
|
||||||
|
- Card grants to opponent's permanents
|
||||||
|
- Card grants only to specific kindred types (when exclude_kindred=True)
|
||||||
|
- Card creates tokens with protection (not granting to existing permanents)
|
||||||
|
- Card only modifies non-protection stats of other permanents
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Card text to analyze
|
||||||
|
keywords: Card keywords (comma-separated)
|
||||||
|
exclude_kindred: If True, exclude kindred-specific grants
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if card grants broad protection, False otherwise
|
||||||
|
"""
|
||||||
|
if not text:
|
||||||
|
return False
|
||||||
|
|
||||||
|
text_lower = text.lower()
|
||||||
|
|
||||||
|
# Early exclusion checks
|
||||||
|
if is_opponent_grant(text):
|
||||||
|
return False
|
||||||
|
|
||||||
|
if is_conditional_self_grant(text):
|
||||||
|
return False
|
||||||
|
|
||||||
|
if any(pattern.search(text_lower) for pattern in EXCLUSION_PATTERNS):
|
||||||
|
return False
|
||||||
|
|
||||||
|
if _should_exclude_token_creation(text_lower):
|
||||||
|
return False
|
||||||
|
|
||||||
|
if _should_exclude_kindred_only(text, text_lower, exclude_kindred):
|
||||||
|
return False
|
||||||
|
found_grant = False
|
||||||
|
if _check_pattern_grants(text_lower, BLANKET_GRANT_PATTERNS):
|
||||||
|
found_grant = True
|
||||||
|
elif _check_pattern_grants(text_lower, MASS_GRANT_PATTERNS):
|
||||||
|
found_grant = True
|
||||||
|
elif _check_pattern_grants(text_lower, TARGETED_GRANT_PATTERNS):
|
||||||
|
found_grant = True
|
||||||
|
elif any(pattern.search(text_lower) for pattern in GRANT_VERB_PATTERNS):
|
||||||
|
found_grant = True
|
||||||
|
if _has_inherent_protection_only(text_lower, keywords, found_grant):
|
||||||
|
return False
|
||||||
|
|
||||||
|
return found_grant
|
||||||
|
|
||||||
|
|
||||||
|
def categorize_protection_card(name: str, text: str, keywords: str, card_type: str, exclude_kindred: bool = False) -> str:
|
||||||
|
"""
|
||||||
|
Categorize a Protection-tagged card for audit purposes.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
name: Card name
|
||||||
|
text: Card text
|
||||||
|
keywords: Card keywords
|
||||||
|
card_type: Card type line
|
||||||
|
exclude_kindred: If True, kindred-specific grants are categorized as metadata, not Grant
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
'Grant' - gives broad protection to others
|
||||||
|
'Kindred' - gives kindred-specific protection (metadata tag)
|
||||||
|
'Inherent' - has protection itself
|
||||||
|
'ConditionalSelf' - only conditionally grants to itself
|
||||||
|
'Opponent' - grants to opponent's permanents
|
||||||
|
'Neither' - false positive
|
||||||
|
"""
|
||||||
|
keywords_lower = keywords.lower() if keywords else ''
|
||||||
|
if is_opponent_grant(text):
|
||||||
|
return 'Opponent'
|
||||||
|
if is_conditional_self_grant(text):
|
||||||
|
return 'ConditionalSelf'
|
||||||
|
has_cond_self = has_conditional_self_grant(text)
|
||||||
|
has_inherent = any(k in keywords_lower for k in PROTECTION_KEYWORDS)
|
||||||
|
kindred_tags = get_kindred_protection_tags(text)
|
||||||
|
if kindred_tags and exclude_kindred:
|
||||||
|
grants_broad = is_granting_protection(text, keywords, exclude_kindred=True)
|
||||||
|
|
||||||
|
if grants_broad and has_inherent:
|
||||||
|
# Has inherent + kindred + broad grants
|
||||||
|
return 'Mixed'
|
||||||
|
elif grants_broad:
|
||||||
|
# Has kindred + broad grants (but no inherent)
|
||||||
|
# This is just Grant with kindred metadata tags
|
||||||
|
return 'Grant'
|
||||||
|
elif has_inherent:
|
||||||
|
# Has inherent + kindred only (not broad)
|
||||||
|
# This is still just Kindred category (inherent is separate from granting)
|
||||||
|
return 'Kindred'
|
||||||
|
else:
|
||||||
|
# Only kindred grants, no inherent or broad
|
||||||
|
return 'Kindred'
|
||||||
|
grants_protection = is_granting_protection(text, keywords, exclude_kindred=exclude_kindred)
|
||||||
|
|
||||||
|
# Categorize based on what it does
|
||||||
|
if grants_protection and has_cond_self:
|
||||||
|
# Has conditional self-grant + grants to others = Mixed
|
||||||
|
return 'Mixed'
|
||||||
|
elif grants_protection and has_inherent:
|
||||||
|
return 'Mixed' # Has inherent + grants broadly
|
||||||
|
elif grants_protection:
|
||||||
|
return 'Grant' # Only grants broadly
|
||||||
|
elif has_inherent:
|
||||||
|
return 'Inherent' # Only has inherent
|
||||||
|
else:
|
||||||
|
return 'Neither' # False positive
|
||||||
169
code/tagging/protection_scope_detection.py
Normal file
169
code/tagging/protection_scope_detection.py
Normal file
|
|
@ -0,0 +1,169 @@
|
||||||
|
"""
|
||||||
|
Protection Scope Detection Module
|
||||||
|
|
||||||
|
Detects the scope of protection effects (Self, Your Permanents, Blanket, Opponent Permanents)
|
||||||
|
to enable intelligent filtering in deck building.
|
||||||
|
|
||||||
|
Part of M5: Protection Effect Granularity milestone.
|
||||||
|
Refactored in M2: Create Scope Detection Utilities to use generic scope detection.
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Standard library imports
|
||||||
|
import re
|
||||||
|
from typing import Optional, Set
|
||||||
|
|
||||||
|
# Local application imports
|
||||||
|
from code.logging_util import get_logger
|
||||||
|
from . import scope_detection_utils as scope_utils
|
||||||
|
from .tag_constants import PROTECTION_ABILITIES
|
||||||
|
|
||||||
|
logger = get_logger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
# Protection scope pattern definitions
|
||||||
|
def _get_protection_scope_patterns(ability: str) -> scope_utils.ScopePatterns:
|
||||||
|
"""
|
||||||
|
Build scope patterns for protection abilities.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
ability: Ability keyword (e.g., "hexproof", "ward")
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
ScopePatterns object with compiled patterns
|
||||||
|
"""
|
||||||
|
ability_lower = ability.lower()
|
||||||
|
|
||||||
|
# Opponent patterns: grants protection TO opponent's permanents
|
||||||
|
# Note: Must distinguish from hexproof reminder text "opponents control [spells/abilities]"
|
||||||
|
opponent_patterns = [
|
||||||
|
re.compile(r'creatures?\s+(?:your\s+)?opponents?\s+control\s+(?:have|gain)', re.IGNORECASE),
|
||||||
|
re.compile(r'permanents?\s+(?:your\s+)?opponents?\s+control\s+(?:have|gain)', re.IGNORECASE),
|
||||||
|
re.compile(r'each\s+creature\s+an?\s+opponent\s+controls?\s+(?:has|gains?)', re.IGNORECASE),
|
||||||
|
]
|
||||||
|
|
||||||
|
# Self-reference patterns
|
||||||
|
self_patterns = [
|
||||||
|
# Tilde (~) - strong self-reference indicator
|
||||||
|
re.compile(r'~\s+(?:has|gains?)\s+' + ability_lower, re.IGNORECASE),
|
||||||
|
re.compile(r'~\s+is\s+' + ability_lower, re.IGNORECASE),
|
||||||
|
# "this creature/permanent" pronouns
|
||||||
|
re.compile(r'this\s+(?:creature|permanent|artifact|enchantment)\s+(?:has|gains?)\s+' + ability_lower, re.IGNORECASE),
|
||||||
|
# Starts with ability (likely self)
|
||||||
|
re.compile(r'^(?:has|gains?)\s+' + ability_lower, re.IGNORECASE),
|
||||||
|
]
|
||||||
|
|
||||||
|
# Your permanents patterns
|
||||||
|
your_patterns = [
|
||||||
|
re.compile(r'(?:other\s+)?(?:creatures?|permanents?|artifacts?|enchantments?)\s+you\s+control', re.IGNORECASE),
|
||||||
|
re.compile(r'your\s+(?:creatures?|permanents?|artifacts?|enchantments?)', re.IGNORECASE),
|
||||||
|
re.compile(r'each\s+(?:creature|permanent)\s+you\s+control', re.IGNORECASE),
|
||||||
|
re.compile(r'other\s+\w+s?\s+you\s+control', re.IGNORECASE), # "Other Merfolk you control", etc.
|
||||||
|
# "Other X you control...have Y" pattern for static grants
|
||||||
|
re.compile(r'other\s+(?:\w+\s+)?(?:creatures?|permanents?)\s+you\s+control\s+(?:get\s+[^.]*\s+and\s+)?have\s+' + ability_lower, re.IGNORECASE),
|
||||||
|
re.compile(r'other\s+\w+s?\s+you\s+control\s+(?:get\s+[^.]*\s+and\s+)?have\s+' + ability_lower, re.IGNORECASE), # "Other Knights you control...have"
|
||||||
|
re.compile(r'equipped\s+(?:creature|permanent)\s+(?:gets\s+[^.]*\s+and\s+)?(?:has|gains?)\s+(?:[^.]*\s+and\s+)?' + ability_lower, re.IGNORECASE), # Equipment
|
||||||
|
re.compile(r'enchanted\s+(?:creature|permanent)\s+(?:gets\s+[^.]*\s+and\s+)?(?:has|gains?)\s+(?:[^.]*\s+and\s+)?' + ability_lower, re.IGNORECASE), # Aura
|
||||||
|
re.compile(r'target\s+(?:\w+\s+)?(?:creature|permanent)\s+(?:gets\s+[^.]*\s+and\s+)?(?:gains?)\s+' + ability_lower, re.IGNORECASE), # Target
|
||||||
|
]
|
||||||
|
|
||||||
|
# Blanket patterns (no ownership qualifier)
|
||||||
|
# Note: Abilities can be listed with "and" (e.g., "gain hexproof and indestructible")
|
||||||
|
blanket_patterns = [
|
||||||
|
re.compile(r'all\s+(?:creatures?|permanents?)\s+(?:have|gain)\s+(?:[^.]*\s+and\s+)?' + ability_lower, re.IGNORECASE),
|
||||||
|
re.compile(r'each\s+(?:creature|permanent)\s+(?:has|gains?)\s+(?:[^.]*\s+and\s+)?' + ability_lower, re.IGNORECASE),
|
||||||
|
re.compile(r'(?:creatures?|permanents?)\s+(?:have|gain)\s+(?:[^.]*\s+and\s+)?' + ability_lower, re.IGNORECASE),
|
||||||
|
]
|
||||||
|
|
||||||
|
return scope_utils.ScopePatterns(
|
||||||
|
opponent=opponent_patterns,
|
||||||
|
self_ref=self_patterns,
|
||||||
|
your_permanents=your_patterns,
|
||||||
|
blanket=blanket_patterns
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def detect_protection_scope(text: str, card_name: str, ability: str, keywords: Optional[str] = None) -> Optional[str]:
|
||||||
|
"""
|
||||||
|
Detect the scope of a protection effect.
|
||||||
|
|
||||||
|
Detection priority order (prevents misclassification):
|
||||||
|
0. Static keyword → "Self"
|
||||||
|
1. Opponent ownership → "Opponent Permanents"
|
||||||
|
2. Self-reference → "Self"
|
||||||
|
3. Your ownership → "Your Permanents"
|
||||||
|
4. No ownership qualifier → "Blanket"
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Card text (lowercase for pattern matching)
|
||||||
|
card_name: Card name (for self-reference detection)
|
||||||
|
ability: Ability type (Ward, Hexproof, etc.)
|
||||||
|
keywords: Optional keywords field for static keyword detection
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Scope prefix or None: "Self", "Your Permanents", "Blanket", "Opponent Permanents"
|
||||||
|
"""
|
||||||
|
if not text or not ability:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Build patterns for this ability
|
||||||
|
patterns = _get_protection_scope_patterns(ability)
|
||||||
|
|
||||||
|
# Use generic scope detection with grant verb checking AND keywords
|
||||||
|
return scope_utils.detect_scope(
|
||||||
|
text=text,
|
||||||
|
card_name=card_name,
|
||||||
|
ability_keyword=ability,
|
||||||
|
patterns=patterns,
|
||||||
|
allow_multiple=False,
|
||||||
|
check_grant_verbs=True,
|
||||||
|
keywords=keywords
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def get_protection_scope_tags(text: str, card_name: str, keywords: Optional[str] = None) -> Set[str]:
|
||||||
|
"""
|
||||||
|
Get all protection scope metadata tags for a card.
|
||||||
|
|
||||||
|
A card can have multiple protection scopes (e.g., self-hexproof + grants ward to others).
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Card text
|
||||||
|
card_name: Card name
|
||||||
|
keywords: Optional keywords field for static keyword detection
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Set of metadata tags like {"Self: Indestructible", "Your Permanents: Ward"}
|
||||||
|
"""
|
||||||
|
if not text or not card_name:
|
||||||
|
return set()
|
||||||
|
|
||||||
|
scope_tags = set()
|
||||||
|
|
||||||
|
# Check each protection ability
|
||||||
|
for ability in PROTECTION_ABILITIES:
|
||||||
|
scope = detect_protection_scope(text, card_name, ability, keywords)
|
||||||
|
|
||||||
|
if scope:
|
||||||
|
# Format: "{Scope}: {Ability}"
|
||||||
|
tag = f"{scope}: {ability}"
|
||||||
|
scope_tags.add(tag)
|
||||||
|
logger.debug(f"Card '{card_name}': detected scope tag '{tag}'")
|
||||||
|
|
||||||
|
return scope_tags
|
||||||
|
|
||||||
|
|
||||||
|
def has_any_protection(text: str) -> bool:
|
||||||
|
"""
|
||||||
|
Quick check if card text contains any protection keywords.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Card text
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if any protection keyword found
|
||||||
|
"""
|
||||||
|
if not text:
|
||||||
|
return False
|
||||||
|
|
||||||
|
text_lower = text.lower()
|
||||||
|
return any(ability.lower() in text_lower for ability in PROTECTION_ABILITIES)
|
||||||
455
code/tagging/regex_patterns.py
Normal file
455
code/tagging/regex_patterns.py
Normal file
|
|
@ -0,0 +1,455 @@
|
||||||
|
"""
|
||||||
|
Centralized regex patterns for MTG card tagging.
|
||||||
|
|
||||||
|
All patterns compiled with re.IGNORECASE for case-insensitive matching.
|
||||||
|
Organized by semantic category for maintainability and reusability.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
from code.tagging import regex_patterns as rgx
|
||||||
|
|
||||||
|
mask = df['text'].str.contains(rgx.YOU_CONTROL, na=False)
|
||||||
|
if rgx.GRANT_HEXPROOF.search(text):
|
||||||
|
...
|
||||||
|
|
||||||
|
# Or use builder functions
|
||||||
|
pattern = rgx.ownership_pattern('creature', 'you')
|
||||||
|
mask = df['text'].str.contains(pattern, na=False)
|
||||||
|
"""
|
||||||
|
|
||||||
|
import re
|
||||||
|
from typing import Pattern, List
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# OWNERSHIP & CONTROLLER PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
YOU_CONTROL: Pattern = re.compile(r'you control', re.IGNORECASE)
|
||||||
|
THEY_CONTROL: Pattern = re.compile(r'they control', re.IGNORECASE)
|
||||||
|
OPPONENT_CONTROL: Pattern = re.compile(r'opponent[s]? control', re.IGNORECASE)
|
||||||
|
|
||||||
|
CREATURE_YOU_CONTROL: Pattern = re.compile(r'creature[s]? you control', re.IGNORECASE)
|
||||||
|
PERMANENT_YOU_CONTROL: Pattern = re.compile(r'permanent[s]? you control', re.IGNORECASE)
|
||||||
|
ARTIFACT_YOU_CONTROL: Pattern = re.compile(r'artifact[s]? you control', re.IGNORECASE)
|
||||||
|
ENCHANTMENT_YOU_CONTROL: Pattern = re.compile(r'enchantment[s]? you control', re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# GRANT VERB PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
GAIN: Pattern = re.compile(r'\bgain[s]?\b', re.IGNORECASE)
|
||||||
|
HAS: Pattern = re.compile(r'\bhas\b', re.IGNORECASE)
|
||||||
|
HAVE: Pattern = re.compile(r'\bhave\b', re.IGNORECASE)
|
||||||
|
GET: Pattern = re.compile(r'\bget[s]?\b', re.IGNORECASE)
|
||||||
|
|
||||||
|
GRANT_VERBS: List[str] = ['gain', 'gains', 'has', 'have', 'get', 'gets']
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# TARGETING PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
TARGET_PLAYER: Pattern = re.compile(r'target player', re.IGNORECASE)
|
||||||
|
TARGET_OPPONENT: Pattern = re.compile(r'target opponent', re.IGNORECASE)
|
||||||
|
TARGET_CREATURE: Pattern = re.compile(r'target creature', re.IGNORECASE)
|
||||||
|
TARGET_PERMANENT: Pattern = re.compile(r'target permanent', re.IGNORECASE)
|
||||||
|
TARGET_ARTIFACT: Pattern = re.compile(r'target artifact', re.IGNORECASE)
|
||||||
|
TARGET_ENCHANTMENT: Pattern = re.compile(r'target enchantment', re.IGNORECASE)
|
||||||
|
|
||||||
|
EACH_PLAYER: Pattern = re.compile(r'each player', re.IGNORECASE)
|
||||||
|
EACH_OPPONENT: Pattern = re.compile(r'each opponent', re.IGNORECASE)
|
||||||
|
TARGET_YOU_CONTROL: Pattern = re.compile(r'target .* you control', re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# PROTECTION ABILITY PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
HEXPROOF: Pattern = re.compile(r'\bhexproof\b', re.IGNORECASE)
|
||||||
|
SHROUD: Pattern = re.compile(r'\bshroud\b', re.IGNORECASE)
|
||||||
|
INDESTRUCTIBLE: Pattern = re.compile(r'\bindestructible\b', re.IGNORECASE)
|
||||||
|
WARD: Pattern = re.compile(r'\bward\b', re.IGNORECASE)
|
||||||
|
PROTECTION_FROM: Pattern = re.compile(r'protection from', re.IGNORECASE)
|
||||||
|
|
||||||
|
PROTECTION_ABILITIES: List[str] = ['hexproof', 'shroud', 'indestructible', 'ward', 'protection']
|
||||||
|
|
||||||
|
CANT_HAVE_PROTECTION: Pattern = re.compile(r"can't have (hexproof|indestructible|ward|shroud)", re.IGNORECASE)
|
||||||
|
LOSE_PROTECTION: Pattern = re.compile(r"lose[s]? (hexproof|indestructible|ward|shroud|protection)", re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# CARD DRAW PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
DRAW_A_CARD: Pattern = re.compile(r'draw[s]? (?:a|one) card', re.IGNORECASE)
|
||||||
|
DRAW_CARDS: Pattern = re.compile(r'draw[s]? (?:two|three|four|five|x|\d+) card', re.IGNORECASE)
|
||||||
|
DRAW: Pattern = re.compile(r'\bdraw[s]?\b', re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# TOKEN CREATION PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
CREATE_TOKEN: Pattern = re.compile(r'create[s]?.*token', re.IGNORECASE)
|
||||||
|
PUT_TOKEN: Pattern = re.compile(r'put[s]?.*token', re.IGNORECASE)
|
||||||
|
|
||||||
|
CREATE_TREASURE: Pattern = re.compile(r'create.*treasure token', re.IGNORECASE)
|
||||||
|
CREATE_FOOD: Pattern = re.compile(r'create.*food token', re.IGNORECASE)
|
||||||
|
CREATE_CLUE: Pattern = re.compile(r'create.*clue token', re.IGNORECASE)
|
||||||
|
CREATE_BLOOD: Pattern = re.compile(r'create.*blood token', re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# COUNTER PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
PLUS_ONE_COUNTER: Pattern = re.compile(r'\+1/\+1 counter', re.IGNORECASE)
|
||||||
|
MINUS_ONE_COUNTER: Pattern = re.compile(r'\-1/\-1 counter', re.IGNORECASE)
|
||||||
|
LOYALTY_COUNTER: Pattern = re.compile(r'loyalty counter', re.IGNORECASE)
|
||||||
|
PROLIFERATE: Pattern = re.compile(r'\bproliferate\b', re.IGNORECASE)
|
||||||
|
|
||||||
|
ONE_OR_MORE_COUNTERS: Pattern = re.compile(r'one or more counter', re.IGNORECASE)
|
||||||
|
ONE_OR_MORE_PLUS_ONE_COUNTERS: Pattern = re.compile(r'one or more \+1/\+1 counter', re.IGNORECASE)
|
||||||
|
IF_HAD_COUNTERS: Pattern = re.compile(r'if it had counter', re.IGNORECASE)
|
||||||
|
WITH_COUNTERS_ON_THEM: Pattern = re.compile(r'with counter[s]? on them', re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# SACRIFICE & REMOVAL PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
SACRIFICE: Pattern = re.compile(r'sacrifice[s]?', re.IGNORECASE)
|
||||||
|
SACRIFICED: Pattern = re.compile(r'sacrificed', re.IGNORECASE)
|
||||||
|
DESTROY: Pattern = re.compile(r'destroy[s]?', re.IGNORECASE)
|
||||||
|
EXILE: Pattern = re.compile(r'exile[s]?', re.IGNORECASE)
|
||||||
|
EXILED: Pattern = re.compile(r'exiled', re.IGNORECASE)
|
||||||
|
|
||||||
|
SACRIFICE_DRAW: Pattern = re.compile(r'sacrifice (?:a|an) (?:artifact|creature|permanent)(?:[^,]*),?[^,]*draw', re.IGNORECASE)
|
||||||
|
SACRIFICE_COLON_DRAW: Pattern = re.compile(r'sacrifice [^:]+: draw', re.IGNORECASE)
|
||||||
|
SACRIFICED_COMMA_DRAW: Pattern = re.compile(r'sacrificed[^,]+, draw', re.IGNORECASE)
|
||||||
|
EXILE_RETURN_BATTLEFIELD: Pattern = re.compile(r'exile.*return.*to the battlefield', re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# DISCARD PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
DISCARD_A_CARD: Pattern = re.compile(r'discard (?:a|one|two|three|x) card', re.IGNORECASE)
|
||||||
|
DISCARD_YOUR_HAND: Pattern = re.compile(r'discard your hand', re.IGNORECASE)
|
||||||
|
YOU_DISCARD: Pattern = re.compile(r'you discard', re.IGNORECASE)
|
||||||
|
|
||||||
|
# Discard triggers
|
||||||
|
WHENEVER_YOU_DISCARD: Pattern = re.compile(r'whenever you discard', re.IGNORECASE)
|
||||||
|
IF_YOU_DISCARDED: Pattern = re.compile(r'if you discarded', re.IGNORECASE)
|
||||||
|
WHEN_YOU_DISCARD: Pattern = re.compile(r'when you discard', re.IGNORECASE)
|
||||||
|
FOR_EACH_DISCARDED: Pattern = re.compile(r'for each card you discarded', re.IGNORECASE)
|
||||||
|
|
||||||
|
# Opponent discard
|
||||||
|
TARGET_PLAYER_DISCARDS: Pattern = re.compile(r'target player discards', re.IGNORECASE)
|
||||||
|
TARGET_OPPONENT_DISCARDS: Pattern = re.compile(r'target opponent discards', re.IGNORECASE)
|
||||||
|
EACH_PLAYER_DISCARDS: Pattern = re.compile(r'each player discards', re.IGNORECASE)
|
||||||
|
EACH_OPPONENT_DISCARDS: Pattern = re.compile(r'each opponent discards', re.IGNORECASE)
|
||||||
|
THAT_PLAYER_DISCARDS: Pattern = re.compile(r'that player discards', re.IGNORECASE)
|
||||||
|
|
||||||
|
# Discard cost
|
||||||
|
ADDITIONAL_COST_DISCARD: Pattern = re.compile(r'as an additional cost to (?:cast this spell|activate this ability),? discard (?:a|one) card', re.IGNORECASE)
|
||||||
|
ADDITIONAL_COST_DISCARD_SHORT: Pattern = re.compile(r'as an additional cost,? discard (?:a|one) card', re.IGNORECASE)
|
||||||
|
|
||||||
|
MADNESS: Pattern = re.compile(r'\bmadness\b', re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# DAMAGE & LIFE LOSS PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
DEALS_ONE_DAMAGE: Pattern = re.compile(r'deals\s+1\s+damage', re.IGNORECASE)
|
||||||
|
EXACTLY_ONE_DAMAGE: Pattern = re.compile(r'exactly\s+1\s+damage', re.IGNORECASE)
|
||||||
|
LOSES_ONE_LIFE: Pattern = re.compile(r'loses\s+1\s+life', re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# COST REDUCTION PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
COST_LESS: Pattern = re.compile(r'cost[s]? \{[\d\w]\} less', re.IGNORECASE)
|
||||||
|
COST_LESS_TO_CAST: Pattern = re.compile(r'cost[s]? less to cast', re.IGNORECASE)
|
||||||
|
WITH_X_IN_COST: Pattern = re.compile(r'with \{[xX]\} in (?:its|their)', re.IGNORECASE)
|
||||||
|
AFFINITY_FOR: Pattern = re.compile(r'affinity for', re.IGNORECASE)
|
||||||
|
SPELLS_COST: Pattern = re.compile(r'spells cost', re.IGNORECASE)
|
||||||
|
SPELLS_YOU_CAST_COST: Pattern = re.compile(r'spells you cast cost', re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# MONARCH & INITIATIVE PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
BECOME_MONARCH: Pattern = re.compile(r'becomes? the monarch', re.IGNORECASE)
|
||||||
|
IS_MONARCH: Pattern = re.compile(r'is the monarch', re.IGNORECASE)
|
||||||
|
WAS_MONARCH: Pattern = re.compile(r'was the monarch', re.IGNORECASE)
|
||||||
|
YOU_ARE_MONARCH: Pattern = re.compile(r"you are the monarch|you're the monarch", re.IGNORECASE)
|
||||||
|
YOU_BECOME_MONARCH: Pattern = re.compile(r'you become the monarch', re.IGNORECASE)
|
||||||
|
CANT_BECOME_MONARCH: Pattern = re.compile(r"can't become the monarch", re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# KEYWORD ABILITY PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
PARTNER_BASIC: Pattern = re.compile(r'\bpartner\b(?!\s*(?:with|[-—–]))', re.IGNORECASE)
|
||||||
|
PARTNER_WITH: Pattern = re.compile(r'partner with', re.IGNORECASE)
|
||||||
|
PARTNER_SURVIVORS: Pattern = re.compile(r'Partner\s*[-—–]\s*Survivors', re.IGNORECASE)
|
||||||
|
PARTNER_FATHER_SON: Pattern = re.compile(r'Partner\s*[-—–]\s*Father\s*&\s*Son', re.IGNORECASE)
|
||||||
|
|
||||||
|
FLYING: Pattern = re.compile(r'\bflying\b', re.IGNORECASE)
|
||||||
|
VIGILANCE: Pattern = re.compile(r'\bvigilance\b', re.IGNORECASE)
|
||||||
|
TRAMPLE: Pattern = re.compile(r'\btrample\b', re.IGNORECASE)
|
||||||
|
HASTE: Pattern = re.compile(r'\bhaste\b', re.IGNORECASE)
|
||||||
|
LIFELINK: Pattern = re.compile(r'\blifelink\b', re.IGNORECASE)
|
||||||
|
DEATHTOUCH: Pattern = re.compile(r'\bdeathtouch\b', re.IGNORECASE)
|
||||||
|
DOUBLE_STRIKE: Pattern = re.compile(r'double strike', re.IGNORECASE)
|
||||||
|
FIRST_STRIKE: Pattern = re.compile(r'first strike', re.IGNORECASE)
|
||||||
|
MENACE: Pattern = re.compile(r'\bmenace\b', re.IGNORECASE)
|
||||||
|
REACH: Pattern = re.compile(r'\breach\b', re.IGNORECASE)
|
||||||
|
|
||||||
|
UNDYING: Pattern = re.compile(r'\bundying\b', re.IGNORECASE)
|
||||||
|
PERSIST: Pattern = re.compile(r'\bpersist\b', re.IGNORECASE)
|
||||||
|
PHASING: Pattern = re.compile(r'\bphasing\b', re.IGNORECASE)
|
||||||
|
FLASH: Pattern = re.compile(r'\bflash\b', re.IGNORECASE)
|
||||||
|
TOXIC: Pattern = re.compile(r'toxic\s*\d+', re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# RETURN TO BATTLEFIELD PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
RETURN_TO_BATTLEFIELD: Pattern = re.compile(r'return.*to the battlefield', re.IGNORECASE)
|
||||||
|
RETURN_IT_TO_BATTLEFIELD: Pattern = re.compile(r'return it to the battlefield', re.IGNORECASE)
|
||||||
|
RETURN_THAT_CARD_TO_BATTLEFIELD: Pattern = re.compile(r'return that card to the battlefield', re.IGNORECASE)
|
||||||
|
RETURN_THEM_TO_BATTLEFIELD: Pattern = re.compile(r'return them to the battlefield', re.IGNORECASE)
|
||||||
|
RETURN_THOSE_CARDS_TO_BATTLEFIELD: Pattern = re.compile(r'return those cards to the battlefield', re.IGNORECASE)
|
||||||
|
|
||||||
|
RETURN_TO_HAND: Pattern = re.compile(r'return.*to.*hand', re.IGNORECASE)
|
||||||
|
RETURN_YOU_CONTROL_TO_HAND: Pattern = re.compile(r'return target.*you control.*to.*hand', re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# SCOPE & QUALIFIER PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
OTHER_CREATURES: Pattern = re.compile(r'other creature[s]?', re.IGNORECASE)
|
||||||
|
ALL_CREATURES: Pattern = re.compile(r'\ball creature[s]?\b', re.IGNORECASE)
|
||||||
|
ALL_PERMANENTS: Pattern = re.compile(r'\ball permanent[s]?\b', re.IGNORECASE)
|
||||||
|
ALL_SLIVERS: Pattern = re.compile(r'\ball sliver[s]?\b', re.IGNORECASE)
|
||||||
|
|
||||||
|
EQUIPPED_CREATURE: Pattern = re.compile(r'equipped creature', re.IGNORECASE)
|
||||||
|
ENCHANTED_CREATURE: Pattern = re.compile(r'enchanted creature', re.IGNORECASE)
|
||||||
|
ENCHANTED_PERMANENT: Pattern = re.compile(r'enchanted permanent', re.IGNORECASE)
|
||||||
|
ENCHANTED_ENCHANTMENT: Pattern = re.compile(r'enchanted enchantment', re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# COMBAT PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
ATTACK: Pattern = re.compile(r'\battack[s]?\b', re.IGNORECASE)
|
||||||
|
ATTACKS: Pattern = re.compile(r'\battacks\b', re.IGNORECASE)
|
||||||
|
BLOCK: Pattern = re.compile(r'\bblock[s]?\b', re.IGNORECASE)
|
||||||
|
BLOCKS: Pattern = re.compile(r'\bblocks\b', re.IGNORECASE)
|
||||||
|
COMBAT_DAMAGE: Pattern = re.compile(r'combat damage', re.IGNORECASE)
|
||||||
|
|
||||||
|
WHENEVER_ATTACKS: Pattern = re.compile(r'whenever .* attacks', re.IGNORECASE)
|
||||||
|
WHEN_ATTACKS: Pattern = re.compile(r'when .* attacks', re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# TYPE LINE PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
INSTANT: Pattern = re.compile(r'\bInstant\b', re.IGNORECASE)
|
||||||
|
SORCERY: Pattern = re.compile(r'\bSorcery\b', re.IGNORECASE)
|
||||||
|
ARTIFACT: Pattern = re.compile(r'\bArtifact\b', re.IGNORECASE)
|
||||||
|
ENCHANTMENT: Pattern = re.compile(r'\bEnchantment\b', re.IGNORECASE)
|
||||||
|
CREATURE: Pattern = re.compile(r'\bCreature\b', re.IGNORECASE)
|
||||||
|
PLANESWALKER: Pattern = re.compile(r'\bPlaneswalker\b', re.IGNORECASE)
|
||||||
|
LAND: Pattern = re.compile(r'\bLand\b', re.IGNORECASE)
|
||||||
|
|
||||||
|
AURA: Pattern = re.compile(r'\bAura\b', re.IGNORECASE)
|
||||||
|
EQUIPMENT: Pattern = re.compile(r'\bEquipment\b', re.IGNORECASE)
|
||||||
|
VEHICLE: Pattern = re.compile(r'\bVehicle\b', re.IGNORECASE)
|
||||||
|
SAGA: Pattern = re.compile(r'\bSaga\b', re.IGNORECASE)
|
||||||
|
|
||||||
|
NONCREATURE: Pattern = re.compile(r'noncreature', re.IGNORECASE)
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# PATTERN BUILDER FUNCTIONS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
def ownership_pattern(subject: str, owner: str = "you") -> Pattern:
|
||||||
|
"""
|
||||||
|
Build ownership pattern like 'creatures you control', 'permanents opponent controls'.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
subject: The card type (e.g., 'creature', 'permanent', 'artifact')
|
||||||
|
owner: Controller ('you', 'opponent', 'they', etc.)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Compiled regex pattern
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
>>> ownership_pattern('creature', 'you')
|
||||||
|
# Matches "creatures you control"
|
||||||
|
>>> ownership_pattern('artifact', 'opponent')
|
||||||
|
# Matches "artifacts opponent controls"
|
||||||
|
"""
|
||||||
|
pattern = fr'{subject}[s]?\s+{owner}\s+control[s]?'
|
||||||
|
return re.compile(pattern, re.IGNORECASE)
|
||||||
|
|
||||||
|
|
||||||
|
def grant_pattern(subject: str, verb: str, ability: str) -> Pattern:
|
||||||
|
"""
|
||||||
|
Build grant pattern like 'creatures you control gain hexproof'.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
subject: What gains the ability ('creatures you control', 'target creature', etc.)
|
||||||
|
verb: Grant verb ('gain', 'has', 'get', etc.)
|
||||||
|
ability: Ability granted ('hexproof', 'flying', 'ward', etc.)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Compiled regex pattern
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
>>> grant_pattern('creatures you control', 'gain', 'hexproof')
|
||||||
|
# Matches "creatures you control gain hexproof"
|
||||||
|
"""
|
||||||
|
pattern = fr'{subject}\s+{verb}[s]?\s+{ability}'
|
||||||
|
return re.compile(pattern, re.IGNORECASE)
|
||||||
|
|
||||||
|
|
||||||
|
def token_creation_pattern(quantity: str, token_type: str) -> Pattern:
|
||||||
|
"""
|
||||||
|
Build token creation pattern like 'create two 1/1 Soldier tokens'.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
quantity: Number word or variable ('one', 'two', 'x', etc.)
|
||||||
|
token_type: Token name ('treasure', 'food', 'soldier', etc.)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Compiled regex pattern
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
>>> token_creation_pattern('two', 'treasure')
|
||||||
|
# Matches "create two Treasure tokens"
|
||||||
|
"""
|
||||||
|
pattern = fr'create[s]?\s+(?:{quantity})\s+.*{token_type}\s+token'
|
||||||
|
return re.compile(pattern, re.IGNORECASE)
|
||||||
|
|
||||||
|
|
||||||
|
def kindred_grant_pattern(tribe: str, ability: str) -> Pattern:
|
||||||
|
"""
|
||||||
|
Build kindred grant pattern like 'knights you control gain protection'.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
tribe: Creature type ('knight', 'elf', 'zombie', etc.)
|
||||||
|
ability: Ability granted ('hexproof', 'protection', etc.)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Compiled regex pattern
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
>>> kindred_grant_pattern('knight', 'hexproof')
|
||||||
|
# Matches "Knights you control gain hexproof"
|
||||||
|
"""
|
||||||
|
pattern = fr'{tribe}[s]?\s+you\s+control.*\b{ability}\b'
|
||||||
|
return re.compile(pattern, re.IGNORECASE)
|
||||||
|
|
||||||
|
|
||||||
|
def targeting_pattern(target: str, subject: str = None) -> Pattern:
|
||||||
|
"""
|
||||||
|
Build targeting pattern like 'target creature you control'.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
target: What is targeted ('player', 'opponent', 'creature', etc.)
|
||||||
|
subject: Optional qualifier ('you control', 'opponent controls', etc.)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Compiled regex pattern
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
>>> targeting_pattern('creature', 'you control')
|
||||||
|
# Matches "target creature you control"
|
||||||
|
>>> targeting_pattern('opponent')
|
||||||
|
# Matches "target opponent"
|
||||||
|
"""
|
||||||
|
if subject:
|
||||||
|
pattern = fr'target\s+{target}\s+{subject}'
|
||||||
|
else:
|
||||||
|
pattern = fr'target\s+{target}'
|
||||||
|
return re.compile(pattern, re.IGNORECASE)
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# MODULE EXPORTS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
__all__ = [
|
||||||
|
# Ownership
|
||||||
|
'YOU_CONTROL', 'THEY_CONTROL', 'OPPONENT_CONTROL',
|
||||||
|
'CREATURE_YOU_CONTROL', 'PERMANENT_YOU_CONTROL', 'ARTIFACT_YOU_CONTROL',
|
||||||
|
'ENCHANTMENT_YOU_CONTROL',
|
||||||
|
|
||||||
|
# Grant verbs
|
||||||
|
'GAIN', 'HAS', 'HAVE', 'GET', 'GRANT_VERBS',
|
||||||
|
|
||||||
|
# Targeting
|
||||||
|
'TARGET_PLAYER', 'TARGET_OPPONENT', 'TARGET_CREATURE', 'TARGET_PERMANENT',
|
||||||
|
'TARGET_ARTIFACT', 'TARGET_ENCHANTMENT', 'EACH_PLAYER', 'EACH_OPPONENT',
|
||||||
|
'TARGET_YOU_CONTROL',
|
||||||
|
|
||||||
|
# Protection abilities
|
||||||
|
'HEXPROOF', 'SHROUD', 'INDESTRUCTIBLE', 'WARD', 'PROTECTION_FROM',
|
||||||
|
'PROTECTION_ABILITIES', 'CANT_HAVE_PROTECTION', 'LOSE_PROTECTION',
|
||||||
|
|
||||||
|
# Draw
|
||||||
|
'DRAW_A_CARD', 'DRAW_CARDS', 'DRAW',
|
||||||
|
|
||||||
|
# Tokens
|
||||||
|
'CREATE_TOKEN', 'PUT_TOKEN',
|
||||||
|
'CREATE_TREASURE', 'CREATE_FOOD', 'CREATE_CLUE', 'CREATE_BLOOD',
|
||||||
|
|
||||||
|
# Counters
|
||||||
|
'PLUS_ONE_COUNTER', 'MINUS_ONE_COUNTER', 'LOYALTY_COUNTER', 'PROLIFERATE',
|
||||||
|
'ONE_OR_MORE_COUNTERS', 'ONE_OR_MORE_PLUS_ONE_COUNTERS', 'IF_HAD_COUNTERS', 'WITH_COUNTERS_ON_THEM',
|
||||||
|
|
||||||
|
# Removal
|
||||||
|
'SACRIFICE', 'SACRIFICED', 'DESTROY', 'EXILE', 'EXILED',
|
||||||
|
'SACRIFICE_DRAW', 'SACRIFICE_COLON_DRAW', 'SACRIFICED_COMMA_DRAW',
|
||||||
|
'EXILE_RETURN_BATTLEFIELD',
|
||||||
|
|
||||||
|
# Discard
|
||||||
|
'DISCARD_A_CARD', 'DISCARD_YOUR_HAND', 'YOU_DISCARD',
|
||||||
|
'WHENEVER_YOU_DISCARD', 'IF_YOU_DISCARDED', 'WHEN_YOU_DISCARD', 'FOR_EACH_DISCARDED',
|
||||||
|
'TARGET_PLAYER_DISCARDS', 'TARGET_OPPONENT_DISCARDS', 'EACH_PLAYER_DISCARDS',
|
||||||
|
'EACH_OPPONENT_DISCARDS', 'THAT_PLAYER_DISCARDS',
|
||||||
|
'ADDITIONAL_COST_DISCARD', 'ADDITIONAL_COST_DISCARD_SHORT', 'MADNESS',
|
||||||
|
|
||||||
|
# Damage & Life Loss
|
||||||
|
'DEALS_ONE_DAMAGE', 'EXACTLY_ONE_DAMAGE', 'LOSES_ONE_LIFE',
|
||||||
|
|
||||||
|
# Cost reduction
|
||||||
|
'COST_LESS', 'COST_LESS_TO_CAST', 'WITH_X_IN_COST', 'AFFINITY_FOR', 'SPELLS_COST', 'SPELLS_YOU_CAST_COST',
|
||||||
|
|
||||||
|
# Monarch
|
||||||
|
'BECOME_MONARCH', 'IS_MONARCH', 'WAS_MONARCH', 'YOU_ARE_MONARCH',
|
||||||
|
'YOU_BECOME_MONARCH', 'CANT_BECOME_MONARCH',
|
||||||
|
|
||||||
|
# Keywords
|
||||||
|
'PARTNER_BASIC', 'PARTNER_WITH', 'PARTNER_SURVIVORS', 'PARTNER_FATHER_SON',
|
||||||
|
'FLYING', 'VIGILANCE', 'TRAMPLE', 'HASTE', 'LIFELINK', 'DEATHTOUCH',
|
||||||
|
'DOUBLE_STRIKE', 'FIRST_STRIKE', 'MENACE', 'REACH',
|
||||||
|
'UNDYING', 'PERSIST', 'PHASING', 'FLASH', 'TOXIC',
|
||||||
|
|
||||||
|
# Return
|
||||||
|
'RETURN_TO_BATTLEFIELD', 'RETURN_IT_TO_BATTLEFIELD', 'RETURN_THAT_CARD_TO_BATTLEFIELD',
|
||||||
|
'RETURN_THEM_TO_BATTLEFIELD', 'RETURN_THOSE_CARDS_TO_BATTLEFIELD',
|
||||||
|
'RETURN_TO_HAND', 'RETURN_YOU_CONTROL_TO_HAND',
|
||||||
|
|
||||||
|
# Scope
|
||||||
|
'OTHER_CREATURES', 'ALL_CREATURES', 'ALL_PERMANENTS', 'ALL_SLIVERS',
|
||||||
|
'EQUIPPED_CREATURE', 'ENCHANTED_CREATURE', 'ENCHANTED_PERMANENT', 'ENCHANTED_ENCHANTMENT',
|
||||||
|
|
||||||
|
# Combat
|
||||||
|
'ATTACK', 'ATTACKS', 'BLOCK', 'BLOCKS', 'COMBAT_DAMAGE',
|
||||||
|
'WHENEVER_ATTACKS', 'WHEN_ATTACKS',
|
||||||
|
|
||||||
|
# Type line
|
||||||
|
'INSTANT', 'SORCERY', 'ARTIFACT', 'ENCHANTMENT', 'CREATURE', 'PLANESWALKER', 'LAND',
|
||||||
|
'AURA', 'EQUIPMENT', 'VEHICLE', 'SAGA', 'NONCREATURE',
|
||||||
|
|
||||||
|
# Builders
|
||||||
|
'ownership_pattern', 'grant_pattern', 'token_creation_pattern',
|
||||||
|
'kindred_grant_pattern', 'targeting_pattern',
|
||||||
|
]
|
||||||
420
code/tagging/scope_detection_utils.py
Normal file
420
code/tagging/scope_detection_utils.py
Normal file
|
|
@ -0,0 +1,420 @@
|
||||||
|
"""
|
||||||
|
Scope Detection Utilities
|
||||||
|
|
||||||
|
Generic utilities for detecting the scope of card abilities (protection, phasing, etc.).
|
||||||
|
Provides reusable pattern-matching logic to avoid duplication across modules.
|
||||||
|
|
||||||
|
Created as part of M2: Create Scope Detection Utilities milestone.
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Standard library imports
|
||||||
|
import re
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from typing import List, Optional, Set
|
||||||
|
|
||||||
|
# Local application imports
|
||||||
|
from . import regex_patterns as rgx
|
||||||
|
from . import tag_utils
|
||||||
|
from code.logging_util import get_logger
|
||||||
|
|
||||||
|
logger = get_logger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class ScopePatterns:
|
||||||
|
"""
|
||||||
|
Pattern collections for scope detection.
|
||||||
|
|
||||||
|
Attributes:
|
||||||
|
opponent: Patterns that indicate opponent ownership
|
||||||
|
self_ref: Patterns that indicate self-reference
|
||||||
|
your_permanents: Patterns that indicate "you control"
|
||||||
|
blanket: Patterns that indicate no ownership qualifier
|
||||||
|
targeted: Patterns that indicate targeting (optional)
|
||||||
|
"""
|
||||||
|
opponent: List[re.Pattern]
|
||||||
|
self_ref: List[re.Pattern]
|
||||||
|
your_permanents: List[re.Pattern]
|
||||||
|
blanket: List[re.Pattern]
|
||||||
|
targeted: Optional[List[re.Pattern]] = None
|
||||||
|
|
||||||
|
|
||||||
|
def detect_scope(
|
||||||
|
text: str,
|
||||||
|
card_name: str,
|
||||||
|
ability_keyword: str,
|
||||||
|
patterns: ScopePatterns,
|
||||||
|
allow_multiple: bool = False,
|
||||||
|
check_grant_verbs: bool = False,
|
||||||
|
keywords: Optional[str] = None,
|
||||||
|
) -> Optional[str]:
|
||||||
|
"""
|
||||||
|
Generic scope detection with priority ordering.
|
||||||
|
|
||||||
|
Detection priority (prevents misclassification):
|
||||||
|
0. Static keyword (in keywords field or simple list) → "Self"
|
||||||
|
1. Opponent ownership → "Opponent Permanents"
|
||||||
|
2. Self-reference → "Self"
|
||||||
|
3. Your ownership → "Your Permanents"
|
||||||
|
4. No ownership qualifier → "Blanket"
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Card text
|
||||||
|
card_name: Card name (for self-reference detection)
|
||||||
|
ability_keyword: Ability keyword to look for (e.g., "hexproof", "phasing")
|
||||||
|
patterns: ScopePatterns object with pattern collections
|
||||||
|
allow_multiple: If True, returns Set[str] instead of single scope
|
||||||
|
check_grant_verbs: If True, checks for grant verbs before assuming "Self"
|
||||||
|
keywords: Optional keywords field from card data (for static keyword detection)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Scope string or None: "Self", "Your Permanents", "Blanket", "Opponent Permanents"
|
||||||
|
If allow_multiple=True, returns Set[str] with all matching scopes
|
||||||
|
"""
|
||||||
|
if not text or not ability_keyword:
|
||||||
|
return set() if allow_multiple else None
|
||||||
|
|
||||||
|
text_lower = text.lower()
|
||||||
|
ability_lower = ability_keyword.lower()
|
||||||
|
card_name_lower = card_name.lower() if card_name else ''
|
||||||
|
|
||||||
|
# Check if ability is mentioned in text
|
||||||
|
if ability_lower not in text_lower:
|
||||||
|
return set() if allow_multiple else None
|
||||||
|
|
||||||
|
# Priority 0: Check if this is a static keyword ability
|
||||||
|
# Static keywords appear in the keywords field or as simple comma-separated lists
|
||||||
|
# without grant verbs (e.g., "Flying, first strike, protection from black")
|
||||||
|
if check_static_keyword(ability_keyword, keywords, text):
|
||||||
|
if allow_multiple:
|
||||||
|
return {"Self"}
|
||||||
|
else:
|
||||||
|
return "Self"
|
||||||
|
|
||||||
|
if allow_multiple:
|
||||||
|
scopes = set()
|
||||||
|
else:
|
||||||
|
scopes = None
|
||||||
|
|
||||||
|
# Priority 1: Opponent ownership
|
||||||
|
for pattern in patterns.opponent:
|
||||||
|
if pattern.search(text_lower):
|
||||||
|
if allow_multiple:
|
||||||
|
scopes.add("Opponent Permanents")
|
||||||
|
break
|
||||||
|
else:
|
||||||
|
return "Opponent Permanents"
|
||||||
|
|
||||||
|
# Priority 2: Self-reference
|
||||||
|
is_self = _check_self_reference(text_lower, card_name_lower, ability_lower, patterns.self_ref)
|
||||||
|
|
||||||
|
# If check_grant_verbs is True, verify we don't have grant patterns before assuming Self
|
||||||
|
if is_self and check_grant_verbs:
|
||||||
|
has_grant_pattern = _has_grant_verbs(text_lower)
|
||||||
|
if not has_grant_pattern:
|
||||||
|
if allow_multiple:
|
||||||
|
scopes.add("Self")
|
||||||
|
else:
|
||||||
|
return "Self"
|
||||||
|
elif is_self:
|
||||||
|
if allow_multiple:
|
||||||
|
scopes.add("Self")
|
||||||
|
else:
|
||||||
|
return "Self"
|
||||||
|
|
||||||
|
# Priority 3: Your ownership
|
||||||
|
for pattern in patterns.your_permanents:
|
||||||
|
if pattern.search(text_lower):
|
||||||
|
if allow_multiple:
|
||||||
|
scopes.add("Your Permanents")
|
||||||
|
break
|
||||||
|
else:
|
||||||
|
return "Your Permanents"
|
||||||
|
|
||||||
|
# Priority 4: Blanket (no ownership qualifier)
|
||||||
|
for pattern in patterns.blanket:
|
||||||
|
if pattern.search(text_lower):
|
||||||
|
# Double-check no ownership was missed
|
||||||
|
if not rgx.YOU_CONTROL.search(text_lower) and 'opponent' not in text_lower:
|
||||||
|
if allow_multiple:
|
||||||
|
scopes.add("Blanket")
|
||||||
|
break
|
||||||
|
else:
|
||||||
|
return "Blanket"
|
||||||
|
|
||||||
|
return scopes if allow_multiple else None
|
||||||
|
|
||||||
|
|
||||||
|
def detect_multi_scope(
|
||||||
|
text: str,
|
||||||
|
card_name: str,
|
||||||
|
ability_keyword: str,
|
||||||
|
patterns: ScopePatterns,
|
||||||
|
check_grant_verbs: bool = False,
|
||||||
|
keywords: Optional[str] = None,
|
||||||
|
) -> Set[str]:
|
||||||
|
"""
|
||||||
|
Detect multiple scopes for cards with multiple effects.
|
||||||
|
|
||||||
|
Some cards grant abilities to multiple scopes:
|
||||||
|
- Self-hexproof + grants ward to others
|
||||||
|
- Target phasing + your permanents phasing
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Card text
|
||||||
|
card_name: Card name
|
||||||
|
ability_keyword: Ability keyword to look for
|
||||||
|
patterns: ScopePatterns object
|
||||||
|
check_grant_verbs: If True, checks for grant verbs before assuming "Self"
|
||||||
|
keywords: Optional keywords field for static keyword detection
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Set of scope strings
|
||||||
|
"""
|
||||||
|
scopes = set()
|
||||||
|
|
||||||
|
if not text or not ability_keyword:
|
||||||
|
return scopes
|
||||||
|
|
||||||
|
text_lower = text.lower()
|
||||||
|
ability_lower = ability_keyword.lower()
|
||||||
|
card_name_lower = card_name.lower() if card_name else ''
|
||||||
|
|
||||||
|
# Check for static keyword first
|
||||||
|
if check_static_keyword(ability_keyword, keywords, text):
|
||||||
|
scopes.add("Self")
|
||||||
|
# For static keywords, we usually don't have multiple scopes
|
||||||
|
# But continue checking in case there are additional effects
|
||||||
|
|
||||||
|
# Check if ability is mentioned
|
||||||
|
if ability_lower not in text_lower:
|
||||||
|
return scopes
|
||||||
|
|
||||||
|
# Check opponent patterns
|
||||||
|
if any(pattern.search(text_lower) for pattern in patterns.opponent):
|
||||||
|
scopes.add("Opponent Permanents")
|
||||||
|
|
||||||
|
# Check self-reference
|
||||||
|
is_self = _check_self_reference(text_lower, card_name_lower, ability_lower, patterns.self_ref)
|
||||||
|
|
||||||
|
if is_self:
|
||||||
|
if check_grant_verbs:
|
||||||
|
has_grant_pattern = _has_grant_verbs(text_lower)
|
||||||
|
if not has_grant_pattern:
|
||||||
|
scopes.add("Self")
|
||||||
|
else:
|
||||||
|
scopes.add("Self")
|
||||||
|
|
||||||
|
# Check your permanents
|
||||||
|
if any(pattern.search(text_lower) for pattern in patterns.your_permanents):
|
||||||
|
scopes.add("Your Permanents")
|
||||||
|
|
||||||
|
# Check blanket (no ownership)
|
||||||
|
has_blanket = any(pattern.search(text_lower) for pattern in patterns.blanket)
|
||||||
|
no_ownership = not rgx.YOU_CONTROL.search(text_lower) and 'opponent' not in text_lower
|
||||||
|
|
||||||
|
if has_blanket and no_ownership:
|
||||||
|
scopes.add("Blanket")
|
||||||
|
|
||||||
|
# Optional: Check for targeting
|
||||||
|
if patterns.targeted:
|
||||||
|
if any(pattern.search(text_lower) for pattern in patterns.targeted):
|
||||||
|
scopes.add("Targeted")
|
||||||
|
|
||||||
|
return scopes
|
||||||
|
|
||||||
|
|
||||||
|
def _check_self_reference(
|
||||||
|
text_lower: str,
|
||||||
|
card_name_lower: str,
|
||||||
|
ability_lower: str,
|
||||||
|
self_patterns: List[re.Pattern]
|
||||||
|
) -> bool:
|
||||||
|
"""
|
||||||
|
Check if text contains self-reference patterns.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text_lower: Lowercase card text
|
||||||
|
card_name_lower: Lowercase card name
|
||||||
|
ability_lower: Lowercase ability keyword
|
||||||
|
self_patterns: List of self-reference patterns
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if self-reference found
|
||||||
|
"""
|
||||||
|
# Check provided self patterns
|
||||||
|
for pattern in self_patterns:
|
||||||
|
if pattern.search(text_lower):
|
||||||
|
return True
|
||||||
|
|
||||||
|
# Check for card name reference (if provided)
|
||||||
|
if card_name_lower:
|
||||||
|
card_name_escaped = re.escape(card_name_lower)
|
||||||
|
card_name_pattern = re.compile(rf'\b{card_name_escaped}\b', re.IGNORECASE)
|
||||||
|
|
||||||
|
if card_name_pattern.search(text_lower):
|
||||||
|
# Make sure it's in a self-ability context
|
||||||
|
self_context_patterns = [
|
||||||
|
re.compile(rf'\b{card_name_escaped}\s+(?:has|gains?)\s+{ability_lower}', re.IGNORECASE),
|
||||||
|
re.compile(rf'\b{card_name_escaped}\s+is\s+{ability_lower}', re.IGNORECASE),
|
||||||
|
]
|
||||||
|
|
||||||
|
for pattern in self_context_patterns:
|
||||||
|
if pattern.search(text_lower):
|
||||||
|
return True
|
||||||
|
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def _has_grant_verbs(text_lower: str) -> bool:
|
||||||
|
"""
|
||||||
|
Check if text contains grant verb patterns.
|
||||||
|
|
||||||
|
Used to distinguish inherent abilities from granted abilities.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text_lower: Lowercase card text
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if grant verbs found
|
||||||
|
"""
|
||||||
|
grant_patterns = [
|
||||||
|
re.compile(r'(?:have|gain|grant|give|get)[s]?\s+', re.IGNORECASE),
|
||||||
|
rgx.OTHER_CREATURES,
|
||||||
|
rgx.CREATURE_YOU_CONTROL,
|
||||||
|
rgx.PERMANENT_YOU_CONTROL,
|
||||||
|
rgx.EQUIPPED_CREATURE,
|
||||||
|
rgx.ENCHANTED_CREATURE,
|
||||||
|
rgx.TARGET_CREATURE,
|
||||||
|
]
|
||||||
|
|
||||||
|
return any(pattern.search(text_lower) for pattern in grant_patterns)
|
||||||
|
|
||||||
|
|
||||||
|
def format_scope_tag(scope: str, ability: str) -> str:
|
||||||
|
"""
|
||||||
|
Format a scope and ability into a metadata tag.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
scope: Scope string (e.g., "Self", "Your Permanents")
|
||||||
|
ability: Ability name (e.g., "Hexproof", "Phasing")
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Formatted tag string (e.g., "Self: Hexproof")
|
||||||
|
"""
|
||||||
|
return f"{scope}: {ability}"
|
||||||
|
|
||||||
|
|
||||||
|
def has_keyword(text: str, keywords: List[str]) -> bool:
|
||||||
|
"""
|
||||||
|
Quick check if card text contains any of the specified keywords.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Card text
|
||||||
|
keywords: List of keywords to search for
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if any keyword found
|
||||||
|
"""
|
||||||
|
if not text:
|
||||||
|
return False
|
||||||
|
|
||||||
|
text_lower = text.lower()
|
||||||
|
return any(keyword.lower() in text_lower for keyword in keywords)
|
||||||
|
|
||||||
|
|
||||||
|
def check_static_keyword(
|
||||||
|
ability_keyword: str,
|
||||||
|
keywords: Optional[str] = None,
|
||||||
|
text: Optional[str] = None
|
||||||
|
) -> bool:
|
||||||
|
"""
|
||||||
|
Check if card has ability as a static keyword (not granted to others).
|
||||||
|
|
||||||
|
A static keyword is one that appears:
|
||||||
|
1. In the keywords field, OR
|
||||||
|
2. As a simple comma-separated list without grant verbs
|
||||||
|
(e.g., "Flying, first strike, protection from black")
|
||||||
|
|
||||||
|
Args:
|
||||||
|
ability_keyword: Ability to check (e.g., "Protection", "Hexproof")
|
||||||
|
keywords: Optional keywords field from card data
|
||||||
|
text: Optional card text for fallback detection
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if ability appears as static keyword
|
||||||
|
"""
|
||||||
|
ability_lower = ability_keyword.lower()
|
||||||
|
|
||||||
|
# Check keywords field first (most reliable)
|
||||||
|
if keywords:
|
||||||
|
keywords_lower = keywords.lower()
|
||||||
|
if ability_lower in keywords_lower:
|
||||||
|
return True
|
||||||
|
|
||||||
|
# Fallback: Check if ability appears in simple comma-separated keyword list
|
||||||
|
# Pattern: starts with keywords (Flying, First strike, etc.) without grant verbs
|
||||||
|
# Example: "Flying, first strike, vigilance, trample, haste, protection from black"
|
||||||
|
if text:
|
||||||
|
text_lower = text.lower()
|
||||||
|
|
||||||
|
# Check if ability appears in text but WITHOUT grant verbs
|
||||||
|
if ability_lower in text_lower:
|
||||||
|
# Look for grant verbs that would indicate this is NOT a static keyword
|
||||||
|
grant_verbs = ['have', 'has', 'gain', 'gains', 'get', 'gets', 'grant', 'grants', 'give', 'gives']
|
||||||
|
|
||||||
|
# Find the position of the ability in text
|
||||||
|
ability_pos = text_lower.find(ability_lower)
|
||||||
|
|
||||||
|
# Check the 50 characters before the ability for grant verbs
|
||||||
|
# This catches patterns like "creatures gain protection" or "has hexproof"
|
||||||
|
context_before = text_lower[max(0, ability_pos - 50):ability_pos]
|
||||||
|
|
||||||
|
# If no grant verbs found nearby, it's likely a static keyword
|
||||||
|
if not any(verb in context_before for verb in grant_verbs):
|
||||||
|
# Additional check: is it part of a comma-separated list?
|
||||||
|
# This helps with "Flying, first strike, protection from X" patterns
|
||||||
|
context_before_30 = text_lower[max(0, ability_pos - 30):ability_pos]
|
||||||
|
if ',' in context_before_30 or ability_pos < 10:
|
||||||
|
return True
|
||||||
|
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def check_static_keyword_legacy(
|
||||||
|
keywords: str,
|
||||||
|
static_keyword: str,
|
||||||
|
text: str,
|
||||||
|
grant_patterns: Optional[List[re.Pattern]] = None
|
||||||
|
) -> bool:
|
||||||
|
"""
|
||||||
|
LEGACY: Check if card has static keyword without granting it to others.
|
||||||
|
|
||||||
|
Used for abilities like "Phasing" that can be both static and granted.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
keywords: Card keywords field
|
||||||
|
static_keyword: Keyword to search for (e.g., "phasing")
|
||||||
|
text: Card text
|
||||||
|
grant_patterns: Optional patterns to check for granting language
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if static keyword found and not granted to others
|
||||||
|
"""
|
||||||
|
if not keywords:
|
||||||
|
return False
|
||||||
|
|
||||||
|
keywords_lower = keywords.lower()
|
||||||
|
|
||||||
|
if static_keyword.lower() not in keywords_lower:
|
||||||
|
return False
|
||||||
|
|
||||||
|
# If grant patterns provided, check if card grants to others
|
||||||
|
if grant_patterns:
|
||||||
|
text_no_reminder = tag_utils.strip_reminder_text(text.lower()) if text else ''
|
||||||
|
grants_to_others = any(pattern.search(text_no_reminder) for pattern in grant_patterns)
|
||||||
|
|
||||||
|
# Only return True if NOT granting to others
|
||||||
|
return not grants_to_others
|
||||||
|
|
||||||
|
return True
|
||||||
|
|
@ -1,13 +1,59 @@
|
||||||
from typing import Dict, List, Final
|
"""
|
||||||
|
Tag Constants Module
|
||||||
|
|
||||||
|
Centralized constants for card tagging and theme detection across the MTG deckbuilder.
|
||||||
|
This module contains all shared constants used by the tagging system including:
|
||||||
|
- Card types and creature types
|
||||||
|
- Pattern groups and regex fragments
|
||||||
|
- Tag groupings and relationships
|
||||||
|
- Protection and ability keywords
|
||||||
|
- Magic numbers and thresholds
|
||||||
|
"""
|
||||||
|
|
||||||
|
from typing import Dict, Final, List
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# TABLE OF CONTENTS
|
||||||
|
# =============================================================================
|
||||||
|
# 1. TRIGGERS & BASIC PATTERNS
|
||||||
|
# 2. TAG GROUPS & RELATIONSHIPS
|
||||||
|
# 3. PATTERN GROUPS & REGEX FRAGMENTS
|
||||||
|
# 4. PHRASE GROUPS
|
||||||
|
# 5. COUNTER TYPES
|
||||||
|
# 6. CREATURE TYPES
|
||||||
|
# 7. NON-CREATURE TYPES & SPECIAL TYPES
|
||||||
|
# 8. PROTECTION & ABILITY KEYWORDS
|
||||||
|
# 9. TOKEN TYPES
|
||||||
|
# 10. MAGIC NUMBERS & THRESHOLDS
|
||||||
|
# 11. DATAFRAME COLUMN REQUIREMENTS
|
||||||
|
# 12. TYPE-TAG MAPPINGS
|
||||||
|
# 13. DRAW-RELATED CONSTANTS
|
||||||
|
# 14. EQUIPMENT-RELATED CONSTANTS
|
||||||
|
# 15. AURA & VOLTRON CONSTANTS
|
||||||
|
# 16. LANDS MATTER PATTERNS
|
||||||
|
# 17. SACRIFICE & GRAVEYARD PATTERNS
|
||||||
|
# 18. CREATURE-RELATED PATTERNS
|
||||||
|
# 19. TOKEN-RELATED PATTERNS
|
||||||
|
# 20. REMOVAL & DESTRUCTION PATTERNS
|
||||||
|
# 21. SPELL-RELATED PATTERNS
|
||||||
|
# 22. MISC PATTERNS & EXCLUSIONS
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# 1. TRIGGERS & BASIC PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
TRIGGERS: List[str] = ['when', 'whenever', 'at']
|
TRIGGERS: List[str] = ['when', 'whenever', 'at']
|
||||||
|
|
||||||
NUM_TO_SEARCH: List[str] = ['a', 'an', 'one', '1', 'two', '2', 'three', '3', 'four','4', 'five', '5',
|
NUM_TO_SEARCH: List[str] = [
|
||||||
'six', '6', 'seven', '7', 'eight', '8', 'nine', '9', 'ten', '10',
|
'a', 'an', 'one', '1', 'two', '2', 'three', '3', 'four', '4', 'five', '5',
|
||||||
'x','one or more']
|
'six', '6', 'seven', '7', 'eight', '8', 'nine', '9', 'ten', '10',
|
||||||
|
'x', 'one or more'
|
||||||
|
]
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# 2. TAG GROUPS & RELATIONSHIPS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
# Constants for common tag groupings
|
|
||||||
TAG_GROUPS: Dict[str, List[str]] = {
|
TAG_GROUPS: Dict[str, List[str]] = {
|
||||||
"Cantrips": ["Cantrips", "Card Draw", "Spellslinger", "Spells Matter"],
|
"Cantrips": ["Cantrips", "Card Draw", "Spellslinger", "Spells Matter"],
|
||||||
"Tokens": ["Token Creation", "Tokens Matter"],
|
"Tokens": ["Token Creation", "Tokens Matter"],
|
||||||
|
|
@ -19,8 +65,11 @@ TAG_GROUPS: Dict[str, List[str]] = {
|
||||||
"Spells": ["Spellslinger", "Spells Matter"]
|
"Spells": ["Spellslinger", "Spells Matter"]
|
||||||
}
|
}
|
||||||
|
|
||||||
# Common regex patterns
|
# =============================================================================
|
||||||
PATTERN_GROUPS: Dict[str, str] = {
|
# 3. PATTERN GROUPS & REGEX FRAGMENTS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
PATTERN_GROUPS: Dict[str, str] = {
|
||||||
"draw": r"draw[s]? a card|draw[s]? one card",
|
"draw": r"draw[s]? a card|draw[s]? one card",
|
||||||
"combat": r"attack[s]?|block[s]?|combat damage",
|
"combat": r"attack[s]?|block[s]?|combat damage",
|
||||||
"tokens": r"create[s]? .* token|put[s]? .* token",
|
"tokens": r"create[s]? .* token|put[s]? .* token",
|
||||||
|
|
@ -30,7 +79,10 @@ PATTERN_GROUPS: Dict[str, str] = {
|
||||||
"cost_reduction": r"cost[s]? \{[\d\w]\} less|affinity for|cost[s]? less to cast|chosen type cost|copy cost|from exile cost|from exile this turn cost|from your graveyard cost|has undaunted|have affinity for artifacts|other than your hand cost|spells cost|spells you cast cost|that target .* cost|those spells cost|you cast cost|you pay cost"
|
"cost_reduction": r"cost[s]? \{[\d\w]\} less|affinity for|cost[s]? less to cast|chosen type cost|copy cost|from exile cost|from exile this turn cost|from your graveyard cost|has undaunted|have affinity for artifacts|other than your hand cost|spells cost|spells you cast cost|that target .* cost|those spells cost|you cast cost|you pay cost"
|
||||||
}
|
}
|
||||||
|
|
||||||
# Common phrase groups (lists) used across taggers
|
# =============================================================================
|
||||||
|
# 4. PHRASE GROUPS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
PHRASE_GROUPS: Dict[str, List[str]] = {
|
PHRASE_GROUPS: Dict[str, List[str]] = {
|
||||||
# Variants for monarch wording
|
# Variants for monarch wording
|
||||||
"monarch": [
|
"monarch": [
|
||||||
|
|
@ -52,11 +104,15 @@ PHRASE_GROUPS: Dict[str, List[str]] = {
|
||||||
r"return .* to the battlefield"
|
r"return .* to the battlefield"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
# Common action patterns
|
|
||||||
CREATE_ACTION_PATTERN: Final[str] = r"create|put"
|
CREATE_ACTION_PATTERN: Final[str] = r"create|put"
|
||||||
|
|
||||||
# Creature/Counter types
|
# =============================================================================
|
||||||
COUNTER_TYPES: List[str] = [r'\+0/\+1', r'\+0/\+2', r'\+1/\+0', r'\+1/\+2', r'\+2/\+0', r'\+2/\+2',
|
# 5. COUNTER TYPES
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
COUNTER_TYPES: List[str] = [
|
||||||
|
r'\+0/\+1', r'\+0/\+2', r'\+1/\+0', r'\+1/\+2', r'\+2/\+0', r'\+2/\+2',
|
||||||
'-0/-1', '-0/-2', '-1/-0', '-1/-2', '-2/-0', '-2/-2',
|
'-0/-1', '-0/-2', '-1/-0', '-1/-2', '-2/-0', '-2/-2',
|
||||||
'Acorn', 'Aegis', 'Age', 'Aim', 'Arrow', 'Arrowhead','Awakening',
|
'Acorn', 'Aegis', 'Age', 'Aim', 'Arrow', 'Arrowhead','Awakening',
|
||||||
'Bait', 'Blaze', 'Blessing', 'Blight',' Blood', 'Bloddline',
|
'Bait', 'Blaze', 'Blessing', 'Blight',' Blood', 'Bloddline',
|
||||||
|
|
@ -90,9 +146,15 @@ COUNTER_TYPES: List[str] = [r'\+0/\+1', r'\+0/\+2', r'\+1/\+0', r'\+1/\+2', r'\+
|
||||||
'Task', 'Ticket', 'Tide', 'Time', 'Tower', 'Training', 'Trap',
|
'Task', 'Ticket', 'Tide', 'Time', 'Tower', 'Training', 'Trap',
|
||||||
'Treasure', 'Unity', 'Unlock', 'Valor', 'Velocity', 'Verse',
|
'Treasure', 'Unity', 'Unlock', 'Valor', 'Velocity', 'Verse',
|
||||||
'Vitality', 'Void', 'Volatile', 'Vortex', 'Vow', 'Voyage', 'Wage',
|
'Vitality', 'Void', 'Volatile', 'Vortex', 'Vow', 'Voyage', 'Wage',
|
||||||
'Winch', 'Wind', 'Wish']
|
'Winch', 'Wind', 'Wish'
|
||||||
|
]
|
||||||
|
|
||||||
CREATURE_TYPES: List[str] = ['Advisor', 'Aetherborn', 'Alien', 'Ally', 'Angel', 'Antelope', 'Ape', 'Archer', 'Archon', 'Armadillo',
|
# =============================================================================
|
||||||
|
# 6. CREATURE TYPES
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
CREATURE_TYPES: List[str] = [
|
||||||
|
'Advisor', 'Aetherborn', 'Alien', 'Ally', 'Angel', 'Antelope', 'Ape', 'Archer', 'Archon', 'Armadillo',
|
||||||
'Army', 'Artificer', 'Assassin', 'Assembly-Worker', 'Astartes', 'Atog', 'Aurochs', 'Automaton',
|
'Army', 'Artificer', 'Assassin', 'Assembly-Worker', 'Astartes', 'Atog', 'Aurochs', 'Automaton',
|
||||||
'Avatar', 'Azra', 'Badger', 'Balloon', 'Barbarian', 'Bard', 'Basilisk', 'Bat', 'Bear', 'Beast', 'Beaver',
|
'Avatar', 'Azra', 'Badger', 'Balloon', 'Barbarian', 'Bard', 'Basilisk', 'Bat', 'Bear', 'Beast', 'Beaver',
|
||||||
'Beeble', 'Beholder', 'Berserker', 'Bird', 'Blinkmoth', 'Boar', 'Brainiac', 'Bringer', 'Brushwagg',
|
'Beeble', 'Beholder', 'Berserker', 'Bird', 'Blinkmoth', 'Boar', 'Brainiac', 'Bringer', 'Brushwagg',
|
||||||
|
|
@ -122,9 +184,15 @@ CREATURE_TYPES: List[str] = ['Advisor', 'Aetherborn', 'Alien', 'Ally', 'Angel',
|
||||||
'Thopter', 'Thrull', 'Tiefling', 'Time Lord', 'Toy', 'Treefolk', 'Trilobite', 'Triskelavite', 'Troll',
|
'Thopter', 'Thrull', 'Tiefling', 'Time Lord', 'Toy', 'Treefolk', 'Trilobite', 'Triskelavite', 'Troll',
|
||||||
'Turtle', 'Tyranid', 'Unicorn', 'Urzan', 'Vampire', 'Varmint', 'Vedalken', 'Volver', 'Wall', 'Walrus',
|
'Turtle', 'Tyranid', 'Unicorn', 'Urzan', 'Vampire', 'Varmint', 'Vedalken', 'Volver', 'Wall', 'Walrus',
|
||||||
'Warlock', 'Warrior', 'Wasp', 'Weasel', 'Weird', 'Werewolf', 'Whale', 'Wizard', 'Wolf', 'Wolverine', 'Wombat',
|
'Warlock', 'Warrior', 'Wasp', 'Weasel', 'Weird', 'Werewolf', 'Whale', 'Wizard', 'Wolf', 'Wolverine', 'Wombat',
|
||||||
'Worm', 'Wraith', 'Wurm', 'Yeti', 'Zombie', 'Zubera']
|
'Worm', 'Wraith', 'Wurm', 'Yeti', 'Zombie', 'Zubera'
|
||||||
|
]
|
||||||
|
|
||||||
NON_CREATURE_TYPES: List[str] = ['Legendary', 'Creature', 'Enchantment', 'Artifact',
|
# =============================================================================
|
||||||
|
# 7. NON-CREATURE TYPES & SPECIAL TYPES
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
NON_CREATURE_TYPES: List[str] = [
|
||||||
|
'Legendary', 'Creature', 'Enchantment', 'Artifact',
|
||||||
'Battle', 'Sorcery', 'Instant', 'Land', '-', '—',
|
'Battle', 'Sorcery', 'Instant', 'Land', '-', '—',
|
||||||
'Blood', 'Clue', 'Food', 'Gold', 'Incubator',
|
'Blood', 'Clue', 'Food', 'Gold', 'Incubator',
|
||||||
'Junk', 'Map', 'Powerstone', 'Treasure',
|
'Junk', 'Map', 'Powerstone', 'Treasure',
|
||||||
|
|
@ -136,23 +204,66 @@ NON_CREATURE_TYPES: List[str] = ['Legendary', 'Creature', 'Enchantment', 'Artifa
|
||||||
'Shrine',
|
'Shrine',
|
||||||
'Plains', 'Island', 'Swamp', 'Forest', 'Mountain',
|
'Plains', 'Island', 'Swamp', 'Forest', 'Mountain',
|
||||||
'Cave', 'Desert', 'Gate', 'Lair', 'Locus', 'Mine',
|
'Cave', 'Desert', 'Gate', 'Lair', 'Locus', 'Mine',
|
||||||
'Power-Plant', 'Sphere', 'Tower', 'Urza\'s']
|
'Power-Plant', 'Sphere', 'Tower', 'Urza\'s'
|
||||||
|
]
|
||||||
|
|
||||||
OUTLAW_TYPES: List[str] = ['Assassin', 'Mercenary', 'Pirate', 'Rogue', 'Warlock']
|
OUTLAW_TYPES: List[str] = ['Assassin', 'Mercenary', 'Pirate', 'Rogue', 'Warlock']
|
||||||
|
|
||||||
ENCHANTMENT_TOKENS: List[str] = ['Cursed Role', 'Monster Role', 'Royal Role', 'Sorcerer Role',
|
# =============================================================================
|
||||||
'Virtuous Role', 'Wicked Role', 'Young Hero Role', 'Shard']
|
# 8. PROTECTION & ABILITY KEYWORDS
|
||||||
ARTIFACT_TOKENS: List[str] = ['Blood', 'Clue', 'Food', 'Gold', 'Incubator',
|
# =============================================================================
|
||||||
'Junk','Map','Powerstone', 'Treasure']
|
|
||||||
|
PROTECTION_ABILITIES: List[str] = [
|
||||||
|
'Protection',
|
||||||
|
'Ward',
|
||||||
|
'Hexproof',
|
||||||
|
'Shroud',
|
||||||
|
'Indestructible'
|
||||||
|
]
|
||||||
|
|
||||||
|
PROTECTION_KEYWORDS: Final[frozenset] = frozenset({
|
||||||
|
'hexproof',
|
||||||
|
'shroud',
|
||||||
|
'indestructible',
|
||||||
|
'ward',
|
||||||
|
'protection from',
|
||||||
|
'protection',
|
||||||
|
})
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# 9. TOKEN TYPES
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
ENCHANTMENT_TOKENS: List[str] = [
|
||||||
|
'Cursed Role', 'Monster Role', 'Royal Role', 'Sorcerer Role',
|
||||||
|
'Virtuous Role', 'Wicked Role', 'Young Hero Role', 'Shard'
|
||||||
|
]
|
||||||
|
|
||||||
|
ARTIFACT_TOKENS: List[str] = [
|
||||||
|
'Blood', 'Clue', 'Food', 'Gold', 'Incubator',
|
||||||
|
'Junk', 'Map', 'Powerstone', 'Treasure'
|
||||||
|
]
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# 10. MAGIC NUMBERS & THRESHOLDS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
CONTEXT_WINDOW_SIZE: Final[int] = 70 # Characters to examine around a regex match
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# 11. DATAFRAME COLUMN REQUIREMENTS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
# Constants for DataFrame validation and processing
|
|
||||||
REQUIRED_COLUMNS: List[str] = [
|
REQUIRED_COLUMNS: List[str] = [
|
||||||
'name', 'faceName', 'edhrecRank', 'colorIdentity', 'colors',
|
'name', 'faceName', 'edhrecRank', 'colorIdentity', 'colors',
|
||||||
'manaCost', 'manaValue', 'type', 'creatureTypes', 'text',
|
'manaCost', 'manaValue', 'type', 'creatureTypes', 'text',
|
||||||
'power', 'toughness', 'keywords', 'themeTags', 'layout', 'side'
|
'power', 'toughness', 'keywords', 'themeTags', 'layout', 'side'
|
||||||
]
|
]
|
||||||
|
|
||||||
# Mapping of card types to their corresponding theme tags
|
# =============================================================================
|
||||||
|
# 12. TYPE-TAG MAPPINGS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
TYPE_TAG_MAPPING: Dict[str, List[str]] = {
|
TYPE_TAG_MAPPING: Dict[str, List[str]] = {
|
||||||
'Artifact': ['Artifacts Matter'],
|
'Artifact': ['Artifacts Matter'],
|
||||||
'Battle': ['Battles Matter'],
|
'Battle': ['Battles Matter'],
|
||||||
|
|
@ -166,7 +277,10 @@ TYPE_TAG_MAPPING: Dict[str, List[str]] = {
|
||||||
'Sorcery': ['Spells Matter', 'Spellslinger']
|
'Sorcery': ['Spells Matter', 'Spellslinger']
|
||||||
}
|
}
|
||||||
|
|
||||||
# Constants for draw-related functionality
|
# =============================================================================
|
||||||
|
# 13. DRAW-RELATED CONSTANTS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
DRAW_RELATED_TAGS: List[str] = [
|
DRAW_RELATED_TAGS: List[str] = [
|
||||||
'Card Draw', # General card draw effects
|
'Card Draw', # General card draw effects
|
||||||
'Conditional Draw', # Draw effects with conditions/triggers
|
'Conditional Draw', # Draw effects with conditions/triggers
|
||||||
|
|
@ -175,16 +289,18 @@ DRAW_RELATED_TAGS: List[str] = [
|
||||||
'Loot', # Draw + discard effects
|
'Loot', # Draw + discard effects
|
||||||
'Replacement Draw', # Effects that modify or replace draws
|
'Replacement Draw', # Effects that modify or replace draws
|
||||||
'Sacrifice to Draw', # Draw effects requiring sacrificing permanents
|
'Sacrifice to Draw', # Draw effects requiring sacrificing permanents
|
||||||
'Unconditional Draw' # Pure card draw without conditions
|
'Unconditional Draw' # Pure card draw without conditions
|
||||||
]
|
]
|
||||||
|
|
||||||
# Text patterns that exclude cards from being tagged as unconditional draw
|
|
||||||
DRAW_EXCLUSION_PATTERNS: List[str] = [
|
DRAW_EXCLUSION_PATTERNS: List[str] = [
|
||||||
'annihilator', # Eldrazi mechanic that can match 'draw' patterns
|
'annihilator', # Eldrazi mechanic that can match 'draw' patterns
|
||||||
'ravenous', # Keyword that can match 'draw' patterns
|
'ravenous', # Keyword that can match 'draw' patterns
|
||||||
]
|
]
|
||||||
|
|
||||||
# Equipment-related constants
|
# =============================================================================
|
||||||
|
# 14. EQUIPMENT-RELATED CONSTANTS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
EQUIPMENT_EXCLUSIONS: List[str] = [
|
EQUIPMENT_EXCLUSIONS: List[str] = [
|
||||||
'Bruenor Battlehammer', # Equipment cost reduction
|
'Bruenor Battlehammer', # Equipment cost reduction
|
||||||
'Nazahn, Revered Bladesmith', # Equipment tutor
|
'Nazahn, Revered Bladesmith', # Equipment tutor
|
||||||
|
|
@ -223,7 +339,10 @@ EQUIPMENT_TEXT_PATTERNS: List[str] = [
|
||||||
'unequip', # Equipment removal
|
'unequip', # Equipment removal
|
||||||
]
|
]
|
||||||
|
|
||||||
# Aura-related constants
|
# =============================================================================
|
||||||
|
# 15. AURA & VOLTRON CONSTANTS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
AURA_SPECIFIC_CARDS: List[str] = [
|
AURA_SPECIFIC_CARDS: List[str] = [
|
||||||
'Ardenn, Intrepid Archaeologist', # Aura movement
|
'Ardenn, Intrepid Archaeologist', # Aura movement
|
||||||
'Calix, Guided By Fate', # Create duplicate Auras
|
'Calix, Guided By Fate', # Create duplicate Auras
|
||||||
|
|
@ -267,7 +386,10 @@ VOLTRON_PATTERNS: List[str] = [
|
||||||
'reconfigure'
|
'reconfigure'
|
||||||
]
|
]
|
||||||
|
|
||||||
# Constants for lands matter functionality
|
# =============================================================================
|
||||||
|
# 16. LANDS MATTER PATTERNS
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
LANDS_MATTER_PATTERNS: Dict[str, List[str]] = {
|
LANDS_MATTER_PATTERNS: Dict[str, List[str]] = {
|
||||||
'land_play': [
|
'land_play': [
|
||||||
'play a land',
|
'play a land',
|
||||||
|
|
@ -850,3 +972,109 @@ TOPDECK_EXCLUSION_PATTERNS: List[str] = [
|
||||||
'look at the top card of target player\'s library',
|
'look at the top card of target player\'s library',
|
||||||
'reveal the top card of target player\'s library'
|
'reveal the top card of target player\'s library'
|
||||||
]
|
]
|
||||||
|
|
||||||
|
# ==============================================================================
|
||||||
|
# Keyword Normalization (M1 - Tagging Refinement)
|
||||||
|
# ==============================================================================
|
||||||
|
|
||||||
|
# Keyword normalization map: variant -> canonical
|
||||||
|
# Maps Commander-specific and variant keywords to their canonical forms
|
||||||
|
KEYWORD_NORMALIZATION_MAP: Dict[str, str] = {
|
||||||
|
# Commander variants
|
||||||
|
'Commander ninjutsu': 'Ninjutsu',
|
||||||
|
'Commander Ninjutsu': 'Ninjutsu',
|
||||||
|
|
||||||
|
# Partner variants (already excluded but mapped for reference)
|
||||||
|
'Partner with': 'Partner',
|
||||||
|
'Choose a Background': 'Choose a Background', # Keep distinct
|
||||||
|
"Doctor's Companion": "Doctor's Companion", # Keep distinct
|
||||||
|
|
||||||
|
# Case normalization for common keywords (most are already correct)
|
||||||
|
'flying': 'Flying',
|
||||||
|
'trample': 'Trample',
|
||||||
|
'vigilance': 'Vigilance',
|
||||||
|
'haste': 'Haste',
|
||||||
|
'deathtouch': 'Deathtouch',
|
||||||
|
'lifelink': 'Lifelink',
|
||||||
|
'menace': 'Menace',
|
||||||
|
'reach': 'Reach',
|
||||||
|
}
|
||||||
|
|
||||||
|
# Keywords that should never appear in theme tags
|
||||||
|
# Already excluded during keyword tagging, but documented here
|
||||||
|
KEYWORD_EXCLUSION_SET: set[str] = {
|
||||||
|
'partner', # Already excluded in tag_for_keywords
|
||||||
|
}
|
||||||
|
|
||||||
|
# Keyword allowlist - keywords that should survive singleton pruning
|
||||||
|
# Seeded from top keywords and theme whitelist
|
||||||
|
KEYWORD_ALLOWLIST: set[str] = {
|
||||||
|
# Evergreen keywords (top 50 from baseline)
|
||||||
|
'Flying', 'Enchant', 'Trample', 'Vigilance', 'Haste', 'Equip', 'Flash',
|
||||||
|
'Mill', 'Scry', 'Transform', 'Cycling', 'First strike', 'Reach', 'Menace',
|
||||||
|
'Lifelink', 'Treasure', 'Defender', 'Deathtouch', 'Kicker', 'Flashback',
|
||||||
|
'Protection', 'Surveil', 'Landfall', 'Crew', 'Ward', 'Morph', 'Devoid',
|
||||||
|
'Investigate', 'Fight', 'Food', 'Partner', 'Double strike', 'Indestructible',
|
||||||
|
'Threshold', 'Proliferate', 'Convoke', 'Hexproof', 'Cumulative upkeep',
|
||||||
|
'Goad', 'Delirium', 'Prowess', 'Suspend', 'Affinity', 'Madness', 'Manifest',
|
||||||
|
'Amass', 'Domain', 'Unearth', 'Explore', 'Changeling',
|
||||||
|
|
||||||
|
# Additional important mechanics
|
||||||
|
'Myriad', 'Cascade', 'Storm', 'Dredge', 'Delve', 'Escape', 'Mutate',
|
||||||
|
'Ninjutsu', 'Overload', 'Rebound', 'Retrace', 'Bloodrush', 'Cipher',
|
||||||
|
'Extort', 'Evolve', 'Undying', 'Persist', 'Wither', 'Infect', 'Annihilator',
|
||||||
|
'Exalted', 'Phasing', 'Shadow', 'Horsemanship', 'Banding', 'Rampage',
|
||||||
|
'Shroud', 'Split second', 'Totem armor', 'Living weapon', 'Undaunted',
|
||||||
|
'Improvise', 'Surge', 'Emerge', 'Escalate', 'Meld', 'Partner', 'Afflict',
|
||||||
|
'Aftermath', 'Embalm', 'Eternalize', 'Exert', 'Fabricate', 'Improvise',
|
||||||
|
'Assist', 'Jump-start', 'Mentor', 'Riot', 'Spectacle', 'Addendum',
|
||||||
|
'Afterlife', 'Adapt', 'Enrage', 'Ascend', 'Learn', 'Boast', 'Foretell',
|
||||||
|
'Squad', 'Encore', 'Daybound', 'Nightbound', 'Disturb', 'Cleave', 'Training',
|
||||||
|
'Reconfigure', 'Blitz', 'Casualty', 'Connive', 'Hideaway', 'Prototype',
|
||||||
|
'Read ahead', 'Living metal', 'More than meets the eye', 'Ravenous',
|
||||||
|
'Squad', 'Toxic', 'For Mirrodin!', 'Backup', 'Bargain', 'Craft', 'Freerunning',
|
||||||
|
'Plot', 'Spree', 'Offspring', 'Bestow', 'Monstrosity', 'Tribute',
|
||||||
|
|
||||||
|
# Partner mechanics (distinct types)
|
||||||
|
'Choose a Background', "Doctor's Companion",
|
||||||
|
|
||||||
|
# Token types (frequently used)
|
||||||
|
'Blood', 'Clue', 'Food', 'Gold', 'Treasure', 'Powerstone',
|
||||||
|
|
||||||
|
# Common ability words
|
||||||
|
'Landfall', 'Raid', 'Revolt', 'Threshold', 'Metalcraft', 'Morbid',
|
||||||
|
'Bloodthirst', 'Battalion', 'Channel', 'Grandeur', 'Kinship', 'Sweep',
|
||||||
|
'Radiance', 'Join forces', 'Fateful hour', 'Inspired', 'Heroic',
|
||||||
|
'Constellation', 'Strive', 'Prowess', 'Ferocious', 'Formidable', 'Renown',
|
||||||
|
'Tempting offer', 'Will of the council', 'Parley', 'Adamant', 'Devotion',
|
||||||
|
}
|
||||||
|
|
||||||
|
# ==============================================================================
|
||||||
|
# Metadata Tag Classification (M3 - Tagging Refinement)
|
||||||
|
# ==============================================================================
|
||||||
|
|
||||||
|
# Metadata tag prefixes - tags starting with these are classified as metadata
|
||||||
|
METADATA_TAG_PREFIXES: List[str] = [
|
||||||
|
'Applied:',
|
||||||
|
'Bracket:',
|
||||||
|
'Diagnostic:',
|
||||||
|
'Internal:',
|
||||||
|
]
|
||||||
|
|
||||||
|
# Specific metadata tags (full match) - additional tags to classify as metadata
|
||||||
|
# These are typically diagnostic, bracket-related, or internal annotations
|
||||||
|
METADATA_TAG_ALLOWLIST: set[str] = {
|
||||||
|
# Bracket annotations
|
||||||
|
'Bracket: Game Changer',
|
||||||
|
'Bracket: Staple',
|
||||||
|
'Bracket: Format Warping',
|
||||||
|
|
||||||
|
# Cost reduction diagnostics (from Applied: namespace)
|
||||||
|
'Applied: Cost Reduction',
|
||||||
|
|
||||||
|
# Kindred-specific protection metadata (from M2)
|
||||||
|
# Format: "{CreatureType}s Gain Protection"
|
||||||
|
# These are auto-generated for kindred-specific protection grants
|
||||||
|
# Example: "Knights Gain Protection", "Frogs Gain Protection"
|
||||||
|
# Note: These are dynamically generated, so we match via prefix in classify_tag
|
||||||
|
}
|
||||||
|
|
@ -13,18 +13,11 @@ The module is designed to work with pandas DataFrames containing card data and p
|
||||||
vectorized operations for efficient processing of large card collections.
|
vectorized operations for efficient processing of large card collections.
|
||||||
"""
|
"""
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
# Standard library imports
|
|
||||||
import re
|
import re
|
||||||
from typing import List, Set, Union, Any, Tuple
|
|
||||||
from functools import lru_cache
|
from functools import lru_cache
|
||||||
|
from typing import Any, List, Set, Tuple, Union
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
# Third-party imports
|
|
||||||
import pandas as pd
|
import pandas as pd
|
||||||
|
|
||||||
# Local application imports
|
|
||||||
from . import tag_constants
|
from . import tag_constants
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -58,7 +51,6 @@ def _ensure_norm_series(df: pd.DataFrame, source_col: str, norm_col: str) -> pd.
|
||||||
"""
|
"""
|
||||||
if norm_col in df.columns:
|
if norm_col in df.columns:
|
||||||
return df[norm_col]
|
return df[norm_col]
|
||||||
# Create normalized string series
|
|
||||||
series = df[source_col].fillna('') if source_col in df.columns else pd.Series([''] * len(df), index=df.index)
|
series = df[source_col].fillna('') if source_col in df.columns else pd.Series([''] * len(df), index=df.index)
|
||||||
series = series.astype(str)
|
series = series.astype(str)
|
||||||
df[norm_col] = series
|
df[norm_col] = series
|
||||||
|
|
@ -120,8 +112,6 @@ def create_type_mask(df: pd.DataFrame, type_text: Union[str, List[str]], regex:
|
||||||
|
|
||||||
if len(df) == 0:
|
if len(df) == 0:
|
||||||
return pd.Series([], dtype=bool)
|
return pd.Series([], dtype=bool)
|
||||||
|
|
||||||
# Use normalized cached series
|
|
||||||
type_series = _ensure_norm_series(df, 'type', '__type_s')
|
type_series = _ensure_norm_series(df, 'type', '__type_s')
|
||||||
|
|
||||||
if regex:
|
if regex:
|
||||||
|
|
@ -160,8 +150,6 @@ def create_text_mask(df: pd.DataFrame, type_text: Union[str, List[str]], regex:
|
||||||
|
|
||||||
if len(df) == 0:
|
if len(df) == 0:
|
||||||
return pd.Series([], dtype=bool)
|
return pd.Series([], dtype=bool)
|
||||||
|
|
||||||
# Use normalized cached series
|
|
||||||
text_series = _ensure_norm_series(df, 'text', '__text_s')
|
text_series = _ensure_norm_series(df, 'text', '__text_s')
|
||||||
|
|
||||||
if regex:
|
if regex:
|
||||||
|
|
@ -192,10 +180,7 @@ def create_keyword_mask(df: pd.DataFrame, type_text: Union[str, List[str]], rege
|
||||||
TypeError: If type_text is not a string or list of strings
|
TypeError: If type_text is not a string or list of strings
|
||||||
ValueError: If required 'keywords' column is missing from DataFrame
|
ValueError: If required 'keywords' column is missing from DataFrame
|
||||||
"""
|
"""
|
||||||
# Validate required columns
|
|
||||||
validate_dataframe_columns(df, {'keywords'})
|
validate_dataframe_columns(df, {'keywords'})
|
||||||
|
|
||||||
# Handle empty DataFrame case
|
|
||||||
if len(df) == 0:
|
if len(df) == 0:
|
||||||
return pd.Series([], dtype=bool)
|
return pd.Series([], dtype=bool)
|
||||||
|
|
||||||
|
|
@ -206,8 +191,6 @@ def create_keyword_mask(df: pd.DataFrame, type_text: Union[str, List[str]], rege
|
||||||
type_text = [type_text]
|
type_text = [type_text]
|
||||||
elif not isinstance(type_text, list):
|
elif not isinstance(type_text, list):
|
||||||
raise TypeError("type_text must be a string or list of strings")
|
raise TypeError("type_text must be a string or list of strings")
|
||||||
|
|
||||||
# Use normalized cached series for keywords
|
|
||||||
keywords = _ensure_norm_series(df, 'keywords', '__keywords_s')
|
keywords = _ensure_norm_series(df, 'keywords', '__keywords_s')
|
||||||
|
|
||||||
if regex:
|
if regex:
|
||||||
|
|
@ -245,8 +228,6 @@ def create_name_mask(df: pd.DataFrame, type_text: Union[str, List[str]], regex:
|
||||||
|
|
||||||
if len(df) == 0:
|
if len(df) == 0:
|
||||||
return pd.Series([], dtype=bool)
|
return pd.Series([], dtype=bool)
|
||||||
|
|
||||||
# Use normalized cached series
|
|
||||||
name_series = _ensure_norm_series(df, 'name', '__name_s')
|
name_series = _ensure_norm_series(df, 'name', '__name_s')
|
||||||
|
|
||||||
if regex:
|
if regex:
|
||||||
|
|
@ -324,21 +305,14 @@ def create_tag_mask(df: pd.DataFrame, tag_patterns: Union[str, List[str]], colum
|
||||||
Boolean Series indicating matching rows
|
Boolean Series indicating matching rows
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
# Match cards with draw-related tags
|
|
||||||
>>> mask = create_tag_mask(df, ['Card Draw', 'Conditional Draw'])
|
>>> mask = create_tag_mask(df, ['Card Draw', 'Conditional Draw'])
|
||||||
>>> mask = create_tag_mask(df, 'Unconditional Draw')
|
>>> mask = create_tag_mask(df, 'Unconditional Draw')
|
||||||
"""
|
"""
|
||||||
if isinstance(tag_patterns, str):
|
if isinstance(tag_patterns, str):
|
||||||
tag_patterns = [tag_patterns]
|
tag_patterns = [tag_patterns]
|
||||||
|
|
||||||
# Handle empty DataFrame case
|
|
||||||
if len(df) == 0:
|
if len(df) == 0:
|
||||||
return pd.Series([], dtype=bool)
|
return pd.Series([], dtype=bool)
|
||||||
|
|
||||||
# Create mask for each pattern
|
|
||||||
masks = [df[column].apply(lambda x: any(pattern in tag for tag in x)) for pattern in tag_patterns]
|
masks = [df[column].apply(lambda x: any(pattern in tag for tag in x)) for pattern in tag_patterns]
|
||||||
|
|
||||||
# Combine masks with OR
|
|
||||||
return pd.concat(masks, axis=1).any(axis=1)
|
return pd.concat(masks, axis=1).any(axis=1)
|
||||||
|
|
||||||
def validate_dataframe_columns(df: pd.DataFrame, required_columns: Set[str]) -> None:
|
def validate_dataframe_columns(df: pd.DataFrame, required_columns: Set[str]) -> None:
|
||||||
|
|
@ -365,11 +339,7 @@ def apply_tag_vectorized(df: pd.DataFrame, mask: pd.Series[bool], tags: Union[st
|
||||||
"""
|
"""
|
||||||
if not isinstance(tags, list):
|
if not isinstance(tags, list):
|
||||||
tags = [tags]
|
tags = [tags]
|
||||||
|
|
||||||
# Get current tags for masked rows
|
|
||||||
current_tags = df.loc[mask, 'themeTags']
|
current_tags = df.loc[mask, 'themeTags']
|
||||||
|
|
||||||
# Add new tags
|
|
||||||
df.loc[mask, 'themeTags'] = current_tags.apply(lambda x: sorted(list(set(x + tags))))
|
df.loc[mask, 'themeTags'] = current_tags.apply(lambda x: sorted(list(set(x + tags))))
|
||||||
|
|
||||||
def apply_rules(df: pd.DataFrame, rules: List[dict]) -> None:
|
def apply_rules(df: pd.DataFrame, rules: List[dict]) -> None:
|
||||||
|
|
@ -463,7 +433,6 @@ def create_numbered_phrase_mask(
|
||||||
numbers = tag_constants.NUM_TO_SEARCH
|
numbers = tag_constants.NUM_TO_SEARCH
|
||||||
# Normalize verbs to list
|
# Normalize verbs to list
|
||||||
verbs = [verb] if isinstance(verb, str) else verb
|
verbs = [verb] if isinstance(verb, str) else verb
|
||||||
# Build patterns
|
|
||||||
if noun:
|
if noun:
|
||||||
patterns = [fr"{v}\s+{num}\s+{noun}" for v in verbs for num in numbers]
|
patterns = [fr"{v}\s+{num}\s+{noun}" for v in verbs for num in numbers]
|
||||||
else:
|
else:
|
||||||
|
|
@ -490,13 +459,8 @@ def create_mass_damage_mask(df: pd.DataFrame) -> pd.Series[bool]:
|
||||||
Returns:
|
Returns:
|
||||||
Boolean Series indicating which cards have mass damage effects
|
Boolean Series indicating which cards have mass damage effects
|
||||||
"""
|
"""
|
||||||
# Create patterns for numeric damage
|
|
||||||
number_patterns = [create_damage_pattern(i) for i in range(1, 21)]
|
number_patterns = [create_damage_pattern(i) for i in range(1, 21)]
|
||||||
|
|
||||||
# Add X damage pattern
|
|
||||||
number_patterns.append(create_damage_pattern('X'))
|
number_patterns.append(create_damage_pattern('X'))
|
||||||
|
|
||||||
# Add patterns for damage targets
|
|
||||||
target_patterns = [
|
target_patterns = [
|
||||||
'to each creature',
|
'to each creature',
|
||||||
'to all creatures',
|
'to all creatures',
|
||||||
|
|
@ -504,9 +468,385 @@ def create_mass_damage_mask(df: pd.DataFrame) -> pd.Series[bool]:
|
||||||
'to each opponent',
|
'to each opponent',
|
||||||
'to everything'
|
'to everything'
|
||||||
]
|
]
|
||||||
|
|
||||||
# Create masks
|
|
||||||
damage_mask = create_text_mask(df, number_patterns)
|
damage_mask = create_text_mask(df, number_patterns)
|
||||||
target_mask = create_text_mask(df, target_patterns)
|
target_mask = create_text_mask(df, target_patterns)
|
||||||
|
|
||||||
return damage_mask & target_mask
|
return damage_mask & target_mask
|
||||||
|
|
||||||
|
|
||||||
|
# ==============================================================================
|
||||||
|
# Keyword Normalization (M1 - Tagging Refinement)
|
||||||
|
# ==============================================================================
|
||||||
|
|
||||||
|
def normalize_keywords(
|
||||||
|
raw: Union[List[str], Set[str], Tuple[str, ...]],
|
||||||
|
allowlist: Set[str],
|
||||||
|
frequency_map: dict[str, int]
|
||||||
|
) -> list[str]:
|
||||||
|
"""Normalize keyword strings for theme tagging.
|
||||||
|
|
||||||
|
Applies normalization rules:
|
||||||
|
1. Case normalization (via normalization map)
|
||||||
|
2. Canonical mapping (e.g., "Commander Ninjutsu" -> "Ninjutsu")
|
||||||
|
3. Singleton pruning (unless allowlisted)
|
||||||
|
4. Deduplication
|
||||||
|
5. Exclusion of blacklisted keywords
|
||||||
|
|
||||||
|
Args:
|
||||||
|
raw: Iterable of raw keyword strings
|
||||||
|
allowlist: Set of keywords that should survive singleton pruning
|
||||||
|
frequency_map: Dict mapping keywords to their occurrence count
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Deduplicated list of normalized keywords
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
ValueError: If raw is not iterable
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
>>> normalize_keywords(
|
||||||
|
... ['Commander Ninjutsu', 'Flying', 'Allons-y!'],
|
||||||
|
... {'Flying', 'Ninjutsu'},
|
||||||
|
... {'Commander Ninjutsu': 2, 'Flying': 100, 'Allons-y!': 1}
|
||||||
|
... )
|
||||||
|
['Ninjutsu', 'Flying'] # 'Allons-y!' pruned as singleton
|
||||||
|
"""
|
||||||
|
if not hasattr(raw, '__iter__') or isinstance(raw, (str, bytes)):
|
||||||
|
raise ValueError(f"raw must be iterable, got {type(raw)}")
|
||||||
|
|
||||||
|
normalized_keywords: set[str] = set()
|
||||||
|
|
||||||
|
for keyword in raw:
|
||||||
|
if not isinstance(keyword, str):
|
||||||
|
continue
|
||||||
|
keyword = keyword.strip()
|
||||||
|
if not keyword:
|
||||||
|
continue
|
||||||
|
if keyword.lower() in tag_constants.KEYWORD_EXCLUSION_SET:
|
||||||
|
continue
|
||||||
|
normalized = tag_constants.KEYWORD_NORMALIZATION_MAP.get(keyword, keyword)
|
||||||
|
frequency = frequency_map.get(keyword, 0)
|
||||||
|
is_singleton = frequency == 1
|
||||||
|
is_allowlisted = normalized in allowlist or keyword in allowlist
|
||||||
|
|
||||||
|
# Prune singletons that aren't allowlisted
|
||||||
|
if is_singleton and not is_allowlisted:
|
||||||
|
continue
|
||||||
|
|
||||||
|
normalized_keywords.add(normalized)
|
||||||
|
|
||||||
|
return sorted(list(normalized_keywords))
|
||||||
|
|
||||||
|
|
||||||
|
# ==============================================================================
|
||||||
|
# M3: Metadata vs Theme Tag Classification
|
||||||
|
# ==============================================================================
|
||||||
|
|
||||||
|
def classify_tag(tag: str) -> str:
|
||||||
|
"""Classify a tag as either 'metadata' or 'theme'.
|
||||||
|
|
||||||
|
Metadata tags are diagnostic, bracket-related, or internal annotations that
|
||||||
|
should not appear in theme catalogs or player-facing tag lists. Theme tags
|
||||||
|
represent gameplay mechanics and deck archetypes.
|
||||||
|
|
||||||
|
Classification rules (in order of precedence):
|
||||||
|
1. Prefix match: Tags starting with METADATA_TAG_PREFIXES → metadata
|
||||||
|
2. Exact match: Tags in METADATA_TAG_ALLOWLIST → metadata
|
||||||
|
3. Kindred pattern: "{Type}s Gain Protection" → metadata
|
||||||
|
4. Default: All other tags → theme
|
||||||
|
|
||||||
|
Args:
|
||||||
|
tag: Tag string to classify
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
"metadata" or "theme"
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
>>> classify_tag("Applied: Cost Reduction")
|
||||||
|
'metadata'
|
||||||
|
>>> classify_tag("Bracket: Game Changer")
|
||||||
|
'metadata'
|
||||||
|
>>> classify_tag("Knights Gain Protection")
|
||||||
|
'metadata'
|
||||||
|
>>> classify_tag("Card Draw")
|
||||||
|
'theme'
|
||||||
|
>>> classify_tag("Spellslinger")
|
||||||
|
'theme'
|
||||||
|
"""
|
||||||
|
# Prefix-based classification
|
||||||
|
for prefix in tag_constants.METADATA_TAG_PREFIXES:
|
||||||
|
if tag.startswith(prefix):
|
||||||
|
return "metadata"
|
||||||
|
|
||||||
|
# Exact match classification
|
||||||
|
if tag in tag_constants.METADATA_TAG_ALLOWLIST:
|
||||||
|
return "metadata"
|
||||||
|
|
||||||
|
# Kindred protection metadata patterns: "{Type} Gain {Ability}"
|
||||||
|
# Covers all protective abilities: Protection, Ward, Hexproof, Shroud, Indestructible
|
||||||
|
# Examples: "Knights Gain Protection", "Spiders Gain Ward", "Merfolk Gain Ward"
|
||||||
|
# Note: Checks for " Gain " pattern since some creature types like "Merfolk" don't end in 's'
|
||||||
|
kindred_abilities = ["Protection", "Ward", "Hexproof", "Shroud", "Indestructible"]
|
||||||
|
for ability in kindred_abilities:
|
||||||
|
if " Gain " in tag and tag.endswith(ability):
|
||||||
|
return "metadata"
|
||||||
|
|
||||||
|
# Protection scope metadata patterns (M5): "{Scope}: {Ability}"
|
||||||
|
# Indicates whether protection applies to self, your permanents, all permanents, or opponent's permanents
|
||||||
|
# Examples: "Self: Hexproof", "Your Permanents: Ward", "Blanket: Indestructible"
|
||||||
|
# These enable deck builder to filter for board-relevant protection vs self-only
|
||||||
|
protection_scopes = ["Self:", "Your Permanents:", "Blanket:", "Opponent Permanents:"]
|
||||||
|
for scope in protection_scopes:
|
||||||
|
if tag.startswith(scope):
|
||||||
|
return "metadata"
|
||||||
|
|
||||||
|
# Phasing scope metadata patterns: "{Scope}: Phasing"
|
||||||
|
# Indicates whether phasing applies to self, your permanents, all permanents, or opponents
|
||||||
|
# Examples: "Self: Phasing", "Your Permanents: Phasing", "Blanket: Phasing",
|
||||||
|
# "Targeted: Phasing", "Opponent Permanents: Phasing"
|
||||||
|
# Similar to protection scopes, enables filtering for board-relevant phasing
|
||||||
|
# Opponent Permanents: Phasing also triggers Removal tag (removal-style phasing)
|
||||||
|
if tag in ["Self: Phasing", "Your Permanents: Phasing", "Blanket: Phasing",
|
||||||
|
"Targeted: Phasing", "Opponent Permanents: Phasing"]:
|
||||||
|
return "metadata"
|
||||||
|
|
||||||
|
# Default: treat as theme tag
|
||||||
|
return "theme"
|
||||||
|
|
||||||
|
|
||||||
|
# --- Text Processing Helpers (M0.6) ---------------------------------------------------------
|
||||||
|
def strip_reminder_text(text: str) -> str:
|
||||||
|
"""Remove reminder text (content in parentheses) from card text.
|
||||||
|
|
||||||
|
Reminder text often contains keywords and patterns that can cause false positives
|
||||||
|
in pattern matching. This function strips all parenthetical content to focus on
|
||||||
|
the actual game text.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Card text possibly containing reminder text in parentheses
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Text with all parenthetical content removed
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> strip_reminder_text("Hexproof (This creature can't be the target of spells)")
|
||||||
|
"Hexproof "
|
||||||
|
"""
|
||||||
|
if not text:
|
||||||
|
return text
|
||||||
|
return re.sub(r'\([^)]*\)', '', text)
|
||||||
|
|
||||||
|
|
||||||
|
def extract_context_window(text: str, match_start: int, match_end: int,
|
||||||
|
window_size: int = None, include_before: bool = False) -> str:
|
||||||
|
"""Extract a context window around a regex match for validation.
|
||||||
|
|
||||||
|
When pattern matching finds a potential match, we often need to examine
|
||||||
|
the surrounding text to validate the match or check for additional keywords.
|
||||||
|
This function extracts a window of text around the match position.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Full text to extract context from
|
||||||
|
match_start: Start position of the regex match
|
||||||
|
match_end: End position of the regex match
|
||||||
|
window_size: Number of characters to include after the match.
|
||||||
|
If None, uses CONTEXT_WINDOW_SIZE from tag_constants (default: 70).
|
||||||
|
To include context before the match, use include_before=True.
|
||||||
|
include_before: If True, includes window_size characters before the match
|
||||||
|
in addition to after. If False (default), only includes after.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Substring of text containing the match plus surrounding context
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> text = "Creatures you control have hexproof and vigilance"
|
||||||
|
>>> match = re.search(r'creatures you control', text)
|
||||||
|
>>> extract_context_window(text, match.start(), match.end(), window_size=30)
|
||||||
|
'Creatures you control have hexproof and '
|
||||||
|
"""
|
||||||
|
if not text:
|
||||||
|
return text
|
||||||
|
if window_size is None:
|
||||||
|
from .tag_constants import CONTEXT_WINDOW_SIZE
|
||||||
|
window_size = CONTEXT_WINDOW_SIZE
|
||||||
|
|
||||||
|
# Calculate window boundaries
|
||||||
|
if include_before:
|
||||||
|
context_start = max(0, match_start - window_size)
|
||||||
|
else:
|
||||||
|
context_start = match_start
|
||||||
|
|
||||||
|
context_end = min(len(text), match_end + window_size)
|
||||||
|
|
||||||
|
return text[context_start:context_end]
|
||||||
|
|
||||||
|
|
||||||
|
# --- Enhanced Tagging Utilities (M3.5/M3.6) ----------------------------------------------------
|
||||||
|
|
||||||
|
def build_combined_mask(
|
||||||
|
df: pd.DataFrame,
|
||||||
|
text_patterns: Union[str, List[str], None] = None,
|
||||||
|
type_patterns: Union[str, List[str], None] = None,
|
||||||
|
keyword_patterns: Union[str, List[str], None] = None,
|
||||||
|
name_list: Union[List[str], None] = None,
|
||||||
|
exclusion_patterns: Union[str, List[str], None] = None,
|
||||||
|
combine_with_or: bool = True
|
||||||
|
) -> pd.Series[bool]:
|
||||||
|
"""Build a combined boolean mask from multiple pattern types.
|
||||||
|
|
||||||
|
This utility reduces boilerplate when creating complex masks by combining
|
||||||
|
text, type, keyword, and name patterns into a single mask. Patterns are
|
||||||
|
combined with OR by default, but can be combined with AND.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
df: DataFrame to search
|
||||||
|
text_patterns: Patterns to match in 'text' column
|
||||||
|
type_patterns: Patterns to match in 'type' column
|
||||||
|
keyword_patterns: Patterns to match in 'keywords' column
|
||||||
|
name_list: List of exact card names to match
|
||||||
|
exclusion_patterns: Text patterns to exclude from final mask
|
||||||
|
combine_with_or: If True, combine masks with OR (default).
|
||||||
|
If False, combine with AND (requires all conditions)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Boolean Series combining all specified patterns
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> # Match cards with flying OR haste, exclude creatures
|
||||||
|
>>> mask = build_combined_mask(
|
||||||
|
... df,
|
||||||
|
... keyword_patterns=['Flying', 'Haste'],
|
||||||
|
... exclusion_patterns='Creature'
|
||||||
|
... )
|
||||||
|
"""
|
||||||
|
if combine_with_or:
|
||||||
|
result = pd.Series([False] * len(df), index=df.index)
|
||||||
|
else:
|
||||||
|
result = pd.Series([True] * len(df), index=df.index)
|
||||||
|
masks = []
|
||||||
|
|
||||||
|
if text_patterns is not None:
|
||||||
|
masks.append(create_text_mask(df, text_patterns))
|
||||||
|
|
||||||
|
if type_patterns is not None:
|
||||||
|
masks.append(create_type_mask(df, type_patterns))
|
||||||
|
|
||||||
|
if keyword_patterns is not None:
|
||||||
|
masks.append(create_keyword_mask(df, keyword_patterns))
|
||||||
|
|
||||||
|
if name_list is not None:
|
||||||
|
masks.append(create_name_mask(df, name_list))
|
||||||
|
if masks:
|
||||||
|
if combine_with_or:
|
||||||
|
for mask in masks:
|
||||||
|
result |= mask
|
||||||
|
else:
|
||||||
|
for mask in masks:
|
||||||
|
result &= mask
|
||||||
|
if exclusion_patterns is not None:
|
||||||
|
exclusion_mask = create_text_mask(df, exclusion_patterns)
|
||||||
|
result &= ~exclusion_mask
|
||||||
|
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
def tag_with_logging(
|
||||||
|
df: pd.DataFrame,
|
||||||
|
mask: pd.Series[bool],
|
||||||
|
tags: Union[str, List[str]],
|
||||||
|
log_message: str,
|
||||||
|
color: str = '',
|
||||||
|
logger=None
|
||||||
|
) -> int:
|
||||||
|
"""Apply tags with standardized logging.
|
||||||
|
|
||||||
|
This utility wraps the common pattern of applying tags and logging the count.
|
||||||
|
It provides consistent formatting for log messages across the tagging module.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
df: DataFrame to modify
|
||||||
|
mask: Boolean mask indicating which rows to tag
|
||||||
|
tags: Tag(s) to apply
|
||||||
|
log_message: Description of what's being tagged (e.g., "flying creatures")
|
||||||
|
color: Color identifier for context (optional)
|
||||||
|
logger: Logger instance to use (optional, uses print if None)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Count of cards tagged
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> count = tag_with_logging(
|
||||||
|
... df,
|
||||||
|
... flying_mask,
|
||||||
|
... 'Flying',
|
||||||
|
... 'creatures with flying ability',
|
||||||
|
... color='blue',
|
||||||
|
... logger=logger
|
||||||
|
... )
|
||||||
|
# Logs: "Tagged 42 blue creatures with flying ability"
|
||||||
|
"""
|
||||||
|
count = mask.sum()
|
||||||
|
if count > 0:
|
||||||
|
apply_tag_vectorized(df, mask, tags)
|
||||||
|
color_part = f'{color} ' if color else ''
|
||||||
|
full_message = f'Tagged {count} {color_part}{log_message}'
|
||||||
|
|
||||||
|
if logger:
|
||||||
|
logger.info(full_message)
|
||||||
|
else:
|
||||||
|
print(full_message)
|
||||||
|
|
||||||
|
return count
|
||||||
|
|
||||||
|
|
||||||
|
def tag_with_rules_and_logging(
|
||||||
|
df: pd.DataFrame,
|
||||||
|
rules: List[dict],
|
||||||
|
summary_message: str,
|
||||||
|
color: str = '',
|
||||||
|
logger=None
|
||||||
|
) -> int:
|
||||||
|
"""Apply multiple tag rules with summarized logging.
|
||||||
|
|
||||||
|
This utility combines apply_rules with logging, providing a summary of
|
||||||
|
all cards affected across multiple rules.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
df: DataFrame to modify
|
||||||
|
rules: List of rule dicts (each with 'mask' and 'tags')
|
||||||
|
summary_message: Overall description (e.g., "card draw effects")
|
||||||
|
color: Color identifier for context (optional)
|
||||||
|
logger: Logger instance to use (optional)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Total count of unique cards affected by any rule
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> rules = [
|
||||||
|
... {'mask': flying_mask, 'tags': ['Flying']},
|
||||||
|
... {'mask': haste_mask, 'tags': ['Haste', 'Aggro']}
|
||||||
|
... ]
|
||||||
|
>>> count = tag_with_rules_and_logging(
|
||||||
|
... df, rules, 'evasive creatures', color='red', logger=logger
|
||||||
|
... )
|
||||||
|
"""
|
||||||
|
affected = pd.Series([False] * len(df), index=df.index)
|
||||||
|
for rule in rules:
|
||||||
|
mask = rule.get('mask')
|
||||||
|
if callable(mask):
|
||||||
|
mask = mask(df)
|
||||||
|
if mask is not None and mask.any():
|
||||||
|
tags = rule.get('tags', [])
|
||||||
|
apply_tag_vectorized(df, mask, tags)
|
||||||
|
affected |= mask
|
||||||
|
|
||||||
|
count = affected.sum()
|
||||||
|
color_part = f'{color} ' if color else ''
|
||||||
|
full_message = f'Tagged {count} {color_part}{summary_message}'
|
||||||
|
|
||||||
|
if logger:
|
||||||
|
logger.info(full_message)
|
||||||
|
else:
|
||||||
|
print(full_message)
|
||||||
|
|
||||||
|
return count
|
||||||
File diff suppressed because it is too large
Load diff
|
|
@ -4,7 +4,7 @@ from pathlib import Path
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
from headless_runner import _resolve_additional_theme_inputs, _parse_theme_list
|
from code.headless_runner import resolve_additional_theme_inputs as _resolve_additional_theme_inputs, _parse_theme_list
|
||||||
|
|
||||||
|
|
||||||
def _write_catalog(path: Path) -> None:
|
def _write_catalog(path: Path) -> None:
|
||||||
|
|
|
||||||
182
code/tests/test_keyword_normalization.py
Normal file
182
code/tests/test_keyword_normalization.py
Normal file
|
|
@ -0,0 +1,182 @@
|
||||||
|
"""Tests for keyword normalization (M1 - Tagging Refinement)."""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from code.tagging import tag_utils, tag_constants
|
||||||
|
|
||||||
|
|
||||||
|
class TestKeywordNormalization:
|
||||||
|
"""Test suite for normalize_keywords function."""
|
||||||
|
|
||||||
|
def test_canonical_mappings(self):
|
||||||
|
"""Test that variant keywords map to canonical forms."""
|
||||||
|
raw = ['Commander Ninjutsu', 'Flying', 'Trample']
|
||||||
|
allowlist = tag_constants.KEYWORD_ALLOWLIST
|
||||||
|
frequency_map = {
|
||||||
|
'Commander Ninjutsu': 2,
|
||||||
|
'Flying': 100,
|
||||||
|
'Trample': 50
|
||||||
|
}
|
||||||
|
|
||||||
|
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||||
|
|
||||||
|
assert 'Ninjutsu' in result
|
||||||
|
assert 'Flying' in result
|
||||||
|
assert 'Trample' in result
|
||||||
|
assert 'Commander Ninjutsu' not in result
|
||||||
|
|
||||||
|
def test_singleton_pruning(self):
|
||||||
|
"""Test that singleton keywords are pruned unless allowlisted."""
|
||||||
|
raw = ['Allons-y!', 'Flying', 'Take 59 Flights of Stairs']
|
||||||
|
allowlist = {'Flying'} # Only Flying is allowlisted
|
||||||
|
frequency_map = {
|
||||||
|
'Allons-y!': 1,
|
||||||
|
'Flying': 100,
|
||||||
|
'Take 59 Flights of Stairs': 1
|
||||||
|
}
|
||||||
|
|
||||||
|
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||||
|
|
||||||
|
assert 'Flying' in result
|
||||||
|
assert 'Allons-y!' not in result
|
||||||
|
assert 'Take 59 Flights of Stairs' not in result
|
||||||
|
|
||||||
|
def test_case_normalization(self):
|
||||||
|
"""Test that keywords are normalized to proper case."""
|
||||||
|
raw = ['flying', 'TRAMPLE', 'vigilance']
|
||||||
|
allowlist = {'Flying', 'Trample', 'Vigilance'}
|
||||||
|
frequency_map = {
|
||||||
|
'flying': 100,
|
||||||
|
'TRAMPLE': 50,
|
||||||
|
'vigilance': 75
|
||||||
|
}
|
||||||
|
|
||||||
|
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||||
|
|
||||||
|
# Case normalization happens via the map
|
||||||
|
# If not in map, original case is preserved
|
||||||
|
assert len(result) == 3
|
||||||
|
|
||||||
|
def test_partner_exclusion(self):
|
||||||
|
"""Test that partner keywords remain excluded."""
|
||||||
|
raw = ['Partner', 'Flying', 'Trample']
|
||||||
|
allowlist = {'Flying', 'Trample'}
|
||||||
|
frequency_map = {
|
||||||
|
'Partner': 50,
|
||||||
|
'Flying': 100,
|
||||||
|
'Trample': 50
|
||||||
|
}
|
||||||
|
|
||||||
|
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||||
|
|
||||||
|
assert 'Flying' in result
|
||||||
|
assert 'Trample' in result
|
||||||
|
assert 'Partner' not in result # Excluded
|
||||||
|
assert 'partner' not in result
|
||||||
|
|
||||||
|
def test_empty_input(self):
|
||||||
|
"""Test that empty input returns empty list."""
|
||||||
|
result = tag_utils.normalize_keywords([], set(), {})
|
||||||
|
assert result == []
|
||||||
|
|
||||||
|
def test_whitespace_handling(self):
|
||||||
|
"""Test that whitespace is properly stripped."""
|
||||||
|
raw = [' Flying ', 'Trample ', ' Vigilance']
|
||||||
|
allowlist = {'Flying', 'Trample', 'Vigilance'}
|
||||||
|
frequency_map = {
|
||||||
|
'Flying': 100,
|
||||||
|
'Trample': 50,
|
||||||
|
'Vigilance': 75
|
||||||
|
}
|
||||||
|
|
||||||
|
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||||
|
|
||||||
|
assert 'Flying' in result
|
||||||
|
assert 'Trample' in result
|
||||||
|
assert 'Vigilance' in result
|
||||||
|
|
||||||
|
def test_deduplication(self):
|
||||||
|
"""Test that duplicate keywords are deduplicated."""
|
||||||
|
raw = ['Flying', 'Flying', 'Trample', 'Flying']
|
||||||
|
allowlist = {'Flying', 'Trample'}
|
||||||
|
frequency_map = {
|
||||||
|
'Flying': 100,
|
||||||
|
'Trample': 50
|
||||||
|
}
|
||||||
|
|
||||||
|
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||||
|
|
||||||
|
assert result.count('Flying') == 1
|
||||||
|
assert result.count('Trample') == 1
|
||||||
|
|
||||||
|
def test_non_string_entries_skipped(self):
|
||||||
|
"""Test that non-string entries are safely skipped."""
|
||||||
|
raw = ['Flying', None, 123, 'Trample', '']
|
||||||
|
allowlist = {'Flying', 'Trample'}
|
||||||
|
frequency_map = {
|
||||||
|
'Flying': 100,
|
||||||
|
'Trample': 50
|
||||||
|
}
|
||||||
|
|
||||||
|
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||||
|
|
||||||
|
assert 'Flying' in result
|
||||||
|
assert 'Trample' in result
|
||||||
|
assert len(result) == 2
|
||||||
|
|
||||||
|
def test_invalid_input_raises_error(self):
|
||||||
|
"""Test that non-iterable input raises ValueError."""
|
||||||
|
with pytest.raises(ValueError, match="raw must be iterable"):
|
||||||
|
tag_utils.normalize_keywords("not-a-list", set(), {})
|
||||||
|
|
||||||
|
def test_allowlist_preserves_singletons(self):
|
||||||
|
"""Test that allowlisted keywords survive even if they're singletons."""
|
||||||
|
raw = ['Myriad', 'Flying', 'Cascade']
|
||||||
|
allowlist = {'Flying', 'Myriad', 'Cascade'} # All allowlisted
|
||||||
|
frequency_map = {
|
||||||
|
'Myriad': 1, # Singleton
|
||||||
|
'Flying': 100,
|
||||||
|
'Cascade': 1 # Singleton
|
||||||
|
}
|
||||||
|
|
||||||
|
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||||
|
|
||||||
|
assert 'Myriad' in result # Preserved despite being singleton
|
||||||
|
assert 'Flying' in result
|
||||||
|
assert 'Cascade' in result # Preserved despite being singleton
|
||||||
|
|
||||||
|
|
||||||
|
class TestKeywordIntegration:
|
||||||
|
"""Integration tests for keyword normalization in tagging flow."""
|
||||||
|
|
||||||
|
def test_normalization_preserves_evergreen_keywords(self):
|
||||||
|
"""Test that common evergreen keywords are always preserved."""
|
||||||
|
evergreen = ['Flying', 'Trample', 'Vigilance', 'Haste', 'Deathtouch', 'Lifelink']
|
||||||
|
allowlist = tag_constants.KEYWORD_ALLOWLIST
|
||||||
|
frequency_map = {kw: 100 for kw in evergreen} # All common
|
||||||
|
|
||||||
|
result = tag_utils.normalize_keywords(evergreen, allowlist, frequency_map)
|
||||||
|
|
||||||
|
for kw in evergreen:
|
||||||
|
assert kw in result
|
||||||
|
|
||||||
|
def test_crossover_keywords_pruned(self):
|
||||||
|
"""Test that crossover-specific singletons are pruned."""
|
||||||
|
crossover_singletons = [
|
||||||
|
'Gae Bolg', # Final Fantasy
|
||||||
|
'Psychic Defense', # Warhammer 40K
|
||||||
|
'Allons-y!', # Doctor Who
|
||||||
|
'Flying' # Evergreen (control)
|
||||||
|
]
|
||||||
|
allowlist = {'Flying'} # Only Flying allowed
|
||||||
|
frequency_map = {
|
||||||
|
'Gae Bolg': 1,
|
||||||
|
'Psychic Defense': 1,
|
||||||
|
'Allons-y!': 1,
|
||||||
|
'Flying': 100
|
||||||
|
}
|
||||||
|
|
||||||
|
result = tag_utils.normalize_keywords(crossover_singletons, allowlist, frequency_map)
|
||||||
|
|
||||||
|
assert result == ['Flying'] # Only evergreen survived
|
||||||
300
code/tests/test_metadata_partition.py
Normal file
300
code/tests/test_metadata_partition.py
Normal file
|
|
@ -0,0 +1,300 @@
|
||||||
|
"""Tests for M3 metadata/theme tag partition functionality.
|
||||||
|
|
||||||
|
Tests cover:
|
||||||
|
- Tag classification (metadata vs theme)
|
||||||
|
- Column creation and data migration
|
||||||
|
- Feature flag behavior
|
||||||
|
- Compatibility with missing columns
|
||||||
|
- CSV read/write with new schema
|
||||||
|
"""
|
||||||
|
import pandas as pd
|
||||||
|
import pytest
|
||||||
|
from code.tagging import tag_utils
|
||||||
|
from code.tagging.tagger import _apply_metadata_partition
|
||||||
|
|
||||||
|
|
||||||
|
class TestTagClassification:
|
||||||
|
"""Tests for classify_tag function."""
|
||||||
|
|
||||||
|
def test_prefix_based_metadata(self):
|
||||||
|
"""Metadata tags identified by prefix."""
|
||||||
|
assert tag_utils.classify_tag("Applied: Cost Reduction") == "metadata"
|
||||||
|
assert tag_utils.classify_tag("Bracket: Game Changer") == "metadata"
|
||||||
|
assert tag_utils.classify_tag("Diagnostic: Test") == "metadata"
|
||||||
|
assert tag_utils.classify_tag("Internal: Debug") == "metadata"
|
||||||
|
|
||||||
|
def test_exact_match_metadata(self):
|
||||||
|
"""Metadata tags identified by exact match."""
|
||||||
|
assert tag_utils.classify_tag("Bracket: Game Changer") == "metadata"
|
||||||
|
assert tag_utils.classify_tag("Bracket: Staple") == "metadata"
|
||||||
|
|
||||||
|
def test_kindred_protection_metadata(self):
|
||||||
|
"""Kindred protection tags are metadata."""
|
||||||
|
assert tag_utils.classify_tag("Knights Gain Protection") == "metadata"
|
||||||
|
assert tag_utils.classify_tag("Frogs Gain Protection") == "metadata"
|
||||||
|
assert tag_utils.classify_tag("Zombies Gain Protection") == "metadata"
|
||||||
|
|
||||||
|
def test_theme_classification(self):
|
||||||
|
"""Regular gameplay tags are themes."""
|
||||||
|
assert tag_utils.classify_tag("Card Draw") == "theme"
|
||||||
|
assert tag_utils.classify_tag("Spellslinger") == "theme"
|
||||||
|
assert tag_utils.classify_tag("Tokens Matter") == "theme"
|
||||||
|
assert tag_utils.classify_tag("Ramp") == "theme"
|
||||||
|
assert tag_utils.classify_tag("Protection") == "theme"
|
||||||
|
|
||||||
|
def test_edge_cases(self):
|
||||||
|
"""Edge cases in tag classification."""
|
||||||
|
# Empty string
|
||||||
|
assert tag_utils.classify_tag("") == "theme"
|
||||||
|
|
||||||
|
# Similar but not exact matches
|
||||||
|
assert tag_utils.classify_tag("Apply: Something") == "theme" # Wrong prefix
|
||||||
|
assert tag_utils.classify_tag("Knights Have Protection") == "theme" # Not "Gain"
|
||||||
|
|
||||||
|
# Case sensitivity
|
||||||
|
assert tag_utils.classify_tag("applied: Cost Reduction") == "theme" # Lowercase
|
||||||
|
|
||||||
|
|
||||||
|
class TestMetadataPartition:
|
||||||
|
"""Tests for _apply_metadata_partition function."""
|
||||||
|
|
||||||
|
def test_basic_partition(self, monkeypatch):
|
||||||
|
"""Basic partition splits tags correctly."""
|
||||||
|
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||||
|
|
||||||
|
df = pd.DataFrame({
|
||||||
|
'name': ['Card A', 'Card B'],
|
||||||
|
'themeTags': [
|
||||||
|
['Card Draw', 'Applied: Cost Reduction'],
|
||||||
|
['Spellslinger', 'Bracket: Game Changer', 'Tokens Matter']
|
||||||
|
]
|
||||||
|
})
|
||||||
|
|
||||||
|
df_out, diag = _apply_metadata_partition(df)
|
||||||
|
|
||||||
|
# Check theme tags
|
||||||
|
assert df_out.loc[0, 'themeTags'] == ['Card Draw']
|
||||||
|
assert df_out.loc[1, 'themeTags'] == ['Spellslinger', 'Tokens Matter']
|
||||||
|
|
||||||
|
# Check metadata tags
|
||||||
|
assert df_out.loc[0, 'metadataTags'] == ['Applied: Cost Reduction']
|
||||||
|
assert df_out.loc[1, 'metadataTags'] == ['Bracket: Game Changer']
|
||||||
|
|
||||||
|
# Check diagnostics
|
||||||
|
assert diag['enabled'] is True
|
||||||
|
assert diag['rows_with_tags'] == 2
|
||||||
|
assert diag['metadata_tags_moved'] == 2
|
||||||
|
assert diag['theme_tags_kept'] == 3
|
||||||
|
|
||||||
|
def test_empty_tags(self, monkeypatch):
|
||||||
|
"""Handles empty tag lists."""
|
||||||
|
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||||
|
|
||||||
|
df = pd.DataFrame({
|
||||||
|
'name': ['Card A', 'Card B'],
|
||||||
|
'themeTags': [[], ['Card Draw']]
|
||||||
|
})
|
||||||
|
|
||||||
|
df_out, diag = _apply_metadata_partition(df)
|
||||||
|
|
||||||
|
assert df_out.loc[0, 'themeTags'] == []
|
||||||
|
assert df_out.loc[0, 'metadataTags'] == []
|
||||||
|
assert df_out.loc[1, 'themeTags'] == ['Card Draw']
|
||||||
|
assert df_out.loc[1, 'metadataTags'] == []
|
||||||
|
|
||||||
|
assert diag['rows_with_tags'] == 1
|
||||||
|
|
||||||
|
def test_all_metadata_tags(self, monkeypatch):
|
||||||
|
"""Handles rows with only metadata tags."""
|
||||||
|
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||||
|
|
||||||
|
df = pd.DataFrame({
|
||||||
|
'name': ['Card A'],
|
||||||
|
'themeTags': [['Applied: Cost Reduction', 'Bracket: Game Changer']]
|
||||||
|
})
|
||||||
|
|
||||||
|
df_out, diag = _apply_metadata_partition(df)
|
||||||
|
|
||||||
|
assert df_out.loc[0, 'themeTags'] == []
|
||||||
|
assert df_out.loc[0, 'metadataTags'] == ['Applied: Cost Reduction', 'Bracket: Game Changer']
|
||||||
|
|
||||||
|
assert diag['metadata_tags_moved'] == 2
|
||||||
|
assert diag['theme_tags_kept'] == 0
|
||||||
|
|
||||||
|
def test_all_theme_tags(self, monkeypatch):
|
||||||
|
"""Handles rows with only theme tags."""
|
||||||
|
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||||
|
|
||||||
|
df = pd.DataFrame({
|
||||||
|
'name': ['Card A'],
|
||||||
|
'themeTags': [['Card Draw', 'Ramp', 'Spellslinger']]
|
||||||
|
})
|
||||||
|
|
||||||
|
df_out, diag = _apply_metadata_partition(df)
|
||||||
|
|
||||||
|
assert df_out.loc[0, 'themeTags'] == ['Card Draw', 'Ramp', 'Spellslinger']
|
||||||
|
assert df_out.loc[0, 'metadataTags'] == []
|
||||||
|
|
||||||
|
assert diag['metadata_tags_moved'] == 0
|
||||||
|
assert diag['theme_tags_kept'] == 3
|
||||||
|
|
||||||
|
def test_feature_flag_disabled(self, monkeypatch):
|
||||||
|
"""Feature flag disables partition."""
|
||||||
|
monkeypatch.setenv('TAG_METADATA_SPLIT', '0')
|
||||||
|
|
||||||
|
df = pd.DataFrame({
|
||||||
|
'name': ['Card A'],
|
||||||
|
'themeTags': [['Card Draw', 'Applied: Cost Reduction']]
|
||||||
|
})
|
||||||
|
|
||||||
|
df_out, diag = _apply_metadata_partition(df)
|
||||||
|
|
||||||
|
# Should not create metadataTags column
|
||||||
|
assert 'metadataTags' not in df_out.columns
|
||||||
|
|
||||||
|
# Should not modify themeTags
|
||||||
|
assert df_out.loc[0, 'themeTags'] == ['Card Draw', 'Applied: Cost Reduction']
|
||||||
|
|
||||||
|
# Should indicate disabled
|
||||||
|
assert diag['enabled'] is False
|
||||||
|
|
||||||
|
def test_missing_theme_tags_column(self, monkeypatch):
|
||||||
|
"""Handles missing themeTags column gracefully."""
|
||||||
|
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||||
|
|
||||||
|
df = pd.DataFrame({
|
||||||
|
'name': ['Card A'],
|
||||||
|
'other_column': ['value']
|
||||||
|
})
|
||||||
|
|
||||||
|
df_out, diag = _apply_metadata_partition(df)
|
||||||
|
|
||||||
|
# Should return unchanged
|
||||||
|
assert 'themeTags' not in df_out.columns
|
||||||
|
assert 'metadataTags' not in df_out.columns
|
||||||
|
|
||||||
|
# Should indicate error
|
||||||
|
assert diag['enabled'] is True
|
||||||
|
assert 'error' in diag
|
||||||
|
|
||||||
|
def test_non_list_tags(self, monkeypatch):
|
||||||
|
"""Handles non-list values in themeTags."""
|
||||||
|
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||||
|
|
||||||
|
df = pd.DataFrame({
|
||||||
|
'name': ['Card A', 'Card B', 'Card C'],
|
||||||
|
'themeTags': [['Card Draw'], None, 'not a list']
|
||||||
|
})
|
||||||
|
|
||||||
|
df_out, diag = _apply_metadata_partition(df)
|
||||||
|
|
||||||
|
# Only first row should be processed
|
||||||
|
assert df_out.loc[0, 'themeTags'] == ['Card Draw']
|
||||||
|
assert df_out.loc[0, 'metadataTags'] == []
|
||||||
|
|
||||||
|
assert diag['rows_with_tags'] == 1
|
||||||
|
|
||||||
|
def test_kindred_protection_partition(self, monkeypatch):
|
||||||
|
"""Kindred protection tags are moved to metadata."""
|
||||||
|
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||||
|
|
||||||
|
df = pd.DataFrame({
|
||||||
|
'name': ['Card A'],
|
||||||
|
'themeTags': [['Protection', 'Knights Gain Protection', 'Card Draw']]
|
||||||
|
})
|
||||||
|
|
||||||
|
df_out, diag = _apply_metadata_partition(df)
|
||||||
|
|
||||||
|
assert 'Protection' in df_out.loc[0, 'themeTags']
|
||||||
|
assert 'Card Draw' in df_out.loc[0, 'themeTags']
|
||||||
|
assert 'Knights Gain Protection' in df_out.loc[0, 'metadataTags']
|
||||||
|
|
||||||
|
def test_diagnostics_structure(self, monkeypatch):
|
||||||
|
"""Diagnostics contain expected fields."""
|
||||||
|
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||||
|
|
||||||
|
df = pd.DataFrame({
|
||||||
|
'name': ['Card A'],
|
||||||
|
'themeTags': [['Card Draw', 'Applied: Cost Reduction']]
|
||||||
|
})
|
||||||
|
|
||||||
|
df_out, diag = _apply_metadata_partition(df)
|
||||||
|
|
||||||
|
# Check required diagnostic fields
|
||||||
|
assert 'enabled' in diag
|
||||||
|
assert 'total_rows' in diag
|
||||||
|
assert 'rows_with_tags' in diag
|
||||||
|
assert 'metadata_tags_moved' in diag
|
||||||
|
assert 'theme_tags_kept' in diag
|
||||||
|
assert 'unique_metadata_tags' in diag
|
||||||
|
assert 'unique_theme_tags' in diag
|
||||||
|
assert 'most_common_metadata' in diag
|
||||||
|
assert 'most_common_themes' in diag
|
||||||
|
|
||||||
|
# Check types
|
||||||
|
assert isinstance(diag['most_common_metadata'], list)
|
||||||
|
assert isinstance(diag['most_common_themes'], list)
|
||||||
|
|
||||||
|
|
||||||
|
class TestCSVCompatibility:
|
||||||
|
"""Tests for CSV read/write with new schema."""
|
||||||
|
|
||||||
|
def test_csv_roundtrip_with_metadata(self, tmp_path, monkeypatch):
|
||||||
|
"""CSV roundtrip preserves both columns."""
|
||||||
|
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||||
|
|
||||||
|
csv_path = tmp_path / "test_cards.csv"
|
||||||
|
|
||||||
|
# Create initial dataframe
|
||||||
|
df = pd.DataFrame({
|
||||||
|
'name': ['Card A'],
|
||||||
|
'themeTags': [['Card Draw', 'Ramp']],
|
||||||
|
'metadataTags': [['Applied: Cost Reduction']]
|
||||||
|
})
|
||||||
|
|
||||||
|
# Write to CSV
|
||||||
|
df.to_csv(csv_path, index=False)
|
||||||
|
|
||||||
|
# Read back
|
||||||
|
df_read = pd.read_csv(
|
||||||
|
csv_path,
|
||||||
|
converters={'themeTags': pd.eval, 'metadataTags': pd.eval}
|
||||||
|
)
|
||||||
|
|
||||||
|
# Verify data preserved
|
||||||
|
assert df_read.loc[0, 'themeTags'] == ['Card Draw', 'Ramp']
|
||||||
|
assert df_read.loc[0, 'metadataTags'] == ['Applied: Cost Reduction']
|
||||||
|
|
||||||
|
def test_csv_backward_compatible(self, tmp_path, monkeypatch):
|
||||||
|
"""Can read old CSVs without metadataTags."""
|
||||||
|
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||||
|
|
||||||
|
csv_path = tmp_path / "old_cards.csv"
|
||||||
|
|
||||||
|
# Create old-style CSV without metadataTags
|
||||||
|
df = pd.DataFrame({
|
||||||
|
'name': ['Card A'],
|
||||||
|
'themeTags': [['Card Draw', 'Applied: Cost Reduction']]
|
||||||
|
})
|
||||||
|
df.to_csv(csv_path, index=False)
|
||||||
|
|
||||||
|
# Read back
|
||||||
|
df_read = pd.read_csv(csv_path, converters={'themeTags': pd.eval})
|
||||||
|
|
||||||
|
# Should read successfully
|
||||||
|
assert 'themeTags' in df_read.columns
|
||||||
|
assert 'metadataTags' not in df_read.columns
|
||||||
|
assert df_read.loc[0, 'themeTags'] == ['Card Draw', 'Applied: Cost Reduction']
|
||||||
|
|
||||||
|
# Apply partition
|
||||||
|
df_partitioned, _ = _apply_metadata_partition(df_read)
|
||||||
|
|
||||||
|
# Should now have both columns
|
||||||
|
assert 'themeTags' in df_partitioned.columns
|
||||||
|
assert 'metadataTags' in df_partitioned.columns
|
||||||
|
assert df_partitioned.loc[0, 'themeTags'] == ['Card Draw']
|
||||||
|
assert df_partitioned.loc[0, 'metadataTags'] == ['Applied: Cost Reduction']
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
pytest.main([__file__, "-v"])
|
||||||
169
code/tests/test_protection_grant_detection.py
Normal file
169
code/tests/test_protection_grant_detection.py
Normal file
|
|
@ -0,0 +1,169 @@
|
||||||
|
"""
|
||||||
|
Tests for protection grant detection (M2).
|
||||||
|
|
||||||
|
Tests the ability to distinguish between cards that grant protection
|
||||||
|
and cards that have inherent protection.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from code.tagging.protection_grant_detection import (
|
||||||
|
is_granting_protection,
|
||||||
|
categorize_protection_card
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class TestGrantDetection:
|
||||||
|
"""Test grant verb detection."""
|
||||||
|
|
||||||
|
def test_gains_hexproof(self):
|
||||||
|
"""Cards with 'gains hexproof' should be detected as granting."""
|
||||||
|
text = "Target creature gains hexproof until end of turn."
|
||||||
|
assert is_granting_protection(text, "")
|
||||||
|
|
||||||
|
def test_gives_indestructible(self):
|
||||||
|
"""Cards with 'gives indestructible' should be detected as granting."""
|
||||||
|
text = "This creature gives target creature indestructible."
|
||||||
|
assert is_granting_protection(text, "")
|
||||||
|
|
||||||
|
def test_creatures_you_control_have(self):
|
||||||
|
"""Mass grant pattern should be detected."""
|
||||||
|
text = "Creatures you control have hexproof."
|
||||||
|
assert is_granting_protection(text, "")
|
||||||
|
|
||||||
|
def test_equipped_creature_gets(self):
|
||||||
|
"""Equipment grant pattern should be detected."""
|
||||||
|
text = "Equipped creature gets +2/+2 and has indestructible."
|
||||||
|
assert is_granting_protection(text, "")
|
||||||
|
|
||||||
|
|
||||||
|
class TestInherentDetection:
|
||||||
|
"""Test inherent protection detection."""
|
||||||
|
|
||||||
|
def test_creature_with_hexproof_keyword(self):
|
||||||
|
"""Creature with hexproof keyword should not be detected as granting."""
|
||||||
|
text = "Hexproof (This creature can't be the target of spells or abilities.)"
|
||||||
|
keywords = "Hexproof"
|
||||||
|
assert not is_granting_protection(text, keywords)
|
||||||
|
|
||||||
|
def test_indestructible_artifact(self):
|
||||||
|
"""Artifact with indestructible keyword should not be detected as granting."""
|
||||||
|
text = "Indestructible"
|
||||||
|
keywords = "Indestructible"
|
||||||
|
assert not is_granting_protection(text, keywords)
|
||||||
|
|
||||||
|
def test_ward_creature(self):
|
||||||
|
"""Creature with Ward should not be detected as granting (unless it grants to others)."""
|
||||||
|
text = "Ward {2}"
|
||||||
|
keywords = "Ward"
|
||||||
|
assert not is_granting_protection(text, keywords)
|
||||||
|
|
||||||
|
|
||||||
|
class TestMixedCases:
|
||||||
|
"""Test cards that both grant and have protection."""
|
||||||
|
|
||||||
|
def test_creature_with_self_grant(self):
|
||||||
|
"""Creature that grants itself protection should be detected."""
|
||||||
|
text = "This creature gains indestructible until end of turn."
|
||||||
|
keywords = ""
|
||||||
|
assert is_granting_protection(text, keywords)
|
||||||
|
|
||||||
|
def test_equipment_with_inherent_and_grant(self):
|
||||||
|
"""Equipment with indestructible that grants protection."""
|
||||||
|
text = "Indestructible. Equipped creature has hexproof."
|
||||||
|
keywords = "Indestructible"
|
||||||
|
# Should be detected as granting because of "has hexproof"
|
||||||
|
assert is_granting_protection(text, keywords)
|
||||||
|
|
||||||
|
|
||||||
|
class TestExclusions:
|
||||||
|
"""Test exclusion patterns."""
|
||||||
|
|
||||||
|
def test_cant_have_hexproof(self):
|
||||||
|
"""Cards that prevent protection should not be tagged."""
|
||||||
|
text = "Creatures your opponents control can't have hexproof."
|
||||||
|
assert not is_granting_protection(text, "")
|
||||||
|
|
||||||
|
def test_loses_indestructible(self):
|
||||||
|
"""Cards that remove protection should not be tagged."""
|
||||||
|
text = "Target creature loses indestructible until end of turn."
|
||||||
|
assert not is_granting_protection(text, "")
|
||||||
|
|
||||||
|
|
||||||
|
class TestEdgeCases:
|
||||||
|
"""Test edge cases and special patterns."""
|
||||||
|
|
||||||
|
def test_protection_from_color(self):
|
||||||
|
"""Protection from [quality] in keywords without grant text."""
|
||||||
|
text = "Protection from red"
|
||||||
|
keywords = "Protection from red"
|
||||||
|
assert not is_granting_protection(text, keywords)
|
||||||
|
|
||||||
|
def test_empty_text(self):
|
||||||
|
"""Empty text should return False."""
|
||||||
|
assert not is_granting_protection("", "")
|
||||||
|
|
||||||
|
def test_none_text(self):
|
||||||
|
"""None text should return False."""
|
||||||
|
assert not is_granting_protection(None, "")
|
||||||
|
|
||||||
|
|
||||||
|
class TestCategorization:
|
||||||
|
"""Test full card categorization."""
|
||||||
|
|
||||||
|
def test_shell_shield_is_grant(self):
|
||||||
|
"""Shell Shield grants hexproof - should be Grant."""
|
||||||
|
text = "Target creature gets +0/+3 and gains hexproof until end of turn."
|
||||||
|
cat = categorize_protection_card("Shell Shield", text, "", "Instant")
|
||||||
|
assert cat == "Grant"
|
||||||
|
|
||||||
|
def test_geist_of_saint_traft_is_mixed(self):
|
||||||
|
"""Geist has hexproof and creates tokens - Mixed."""
|
||||||
|
text = "Hexproof. Whenever this attacks, create a token."
|
||||||
|
keywords = "Hexproof"
|
||||||
|
cat = categorize_protection_card("Geist", text, keywords, "Creature")
|
||||||
|
# Has hexproof keyword, so inherent
|
||||||
|
assert cat in ("Inherent", "Mixed")
|
||||||
|
|
||||||
|
def test_darksteel_brute_is_inherent(self):
|
||||||
|
"""Darksteel Brute has indestructible - should be Inherent."""
|
||||||
|
text = "Indestructible"
|
||||||
|
keywords = "Indestructible"
|
||||||
|
cat = categorize_protection_card("Darksteel Brute", text, keywords, "Artifact")
|
||||||
|
assert cat == "Inherent"
|
||||||
|
|
||||||
|
def test_scion_of_oona_is_grant(self):
|
||||||
|
"""Scion of Oona grants shroud to other faeries - should be Grant."""
|
||||||
|
text = "Other Faeries you control have shroud."
|
||||||
|
keywords = "Flying, Flash"
|
||||||
|
cat = categorize_protection_card("Scion of Oona", text, keywords, "Creature")
|
||||||
|
assert cat == "Grant"
|
||||||
|
|
||||||
|
|
||||||
|
class TestRealWorldCards:
|
||||||
|
"""Test against actual card samples from baseline audit."""
|
||||||
|
|
||||||
|
def test_bulwark_ox(self):
|
||||||
|
"""Bulwark Ox - grants hexproof and indestructible."""
|
||||||
|
text = "Sacrifice: Creatures you control with counters gain hexproof and indestructible"
|
||||||
|
assert is_granting_protection(text, "")
|
||||||
|
|
||||||
|
def test_bloodsworn_squire(self):
|
||||||
|
"""Bloodsworn Squire - grants itself indestructible."""
|
||||||
|
text = "This creature gains indestructible until end of turn"
|
||||||
|
assert is_granting_protection(text, "")
|
||||||
|
|
||||||
|
def test_kaldra_compleat(self):
|
||||||
|
"""Kaldra Compleat - equipment with indestructible that grants."""
|
||||||
|
text = "Indestructible. Equipped creature gets +5/+5 and has indestructible"
|
||||||
|
keywords = "Indestructible"
|
||||||
|
assert is_granting_protection(text, keywords)
|
||||||
|
|
||||||
|
def test_ward_sliver(self):
|
||||||
|
"""Ward Sliver - grants protection to all slivers."""
|
||||||
|
text = "All Slivers have protection from the chosen color"
|
||||||
|
assert is_granting_protection(text, "")
|
||||||
|
|
||||||
|
def test_rebbec(self):
|
||||||
|
"""Rebbec - grants protection to artifacts."""
|
||||||
|
text = "Artifacts you control have protection from each mana value"
|
||||||
|
assert is_granting_protection(text, "")
|
||||||
|
|
@ -170,7 +170,7 @@ def _step5_summary_placeholder_html(token: int, *, message: str | None = None) -
|
||||||
return (
|
return (
|
||||||
f'<div id="deck-summary" data-summary '
|
f'<div id="deck-summary" data-summary '
|
||||||
f'hx-get="/build/step5/summary?token={token}" '
|
f'hx-get="/build/step5/summary?token={token}" '
|
||||||
'hx-trigger="load, step5:refresh from:body" hx-swap="outerHTML">'
|
'hx-trigger="step5:refresh from:body" hx-swap="outerHTML">'
|
||||||
f'<div class="muted" style="margin-top:1rem;">{_esc(text)}</div>'
|
f'<div class="muted" style="margin-top:1rem;">{_esc(text)}</div>'
|
||||||
'</div>'
|
'</div>'
|
||||||
)
|
)
|
||||||
|
|
|
||||||
|
|
@ -159,11 +159,18 @@ def _read_csv_summary(csv_path: Path) -> Tuple[dict, Dict[str, int], Dict[str, i
|
||||||
# Type counts/cards (exclude commander entry from distribution)
|
# Type counts/cards (exclude commander entry from distribution)
|
||||||
if not is_commander:
|
if not is_commander:
|
||||||
type_counts[cat] = type_counts.get(cat, 0) + cnt
|
type_counts[cat] = type_counts.get(cat, 0) + cnt
|
||||||
|
# M5: Extract metadata tags column if present
|
||||||
|
metadata_tags_raw = ''
|
||||||
|
metadata_idx = headers.index('MetadataTags') if 'MetadataTags' in headers else -1
|
||||||
|
if metadata_idx >= 0 and metadata_idx < len(row):
|
||||||
|
metadata_tags_raw = row[metadata_idx] or ''
|
||||||
|
metadata_tags_list = [t.strip() for t in metadata_tags_raw.split(';') if t.strip()]
|
||||||
type_cards.setdefault(cat, []).append({
|
type_cards.setdefault(cat, []).append({
|
||||||
'name': name,
|
'name': name,
|
||||||
'count': cnt,
|
'count': cnt,
|
||||||
'role': role,
|
'role': role,
|
||||||
'tags': tags_list,
|
'tags': tags_list,
|
||||||
|
'metadata_tags': metadata_tags_list, # M5: Include metadata tags
|
||||||
})
|
})
|
||||||
|
|
||||||
# Curve
|
# Curve
|
||||||
|
|
|
||||||
|
|
@ -900,7 +900,7 @@ def ideal_labels() -> Dict[str, str]:
|
||||||
'removal': 'Spot Removal',
|
'removal': 'Spot Removal',
|
||||||
'wipes': 'Board Wipes',
|
'wipes': 'Board Wipes',
|
||||||
'card_advantage': 'Card Advantage',
|
'card_advantage': 'Card Advantage',
|
||||||
'protection': 'Protection',
|
'protection': 'Protective Effects',
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -1181,6 +1181,9 @@ def _ensure_setup_ready(out, force: bool = False) -> None:
|
||||||
# Only flip phase if previous run finished
|
# Only flip phase if previous run finished
|
||||||
if st.get('phase') in {'themes','themes-fast'}:
|
if st.get('phase') in {'themes','themes-fast'}:
|
||||||
st['phase'] = 'done'
|
st['phase'] = 'done'
|
||||||
|
# Also ensure percent is 100 when done
|
||||||
|
if st.get('finished_at'):
|
||||||
|
st['percent'] = 100
|
||||||
with open(status_path, 'w', encoding='utf-8') as _wf:
|
with open(status_path, 'w', encoding='utf-8') as _wf:
|
||||||
json.dump(st, _wf)
|
json.dump(st, _wf)
|
||||||
except Exception:
|
except Exception:
|
||||||
|
|
@ -1463,16 +1466,17 @@ def _ensure_setup_ready(out, force: bool = False) -> None:
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
# Unconditional fallback: if (for any reason) no theme export ran above, perform a fast-path export now.
|
# Conditional fallback: only run theme export if refresh_needed was True but somehow no export performed.
|
||||||
# This guarantees that clicking Run Setup/Tagging always leaves themes current even when tagging wasn't needed.
|
# This avoids repeated exports when setup is already complete and _ensure_setup_ready is called again.
|
||||||
try:
|
try:
|
||||||
if not theme_export_performed:
|
if not theme_export_performed and refresh_needed:
|
||||||
_refresh_theme_catalog(out, force=False, fast_path=True)
|
_refresh_theme_catalog(out, force=False, fast_path=True)
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
else: # If export just ran (either earlier or via fallback), ensure enrichment ran (safety double-call guard inside helper)
|
else: # If export just ran (either earlier or via fallback), ensure enrichment ran (safety double-call guard inside helper)
|
||||||
try:
|
try:
|
||||||
_run_theme_metadata_enrichment(out)
|
if theme_export_performed or refresh_needed:
|
||||||
|
_run_theme_metadata_enrichment(out)
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
|
|
@ -1907,7 +1911,7 @@ def _make_stages(b: DeckBuilder) -> List[Dict[str, Any]]:
|
||||||
("removal", "Confirm Removal", "add_removal"),
|
("removal", "Confirm Removal", "add_removal"),
|
||||||
("wipes", "Confirm Board Wipes", "add_board_wipes"),
|
("wipes", "Confirm Board Wipes", "add_board_wipes"),
|
||||||
("card_advantage", "Confirm Card Advantage", "add_card_advantage"),
|
("card_advantage", "Confirm Card Advantage", "add_card_advantage"),
|
||||||
("protection", "Confirm Protection", "add_protection"),
|
("protection", "Confirm Protective Effects", "add_protection"),
|
||||||
]
|
]
|
||||||
any_granular = any(callable(getattr(b, rn, None)) for _key, _label, rn in spell_categories)
|
any_granular = any(callable(getattr(b, rn, None)) for _key, _label, rn in spell_categories)
|
||||||
if any_granular:
|
if any_granular:
|
||||||
|
|
|
||||||
|
|
@ -309,7 +309,8 @@
|
||||||
.catch(function(){ /* noop */ });
|
.catch(function(){ /* noop */ });
|
||||||
} catch(e) {}
|
} catch(e) {}
|
||||||
}
|
}
|
||||||
setInterval(pollStatus, 3000);
|
// Poll every 10 seconds instead of 3 to reduce server load (only for header indicator)
|
||||||
|
setInterval(pollStatus, 10000);
|
||||||
pollStatus();
|
pollStatus();
|
||||||
|
|
||||||
// Health indicator poller
|
// Health indicator poller
|
||||||
|
|
@ -1011,6 +1012,7 @@
|
||||||
var role = (attr('data-role')||'').trim();
|
var role = (attr('data-role')||'').trim();
|
||||||
var reasonsRaw = attr('data-reasons')||'';
|
var reasonsRaw = attr('data-reasons')||'';
|
||||||
var tagsRaw = attr('data-tags')||'';
|
var tagsRaw = attr('data-tags')||'';
|
||||||
|
var metadataTagsRaw = attr('data-metadata-tags')||''; // M5: Extract metadata tags
|
||||||
var reasonsRaw = attr('data-reasons')||'';
|
var reasonsRaw = attr('data-reasons')||'';
|
||||||
var roleEl = panel.querySelector('.hcp-role');
|
var roleEl = panel.querySelector('.hcp-role');
|
||||||
var hasFlip = !!card.querySelector('.dfc-toggle');
|
var hasFlip = !!card.querySelector('.dfc-toggle');
|
||||||
|
|
@ -1115,6 +1117,14 @@
|
||||||
tagsEl.style.display = 'none';
|
tagsEl.style.display = 'none';
|
||||||
} else {
|
} else {
|
||||||
var tagText = allTags.map(displayLabel).join(', ');
|
var tagText = allTags.map(displayLabel).join(', ');
|
||||||
|
// M5: Temporarily append metadata tags for debugging
|
||||||
|
if(metadataTagsRaw && metadataTagsRaw.trim()){
|
||||||
|
var metaTags = metadataTagsRaw.split(',').map(function(t){return t.trim();}).filter(Boolean);
|
||||||
|
if(metaTags.length){
|
||||||
|
var metaText = metaTags.map(displayLabel).join(', ');
|
||||||
|
tagText = tagText ? (tagText + ' | META: ' + metaText) : ('META: ' + metaText);
|
||||||
|
}
|
||||||
|
}
|
||||||
tagsEl.textContent = tagText;
|
tagsEl.textContent = tagText;
|
||||||
tagsEl.style.display = tagText ? '' : 'none';
|
tagsEl.style.display = tagText ? '' : 'none';
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -462,11 +462,12 @@
|
||||||
<!-- controls now above -->
|
<!-- controls now above -->
|
||||||
|
|
||||||
{% if allow_must_haves %}
|
{% if allow_must_haves %}
|
||||||
{% include "partials/include_exclude_summary.html" with oob=False %}
|
{% set oob = False %}
|
||||||
|
{% include "partials/include_exclude_summary.html" %}
|
||||||
{% endif %}
|
{% endif %}
|
||||||
<div id="deck-summary" data-summary
|
<div id="deck-summary" data-summary
|
||||||
hx-get="/build/step5/summary?token={{ summary_token }}"
|
hx-get="/build/step5/summary?token={{ summary_token }}"
|
||||||
hx-trigger="load, step5:refresh from:body"
|
hx-trigger="load once, step5:refresh from:body"
|
||||||
hx-swap="outerHTML">
|
hx-swap="outerHTML">
|
||||||
<div class="muted" style="margin-top:1rem;">
|
<div class="muted" style="margin-top:1rem;">
|
||||||
{% if summary_ready %}Loading deck summary…{% else %}Deck summary will appear after the build completes.{% endif %}
|
{% if summary_ready %}Loading deck summary…{% else %}Deck summary will appear after the build completes.{% endif %}
|
||||||
|
|
|
||||||
|
|
@ -74,7 +74,7 @@
|
||||||
{% set owned = (owned_set is defined and c.name and (c.name|lower in owned_set)) %}
|
{% set owned = (owned_set is defined and c.name and (c.name|lower in owned_set)) %}
|
||||||
<span class="count">{{ cnt }}</span>
|
<span class="count">{{ cnt }}</span>
|
||||||
<span class="times">x</span>
|
<span class="times">x</span>
|
||||||
<span class="name dfc-anchor" title="{{ c.name }}" data-card-name="{{ c.name }}" data-count="{{ cnt }}" data-role="{{ c.role }}" data-tags="{{ (c.tags|map('trim')|join(', ')) if c.tags else '' }}"{% if overlaps %} data-overlaps="{{ overlaps|join(', ') }}"{% endif %}>{{ c.name }}</span>
|
<span class="name dfc-anchor" title="{{ c.name }}" data-card-name="{{ c.name }}" data-count="{{ cnt }}" data-role="{{ c.role }}" data-tags="{{ (c.tags|map('trim')|join(', ')) if c.tags else '' }}"{% if c.metadata_tags %} data-metadata-tags="{{ (c.metadata_tags|map('trim')|join(', ')) }}"{% endif %}{% if overlaps %} data-overlaps="{{ overlaps|join(', ') }}"{% endif %}>{{ c.name }}</span>
|
||||||
<span class="flip-slot" aria-hidden="true">
|
<span class="flip-slot" aria-hidden="true">
|
||||||
{% if c.dfc_land %}
|
{% if c.dfc_land %}
|
||||||
<span class="dfc-land-chip {% if c.dfc_adds_extra_land %}extra{% else %}counts{% endif %}" title="{{ c.dfc_note or 'Modal double-faced land' }}">DFC land{% if c.dfc_adds_extra_land %} +1{% endif %}</span>
|
<span class="dfc-land-chip {% if c.dfc_adds_extra_land %}extra{% else %}counts{% endif %}" title="{{ c.dfc_note or 'Modal double-faced land' }}">DFC land{% if c.dfc_adds_extra_land %} +1{% endif %}</span>
|
||||||
|
|
|
||||||
|
|
@ -127,7 +127,8 @@
|
||||||
.then(update)
|
.then(update)
|
||||||
.catch(function(){});
|
.catch(function(){});
|
||||||
}
|
}
|
||||||
setInterval(poll, 3000);
|
// Poll every 5 seconds instead of 3 to reduce server load
|
||||||
|
setInterval(poll, 5000);
|
||||||
poll();
|
poll();
|
||||||
})();
|
})();
|
||||||
</script>
|
</script>
|
||||||
|
|
|
||||||
File diff suppressed because it is too large
Load diff
|
|
@ -99,6 +99,12 @@ services:
|
||||||
WEB_AUTO_REFRESH_DAYS: "7" # Refresh cards.csv if older than N days; 0=never
|
WEB_AUTO_REFRESH_DAYS: "7" # Refresh cards.csv if older than N days; 0=never
|
||||||
WEB_TAG_PARALLEL: "1" # 1=parallelize tagging
|
WEB_TAG_PARALLEL: "1" # 1=parallelize tagging
|
||||||
WEB_TAG_WORKERS: "4" # Worker count when parallel tagging
|
WEB_TAG_WORKERS: "4" # Worker count when parallel tagging
|
||||||
|
|
||||||
|
# Tagging Refinement Feature Flags
|
||||||
|
TAG_NORMALIZE_KEYWORDS: "1" # 1=normalize keywords & filter specialty mechanics (recommended)
|
||||||
|
TAG_PROTECTION_GRANTS: "1" # 1=Protection tag only for cards granting shields (recommended)
|
||||||
|
TAG_METADATA_SPLIT: "1" # 1=separate metadata tags from themes in CSVs (recommended)
|
||||||
|
|
||||||
THEME_CATALOG_MODE: "merge" # Use merged Phase B catalog builder (with YAML export)
|
THEME_CATALOG_MODE: "merge" # Use merged Phase B catalog builder (with YAML export)
|
||||||
THEME_YAML_FAST_SKIP: "0" # 1=allow skipping per-theme YAML on fast path (rare; default always export)
|
THEME_YAML_FAST_SKIP: "0" # 1=allow skipping per-theme YAML on fast path (rare; default always export)
|
||||||
# Live YAML scan interval in seconds for change detection (dev convenience)
|
# Live YAML scan interval in seconds for change detection (dev convenience)
|
||||||
|
|
|
||||||
|
|
@ -101,6 +101,12 @@ services:
|
||||||
WEB_AUTO_REFRESH_DAYS: "7" # Refresh cards.csv if older than N days; 0=never
|
WEB_AUTO_REFRESH_DAYS: "7" # Refresh cards.csv if older than N days; 0=never
|
||||||
WEB_TAG_PARALLEL: "1" # 1=parallelize tagging
|
WEB_TAG_PARALLEL: "1" # 1=parallelize tagging
|
||||||
WEB_TAG_WORKERS: "4" # Worker count when parallel tagging
|
WEB_TAG_WORKERS: "4" # Worker count when parallel tagging
|
||||||
|
|
||||||
|
# Tagging Refinement Feature Flags
|
||||||
|
TAG_NORMALIZE_KEYWORDS: "1" # 1=normalize keywords & filter specialty mechanics (recommended)
|
||||||
|
TAG_PROTECTION_GRANTS: "1" # 1=Protection tag only for cards granting shields (recommended)
|
||||||
|
TAG_METADATA_SPLIT: "1" # 1=separate metadata tags from themes in CSVs (recommended)
|
||||||
|
|
||||||
THEME_CATALOG_MODE: "merge" # Use merged Phase B catalog builder (with YAML export)
|
THEME_CATALOG_MODE: "merge" # Use merged Phase B catalog builder (with YAML export)
|
||||||
THEME_YAML_FAST_SKIP: "0" # 1=allow skipping per-theme YAML on fast path (rare; default always export)
|
THEME_YAML_FAST_SKIP: "0" # 1=allow skipping per-theme YAML on fast path (rare; default always export)
|
||||||
# Live YAML scan interval in seconds for change detection (dev convenience)
|
# Live YAML scan interval in seconds for change detection (dev convenience)
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue