mirror of
https://github.com/mwisnowski/mtg_python_deckbuilder.git
synced 2025-12-16 15:40:12 +01:00
Merge pull request #32 from mwisnowski/feature/tagging-refinement
Feature/tagging refinement
This commit is contained in:
commit
4c79a7b45b
40 changed files with 5632 additions and 2789 deletions
|
|
@ -92,6 +92,12 @@ WEB_AUTO_REFRESH_DAYS=7 # dockerhub: WEB_AUTO_REFRESH_DAYS="7"
|
|||
WEB_TAG_PARALLEL=1 # dockerhub: WEB_TAG_PARALLEL="1"
|
||||
WEB_TAG_WORKERS=2 # dockerhub: WEB_TAG_WORKERS="4"
|
||||
WEB_AUTO_ENFORCE=0 # dockerhub: WEB_AUTO_ENFORCE="0"
|
||||
|
||||
# Tagging Refinement Feature Flags
|
||||
TAG_NORMALIZE_KEYWORDS=1 # dockerhub: TAG_NORMALIZE_KEYWORDS="1" # Normalize keywords & filter specialty mechanics
|
||||
TAG_PROTECTION_GRANTS=1 # dockerhub: TAG_PROTECTION_GRANTS="1" # Protection tag only for cards granting shields
|
||||
TAG_METADATA_SPLIT=1 # dockerhub: TAG_METADATA_SPLIT="1" # Separate metadata tags from themes in CSVs
|
||||
|
||||
# DFC_COMPAT_SNAPSHOT=0 # 1=write legacy unmerged MDFC snapshots alongside merged catalogs (deprecated compatibility workflow)
|
||||
# WEB_CUSTOM_EXPORT_BASE= # Custom basename for exports (optional).
|
||||
# THEME_CATALOG_YAML_SCAN_INTERVAL_SEC=2.0 # Poll for YAML changes (dev)
|
||||
|
|
|
|||
61
CHANGELOG.md
61
CHANGELOG.md
|
|
@ -9,16 +9,69 @@ This format follows Keep a Changelog principles and aims for Semantic Versioning
|
|||
|
||||
## [Unreleased]
|
||||
### Summary
|
||||
- _No unreleased changes yet_
|
||||
- Card tagging system improvements split metadata from gameplay themes for cleaner deck building experience
|
||||
- Keyword normalization reduces specialty keyword noise by 96% while maintaining theme catalog quality
|
||||
- Protection tag now focuses on cards that grant shields to others, not just those with inherent protection
|
||||
- Web UI improvements: faster polling, fixed progress display, and theme refresh stability
|
||||
- **Protection System Overhaul**: Comprehensive enhancement to protection card detection, classification, and deck building
|
||||
- Fine-grained scope metadata distinguishes self-protection from board-wide effects ("Your Permanents: Hexproof" vs "Self: Hexproof")
|
||||
- Enhanced grant detection with Equipment/Aura patterns, phasing support, and complex trigger handling
|
||||
- Intelligent deck builder filtering includes board-relevant protection while excluding self-only and type-specific cards
|
||||
- Tiered pool limiting focuses on high-quality staples while maintaining variety across builds
|
||||
- Improved scope tagging for cards with keyword-only protection effects (no grant text, just inherent keywords)
|
||||
- **Tagging Module Refactoring**: Large-scale refactor to improve code quality and maintainability
|
||||
- Centralized regex patterns, extracted reusable utilities, decomposed complex functions
|
||||
- Improved code organization and readability while maintaining 100% tagging accuracy
|
||||
|
||||
### Added
|
||||
- _None_
|
||||
- Metadata partition system separates diagnostic tags from gameplay themes in card data
|
||||
- Keyword normalization system with smart filtering of one-off specialty mechanics
|
||||
- Allowlist preserves important keywords like Flying, Myriad, and Transform
|
||||
- Protection grant detection identifies cards that give Hexproof, Ward, or Indestructible to other permanents
|
||||
- Automatic tagging for creature-type-specific protection (e.g., "Knights Gain Protection")
|
||||
- New `metadataTags` column in card data for bracket annotations and internal diagnostics
|
||||
- Static phasing keyword detection from keywords field (catches creatures like Breezekeeper)
|
||||
- "Other X you control have Y" protection pattern for static ability grants
|
||||
- "Enchanted creature has phasing" pattern detection
|
||||
- Chosen type blanket phasing patterns
|
||||
- Complex trigger phasing patterns (reactive, consequent, end-of-turn)
|
||||
- Protection scope filtering in deck builder (feature flag: `TAG_PROTECTION_SCOPE`) intelligently selects board-relevant protection
|
||||
- Phasing cards with "Your Permanents:" or "Targeted:" metadata now tagged as Protection and included in protection pool
|
||||
- Metadata tags temporarily visible in card hover previews for debugging (shows scope like "Your Permanents: Hexproof")
|
||||
- Web-slinging tagger function to identify cards with web-slinging mechanics
|
||||
|
||||
### Changed
|
||||
- _None_
|
||||
- Card tags now split between themes (for deck building) and metadata (for diagnostics)
|
||||
- Keywords now consolidate variants (e.g., "Commander ninjutsu" becomes "Ninjutsu")
|
||||
- Setup progress polling reduced from 3s to 5-10s intervals for better performance
|
||||
- Theme catalog streamlined from 753 to 736 themes (-2.3%) with improved quality
|
||||
- Protection tag refined to focus on 329 cards that grant shields (down from 1,166 with inherent effects)
|
||||
- Protection tag renamed to "Protective Effects" throughout web interface to avoid confusion with the Magic keyword "protection"
|
||||
- Theme catalog automatically excludes metadata tags from theme suggestions
|
||||
- Grant detection now strips reminder text before pattern matching to avoid false positives
|
||||
- Deck builder protection phase now filters by scope metadata: includes "Your Permanents:", excludes "Self:" protection
|
||||
- Protection card selection now randomized per build for variety (using seeded RNG when deterministic mode enabled)
|
||||
- Protection pool now limited to ~40-50 high-quality cards (tiered selection: top 3x target + random 10-20 extras)
|
||||
- Tagging module imports standardized with consistent organization and centralized constants
|
||||
|
||||
### Fixed
|
||||
- _None_
|
||||
- Setup progress now shows 100% completion instead of getting stuck at 99%
|
||||
- Theme catalog no longer continuously regenerates after setup completes
|
||||
- Health indicator polling optimized to reduce server load
|
||||
- Protection detection now correctly excludes creatures with only inherent keywords
|
||||
- Dive Down, Glint no longer falsely identified as granting to opponents (reminder text fix)
|
||||
- Drogskol Captain, Haytham Kenway now correctly get "Your Permanents" scope tags
|
||||
- 7 cards with static Phasing keyword now properly detected (Breezekeeper, Teferi's Drake, etc.)
|
||||
- Type-specific protection grants (e.g., "Knights Gain Indestructible") now correctly excluded from general protection pool
|
||||
- Protection scope filter now properly prioritizes exclusions over inclusions (fixes Knight Exemplar in non-Knight decks)
|
||||
- Inherent protection cards (Aysen Highway, Phantom Colossus, etc.) now correctly get "Self: Protection" metadata tags
|
||||
- Scope tagging now applies to ALL cards with protection effects, not just grant cards
|
||||
- Cloak of Invisibility, Teferi's Curse now get "Your Permanents: Phasing" tags
|
||||
- Shimmer now gets "Blanket: Phasing" tag for chosen type effect
|
||||
- King of the Oathbreakers now gets "Self: Phasing" tag for reactive trigger
|
||||
- Cards with static keywords (Protection, Hexproof, Ward, Indestructible) in their keywords field now get proper scope metadata tags
|
||||
- Cards with X in their mana cost now properly identified and tagged with "X Spells" theme for better deck building accuracy
|
||||
- Card tagging system enhanced with smarter pattern detection and more consistent categorization
|
||||
|
||||
## [2.5.2] - 2025-10-08
|
||||
### Summary
|
||||
|
|
|
|||
46
README.md
46
README.md
|
|
@ -99,15 +99,51 @@ Execute saved configs without manual input.
|
|||
|
||||
### Initial Setup
|
||||
Refresh data and caches when formats shift.
|
||||
- Runs card downloads, CSV regeneration, tagging, and commander catalog rebuilds.
|
||||
- Runs card downloads, CSV regeneration, smart tagging (keywords + protection grants), and commander catalog rebuilds.
|
||||
- Controlled by `SHOW_SETUP=1` (on by default in compose).
|
||||
- Force a rebuild manually:
|
||||
- **Force a full rebuild (setup + tagging)**:
|
||||
```powershell
|
||||
docker compose run --rm --entrypoint bash web -lc "python -m code.file_setup.setup"
|
||||
# Docker:
|
||||
docker compose run --rm web python -c "from code.file_setup.setup import initial_setup; from code.tagging.tagger import run_tagging; initial_setup(); run_tagging()"
|
||||
|
||||
# Local (with venv activated):
|
||||
python -c "from code.file_setup.setup import initial_setup; from code.tagging.tagger import run_tagging; initial_setup(); run_tagging()"
|
||||
|
||||
# With parallel processing (faster):
|
||||
python -c "from code.file_setup.setup import initial_setup; from code.tagging.tagger import run_tagging; initial_setup(); run_tagging(parallel=True)"
|
||||
|
||||
# With parallel processing and custom worker count:
|
||||
python -c "from code.file_setup.setup import initial_setup; from code.tagging.tagger import run_tagging; initial_setup(); run_tagging(parallel=True, max_workers=4)"
|
||||
```
|
||||
- Rebuild only the commander catalog:
|
||||
- **Rebuild only CSVs without tagging**:
|
||||
```powershell
|
||||
docker compose run --rm --entrypoint bash web -lc "python -m code.scripts.refresh_commander_catalog"
|
||||
# Docker:
|
||||
docker compose run --rm web python -c "from code.file_setup.setup import initial_setup; initial_setup()"
|
||||
|
||||
# Local:
|
||||
python -c "from code.file_setup.setup import initial_setup; initial_setup()"
|
||||
```
|
||||
- **Run only tagging (CSVs must exist)**:
|
||||
```powershell
|
||||
# Docker:
|
||||
docker compose run --rm web python -c "from code.tagging.tagger import run_tagging; run_tagging()"
|
||||
|
||||
# Local:
|
||||
python -c "from code.tagging.tagger import run_tagging; run_tagging()"
|
||||
|
||||
# With parallel processing (faster):
|
||||
python -c "from code.tagging.tagger import run_tagging; run_tagging(parallel=True)"
|
||||
|
||||
# With parallel processing and custom worker count:
|
||||
python -c "from code.tagging.tagger import run_tagging; run_tagging(parallel=True, max_workers=4)"
|
||||
```
|
||||
- **Rebuild only the commander catalog**:
|
||||
```powershell
|
||||
# Docker:
|
||||
docker compose run --rm web python -m code.scripts.refresh_commander_catalog
|
||||
|
||||
# Local:
|
||||
python -m code.scripts.refresh_commander_catalog
|
||||
```
|
||||
|
||||
### Owned Library
|
||||
|
|
|
|||
|
|
@ -1,26 +1,67 @@
|
|||
# MTG Python Deckbuilder ${VERSION}
|
||||
|
||||
## [Unreleased]
|
||||
### Summary
|
||||
- Builder responsiveness upgrades: smarter HTMX caching, shared debounce helpers, and virtualization hints keep long card lists responsive.
|
||||
- Commander catalog now ships skeleton placeholders, lazy commander art loading, and cached default results for faster repeat visits.
|
||||
- Deck summary streams via an HTMX fragment while virtualization powers summary lists without loading every row up front.
|
||||
- Mana analytics load on demand with collapsible sections and interactive chart tooltips that support click-to-pin comparisons.
|
||||
- Card tagging system improvements split metadata from gameplay themes for cleaner deck building experience
|
||||
- Keyword normalization reduces specialty keyword noise by 96% while maintaining theme catalog quality
|
||||
- Protection tag now focuses on cards that grant shields to others, not just those with inherent protection
|
||||
- Web UI improvements: faster polling, fixed progress display, and theme refresh stability
|
||||
- Comprehensive enhancement to protection card detection, classification, and deck building
|
||||
- Fine-grained scope metadata distinguishes self-protection from board-wide effects ("Your Permanents: Hexproof" vs "Self: Hexproof")
|
||||
- Enhanced grant detection with Equipment/Aura patterns, phasing support, and complex trigger handling
|
||||
- Intelligent deck builder filtering includes board-relevant protection while excluding self-only and type-specific cards
|
||||
- Tiered pool limiting focuses on high-quality staples while maintaining variety across builds
|
||||
- Improved scope tagging for cards with keyword-only protection effects (no grant text, just inherent keywords)
|
||||
- Large-scale refactor to improve code quality and maintainability
|
||||
- Centralized regex patterns, extracted reusable utilities, decomposed complex functions
|
||||
- Improved code organization and readability while maintaining 100% tagging accuracy
|
||||
|
||||
### Added
|
||||
- Skeleton placeholders accept `data-skeleton-label` microcopy and only surface after ~400 ms across the build wizard, stage navigator, and alternatives panel.
|
||||
- Must-have toggle API (`/build/must-haves/toggle`), telemetry ingestion route (`/telemetry/events`), and structured logging helpers capture include/exclude beacons.
|
||||
- Commander catalog results wrap in a deferred skeleton list while commander art lazy-loads via a new `IntersectionObserver` helper in `code/web/static/app.js`.
|
||||
- Collapsible accordions for Mana Overview and Test Hand sections defer heavy analytics until they are expanded.
|
||||
- Click-to-pin chart tooltips keep comparisons anchored and add copy-friendly working buttons.
|
||||
- Virtualized card lists automatically render only visible items once 12+ cards are present.
|
||||
- Metadata partition system separates diagnostic tags from gameplay themes in card data
|
||||
- Keyword normalization system with smart filtering of one-off specialty mechanics
|
||||
- Allowlist preserves important keywords like Flying, Myriad, and Transform
|
||||
- Protection grant detection identifies cards that give Hexproof, Ward, or Indestructible to other permanents
|
||||
- Automatic tagging for creature-type-specific protection (e.g., "Knights Gain Protection")
|
||||
- New `metadataTags` column in card data for bracket annotations and internal diagnostics
|
||||
- Static phasing keyword detection from keywords field (catches creatures like Breezekeeper)
|
||||
- "Other X you control have Y" protection pattern for static ability grants
|
||||
- "Enchanted creature has phasing" pattern detection
|
||||
- Chosen type blanket phasing patterns
|
||||
- Complex trigger phasing patterns (reactive, consequent, end-of-turn)
|
||||
- Protection scope filtering in deck builder (feature flag: `TAG_PROTECTION_SCOPE`) intelligently selects board-relevant protection
|
||||
- Phasing cards with "Your Permanents:" or "Targeted:" metadata now tagged as Protection and included in protection pool
|
||||
- Metadata tags temporarily visible in card hover previews for debugging (shows scope like "Your Permanents: Hexproof")
|
||||
- Web-slinging tagger function to identify cards with web-slinging mechanics
|
||||
|
||||
### Changed
|
||||
- Commander search and theme picker now share an intelligent debounce to prevent redundant requests while typing.
|
||||
- Card grids adopt modern containment rules to minimize layout recalculations on large decks.
|
||||
- Include/exclude buttons respond immediately with optimistic updates, reconciling gracefully if the server disagrees.
|
||||
- Frequently accessed views, like the commander catalog default, now pull from an in-memory cache for sub-200 ms reloads.
|
||||
- Deck review loads in focused chunks, keeping the initial page lean while analytics stream progressively.
|
||||
- Chart hover zones expand to full column width for easier interaction.
|
||||
- Card tags now split between themes (for deck building) and metadata (for diagnostics)
|
||||
- Keywords now consolidate variants (e.g., "Commander ninjutsu" becomes "Ninjutsu")
|
||||
- Setup progress polling reduced from 3s to 5-10s intervals for better performance
|
||||
- Theme catalog streamlined from 753 to 736 themes (-2.3%) with improved quality
|
||||
- Protection tag refined to focus on 329 cards that grant shields (down from 1,166 with inherent effects)
|
||||
- Protection tag renamed to "Protective Effects" throughout web interface to avoid confusion with the Magic keyword "protection"
|
||||
- Theme catalog automatically excludes metadata tags from theme suggestions
|
||||
- Grant detection now strips reminder text before pattern matching to avoid false positives
|
||||
- Deck builder protection phase now filters by scope metadata: includes "Your Permanents:", excludes "Self:" protection
|
||||
- Protection card selection now randomized per build for variety (using seeded RNG when deterministic mode enabled)
|
||||
- Protection pool now limited to ~40-50 high-quality cards (tiered selection: top 3x target + random 10-20 extras)
|
||||
- Tagging module imports standardized with consistent organization and centralized constants
|
||||
|
||||
### Fixed
|
||||
- _None_
|
||||
- Setup progress now shows 100% completion instead of getting stuck at 99%
|
||||
- Theme catalog no longer continuously regenerates after setup completes
|
||||
- Health indicator polling optimized to reduce server load
|
||||
- Protection detection now correctly excludes creatures with only inherent keywords
|
||||
- Dive Down, Glint no longer falsely identified as granting to opponents (reminder text fix)
|
||||
- Drogskol Captain, Haytham Kenway now correctly get "Your Permanents" scope tags
|
||||
- 7 cards with static Phasing keyword now properly detected (Breezekeeper, Teferi's Drake, etc.)
|
||||
- Type-specific protection grants (e.g., "Knights Gain Indestructible") now correctly excluded from general protection pool
|
||||
- Protection scope filter now properly prioritizes exclusions over inclusions (fixes Knight Exemplar in non-Knight decks)
|
||||
- Inherent protection cards (Aysen Highway, Phantom Colossus, etc.) now correctly get "Self: Protection" metadata tags
|
||||
- Scope tagging now applies to ALL cards with protection effects, not just grant cards
|
||||
- Cloak of Invisibility, Teferi's Curse now get "Your Permanents: Phasing" tags
|
||||
- Shimmer now gets "Blanket: Phasing" tag for chosen type effect
|
||||
- King of the Oathbreakers now gets "Self: Phasing" tag for reactive trigger
|
||||
- Cards with static keywords (Protection, Hexproof, Ward, Indestructible) in their keywords field now get proper scope metadata tags
|
||||
- Cards with X in their mana cost now properly identified and tagged with "X Spells" theme for better deck building accuracy
|
||||
- Card tagging system enhanced with smarter pattern detection and more consistent categorization
|
||||
|
|
|
|||
|
|
@ -1,5 +0,0 @@
|
|||
import urllib.request, json
|
||||
raw = urllib.request.urlopen("http://localhost:8000/themes/metrics").read().decode()
|
||||
js=json.loads(raw)
|
||||
print('example_enforcement_active=', js.get('preview',{}).get('example_enforcement_active'))
|
||||
print('example_enforce_threshold_pct=', js.get('preview',{}).get('example_enforce_threshold_pct'))
|
||||
|
|
@ -1 +0,0 @@
|
|||
=\ 1\; & \c:/Users/Matt/mtg_python/mtg_python_deckbuilder/.venv/Scripts/python.exe\ code/scripts/build_theme_catalog.py --output config/themes/theme_list_tmp.json
|
||||
|
|
@ -1,3 +0,0 @@
|
|||
from code.web.services import orchestrator
|
||||
orchestrator._ensure_setup_ready(print, force=False)
|
||||
print('DONE')
|
||||
|
|
@ -1759,6 +1759,7 @@ class DeckBuilder(
|
|||
entry['Synergy'] = synergy
|
||||
else:
|
||||
# If no tags passed attempt enrichment from filtered pool first, then full snapshot
|
||||
metadata_tags: list[str] = []
|
||||
if not tags:
|
||||
# Use filtered pool (_combined_cards_df) instead of unfiltered (_full_cards_df)
|
||||
# This ensures exclude filtering is respected during card enrichment
|
||||
|
|
@ -1774,6 +1775,13 @@ class DeckBuilder(
|
|||
# tolerate comma separated
|
||||
parts = [p.strip().strip("'\"") for p in raw_tags.split(',')]
|
||||
tags = [p for p in parts if p]
|
||||
# M5: Extract metadata tags for web UI display
|
||||
raw_meta = row_match.iloc[0].get('metadataTags', [])
|
||||
if isinstance(raw_meta, list):
|
||||
metadata_tags = [str(t).strip() for t in raw_meta if str(t).strip()]
|
||||
elif isinstance(raw_meta, str) and raw_meta.strip():
|
||||
parts = [p.strip().strip("'\"") for p in raw_meta.split(',')]
|
||||
metadata_tags = [p for p in parts if p]
|
||||
except Exception:
|
||||
pass
|
||||
# Enrich missing type and mana_cost for accurate categorization
|
||||
|
|
@ -1811,6 +1819,7 @@ class DeckBuilder(
|
|||
'Mana Value': mana_value,
|
||||
'Creature Types': creature_types,
|
||||
'Tags': tags,
|
||||
'MetadataTags': metadata_tags, # M5: Store metadata tags for web UI
|
||||
'Commander': is_commander,
|
||||
'Count': 1,
|
||||
'Role': (role or ('commander' if is_commander else None)),
|
||||
|
|
|
|||
|
|
@ -438,7 +438,7 @@ DEFAULT_REMOVAL_COUNT: Final[int] = 10 # Default number of spot removal spells
|
|||
DEFAULT_WIPES_COUNT: Final[int] = 2 # Default number of board wipes
|
||||
|
||||
DEFAULT_CARD_ADVANTAGE_COUNT: Final[int] = 10 # Default number of card advantage pieces
|
||||
DEFAULT_PROTECTION_COUNT: Final[int] = 8 # Default number of protection spells
|
||||
DEFAULT_PROTECTION_COUNT: Final[int] = 8 # Default number of protective effects (hexproof, indestructible, protection, ward, etc.)
|
||||
|
||||
# Deck composition prompts
|
||||
DECK_COMPOSITION_PROMPTS: Final[Dict[str, str]] = {
|
||||
|
|
@ -450,7 +450,7 @@ DECK_COMPOSITION_PROMPTS: Final[Dict[str, str]] = {
|
|||
'removal': 'Enter desired number of spot removal spells (default: 10):',
|
||||
'wipes': 'Enter desired number of board wipes (default: 2):',
|
||||
'card_advantage': 'Enter desired number of card advantage pieces (default: 10):',
|
||||
'protection': 'Enter desired number of protection spells (default: 8):',
|
||||
'protection': 'Enter desired number of protective effects (default: 8):',
|
||||
'max_deck_price': 'Enter maximum total deck price in dollars (default: 400.0):',
|
||||
'max_card_price': 'Enter maximum price per card in dollars (default: 20.0):'
|
||||
}
|
||||
|
|
@ -511,7 +511,7 @@ DEFAULT_THEME_TAGS = [
|
|||
'Combat Matters', 'Control', 'Counters Matter', 'Energy',
|
||||
'Enter the Battlefield', 'Equipment', 'Exile Matters', 'Infect',
|
||||
'Interaction', 'Lands Matter', 'Leave the Battlefield', 'Legends Matter',
|
||||
'Life Matters', 'Mill', 'Monarch', 'Protection', 'Ramp', 'Reanimate',
|
||||
'Life Matters', 'Mill', 'Monarch', 'Protective Effects', 'Ramp', 'Reanimate',
|
||||
'Removal', 'Sacrifice Matters', 'Spellslinger', 'Stax', 'Superfriends',
|
||||
'Theft', 'Token Creation', 'Tokens Matter', 'Voltron', 'X Spells'
|
||||
]
|
||||
|
|
|
|||
|
|
@ -539,6 +539,10 @@ class SpellAdditionMixin:
|
|||
"""Add protection spells to the deck.
|
||||
Selects cards tagged as 'protection', prioritizing by EDHREC rank and mana value.
|
||||
Avoids duplicates and commander card.
|
||||
|
||||
M5: When TAG_PROTECTION_SCOPE is enabled, filters to include only cards that
|
||||
protect your board (Your Permanents:, {Type} Gain) and excludes self-only or
|
||||
opponent protection cards.
|
||||
"""
|
||||
target = self.ideal_counts.get('protection', 0)
|
||||
if target <= 0 or self._combined_cards_df is None:
|
||||
|
|
@ -546,14 +550,88 @@ class SpellAdditionMixin:
|
|||
already = {n.lower() for n in self.card_library.keys()}
|
||||
df = self._combined_cards_df.copy()
|
||||
df['_ltags'] = df.get('themeTags', []).apply(bu.normalize_tag_cell)
|
||||
pool = df[df['_ltags'].apply(lambda tags: any('protection' in t for t in tags))]
|
||||
|
||||
# M5: Apply scope-based filtering if enabled
|
||||
import settings as s
|
||||
if getattr(s, 'TAG_PROTECTION_SCOPE', True):
|
||||
# Check metadata tags for scope information
|
||||
df['_meta_tags'] = df.get('metadataTags', []).apply(bu.normalize_tag_cell)
|
||||
|
||||
def is_board_relevant_protection(row):
|
||||
"""Check if protection card helps protect your board.
|
||||
|
||||
Includes:
|
||||
- Cards with "Your Permanents:" metadata (board-wide protection)
|
||||
- Cards with "Blanket:" metadata (affects all permanents)
|
||||
- Cards with "Targeted:" metadata (can target your stuff)
|
||||
- Legacy cards without metadata tags
|
||||
|
||||
Excludes:
|
||||
- "Self:" protection (only protects itself)
|
||||
- "Opponent Permanents:" protection (helps opponents)
|
||||
- Type-specific grants like "Knights Gain" (too narrow, handled by kindred synergies)
|
||||
"""
|
||||
theme_tags = row.get('_ltags', [])
|
||||
meta_tags = row.get('_meta_tags', [])
|
||||
|
||||
# First check if it has general protection tag
|
||||
has_protection = any('protection' in t for t in theme_tags)
|
||||
if not has_protection:
|
||||
return False
|
||||
|
||||
# INCLUDE: Board-relevant scopes
|
||||
# "Your Permanents:", "Blanket:", "Targeted:"
|
||||
has_board_scope = any(
|
||||
'your permanents:' in t or 'blanket:' in t or 'targeted:' in t
|
||||
for t in meta_tags
|
||||
)
|
||||
|
||||
# EXCLUDE: Self-only, opponent protection, or type-specific grants
|
||||
# Check for type-specific grants FIRST (highest priority exclusion)
|
||||
has_type_specific = any(
|
||||
' gain ' in t.lower() # "Knights Gain", "Treefolk Gain", etc.
|
||||
for t in meta_tags
|
||||
)
|
||||
|
||||
has_excluded_scope = any(
|
||||
'self:' in t or
|
||||
'opponent permanents:' in t
|
||||
for t in meta_tags
|
||||
)
|
||||
|
||||
# Include if board-relevant, or if no scope tags (legacy cards)
|
||||
# ALWAYS exclude type-specific grants (too narrow for general protection)
|
||||
if meta_tags:
|
||||
# Has metadata - use it for filtering
|
||||
# Exclude if type-specific OR self/opponent
|
||||
if has_type_specific or has_excluded_scope:
|
||||
return False
|
||||
# Otherwise include if board-relevant
|
||||
return has_board_scope
|
||||
else:
|
||||
# No metadata - legacy card, include by default
|
||||
return True
|
||||
|
||||
pool = df[df.apply(is_board_relevant_protection, axis=1)]
|
||||
|
||||
# Log scope filtering stats
|
||||
original_count = len(df[df['_ltags'].apply(lambda tags: any('protection' in t for t in tags))])
|
||||
filtered_count = len(pool)
|
||||
if original_count > filtered_count:
|
||||
self.output_func(f"Protection scope filter: {filtered_count}/{original_count} cards (excluded {original_count - filtered_count} self-only/opponent cards)")
|
||||
else:
|
||||
# Legacy behavior: include all cards with 'protection' tag
|
||||
pool = df[df['_ltags'].apply(lambda tags: any('protection' in t for t in tags))]
|
||||
|
||||
pool = pool[~pool['type'].fillna('').str.contains('Land', case=False, na=False)]
|
||||
commander_name = getattr(self, 'commander', None)
|
||||
if commander_name:
|
||||
pool = pool[pool['name'] != commander_name]
|
||||
pool = self._apply_bracket_pre_filters(pool)
|
||||
pool = bu.sort_by_priority(pool, ['edhrecRank','manaValue'])
|
||||
|
||||
self._debug_dump_pool(pool, 'protection')
|
||||
|
||||
try:
|
||||
if str(os.getenv('DEBUG_SPELL_POOLS', '')).strip().lower() in {"1","true","yes","on"}:
|
||||
names = pool['name'].astype(str).head(30).tolist()
|
||||
|
|
@ -580,6 +658,48 @@ class SpellAdditionMixin:
|
|||
if existing >= target and to_add == 0:
|
||||
return
|
||||
target = to_add if existing < target else to_add
|
||||
|
||||
# M5: Limit pool size to manageable tier-based selection
|
||||
# Strategy: Top tier (3x target) + random deeper selection
|
||||
# This keeps the pool focused on high-quality options (~50-70 cards typical)
|
||||
original_pool_size = len(pool)
|
||||
if len(pool) > 0 and target > 0:
|
||||
try:
|
||||
# Tier 1: Top quality cards (3x target count)
|
||||
tier1_size = min(3 * target, len(pool))
|
||||
tier1 = pool.head(tier1_size).copy()
|
||||
|
||||
# Tier 2: Random additional cards from remaining pool (10-20 cards)
|
||||
if len(pool) > tier1_size:
|
||||
remaining_pool = pool.iloc[tier1_size:].copy()
|
||||
tier2_size = min(
|
||||
self.rng.randint(10, 20) if hasattr(self, 'rng') and self.rng else 15,
|
||||
len(remaining_pool)
|
||||
)
|
||||
if hasattr(self, 'rng') and self.rng and len(remaining_pool) > tier2_size:
|
||||
# Use random.sample() to select random indices from the remaining pool
|
||||
tier2_indices = self.rng.sample(range(len(remaining_pool)), tier2_size)
|
||||
tier2 = remaining_pool.iloc[tier2_indices]
|
||||
else:
|
||||
tier2 = remaining_pool.head(tier2_size)
|
||||
pool = tier1._append(tier2, ignore_index=True)
|
||||
else:
|
||||
pool = tier1
|
||||
|
||||
if len(pool) != original_pool_size:
|
||||
self.output_func(f"Protection pool limited: {len(pool)}/{original_pool_size} cards (tier1: {tier1_size}, tier2: {len(pool) - tier1_size})")
|
||||
except Exception as e:
|
||||
self.output_func(f"Warning: Pool limiting failed, using full pool: {e}")
|
||||
|
||||
# Shuffle pool for variety across builds (using seeded RNG for determinism)
|
||||
try:
|
||||
if hasattr(self, 'rng') and self.rng is not None:
|
||||
pool_list = pool.to_dict('records')
|
||||
self.rng.shuffle(pool_list)
|
||||
import pandas as pd
|
||||
pool = pd.DataFrame(pool_list)
|
||||
except Exception:
|
||||
pass
|
||||
added = 0
|
||||
added_names: List[str] = []
|
||||
for _, r in pool.iterrows():
|
||||
|
|
|
|||
|
|
@ -878,7 +878,7 @@ class ReportingMixin:
|
|||
|
||||
headers = [
|
||||
"Name","Count","Type","ManaCost","ManaValue","Colors","Power","Toughness",
|
||||
"Role","SubRole","AddedBy","TriggerTag","Synergy","Tags","Text","DFCNote","Owned"
|
||||
"Role","SubRole","AddedBy","TriggerTag","Synergy","Tags","MetadataTags","Text","DFCNote","Owned"
|
||||
]
|
||||
|
||||
header_suffix: List[str] = []
|
||||
|
|
@ -946,6 +946,9 @@ class ReportingMixin:
|
|||
role = info.get('Role', '') or ''
|
||||
tags = info.get('Tags', []) or []
|
||||
tags_join = '; '.join(tags)
|
||||
# M5: Include metadata tags in export
|
||||
metadata_tags = info.get('MetadataTags', []) or []
|
||||
metadata_tags_join = '; '.join(metadata_tags)
|
||||
text_field = ''
|
||||
colors = ''
|
||||
power = ''
|
||||
|
|
@ -1014,6 +1017,7 @@ class ReportingMixin:
|
|||
info.get('TriggerTag') or '',
|
||||
info.get('Synergy') if info.get('Synergy') is not None else '',
|
||||
tags_join,
|
||||
metadata_tags_join, # M5: Include metadata tags
|
||||
text_field[:800] if isinstance(text_field, str) else str(text_field)[:800],
|
||||
dfc_note,
|
||||
owned_flag
|
||||
|
|
|
|||
|
|
@ -2,7 +2,23 @@
|
|||
|
||||
This module provides the main setup functionality for the MTG Python Deckbuilder
|
||||
application. It handles initial setup tasks such as downloading card data,
|
||||
creating color-filtered card lists, and generating commander-eligible card lists.
|
||||
creating color-filtered card lists, and gener logger.info(f'Downloading latest card data for {color} cards')
|
||||
download_cards_csv(MTGJSON_API_URL, f'{CSV_DIRECTORY}/cards.csv')
|
||||
|
||||
logger.info('Loading and processing card data')
|
||||
try:
|
||||
df = pd.read_csv(f'{CSV_DIRECTORY}/cards.csv', low_memory=False)
|
||||
except pd.errors.ParserError as e:
|
||||
logger.warning(f'CSV parsing error encountered: {e}. Retrying with error handling...')
|
||||
df = pd.read_csv(
|
||||
f'{CSV_DIRECTORY}/cards.csv',
|
||||
low_memory=False,
|
||||
on_bad_lines='warn', # Warn about malformed rows but continue
|
||||
encoding_errors='replace' # Replace bad encoding chars
|
||||
)
|
||||
logger.info('Successfully loaded card data with error handling (some rows may have been skipped)')
|
||||
|
||||
logger.info(f'Regenerating {color} cards CSV')der-eligible card lists.
|
||||
|
||||
Key Features:
|
||||
- Initial setup and configuration
|
||||
|
|
@ -197,7 +213,17 @@ def regenerate_csvs_all() -> None:
|
|||
download_cards_csv(MTGJSON_API_URL, f'{CSV_DIRECTORY}/cards.csv')
|
||||
|
||||
logger.info('Loading and processing card data')
|
||||
df = pd.read_csv(f'{CSV_DIRECTORY}/cards.csv', low_memory=False)
|
||||
try:
|
||||
df = pd.read_csv(f'{CSV_DIRECTORY}/cards.csv', low_memory=False)
|
||||
except pd.errors.ParserError as e:
|
||||
logger.warning(f'CSV parsing error encountered: {e}. Retrying with error handling...')
|
||||
df = pd.read_csv(
|
||||
f'{CSV_DIRECTORY}/cards.csv',
|
||||
low_memory=False,
|
||||
on_bad_lines='warn', # Warn about malformed rows but continue
|
||||
encoding_errors='replace' # Replace bad encoding chars
|
||||
)
|
||||
logger.info(f'Successfully loaded card data with error handling (some rows may have been skipped)')
|
||||
|
||||
logger.info('Regenerating color identity sorted files')
|
||||
save_color_filtered_csvs(df, CSV_DIRECTORY)
|
||||
|
|
@ -234,7 +260,12 @@ def regenerate_csv_by_color(color: str) -> None:
|
|||
download_cards_csv(MTGJSON_API_URL, f'{CSV_DIRECTORY}/cards.csv')
|
||||
|
||||
logger.info('Loading and processing card data')
|
||||
df = pd.read_csv(f'{CSV_DIRECTORY}/cards.csv', low_memory=False)
|
||||
df = pd.read_csv(
|
||||
f'{CSV_DIRECTORY}/cards.csv',
|
||||
low_memory=False,
|
||||
on_bad_lines='skip', # Skip malformed rows (MTGJSON CSV has escaping issues)
|
||||
encoding_errors='replace' # Replace bad encoding chars
|
||||
)
|
||||
|
||||
logger.info(f'Regenerating {color} cards CSV')
|
||||
# Use shared utilities to base-filter once then slice color, honoring bans
|
||||
|
|
|
|||
203
code/scripts/audit_protection_full_v2.py
Normal file
203
code/scripts/audit_protection_full_v2.py
Normal file
|
|
@ -0,0 +1,203 @@
|
|||
"""
|
||||
Full audit of Protection-tagged cards with kindred metadata support (M2 Phase 2).
|
||||
|
||||
Created: October 8, 2025
|
||||
Purpose: Audit and validate Protection tag precision after implementing grant detection.
|
||||
Can be re-run periodically to check tagging quality.
|
||||
|
||||
This script audits ALL Protection-tagged cards and categorizes them:
|
||||
- Grant: Gives broad protection to other permanents YOU control
|
||||
- Kindred: Gives protection to specific creature types (metadata tags)
|
||||
- Mixed: Both broad and kindred/inherent
|
||||
- Inherent: Only has protection itself
|
||||
- ConditionalSelf: Only conditionally grants to itself
|
||||
- Opponent: Grants to opponent's permanents
|
||||
- Neither: False positive
|
||||
|
||||
Outputs:
|
||||
- m2_audit_v2.json: Full analysis with summary
|
||||
- m2_audit_v2_grant.csv: Cards for main Protection tag
|
||||
- m2_audit_v2_kindred.csv: Cards for kindred metadata tags
|
||||
- m2_audit_v2_mixed.csv: Cards with both broad and kindred grants
|
||||
- m2_audit_v2_conditional.csv: Conditional self-grants (exclude)
|
||||
- m2_audit_v2_inherent.csv: Inherent protection only (exclude)
|
||||
- m2_audit_v2_opponent.csv: Opponent grants (exclude)
|
||||
- m2_audit_v2_neither.csv: False positives (exclude)
|
||||
- m2_audit_v2_all.csv: All cards combined
|
||||
"""
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
import pandas as pd
|
||||
import json
|
||||
|
||||
# Add project root to path
|
||||
project_root = Path(__file__).parent.parent.parent
|
||||
sys.path.insert(0, str(project_root))
|
||||
|
||||
from code.tagging.protection_grant_detection import (
|
||||
categorize_protection_card,
|
||||
get_kindred_protection_tags,
|
||||
is_granting_protection,
|
||||
)
|
||||
|
||||
def load_all_cards():
|
||||
"""Load all cards from color/identity CSV files."""
|
||||
csv_dir = project_root / 'csv_files'
|
||||
|
||||
# Get all color/identity CSVs (not the raw cards.csv)
|
||||
csv_files = list(csv_dir.glob('*_cards.csv'))
|
||||
csv_files = [f for f in csv_files if f.stem not in ['cards', 'testdata']]
|
||||
|
||||
all_cards = []
|
||||
for csv_file in csv_files:
|
||||
try:
|
||||
df = pd.read_csv(csv_file)
|
||||
all_cards.append(df)
|
||||
except Exception as e:
|
||||
print(f"Warning: Could not load {csv_file.name}: {e}")
|
||||
|
||||
# Combine all DataFrames
|
||||
combined = pd.concat(all_cards, ignore_index=True)
|
||||
|
||||
# Drop duplicates (cards appear in multiple color files)
|
||||
combined = combined.drop_duplicates(subset=['name'], keep='first')
|
||||
|
||||
return combined
|
||||
|
||||
def audit_all_protection_cards():
|
||||
"""Audit all Protection-tagged cards."""
|
||||
print("Loading all cards...")
|
||||
df = load_all_cards()
|
||||
|
||||
print(f"Total cards loaded: {len(df)}")
|
||||
|
||||
# Filter to Protection-tagged cards (column is 'themeTags' in color CSVs)
|
||||
df_prot = df[df['themeTags'].str.contains('Protection', case=False, na=False)].copy()
|
||||
|
||||
print(f"Protection-tagged cards: {len(df_prot)}")
|
||||
|
||||
# Categorize each card
|
||||
categories = []
|
||||
grants_list = []
|
||||
kindred_tags_list = []
|
||||
|
||||
for idx, row in df_prot.iterrows():
|
||||
name = row['name']
|
||||
text = str(row.get('text', '')).replace('\\n', '\n') # Convert escaped newlines to real newlines
|
||||
keywords = str(row.get('keywords', ''))
|
||||
card_type = str(row.get('type', ''))
|
||||
|
||||
# Categorize with kindred exclusion enabled
|
||||
category = categorize_protection_card(name, text, keywords, card_type, exclude_kindred=True)
|
||||
|
||||
# Check if it grants broadly
|
||||
grants_broad = is_granting_protection(text, keywords, exclude_kindred=True)
|
||||
|
||||
# Get kindred tags
|
||||
kindred_tags = get_kindred_protection_tags(text)
|
||||
|
||||
categories.append(category)
|
||||
grants_list.append(grants_broad)
|
||||
kindred_tags_list.append(', '.join(sorted(kindred_tags)) if kindred_tags else '')
|
||||
|
||||
df_prot['category'] = categories
|
||||
df_prot['grants_broad'] = grants_list
|
||||
df_prot['kindred_tags'] = kindred_tags_list
|
||||
|
||||
# Generate summary (convert numpy types to native Python for JSON serialization)
|
||||
summary = {
|
||||
'total': int(len(df_prot)),
|
||||
'categories': {k: int(v) for k, v in df_prot['category'].value_counts().to_dict().items()},
|
||||
'grants_broad_count': int(df_prot['grants_broad'].sum()),
|
||||
'kindred_cards_count': int((df_prot['kindred_tags'] != '').sum()),
|
||||
}
|
||||
|
||||
# Calculate keep vs remove
|
||||
keep_categories = {'Grant', 'Mixed'}
|
||||
kindred_only = df_prot[df_prot['category'] == 'Kindred']
|
||||
keep_count = len(df_prot[df_prot['category'].isin(keep_categories)])
|
||||
remove_count = len(df_prot[~df_prot['category'].isin(keep_categories | {'Kindred'})])
|
||||
|
||||
summary['keep_main_tag'] = keep_count
|
||||
summary['kindred_metadata'] = len(kindred_only)
|
||||
summary['remove'] = remove_count
|
||||
summary['precision_estimate'] = round((keep_count / len(df_prot)) * 100, 1) if len(df_prot) > 0 else 0
|
||||
|
||||
# Print summary
|
||||
print(f"\n{'='*60}")
|
||||
print("AUDIT SUMMARY")
|
||||
print(f"{'='*60}")
|
||||
print(f"Total Protection-tagged cards: {summary['total']}")
|
||||
print(f"\nCategories:")
|
||||
for cat, count in sorted(summary['categories'].items()):
|
||||
pct = (count / summary['total']) * 100
|
||||
print(f" {cat:20s} {count:4d} ({pct:5.1f}%)")
|
||||
|
||||
print(f"\n{'='*60}")
|
||||
print(f"Main Protection tag: {keep_count:4d} ({keep_count/len(df_prot)*100:5.1f}%)")
|
||||
print(f"Kindred metadata only: {len(kindred_only):4d} ({len(kindred_only)/len(df_prot)*100:5.1f}%)")
|
||||
print(f"Remove: {remove_count:4d} ({remove_count/len(df_prot)*100:5.1f}%)")
|
||||
print(f"{'='*60}")
|
||||
print(f"Precision estimate: {summary['precision_estimate']}%")
|
||||
print(f"{'='*60}\n")
|
||||
|
||||
# Export results
|
||||
output_dir = project_root / 'logs' / 'roadmaps' / 'source' / 'tagging_refinement'
|
||||
output_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Export JSON summary
|
||||
with open(output_dir / 'm2_audit_v2.json', 'w') as f:
|
||||
json.dump({
|
||||
'summary': summary,
|
||||
'cards': df_prot[['name', 'type', 'category', 'grants_broad', 'kindred_tags', 'keywords', 'text']].to_dict(orient='records')
|
||||
}, f, indent=2)
|
||||
|
||||
# Export CSVs by category
|
||||
export_cols = ['name', 'type', 'category', 'grants_broad', 'kindred_tags', 'keywords', 'text']
|
||||
|
||||
# Grant category
|
||||
df_grant = df_prot[df_prot['category'] == 'Grant']
|
||||
df_grant[export_cols].to_csv(output_dir / 'm2_audit_v2_grant.csv', index=False)
|
||||
print(f"Exported {len(df_grant)} Grant cards to m2_audit_v2_grant.csv")
|
||||
|
||||
# Kindred category
|
||||
df_kindred = df_prot[df_prot['category'] == 'Kindred']
|
||||
df_kindred[export_cols].to_csv(output_dir / 'm2_audit_v2_kindred.csv', index=False)
|
||||
print(f"Exported {len(df_kindred)} Kindred cards to m2_audit_v2_kindred.csv")
|
||||
|
||||
# Mixed category
|
||||
df_mixed = df_prot[df_prot['category'] == 'Mixed']
|
||||
df_mixed[export_cols].to_csv(output_dir / 'm2_audit_v2_mixed.csv', index=False)
|
||||
print(f"Exported {len(df_mixed)} Mixed cards to m2_audit_v2_mixed.csv")
|
||||
|
||||
# ConditionalSelf category
|
||||
df_conditional = df_prot[df_prot['category'] == 'ConditionalSelf']
|
||||
df_conditional[export_cols].to_csv(output_dir / 'm2_audit_v2_conditional.csv', index=False)
|
||||
print(f"Exported {len(df_conditional)} ConditionalSelf cards to m2_audit_v2_conditional.csv")
|
||||
|
||||
# Inherent category
|
||||
df_inherent = df_prot[df_prot['category'] == 'Inherent']
|
||||
df_inherent[export_cols].to_csv(output_dir / 'm2_audit_v2_inherent.csv', index=False)
|
||||
print(f"Exported {len(df_inherent)} Inherent cards to m2_audit_v2_inherent.csv")
|
||||
|
||||
# Opponent category
|
||||
df_opponent = df_prot[df_prot['category'] == 'Opponent']
|
||||
df_opponent[export_cols].to_csv(output_dir / 'm2_audit_v2_opponent.csv', index=False)
|
||||
print(f"Exported {len(df_opponent)} Opponent cards to m2_audit_v2_opponent.csv")
|
||||
|
||||
# Neither category
|
||||
df_neither = df_prot[df_prot['category'] == 'Neither']
|
||||
df_neither[export_cols].to_csv(output_dir / 'm2_audit_v2_neither.csv', index=False)
|
||||
print(f"Exported {len(df_neither)} Neither cards to m2_audit_v2_neither.csv")
|
||||
|
||||
# All cards
|
||||
df_prot[export_cols].to_csv(output_dir / 'm2_audit_v2_all.csv', index=False)
|
||||
print(f"Exported {len(df_prot)} total cards to m2_audit_v2_all.csv")
|
||||
|
||||
print(f"\nAll files saved to: {output_dir}")
|
||||
|
||||
return df_prot, summary
|
||||
|
||||
if __name__ == '__main__':
|
||||
df_results, summary = audit_all_protection_cards()
|
||||
|
|
@ -1,6 +1,7 @@
|
|||
from __future__ import annotations
|
||||
|
||||
# Standard library imports
|
||||
import os
|
||||
from typing import Dict, List, Optional
|
||||
|
||||
# ----------------------------------------------------------------------------------
|
||||
|
|
@ -98,4 +99,20 @@ CSV_DIRECTORY: str = 'csv_files'
|
|||
FILL_NA_COLUMNS: Dict[str, Optional[str]] = {
|
||||
'colorIdentity': 'Colorless', # Default color identity for cards without one
|
||||
'faceName': None # Use card's name column value when face name is not available
|
||||
}
|
||||
}
|
||||
|
||||
# ----------------------------------------------------------------------------------
|
||||
# TAGGING REFINEMENT FEATURE FLAGS (M1-M5)
|
||||
# ----------------------------------------------------------------------------------
|
||||
|
||||
# M1: Enable keyword normalization and singleton pruning (completed)
|
||||
TAG_NORMALIZE_KEYWORDS = os.getenv('TAG_NORMALIZE_KEYWORDS', '1').lower() not in ('0', 'false', 'off', 'disabled')
|
||||
|
||||
# M2: Enable protection grant detection (completed)
|
||||
TAG_PROTECTION_GRANTS = os.getenv('TAG_PROTECTION_GRANTS', '1').lower() not in ('0', 'false', 'off', 'disabled')
|
||||
|
||||
# M3: Enable metadata/theme partition (completed)
|
||||
TAG_METADATA_SPLIT = os.getenv('TAG_METADATA_SPLIT', '1').lower() not in ('0', 'false', 'off', 'disabled')
|
||||
|
||||
# M5: Enable protection scope filtering in deck builder (completed - Phase 1-3, in progress Phase 4+)
|
||||
TAG_PROTECTION_SCOPE = os.getenv('TAG_PROTECTION_SCOPE', '1').lower() not in ('0', 'false', 'off', 'disabled')
|
||||
|
|
@ -1,9 +1,11 @@
|
|||
from __future__ import annotations
|
||||
|
||||
# Standard library imports
|
||||
import json
|
||||
from pathlib import Path
|
||||
from typing import Dict, Iterable, Set
|
||||
|
||||
# Third-party imports
|
||||
import pandas as pd
|
||||
|
||||
def _ensure_norm_series(df: pd.DataFrame, source_col: str, norm_col: str) -> pd.Series:
|
||||
|
|
|
|||
|
|
@ -1,9 +1,11 @@
|
|||
from __future__ import annotations
|
||||
|
||||
# Standard library imports
|
||||
import json
|
||||
from pathlib import Path
|
||||
from typing import List, Optional
|
||||
|
||||
import json
|
||||
# Third-party imports
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -1,14 +1,17 @@
|
|||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
# Standard library imports
|
||||
import ast
|
||||
import json
|
||||
from collections import defaultdict
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Set, DefaultDict
|
||||
from collections import defaultdict
|
||||
from typing import DefaultDict, Dict, List, Set
|
||||
|
||||
# Third-party imports
|
||||
import pandas as pd
|
||||
|
||||
# Local application imports
|
||||
from settings import CSV_DIRECTORY, SETUP_COLORS
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -73,6 +73,132 @@ def load_merge_summary() -> Dict[str, Any]:
|
|||
return {"updated_at": None, "colors": {}}
|
||||
|
||||
|
||||
def _merge_tag_columns(work_df: pd.DataFrame, group_sorted: pd.DataFrame, primary_idx: int) -> None:
|
||||
"""Merge list columns (themeTags, roleTags) into union values.
|
||||
|
||||
Args:
|
||||
work_df: Working DataFrame to update
|
||||
group_sorted: Sorted group of faces for a multi-face card
|
||||
primary_idx: Index of primary face to update
|
||||
"""
|
||||
for column in _LIST_UNION_COLUMNS:
|
||||
if column in group_sorted.columns:
|
||||
union_values = _merge_object_lists(group_sorted[column])
|
||||
work_df.at[primary_idx, column] = union_values
|
||||
|
||||
if "keywords" in group_sorted.columns:
|
||||
keyword_union = _merge_keywords(group_sorted["keywords"])
|
||||
work_df.at[primary_idx, "keywords"] = _join_keywords(keyword_union)
|
||||
|
||||
|
||||
def _build_face_payload(face_row: pd.Series) -> Dict[str, Any]:
|
||||
"""Build face metadata payload from a single face row.
|
||||
|
||||
Args:
|
||||
face_row: Single face row from grouped DataFrame
|
||||
|
||||
Returns:
|
||||
Dictionary containing face metadata
|
||||
"""
|
||||
text_val = face_row.get("text") or face_row.get("oracleText") or ""
|
||||
mana_cost_val = face_row.get("manaCost", face_row.get("mana_cost", "")) or ""
|
||||
mana_value_raw = face_row.get("manaValue", face_row.get("mana_value", ""))
|
||||
|
||||
try:
|
||||
if mana_value_raw in (None, ""):
|
||||
mana_value_val = None
|
||||
else:
|
||||
mana_value_val = float(mana_value_raw)
|
||||
if math.isnan(mana_value_val):
|
||||
mana_value_val = None
|
||||
except Exception:
|
||||
mana_value_val = None
|
||||
|
||||
type_val = face_row.get("type", "") or ""
|
||||
|
||||
return {
|
||||
"face": str(face_row.get("faceName") or face_row.get("name") or ""),
|
||||
"side": str(face_row.get("side") or ""),
|
||||
"layout": str(face_row.get("layout") or ""),
|
||||
"themeTags": _merge_object_lists([face_row.get("themeTags", [])]),
|
||||
"roleTags": _merge_object_lists([face_row.get("roleTags", [])]),
|
||||
"type": str(type_val),
|
||||
"text": str(text_val),
|
||||
"mana_cost": str(mana_cost_val),
|
||||
"mana_value": mana_value_val,
|
||||
"produces_mana": _text_produces_mana(text_val),
|
||||
"is_land": 'land' in str(type_val).lower(),
|
||||
}
|
||||
|
||||
|
||||
def _build_merge_detail(name: str, group_sorted: pd.DataFrame, faces_payload: List[Dict[str, Any]]) -> Dict[str, Any]:
|
||||
"""Build detailed merge information for a multi-face card group.
|
||||
|
||||
Args:
|
||||
name: Card name
|
||||
group_sorted: Sorted group of faces
|
||||
faces_payload: List of face metadata dictionaries
|
||||
|
||||
Returns:
|
||||
Dictionary containing merge details
|
||||
"""
|
||||
layout_set = sorted({f.get("layout", "") for f in faces_payload if f.get("layout")})
|
||||
removed_faces = faces_payload[1:] if len(faces_payload) > 1 else []
|
||||
|
||||
return {
|
||||
"name": name,
|
||||
"total_faces": len(group_sorted),
|
||||
"dropped_faces": max(len(group_sorted) - 1, 0),
|
||||
"layouts": layout_set,
|
||||
"primary_face": faces_payload[0] if faces_payload else {},
|
||||
"removed_faces": removed_faces,
|
||||
"theme_tags": sorted({tag for face in faces_payload for tag in face.get("themeTags", [])}),
|
||||
"role_tags": sorted({tag for face in faces_payload for tag in face.get("roleTags", [])}),
|
||||
"faces": faces_payload,
|
||||
}
|
||||
|
||||
|
||||
def _log_merge_summary(color: str, merged_count: int, drop_count: int, multi_face_count: int, logger) -> None:
|
||||
"""Log merge summary with structured and human-readable formats.
|
||||
|
||||
Args:
|
||||
color: Color being processed
|
||||
merged_count: Number of card groups merged
|
||||
drop_count: Number of face rows dropped
|
||||
multi_face_count: Total multi-face rows processed
|
||||
logger: Logger instance
|
||||
"""
|
||||
try:
|
||||
logger.info(
|
||||
"dfc_merge_summary %s",
|
||||
json.dumps(
|
||||
{
|
||||
"event": "dfc_merge_summary",
|
||||
"color": color,
|
||||
"groups_merged": merged_count,
|
||||
"faces_dropped": drop_count,
|
||||
"multi_face_rows": multi_face_count,
|
||||
},
|
||||
sort_keys=True,
|
||||
),
|
||||
)
|
||||
except Exception:
|
||||
logger.info(
|
||||
"dfc_merge_summary event=%s groups=%d dropped=%d rows=%d",
|
||||
color,
|
||||
merged_count,
|
||||
drop_count,
|
||||
multi_face_count,
|
||||
)
|
||||
|
||||
logger.info(
|
||||
"Merged %d multi-face card groups for %s (dropped %d extra faces)",
|
||||
merged_count,
|
||||
color,
|
||||
drop_count,
|
||||
)
|
||||
|
||||
|
||||
def merge_multi_face_rows(
|
||||
df: pd.DataFrame,
|
||||
color: str,
|
||||
|
|
@ -93,7 +219,6 @@ def merge_multi_face_rows(
|
|||
return df
|
||||
|
||||
work_df = df.copy()
|
||||
|
||||
layout_series = work_df["layout"].fillna("").astype(str).str.lower()
|
||||
multi_mask = layout_series.isin(_MULTI_FACE_LAYOUTS)
|
||||
|
||||
|
|
@ -110,66 +235,15 @@ def merge_multi_face_rows(
|
|||
|
||||
group_sorted = _sort_faces(group)
|
||||
primary_idx = group_sorted.index[0]
|
||||
faces_payload: List[Dict[str, Any]] = []
|
||||
|
||||
for column in _LIST_UNION_COLUMNS:
|
||||
if column in group_sorted.columns:
|
||||
union_values = _merge_object_lists(group_sorted[column])
|
||||
work_df.at[primary_idx, column] = union_values
|
||||
_merge_tag_columns(work_df, group_sorted, primary_idx)
|
||||
|
||||
if "keywords" in group_sorted.columns:
|
||||
keyword_union = _merge_keywords(group_sorted["keywords"])
|
||||
work_df.at[primary_idx, "keywords"] = _join_keywords(keyword_union)
|
||||
|
||||
for _, face_row in group_sorted.iterrows():
|
||||
text_val = face_row.get("text") or face_row.get("oracleText") or ""
|
||||
mana_cost_val = face_row.get("manaCost", face_row.get("mana_cost", "")) or ""
|
||||
mana_value_raw = face_row.get("manaValue", face_row.get("mana_value", ""))
|
||||
try:
|
||||
if mana_value_raw in (None, ""):
|
||||
mana_value_val = None
|
||||
else:
|
||||
mana_value_val = float(mana_value_raw)
|
||||
if math.isnan(mana_value_val):
|
||||
mana_value_val = None
|
||||
except Exception:
|
||||
mana_value_val = None
|
||||
type_val = face_row.get("type", "") or ""
|
||||
faces_payload.append(
|
||||
{
|
||||
"face": str(face_row.get("faceName") or face_row.get("name") or ""),
|
||||
"side": str(face_row.get("side") or ""),
|
||||
"layout": str(face_row.get("layout") or ""),
|
||||
"themeTags": _merge_object_lists([face_row.get("themeTags", [])]),
|
||||
"roleTags": _merge_object_lists([face_row.get("roleTags", [])]),
|
||||
"type": str(type_val),
|
||||
"text": str(text_val),
|
||||
"mana_cost": str(mana_cost_val),
|
||||
"mana_value": mana_value_val,
|
||||
"produces_mana": _text_produces_mana(text_val),
|
||||
"is_land": 'land' in str(type_val).lower(),
|
||||
}
|
||||
)
|
||||
|
||||
for idx in group_sorted.index[1:]:
|
||||
drop_indices.append(idx)
|
||||
faces_payload = [_build_face_payload(row) for _, row in group_sorted.iterrows()]
|
||||
|
||||
drop_indices.extend(group_sorted.index[1:])
|
||||
|
||||
merged_count += 1
|
||||
layout_set = sorted({f.get("layout", "") for f in faces_payload if f.get("layout")})
|
||||
removed_faces = faces_payload[1:] if len(faces_payload) > 1 else []
|
||||
merge_details.append(
|
||||
{
|
||||
"name": name,
|
||||
"total_faces": len(group_sorted),
|
||||
"dropped_faces": max(len(group_sorted) - 1, 0),
|
||||
"layouts": layout_set,
|
||||
"primary_face": faces_payload[0] if faces_payload else {},
|
||||
"removed_faces": removed_faces,
|
||||
"theme_tags": sorted({tag for face in faces_payload for tag in face.get("themeTags", [])}),
|
||||
"role_tags": sorted({tag for face in faces_payload for tag in face.get("roleTags", [])}),
|
||||
"faces": faces_payload,
|
||||
}
|
||||
)
|
||||
merge_details.append(_build_merge_detail(name, group_sorted, faces_payload))
|
||||
|
||||
if drop_indices:
|
||||
work_df = work_df.drop(index=drop_indices)
|
||||
|
|
@ -192,38 +266,10 @@ def merge_multi_face_rows(
|
|||
logger.warning("Failed to record DFC merge summary for %s: %s", color, exc)
|
||||
|
||||
if logger is not None:
|
||||
try:
|
||||
logger.info(
|
||||
"dfc_merge_summary %s",
|
||||
json.dumps(
|
||||
{
|
||||
"event": "dfc_merge_summary",
|
||||
"color": color,
|
||||
"groups_merged": merged_count,
|
||||
"faces_dropped": len(drop_indices),
|
||||
"multi_face_rows": int(multi_mask.sum()),
|
||||
},
|
||||
sort_keys=True,
|
||||
),
|
||||
)
|
||||
except Exception:
|
||||
logger.info(
|
||||
"dfc_merge_summary event=%s groups=%d dropped=%d rows=%d",
|
||||
color,
|
||||
merged_count,
|
||||
len(drop_indices),
|
||||
int(multi_mask.sum()),
|
||||
)
|
||||
logger.info(
|
||||
"Merged %d multi-face card groups for %s (dropped %d extra faces)",
|
||||
merged_count,
|
||||
color,
|
||||
len(drop_indices),
|
||||
)
|
||||
_log_merge_summary(color, merged_count, len(drop_indices), int(multi_mask.sum()), logger)
|
||||
|
||||
_persist_merge_summary(color, summary_payload, logger)
|
||||
|
||||
# Reset index to keep downstream expectations consistent.
|
||||
return work_df.reset_index(drop=True)
|
||||
|
||||
|
||||
|
|
|
|||
213
code/tagging/phasing_scope_detection.py
Normal file
213
code/tagging/phasing_scope_detection.py
Normal file
|
|
@ -0,0 +1,213 @@
|
|||
"""
|
||||
Phasing Scope Detection Module
|
||||
|
||||
Detects the scope of phasing effects with multiple dimensions:
|
||||
- Targeted: Phasing (any targeting effect)
|
||||
- Self: Phasing (phases itself out)
|
||||
- Your Permanents: Phasing (phases your permanents out)
|
||||
- Opponent Permanents: Phasing (phases opponent permanents - removal)
|
||||
- Blanket: Phasing (phases all permanents out)
|
||||
|
||||
Cards can have multiple scope tags (e.g., Targeted + Your Permanents).
|
||||
|
||||
Refactored in M2: Create Scope Detection Utilities to use generic scope detection.
|
||||
"""
|
||||
|
||||
# Standard library imports
|
||||
import re
|
||||
from typing import Set
|
||||
|
||||
# Local application imports
|
||||
from . import scope_detection_utils as scope_utils
|
||||
from code.logging_util import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
# Phasing scope pattern definitions
|
||||
def _get_phasing_scope_patterns() -> scope_utils.ScopePatterns:
|
||||
"""
|
||||
Build scope patterns for phasing abilities.
|
||||
|
||||
Returns:
|
||||
ScopePatterns object with compiled patterns
|
||||
"""
|
||||
# Targeting patterns (special for phasing - detects "target...phases out")
|
||||
targeting_patterns = [
|
||||
re.compile(r'target\s+(?:\w+\s+)*(?:creature|permanent|artifact|enchantment|nonland\s+permanent)s?(?:[^.]*)?phases?\s+out', re.IGNORECASE),
|
||||
re.compile(r'target\s+player\s+controls[^.]*phases?\s+out', re.IGNORECASE),
|
||||
]
|
||||
|
||||
# Self-reference patterns
|
||||
self_patterns = [
|
||||
re.compile(r'this\s+(?:creature|permanent|artifact|enchantment)\s+phases?\s+out', re.IGNORECASE),
|
||||
re.compile(r'~\s+phases?\s+out', re.IGNORECASE),
|
||||
# Triggered self-phasing (King of the Oathbreakers)
|
||||
re.compile(r'whenever.*(?:becomes\s+the\s+target|becomes\s+target).*(?:it|this\s+creature)\s+phases?\s+out', re.IGNORECASE),
|
||||
# Consequent self-phasing (Cyclonus: "connive. Then...phase out")
|
||||
re.compile(r'(?:then|,)\s+(?:it|this\s+creature)\s+phases?\s+out', re.IGNORECASE),
|
||||
# At end of turn/combat self-phasing
|
||||
re.compile(r'(?:at\s+(?:the\s+)?end\s+of|after).*(?:it|this\s+creature)\s+phases?\s+out', re.IGNORECASE),
|
||||
]
|
||||
|
||||
# Opponent patterns
|
||||
opponent_patterns = [
|
||||
re.compile(r'target\s+(?:\w+\s+)*(?:creature|permanent)\s+an?\s+opponents?\s+controls?\s+phases?\s+out', re.IGNORECASE),
|
||||
# Unqualified targets (can target opponents' stuff if no "you control" restriction)
|
||||
re.compile(r'(?:up\s+to\s+)?(?:one\s+|x\s+|that\s+many\s+)?(?:other\s+)?(?:another\s+)?target\s+(?:\w+\s+)*(?:creature|permanent|artifact|enchantment|nonland\s+permanent)s?(?:[^.]*)?phases?\s+out', re.IGNORECASE),
|
||||
re.compile(r'target\s+(?:\w+\s+)*(?:creature|permanent|artifact|enchantment|land|nonland\s+permanent)(?:,|\s+and)?\s+(?:then|and)?\s+it\s+phases?\s+out', re.IGNORECASE),
|
||||
]
|
||||
|
||||
# Your permanents patterns
|
||||
your_patterns = [
|
||||
# Explicit "you control"
|
||||
re.compile(r'(?:target\s+)?(?:creatures?|permanents?|nonland\s+permanents?)\s+you\s+control\s+phases?\s+out', re.IGNORECASE),
|
||||
re.compile(r'(?:target\s+)?(?:other\s+)?(?:creatures?|permanents?)\s+you\s+control\s+phases?\s+out', re.IGNORECASE),
|
||||
re.compile(r'permanents?\s+you\s+control\s+phase\s+out', re.IGNORECASE),
|
||||
re.compile(r'(?:any|up\s+to)\s+(?:number\s+of\s+)?(?:target\s+)?(?:other\s+)?(?:creatures?|permanents?|nonland\s+permanents?)\s+you\s+control\s+phases?\s+out', re.IGNORECASE),
|
||||
re.compile(r'all\s+(?:creatures?|permanents?)\s+you\s+control\s+phase\s+out', re.IGNORECASE),
|
||||
re.compile(r'each\s+(?:creature|permanent)\s+you\s+control\s+phases?\s+out', re.IGNORECASE),
|
||||
# Pronoun reference to "you control" context
|
||||
re.compile(r'(?:creatures?|permanents?|planeswalkers?)\s+you\s+control[^.]*(?:those|the)\s+(?:creatures?|permanents?|planeswalkers?)\s+phase\s+out', re.IGNORECASE),
|
||||
re.compile(r'creature\s+you\s+control[^.]*(?:it)\s+phases?\s+out', re.IGNORECASE),
|
||||
re.compile(r'you\s+control.*those\s+(?:creatures?|permanents?|planeswalkers?)\s+phase\s+out', re.IGNORECASE),
|
||||
# Equipment/Aura
|
||||
re.compile(r'equipped\s+(?:creature|permanent)\s+(?:gets\s+[^.]*\s+and\s+)?phases?\s+out', re.IGNORECASE),
|
||||
re.compile(r'enchanted\s+(?:creature|permanent)\s+(?:gets\s+[^.]*\s+and\s+)?phases?\s+out', re.IGNORECASE),
|
||||
re.compile(r'enchanted\s+(?:creature|permanent)\s+(?:has|gains?)\s+phasing', re.IGNORECASE),
|
||||
re.compile(r'(?:equipped|enchanted)\s+(?:creature|permanent)[^.]*,?\s+(?:then\s+)?that\s+(?:creature|permanent)\s+phases?\s+out', re.IGNORECASE),
|
||||
# Target controlled by specific player
|
||||
re.compile(r'(?:each|target)\s+(?:creature|permanent)\s+target\s+player\s+controls\s+phases?\s+out', re.IGNORECASE),
|
||||
]
|
||||
|
||||
# Blanket patterns
|
||||
blanket_patterns = [
|
||||
re.compile(r'all\s+(?:nontoken\s+)?(?:creatures?|permanents?)(?:\s+of\s+that\s+type)?\s+(?:[^.]*\s+)?phase\s+out', re.IGNORECASE),
|
||||
re.compile(r'each\s+(?:creature|permanent)\s+(?:[^.]*\s+)?phases?\s+out', re.IGNORECASE),
|
||||
# Type-specific blanket (Shimmer)
|
||||
re.compile(r'each\s+(?:land|creature|permanent|artifact|enchantment)\s+of\s+the\s+chosen\s+type\s+has\s+phasing', re.IGNORECASE),
|
||||
re.compile(r'(?:lands?|creatures?|permanents?|artifacts?|enchantments?)\s+of\s+the\s+chosen\s+type\s+(?:have|has)\s+phasing', re.IGNORECASE),
|
||||
# Pronoun reference to "all creatures"
|
||||
re.compile(r'all\s+(?:nontoken\s+)?(?:creatures?|permanents?)[^.]*,?\s+(?:then\s+)?(?:those|the)\s+(?:creatures?|permanents?)\s+phase\s+out', re.IGNORECASE),
|
||||
]
|
||||
|
||||
return scope_utils.ScopePatterns(
|
||||
opponent=opponent_patterns,
|
||||
self_ref=self_patterns,
|
||||
your_permanents=your_patterns,
|
||||
blanket=blanket_patterns,
|
||||
targeted=targeting_patterns
|
||||
)
|
||||
|
||||
|
||||
def get_phasing_scope_tags(text: str, card_name: str, keywords: str = '') -> Set[str]:
|
||||
"""
|
||||
Get all phasing scope metadata tags for a card.
|
||||
|
||||
A card can have multiple scope tags:
|
||||
- "Targeted: Phasing" - Uses targeting
|
||||
- "Self: Phasing" - Phases itself out
|
||||
- "Your Permanents: Phasing" - Phases your permanents
|
||||
- "Opponent Permanents: Phasing" - Phases opponent permanents (removal)
|
||||
- "Blanket: Phasing" - Phases all permanents
|
||||
|
||||
Args:
|
||||
text: Card text
|
||||
card_name: Card name
|
||||
keywords: Card keywords (to check for static "Phasing" ability)
|
||||
|
||||
Returns:
|
||||
Set of metadata tags
|
||||
"""
|
||||
if not card_name:
|
||||
return set()
|
||||
|
||||
text_lower = text.lower() if text else ''
|
||||
keywords_lower = keywords.lower() if keywords else ''
|
||||
tags = set()
|
||||
|
||||
# Check for static "Phasing" keyword ability (self-phasing)
|
||||
# Only add Self tag if card doesn't grant phasing to others
|
||||
if 'phasing' in keywords_lower:
|
||||
# Define patterns for checking if card grants phasing to others
|
||||
grants_pattern = [re.compile(
|
||||
r'(other|target|each|all|enchanted|equipped|creatures? you control|permanents? you control).*phas',
|
||||
re.IGNORECASE
|
||||
)]
|
||||
|
||||
is_static = scope_utils.check_static_keyword_legacy(
|
||||
keywords=keywords,
|
||||
static_keyword='phasing',
|
||||
text=text,
|
||||
grant_patterns=grants_pattern
|
||||
)
|
||||
|
||||
if is_static:
|
||||
tags.add('Self: Phasing')
|
||||
return tags # Early return - static keyword only
|
||||
|
||||
# Check if phasing is mentioned in text
|
||||
if 'phas' not in text_lower:
|
||||
return tags
|
||||
|
||||
# Build phasing patterns and detect scopes
|
||||
patterns = _get_phasing_scope_patterns()
|
||||
|
||||
# Detect all scopes (phasing can have multiple)
|
||||
scopes = scope_utils.detect_multi_scope(
|
||||
text=text,
|
||||
card_name=card_name,
|
||||
ability_keyword='phas', # Use 'phas' to catch both 'phase' and 'phasing'
|
||||
patterns=patterns,
|
||||
check_grant_verbs=False # Phasing doesn't need grant verb checking
|
||||
)
|
||||
|
||||
# Format scope tags with "Phasing" ability name
|
||||
for scope in scopes:
|
||||
if scope == "Targeted":
|
||||
tags.add("Targeted: Phasing")
|
||||
else:
|
||||
tags.add(scope_utils.format_scope_tag(scope, "Phasing"))
|
||||
logger.debug(f"Card '{card_name}': detected {scope}: Phasing")
|
||||
|
||||
return tags
|
||||
|
||||
|
||||
def has_phasing(text: str) -> bool:
|
||||
"""
|
||||
Quick check if card text contains phasing keywords.
|
||||
|
||||
Args:
|
||||
text: Card text
|
||||
|
||||
Returns:
|
||||
True if phasing keyword found
|
||||
"""
|
||||
if not text:
|
||||
return False
|
||||
|
||||
text_lower = text.lower()
|
||||
|
||||
# Check for phasing keywords
|
||||
phasing_keywords = [
|
||||
'phase out',
|
||||
'phases out',
|
||||
'phasing',
|
||||
'phase in',
|
||||
'phases in',
|
||||
]
|
||||
|
||||
return any(keyword in text_lower for keyword in phasing_keywords)
|
||||
|
||||
|
||||
def is_removal_phasing(tags: Set[str]) -> bool:
|
||||
"""
|
||||
Check if phasing effect acts as removal (targets opponent permanents).
|
||||
|
||||
Args:
|
||||
tags: Set of phasing scope tags
|
||||
|
||||
Returns:
|
||||
True if this is removal-style phasing
|
||||
"""
|
||||
return "Opponent Permanents: Phasing" in tags
|
||||
551
code/tagging/protection_grant_detection.py
Normal file
551
code/tagging/protection_grant_detection.py
Normal file
|
|
@ -0,0 +1,551 @@
|
|||
"""
|
||||
Protection grant detection implementation for M2.
|
||||
|
||||
This module provides helpers to distinguish cards that grant protection effects
|
||||
from cards that have inherent protection effects.
|
||||
|
||||
Usage in tagger.py:
|
||||
from code.tagging.protection_grant_detection import is_granting_protection
|
||||
|
||||
if is_granting_protection(text, keywords):
|
||||
# Tag as Protection
|
||||
"""
|
||||
import re
|
||||
from typing import List, Pattern, Set
|
||||
from . import regex_patterns as rgx
|
||||
from . import tag_utils
|
||||
from .tag_constants import CONTEXT_WINDOW_SIZE, CREATURE_TYPES, PROTECTION_KEYWORDS
|
||||
|
||||
|
||||
# Pre-compile kindred detection patterns at module load for performance
|
||||
# Pattern: (compiled_regex, tag_name_template)
|
||||
def _build_kindred_patterns() -> List[tuple[Pattern, str]]:
|
||||
"""Build pre-compiled kindred patterns for all creature types.
|
||||
|
||||
Returns:
|
||||
List of tuples containing (compiled_pattern, tag_name)
|
||||
"""
|
||||
patterns = []
|
||||
|
||||
for creature_type in CREATURE_TYPES:
|
||||
creature_lower = creature_type.lower()
|
||||
creature_escaped = re.escape(creature_lower)
|
||||
tag_name = f"{creature_type}s Gain Protection"
|
||||
pattern_templates = [
|
||||
rf'\bother {creature_escaped}s?\b.*\b(have|gain)\b',
|
||||
rf'\b{creature_escaped} creatures?\b.*\b(have|gain)\b',
|
||||
rf'\btarget {creature_escaped}\b.*\bgains?\b',
|
||||
]
|
||||
|
||||
for pattern_str in pattern_templates:
|
||||
try:
|
||||
compiled = re.compile(pattern_str, re.IGNORECASE)
|
||||
patterns.append((compiled, tag_name))
|
||||
except re.error:
|
||||
# Skip patterns that fail to compile
|
||||
pass
|
||||
|
||||
return patterns
|
||||
KINDRED_PATTERNS: List[tuple[Pattern, str]] = _build_kindred_patterns()
|
||||
|
||||
|
||||
# Grant verb patterns - cards that give protection to other permanents
|
||||
# These patterns look for grant verbs that affect OTHER permanents, not self
|
||||
# M5: Added phasing support
|
||||
# Pre-compiled at module load for performance
|
||||
GRANT_VERB_PATTERNS: List[Pattern] = [
|
||||
re.compile(r'\bgain[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection|phasing)\b', re.IGNORECASE),
|
||||
re.compile(r'\bgive[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection|phasing)\b', re.IGNORECASE),
|
||||
re.compile(r'\bgrant[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection|phasing)\b', re.IGNORECASE),
|
||||
re.compile(r'\bhave\b.*\b(hexproof|shroud|indestructible|ward|protection|phasing)\b', re.IGNORECASE), # "have hexproof" static grants
|
||||
re.compile(r'\bget[s]?\b.*\+.*\b(hexproof|shroud|indestructible|ward|protection|phasing)\b', re.IGNORECASE), # "gets +X/+X and has hexproof" direct
|
||||
re.compile(r'\bget[s]?\b.*\+.*\band\b.*\b(gain[s]?|have)\b.*\b(hexproof|shroud|indestructible|ward|protection|phasing)\b', re.IGNORECASE), # "gets +X/+X and gains hexproof"
|
||||
re.compile(r'\bphases? out\b', re.IGNORECASE), # M5: Direct phasing triggers (e.g., "it phases out")
|
||||
]
|
||||
|
||||
# Self-reference patterns that should NOT count as granting
|
||||
# Reminder text and keyword lines only
|
||||
# M5: Added phasing support
|
||||
# Pre-compiled at module load for performance
|
||||
SELF_REFERENCE_PATTERNS: List[Pattern] = [
|
||||
re.compile(r'^\s*(hexproof|shroud|indestructible|ward|protection|phasing)', re.IGNORECASE), # Start of text (keyword ability)
|
||||
re.compile(r'\([^)]*\b(hexproof|shroud|indestructible|ward|protection|phasing)[^)]*\)', re.IGNORECASE), # Reminder text in parens
|
||||
]
|
||||
|
||||
# Conditional self-grant patterns - activated/triggered abilities that grant to self
|
||||
# Pre-compiled at module load for performance
|
||||
CONDITIONAL_SELF_GRANT_PATTERNS: List[Pattern] = [
|
||||
# Activated abilities
|
||||
re.compile(r'\{[^}]*\}.*:.*\bthis (creature|permanent|artifact|enchantment)\b.*\bgain[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection)\b', re.IGNORECASE),
|
||||
re.compile(r'discard.*:.*\bthis (creature|permanent|artifact|enchantment)\b.*\bgain[s]?\b', re.IGNORECASE),
|
||||
re.compile(r'\{t\}.*:.*\bthis (creature|permanent|artifact|enchantment)\b.*\bgain[s]?\b', re.IGNORECASE),
|
||||
re.compile(r'sacrifice.*:.*\bthis (creature|permanent|artifact|enchantment)\b.*\bgain[s]?\b', re.IGNORECASE),
|
||||
re.compile(r'pay.*life.*:.*\bthis (creature|permanent|artifact|enchantment)\b.*\bgain[s]?\b', re.IGNORECASE),
|
||||
# Triggered abilities that grant to self only
|
||||
re.compile(r'whenever.*\b(this creature|this permanent|it)\b.*\bgain[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection)\b', re.IGNORECASE),
|
||||
re.compile(r'whenever you (cast|play|attack|cycle|discard|commit).*\b(this creature|this permanent|it)\b.*\bgain[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection)\b', re.IGNORECASE),
|
||||
re.compile(r'at the beginning.*\b(this creature|this permanent|it)\b.*\bgain[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection)\b', re.IGNORECASE),
|
||||
re.compile(r'whenever.*\b(this creature|this permanent)\b (attacks|enters|becomes).*\b(this creature|this permanent|it)\b.*\bgain[s]?\b', re.IGNORECASE),
|
||||
# Named self-references (e.g., "Pristine Skywise gains")
|
||||
re.compile(r'whenever you cast.*[A-Z][a-z]+.*gains.*\b(hexproof|shroud|indestructible|ward|protection)\b', re.IGNORECASE),
|
||||
re.compile(r'whenever you.*[A-Z][a-z]+.*gains.*\b(hexproof|shroud|indestructible|ward|protection)\b', re.IGNORECASE),
|
||||
# Static conditional abilities (as long as, if you control X)
|
||||
re.compile(r'as long as.*\b(this creature|this permanent|it|has)\b.*(has|gains?).*\b(hexproof|shroud|indestructible|ward|protection)\b', re.IGNORECASE),
|
||||
]
|
||||
|
||||
# Mass grant patterns - affects multiple creatures YOU control
|
||||
# Pre-compiled at module load for performance
|
||||
MASS_GRANT_PATTERNS: List[Pattern] = [
|
||||
re.compile(r'creatures you control (have|gain|get)', re.IGNORECASE),
|
||||
re.compile(r'other .* you control (have|gain|get)', re.IGNORECASE),
|
||||
re.compile(r'(artifacts?|enchantments?|permanents?) you control (have|gain|get)', re.IGNORECASE), # Artifacts you control have...
|
||||
re.compile(r'other (creatures?|artifacts?|enchantments?) (have|gain|get)', re.IGNORECASE), # Other creatures have...
|
||||
re.compile(r'all (creatures?|slivers?|permanents?) (have|gain|get)', re.IGNORECASE), # All creatures/slivers have...
|
||||
]
|
||||
|
||||
# Targeted grant patterns - must specify "you control"
|
||||
# Pre-compiled at module load for performance
|
||||
TARGETED_GRANT_PATTERNS: List[Pattern] = [
|
||||
re.compile(r'target .* you control (gains?|gets?|has)', re.IGNORECASE),
|
||||
re.compile(r'equipped creature (gains?|gets?|has)', re.IGNORECASE),
|
||||
re.compile(r'enchanted enchantment (gains?|gets?|has)', re.IGNORECASE),
|
||||
]
|
||||
|
||||
# Exclusion patterns - cards that remove or prevent protection
|
||||
# Pre-compiled at module load for performance
|
||||
EXCLUSION_PATTERNS: List[Pattern] = [
|
||||
re.compile(r"can't have (hexproof|indestructible|ward|shroud)", re.IGNORECASE),
|
||||
re.compile(r"lose[s]? (hexproof|indestructible|ward|shroud|protection)", re.IGNORECASE),
|
||||
re.compile(r"without (hexproof|indestructible|ward|shroud)", re.IGNORECASE),
|
||||
re.compile(r"protection from.*can't", re.IGNORECASE),
|
||||
]
|
||||
|
||||
# Opponent grant patterns - grants to opponent's permanents (EXCLUDE these)
|
||||
# NOTE: "all creatures" and "all permanents" are BLANKET effects (help you too),
|
||||
# not opponent grants. Only exclude effects that ONLY help opponents.
|
||||
# Pre-compiled at module load for performance
|
||||
OPPONENT_GRANT_PATTERNS: List[Pattern] = [
|
||||
rgx.TARGET_OPPONENT,
|
||||
rgx.EACH_OPPONENT,
|
||||
rgx.OPPONENT_CONTROL,
|
||||
re.compile(r'opponent.*permanents?.*have', re.IGNORECASE), # opponent's permanents have
|
||||
]
|
||||
|
||||
# Blanket grant patterns - affects all permanents regardless of controller
|
||||
# These are VALID protection grants that should be tagged (Blanket scope in M5)
|
||||
# Pre-compiled at module load for performance
|
||||
BLANKET_GRANT_PATTERNS: List[Pattern] = [
|
||||
re.compile(r'\ball creatures? (have|gain|get)\b', re.IGNORECASE), # All creatures gain hexproof
|
||||
re.compile(r'\ball permanents? (have|gain|get)\b', re.IGNORECASE), # All permanents gain indestructible
|
||||
re.compile(r'\beach creature (has|gains?|gets?)\b', re.IGNORECASE), # Each creature gains ward
|
||||
rgx.EACH_PLAYER, # Each player gains hexproof (very rare but valid blanket)
|
||||
]
|
||||
|
||||
# Kindred-specific grant patterns for metadata tagging
|
||||
KINDRED_GRANT_PATTERNS = {
|
||||
'Knights Gain Protection': [
|
||||
r'knight[s]? you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
r'other knight[s]?.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
],
|
||||
'Merfolk Gain Protection': [
|
||||
r'merfolk you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
r'other merfolk.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
],
|
||||
'Zombies Gain Protection': [
|
||||
r'zombie[s]? you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
r'other zombie[s]?.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
r'target.*zombie.*\bgain[s]?\b.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
],
|
||||
'Vampires Gain Protection': [
|
||||
r'vampire[s]? you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
r'other vampire[s]?.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
],
|
||||
'Elves Gain Protection': [
|
||||
r'el(f|ves) you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
r'other el(f|ves).*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
],
|
||||
'Dragons Gain Protection': [
|
||||
r'dragon[s]? you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
r'other dragon[s]?.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
],
|
||||
'Goblins Gain Protection': [
|
||||
r'goblin[s]? you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
r'other goblin[s]?.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
],
|
||||
'Slivers Gain Protection': [
|
||||
r'sliver[s]? you control.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
r'all sliver[s]?.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
r'other sliver[s]?.*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
],
|
||||
'Artifacts Gain Protection': [
|
||||
r'artifact[s]? you control (have|gain).*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
r'other artifact[s]? (have|gain).*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
],
|
||||
'Enchantments Gain Protection': [
|
||||
r'enchantment[s]? you control (have|gain).*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
r'other enchantment[s]? (have|gain).*\b(hexproof|shroud|indestructible|ward|protection)\b',
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
def get_kindred_protection_tags(text: str) -> Set[str]:
|
||||
"""
|
||||
Identify kindred-specific protection grants for metadata tagging.
|
||||
|
||||
Returns a set of metadata tag names like:
|
||||
- "Knights Gain Hexproof"
|
||||
- "Spiders Gain Ward"
|
||||
- "Artifacts Gain Indestructible"
|
||||
|
||||
Uses both predefined patterns and dynamic creature type detection,
|
||||
with specific ability detection (hexproof, ward, indestructible, shroud, protection).
|
||||
|
||||
IMPORTANT: Only tags the specific abilities that appear in the same sentence
|
||||
as the creature type grant to avoid false positives like Svyelun.
|
||||
"""
|
||||
if not text:
|
||||
return set()
|
||||
|
||||
text_lower = text.lower()
|
||||
tags = set()
|
||||
|
||||
# Only proceed if protective abilities are present (performance optimization)
|
||||
protective_abilities = ['hexproof', 'shroud', 'indestructible', 'ward', 'protection']
|
||||
if not any(keyword in text_lower for keyword in protective_abilities):
|
||||
return tags
|
||||
for tag_base, patterns in KINDRED_GRANT_PATTERNS.items():
|
||||
for pattern in patterns:
|
||||
pattern_compiled = re.compile(pattern, re.IGNORECASE) if isinstance(pattern, str) else pattern
|
||||
match = pattern_compiled.search(text_lower)
|
||||
if match:
|
||||
creature_type = tag_base.split(' Gain ')[0]
|
||||
# Get the matched text to check which abilities are in this specific grant
|
||||
matched_text = match.group(0)
|
||||
# Only tag abilities that appear in the matched phrase
|
||||
if 'hexproof' in matched_text:
|
||||
tags.add(f"{creature_type} Gain Hexproof")
|
||||
if 'shroud' in matched_text:
|
||||
tags.add(f"{creature_type} Gain Shroud")
|
||||
if 'indestructible' in matched_text:
|
||||
tags.add(f"{creature_type} Gain Indestructible")
|
||||
if 'ward' in matched_text:
|
||||
tags.add(f"{creature_type} Gain Ward")
|
||||
if 'protection' in matched_text:
|
||||
tags.add(f"{creature_type} Gain Protection")
|
||||
break # Found match for this kindred type, move to next
|
||||
|
||||
# Use pre-compiled patterns for all creature types
|
||||
for compiled_pattern, tag_template in KINDRED_PATTERNS:
|
||||
match = compiled_pattern.search(text_lower)
|
||||
if match:
|
||||
creature_type = tag_template.split(' Gain ')[0]
|
||||
# Get the matched text to check which abilities are in this specific grant
|
||||
matched_text = match.group(0)
|
||||
# Only tag abilities that appear in the matched phrase
|
||||
if 'hexproof' in matched_text:
|
||||
tags.add(f"{creature_type} Gain Hexproof")
|
||||
if 'shroud' in matched_text:
|
||||
tags.add(f"{creature_type} Gain Shroud")
|
||||
if 'indestructible' in matched_text:
|
||||
tags.add(f"{creature_type} Gain Indestructible")
|
||||
if 'ward' in matched_text:
|
||||
tags.add(f"{creature_type} Gain Ward")
|
||||
if 'protection' in matched_text:
|
||||
tags.add(f"{creature_type} Gain Protection")
|
||||
# Don't break - a card could grant to multiple creature types
|
||||
|
||||
return tags
|
||||
|
||||
|
||||
def is_opponent_grant(text: str) -> bool:
|
||||
"""
|
||||
Check if card grants protection to opponent's permanents ONLY.
|
||||
|
||||
Returns True if this grants ONLY to opponents (should be excluded from Protection tag).
|
||||
Does NOT exclude blanket effects like "all creatures gain hexproof" which help you too.
|
||||
"""
|
||||
if not text:
|
||||
return False
|
||||
|
||||
text_lower = text.lower()
|
||||
|
||||
# Remove reminder text (in parentheses) to avoid false positives
|
||||
# Reminder text often mentions "opponents control" for hexproof/shroud explanations
|
||||
text_no_reminder = tag_utils.strip_reminder_text(text_lower)
|
||||
for pattern in OPPONENT_GRANT_PATTERNS:
|
||||
match = pattern.search(text_no_reminder)
|
||||
if match:
|
||||
# Must be in context of granting protection
|
||||
if any(prot in text_lower for prot in ['hexproof', 'shroud', 'indestructible', 'ward', 'protection']):
|
||||
context = tag_utils.extract_context_window(
|
||||
text_no_reminder, match.start(), match.end(),
|
||||
window_size=CONTEXT_WINDOW_SIZE, include_before=True
|
||||
)
|
||||
|
||||
# If "you control" appears in the context, it's limiting to YOUR permanents, not opponents
|
||||
if 'you control' not in context:
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
|
||||
def has_conditional_self_grant(text: str) -> bool:
|
||||
"""
|
||||
Check if card has any conditional self-grant patterns.
|
||||
This does NOT check if it ALSO grants to others.
|
||||
"""
|
||||
if not text:
|
||||
return False
|
||||
|
||||
text_lower = text.lower()
|
||||
for pattern in CONDITIONAL_SELF_GRANT_PATTERNS:
|
||||
if pattern.search(text_lower):
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
|
||||
def is_conditional_self_grant(text: str) -> bool:
|
||||
"""
|
||||
Check if card only conditionally grants protection to itself.
|
||||
|
||||
Examples:
|
||||
- "{B}, Discard a card: This creature gains hexproof until end of turn."
|
||||
- "Whenever you cast a noncreature spell, untap this creature. It gains protection..."
|
||||
- "Whenever this creature attacks, it gains indestructible until end of turn."
|
||||
|
||||
These should be excluded as they don't provide protection to OTHER permanents.
|
||||
"""
|
||||
if not text:
|
||||
return False
|
||||
|
||||
text_lower = text.lower()
|
||||
found_conditional_self = has_conditional_self_grant(text)
|
||||
|
||||
if not found_conditional_self:
|
||||
return False
|
||||
|
||||
# If we found a conditional self-grant, check if there's ALSO a grant to others
|
||||
other_grant_patterns = [
|
||||
rgx.OTHER_CREATURES,
|
||||
re.compile(r'creatures you control (have|gain)', re.IGNORECASE),
|
||||
re.compile(r'target (creature|permanent) you control gains', re.IGNORECASE),
|
||||
re.compile(r'another target (creature|permanent)', re.IGNORECASE),
|
||||
re.compile(r'equipped creature (has|gains)', re.IGNORECASE),
|
||||
re.compile(r'enchanted creature (has|gains)', re.IGNORECASE),
|
||||
re.compile(r'target legendary', re.IGNORECASE),
|
||||
re.compile(r'permanents you control gain', re.IGNORECASE),
|
||||
]
|
||||
has_other_grant = any(pattern.search(text_lower) for pattern in other_grant_patterns)
|
||||
|
||||
# Return True only if it's ONLY conditional self-grants (no other grants)
|
||||
return not has_other_grant
|
||||
|
||||
|
||||
def _should_exclude_token_creation(text_lower: str) -> bool:
|
||||
"""Check if card only creates tokens with protection (not granting to existing permanents).
|
||||
|
||||
Args:
|
||||
text_lower: Lowercased card text
|
||||
|
||||
Returns:
|
||||
True if card only creates tokens, False if it also grants
|
||||
"""
|
||||
token_with_protection = re.compile(r'create.*token.*with.*(hexproof|shroud|indestructible|ward|protection)', re.IGNORECASE)
|
||||
if token_with_protection.search(text_lower):
|
||||
has_grant_to_others = any(pattern.search(text_lower) for pattern in MASS_GRANT_PATTERNS)
|
||||
return not has_grant_to_others
|
||||
return False
|
||||
|
||||
|
||||
def _should_exclude_kindred_only(text: str, text_lower: str, exclude_kindred: bool) -> bool:
|
||||
"""Check if card only grants to specific kindred types.
|
||||
|
||||
Args:
|
||||
text: Original card text
|
||||
text_lower: Lowercased card text
|
||||
exclude_kindred: Whether to exclude kindred-specific grants
|
||||
|
||||
Returns:
|
||||
True if card only has kindred grants, False if it has broad grants
|
||||
"""
|
||||
if not exclude_kindred:
|
||||
return False
|
||||
|
||||
kindred_tags = get_kindred_protection_tags(text)
|
||||
if not kindred_tags:
|
||||
return False
|
||||
broad_only_patterns = [
|
||||
re.compile(r'\bcreatures you control (have|gain)\b(?!.*(knight|merfolk|zombie|elf|dragon|goblin|sliver))', re.IGNORECASE),
|
||||
re.compile(r'\bpermanents you control (have|gain)\b', re.IGNORECASE),
|
||||
re.compile(r'\beach (creature|permanent) you control', re.IGNORECASE),
|
||||
re.compile(r'\ball (creatures?|permanents?)', re.IGNORECASE),
|
||||
]
|
||||
|
||||
has_broad_grant = any(pattern.search(text_lower) for pattern in broad_only_patterns)
|
||||
return not has_broad_grant
|
||||
|
||||
|
||||
def _check_pattern_grants(text_lower: str, pattern_list: List[Pattern]) -> bool:
|
||||
"""Check if text contains protection grants matching pattern list.
|
||||
|
||||
Args:
|
||||
text_lower: Lowercased card text
|
||||
pattern_list: List of grant patterns to check
|
||||
|
||||
Returns:
|
||||
True if protection grant found, False otherwise
|
||||
"""
|
||||
for pattern in pattern_list:
|
||||
match = pattern.search(text_lower)
|
||||
if match:
|
||||
context = tag_utils.extract_context_window(text_lower, match.start(), match.end())
|
||||
if any(prot in context for prot in PROTECTION_KEYWORDS):
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
def _has_inherent_protection_only(text_lower: str, keywords: str, found_grant: bool) -> bool:
|
||||
"""Check if card only has inherent protection without granting.
|
||||
|
||||
Args:
|
||||
text_lower: Lowercased card text
|
||||
keywords: Card keywords
|
||||
found_grant: Whether a grant pattern was found
|
||||
|
||||
Returns:
|
||||
True if card only has inherent protection, False otherwise
|
||||
"""
|
||||
if not keywords:
|
||||
return False
|
||||
|
||||
keywords_lower = keywords.lower()
|
||||
has_inherent = any(k in keywords_lower for k in PROTECTION_KEYWORDS)
|
||||
|
||||
if not has_inherent or found_grant:
|
||||
return False
|
||||
stat_only_pattern = re.compile(r'(get[s]?|gain[s]?)\s+[+\-][0-9X]+/[+\-][0-9X]+', re.IGNORECASE)
|
||||
has_stat_only = bool(stat_only_pattern.search(text_lower))
|
||||
mentions_other_without_prot = False
|
||||
if 'other' in text_lower:
|
||||
other_idx = text_lower.find('other')
|
||||
remaining_text = text_lower[other_idx:]
|
||||
mentions_other_without_prot = not any(prot in remaining_text for prot in PROTECTION_KEYWORDS)
|
||||
|
||||
return has_stat_only or mentions_other_without_prot
|
||||
|
||||
|
||||
def is_granting_protection(text: str, keywords: str, exclude_kindred: bool = False) -> bool:
|
||||
"""
|
||||
Determine if a card grants protection effects to other permanents.
|
||||
|
||||
Returns True if the card gives/grants protection to other cards unconditionally.
|
||||
Returns False if:
|
||||
- Card only has inherent protection
|
||||
- Card only conditionally grants to itself
|
||||
- Card grants to opponent's permanents
|
||||
- Card grants only to specific kindred types (when exclude_kindred=True)
|
||||
- Card creates tokens with protection (not granting to existing permanents)
|
||||
- Card only modifies non-protection stats of other permanents
|
||||
|
||||
Args:
|
||||
text: Card text to analyze
|
||||
keywords: Card keywords (comma-separated)
|
||||
exclude_kindred: If True, exclude kindred-specific grants
|
||||
|
||||
Returns:
|
||||
True if card grants broad protection, False otherwise
|
||||
"""
|
||||
if not text:
|
||||
return False
|
||||
|
||||
text_lower = text.lower()
|
||||
|
||||
# Early exclusion checks
|
||||
if is_opponent_grant(text):
|
||||
return False
|
||||
|
||||
if is_conditional_self_grant(text):
|
||||
return False
|
||||
|
||||
if any(pattern.search(text_lower) for pattern in EXCLUSION_PATTERNS):
|
||||
return False
|
||||
|
||||
if _should_exclude_token_creation(text_lower):
|
||||
return False
|
||||
|
||||
if _should_exclude_kindred_only(text, text_lower, exclude_kindred):
|
||||
return False
|
||||
found_grant = False
|
||||
if _check_pattern_grants(text_lower, BLANKET_GRANT_PATTERNS):
|
||||
found_grant = True
|
||||
elif _check_pattern_grants(text_lower, MASS_GRANT_PATTERNS):
|
||||
found_grant = True
|
||||
elif _check_pattern_grants(text_lower, TARGETED_GRANT_PATTERNS):
|
||||
found_grant = True
|
||||
elif any(pattern.search(text_lower) for pattern in GRANT_VERB_PATTERNS):
|
||||
found_grant = True
|
||||
if _has_inherent_protection_only(text_lower, keywords, found_grant):
|
||||
return False
|
||||
|
||||
return found_grant
|
||||
|
||||
|
||||
def categorize_protection_card(name: str, text: str, keywords: str, card_type: str, exclude_kindred: bool = False) -> str:
|
||||
"""
|
||||
Categorize a Protection-tagged card for audit purposes.
|
||||
|
||||
Args:
|
||||
name: Card name
|
||||
text: Card text
|
||||
keywords: Card keywords
|
||||
card_type: Card type line
|
||||
exclude_kindred: If True, kindred-specific grants are categorized as metadata, not Grant
|
||||
|
||||
Returns:
|
||||
'Grant' - gives broad protection to others
|
||||
'Kindred' - gives kindred-specific protection (metadata tag)
|
||||
'Inherent' - has protection itself
|
||||
'ConditionalSelf' - only conditionally grants to itself
|
||||
'Opponent' - grants to opponent's permanents
|
||||
'Neither' - false positive
|
||||
"""
|
||||
keywords_lower = keywords.lower() if keywords else ''
|
||||
if is_opponent_grant(text):
|
||||
return 'Opponent'
|
||||
if is_conditional_self_grant(text):
|
||||
return 'ConditionalSelf'
|
||||
has_cond_self = has_conditional_self_grant(text)
|
||||
has_inherent = any(k in keywords_lower for k in PROTECTION_KEYWORDS)
|
||||
kindred_tags = get_kindred_protection_tags(text)
|
||||
if kindred_tags and exclude_kindred:
|
||||
grants_broad = is_granting_protection(text, keywords, exclude_kindred=True)
|
||||
|
||||
if grants_broad and has_inherent:
|
||||
# Has inherent + kindred + broad grants
|
||||
return 'Mixed'
|
||||
elif grants_broad:
|
||||
# Has kindred + broad grants (but no inherent)
|
||||
# This is just Grant with kindred metadata tags
|
||||
return 'Grant'
|
||||
elif has_inherent:
|
||||
# Has inherent + kindred only (not broad)
|
||||
# This is still just Kindred category (inherent is separate from granting)
|
||||
return 'Kindred'
|
||||
else:
|
||||
# Only kindred grants, no inherent or broad
|
||||
return 'Kindred'
|
||||
grants_protection = is_granting_protection(text, keywords, exclude_kindred=exclude_kindred)
|
||||
|
||||
# Categorize based on what it does
|
||||
if grants_protection and has_cond_self:
|
||||
# Has conditional self-grant + grants to others = Mixed
|
||||
return 'Mixed'
|
||||
elif grants_protection and has_inherent:
|
||||
return 'Mixed' # Has inherent + grants broadly
|
||||
elif grants_protection:
|
||||
return 'Grant' # Only grants broadly
|
||||
elif has_inherent:
|
||||
return 'Inherent' # Only has inherent
|
||||
else:
|
||||
return 'Neither' # False positive
|
||||
169
code/tagging/protection_scope_detection.py
Normal file
169
code/tagging/protection_scope_detection.py
Normal file
|
|
@ -0,0 +1,169 @@
|
|||
"""
|
||||
Protection Scope Detection Module
|
||||
|
||||
Detects the scope of protection effects (Self, Your Permanents, Blanket, Opponent Permanents)
|
||||
to enable intelligent filtering in deck building.
|
||||
|
||||
Part of M5: Protection Effect Granularity milestone.
|
||||
Refactored in M2: Create Scope Detection Utilities to use generic scope detection.
|
||||
"""
|
||||
|
||||
# Standard library imports
|
||||
import re
|
||||
from typing import Optional, Set
|
||||
|
||||
# Local application imports
|
||||
from code.logging_util import get_logger
|
||||
from . import scope_detection_utils as scope_utils
|
||||
from .tag_constants import PROTECTION_ABILITIES
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
# Protection scope pattern definitions
|
||||
def _get_protection_scope_patterns(ability: str) -> scope_utils.ScopePatterns:
|
||||
"""
|
||||
Build scope patterns for protection abilities.
|
||||
|
||||
Args:
|
||||
ability: Ability keyword (e.g., "hexproof", "ward")
|
||||
|
||||
Returns:
|
||||
ScopePatterns object with compiled patterns
|
||||
"""
|
||||
ability_lower = ability.lower()
|
||||
|
||||
# Opponent patterns: grants protection TO opponent's permanents
|
||||
# Note: Must distinguish from hexproof reminder text "opponents control [spells/abilities]"
|
||||
opponent_patterns = [
|
||||
re.compile(r'creatures?\s+(?:your\s+)?opponents?\s+control\s+(?:have|gain)', re.IGNORECASE),
|
||||
re.compile(r'permanents?\s+(?:your\s+)?opponents?\s+control\s+(?:have|gain)', re.IGNORECASE),
|
||||
re.compile(r'each\s+creature\s+an?\s+opponent\s+controls?\s+(?:has|gains?)', re.IGNORECASE),
|
||||
]
|
||||
|
||||
# Self-reference patterns
|
||||
self_patterns = [
|
||||
# Tilde (~) - strong self-reference indicator
|
||||
re.compile(r'~\s+(?:has|gains?)\s+' + ability_lower, re.IGNORECASE),
|
||||
re.compile(r'~\s+is\s+' + ability_lower, re.IGNORECASE),
|
||||
# "this creature/permanent" pronouns
|
||||
re.compile(r'this\s+(?:creature|permanent|artifact|enchantment)\s+(?:has|gains?)\s+' + ability_lower, re.IGNORECASE),
|
||||
# Starts with ability (likely self)
|
||||
re.compile(r'^(?:has|gains?)\s+' + ability_lower, re.IGNORECASE),
|
||||
]
|
||||
|
||||
# Your permanents patterns
|
||||
your_patterns = [
|
||||
re.compile(r'(?:other\s+)?(?:creatures?|permanents?|artifacts?|enchantments?)\s+you\s+control', re.IGNORECASE),
|
||||
re.compile(r'your\s+(?:creatures?|permanents?|artifacts?|enchantments?)', re.IGNORECASE),
|
||||
re.compile(r'each\s+(?:creature|permanent)\s+you\s+control', re.IGNORECASE),
|
||||
re.compile(r'other\s+\w+s?\s+you\s+control', re.IGNORECASE), # "Other Merfolk you control", etc.
|
||||
# "Other X you control...have Y" pattern for static grants
|
||||
re.compile(r'other\s+(?:\w+\s+)?(?:creatures?|permanents?)\s+you\s+control\s+(?:get\s+[^.]*\s+and\s+)?have\s+' + ability_lower, re.IGNORECASE),
|
||||
re.compile(r'other\s+\w+s?\s+you\s+control\s+(?:get\s+[^.]*\s+and\s+)?have\s+' + ability_lower, re.IGNORECASE), # "Other Knights you control...have"
|
||||
re.compile(r'equipped\s+(?:creature|permanent)\s+(?:gets\s+[^.]*\s+and\s+)?(?:has|gains?)\s+(?:[^.]*\s+and\s+)?' + ability_lower, re.IGNORECASE), # Equipment
|
||||
re.compile(r'enchanted\s+(?:creature|permanent)\s+(?:gets\s+[^.]*\s+and\s+)?(?:has|gains?)\s+(?:[^.]*\s+and\s+)?' + ability_lower, re.IGNORECASE), # Aura
|
||||
re.compile(r'target\s+(?:\w+\s+)?(?:creature|permanent)\s+(?:gets\s+[^.]*\s+and\s+)?(?:gains?)\s+' + ability_lower, re.IGNORECASE), # Target
|
||||
]
|
||||
|
||||
# Blanket patterns (no ownership qualifier)
|
||||
# Note: Abilities can be listed with "and" (e.g., "gain hexproof and indestructible")
|
||||
blanket_patterns = [
|
||||
re.compile(r'all\s+(?:creatures?|permanents?)\s+(?:have|gain)\s+(?:[^.]*\s+and\s+)?' + ability_lower, re.IGNORECASE),
|
||||
re.compile(r'each\s+(?:creature|permanent)\s+(?:has|gains?)\s+(?:[^.]*\s+and\s+)?' + ability_lower, re.IGNORECASE),
|
||||
re.compile(r'(?:creatures?|permanents?)\s+(?:have|gain)\s+(?:[^.]*\s+and\s+)?' + ability_lower, re.IGNORECASE),
|
||||
]
|
||||
|
||||
return scope_utils.ScopePatterns(
|
||||
opponent=opponent_patterns,
|
||||
self_ref=self_patterns,
|
||||
your_permanents=your_patterns,
|
||||
blanket=blanket_patterns
|
||||
)
|
||||
|
||||
|
||||
def detect_protection_scope(text: str, card_name: str, ability: str, keywords: Optional[str] = None) -> Optional[str]:
|
||||
"""
|
||||
Detect the scope of a protection effect.
|
||||
|
||||
Detection priority order (prevents misclassification):
|
||||
0. Static keyword → "Self"
|
||||
1. Opponent ownership → "Opponent Permanents"
|
||||
2. Self-reference → "Self"
|
||||
3. Your ownership → "Your Permanents"
|
||||
4. No ownership qualifier → "Blanket"
|
||||
|
||||
Args:
|
||||
text: Card text (lowercase for pattern matching)
|
||||
card_name: Card name (for self-reference detection)
|
||||
ability: Ability type (Ward, Hexproof, etc.)
|
||||
keywords: Optional keywords field for static keyword detection
|
||||
|
||||
Returns:
|
||||
Scope prefix or None: "Self", "Your Permanents", "Blanket", "Opponent Permanents"
|
||||
"""
|
||||
if not text or not ability:
|
||||
return None
|
||||
|
||||
# Build patterns for this ability
|
||||
patterns = _get_protection_scope_patterns(ability)
|
||||
|
||||
# Use generic scope detection with grant verb checking AND keywords
|
||||
return scope_utils.detect_scope(
|
||||
text=text,
|
||||
card_name=card_name,
|
||||
ability_keyword=ability,
|
||||
patterns=patterns,
|
||||
allow_multiple=False,
|
||||
check_grant_verbs=True,
|
||||
keywords=keywords
|
||||
)
|
||||
|
||||
|
||||
def get_protection_scope_tags(text: str, card_name: str, keywords: Optional[str] = None) -> Set[str]:
|
||||
"""
|
||||
Get all protection scope metadata tags for a card.
|
||||
|
||||
A card can have multiple protection scopes (e.g., self-hexproof + grants ward to others).
|
||||
|
||||
Args:
|
||||
text: Card text
|
||||
card_name: Card name
|
||||
keywords: Optional keywords field for static keyword detection
|
||||
|
||||
Returns:
|
||||
Set of metadata tags like {"Self: Indestructible", "Your Permanents: Ward"}
|
||||
"""
|
||||
if not text or not card_name:
|
||||
return set()
|
||||
|
||||
scope_tags = set()
|
||||
|
||||
# Check each protection ability
|
||||
for ability in PROTECTION_ABILITIES:
|
||||
scope = detect_protection_scope(text, card_name, ability, keywords)
|
||||
|
||||
if scope:
|
||||
# Format: "{Scope}: {Ability}"
|
||||
tag = f"{scope}: {ability}"
|
||||
scope_tags.add(tag)
|
||||
logger.debug(f"Card '{card_name}': detected scope tag '{tag}'")
|
||||
|
||||
return scope_tags
|
||||
|
||||
|
||||
def has_any_protection(text: str) -> bool:
|
||||
"""
|
||||
Quick check if card text contains any protection keywords.
|
||||
|
||||
Args:
|
||||
text: Card text
|
||||
|
||||
Returns:
|
||||
True if any protection keyword found
|
||||
"""
|
||||
if not text:
|
||||
return False
|
||||
|
||||
text_lower = text.lower()
|
||||
return any(ability.lower() in text_lower for ability in PROTECTION_ABILITIES)
|
||||
455
code/tagging/regex_patterns.py
Normal file
455
code/tagging/regex_patterns.py
Normal file
|
|
@ -0,0 +1,455 @@
|
|||
"""
|
||||
Centralized regex patterns for MTG card tagging.
|
||||
|
||||
All patterns compiled with re.IGNORECASE for case-insensitive matching.
|
||||
Organized by semantic category for maintainability and reusability.
|
||||
|
||||
Usage:
|
||||
from code.tagging import regex_patterns as rgx
|
||||
|
||||
mask = df['text'].str.contains(rgx.YOU_CONTROL, na=False)
|
||||
if rgx.GRANT_HEXPROOF.search(text):
|
||||
...
|
||||
|
||||
# Or use builder functions
|
||||
pattern = rgx.ownership_pattern('creature', 'you')
|
||||
mask = df['text'].str.contains(pattern, na=False)
|
||||
"""
|
||||
|
||||
import re
|
||||
from typing import Pattern, List
|
||||
|
||||
# =============================================================================
|
||||
# OWNERSHIP & CONTROLLER PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
YOU_CONTROL: Pattern = re.compile(r'you control', re.IGNORECASE)
|
||||
THEY_CONTROL: Pattern = re.compile(r'they control', re.IGNORECASE)
|
||||
OPPONENT_CONTROL: Pattern = re.compile(r'opponent[s]? control', re.IGNORECASE)
|
||||
|
||||
CREATURE_YOU_CONTROL: Pattern = re.compile(r'creature[s]? you control', re.IGNORECASE)
|
||||
PERMANENT_YOU_CONTROL: Pattern = re.compile(r'permanent[s]? you control', re.IGNORECASE)
|
||||
ARTIFACT_YOU_CONTROL: Pattern = re.compile(r'artifact[s]? you control', re.IGNORECASE)
|
||||
ENCHANTMENT_YOU_CONTROL: Pattern = re.compile(r'enchantment[s]? you control', re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# GRANT VERB PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
GAIN: Pattern = re.compile(r'\bgain[s]?\b', re.IGNORECASE)
|
||||
HAS: Pattern = re.compile(r'\bhas\b', re.IGNORECASE)
|
||||
HAVE: Pattern = re.compile(r'\bhave\b', re.IGNORECASE)
|
||||
GET: Pattern = re.compile(r'\bget[s]?\b', re.IGNORECASE)
|
||||
|
||||
GRANT_VERBS: List[str] = ['gain', 'gains', 'has', 'have', 'get', 'gets']
|
||||
|
||||
# =============================================================================
|
||||
# TARGETING PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
TARGET_PLAYER: Pattern = re.compile(r'target player', re.IGNORECASE)
|
||||
TARGET_OPPONENT: Pattern = re.compile(r'target opponent', re.IGNORECASE)
|
||||
TARGET_CREATURE: Pattern = re.compile(r'target creature', re.IGNORECASE)
|
||||
TARGET_PERMANENT: Pattern = re.compile(r'target permanent', re.IGNORECASE)
|
||||
TARGET_ARTIFACT: Pattern = re.compile(r'target artifact', re.IGNORECASE)
|
||||
TARGET_ENCHANTMENT: Pattern = re.compile(r'target enchantment', re.IGNORECASE)
|
||||
|
||||
EACH_PLAYER: Pattern = re.compile(r'each player', re.IGNORECASE)
|
||||
EACH_OPPONENT: Pattern = re.compile(r'each opponent', re.IGNORECASE)
|
||||
TARGET_YOU_CONTROL: Pattern = re.compile(r'target .* you control', re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# PROTECTION ABILITY PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
HEXPROOF: Pattern = re.compile(r'\bhexproof\b', re.IGNORECASE)
|
||||
SHROUD: Pattern = re.compile(r'\bshroud\b', re.IGNORECASE)
|
||||
INDESTRUCTIBLE: Pattern = re.compile(r'\bindestructible\b', re.IGNORECASE)
|
||||
WARD: Pattern = re.compile(r'\bward\b', re.IGNORECASE)
|
||||
PROTECTION_FROM: Pattern = re.compile(r'protection from', re.IGNORECASE)
|
||||
|
||||
PROTECTION_ABILITIES: List[str] = ['hexproof', 'shroud', 'indestructible', 'ward', 'protection']
|
||||
|
||||
CANT_HAVE_PROTECTION: Pattern = re.compile(r"can't have (hexproof|indestructible|ward|shroud)", re.IGNORECASE)
|
||||
LOSE_PROTECTION: Pattern = re.compile(r"lose[s]? (hexproof|indestructible|ward|shroud|protection)", re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# CARD DRAW PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
DRAW_A_CARD: Pattern = re.compile(r'draw[s]? (?:a|one) card', re.IGNORECASE)
|
||||
DRAW_CARDS: Pattern = re.compile(r'draw[s]? (?:two|three|four|five|x|\d+) card', re.IGNORECASE)
|
||||
DRAW: Pattern = re.compile(r'\bdraw[s]?\b', re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# TOKEN CREATION PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
CREATE_TOKEN: Pattern = re.compile(r'create[s]?.*token', re.IGNORECASE)
|
||||
PUT_TOKEN: Pattern = re.compile(r'put[s]?.*token', re.IGNORECASE)
|
||||
|
||||
CREATE_TREASURE: Pattern = re.compile(r'create.*treasure token', re.IGNORECASE)
|
||||
CREATE_FOOD: Pattern = re.compile(r'create.*food token', re.IGNORECASE)
|
||||
CREATE_CLUE: Pattern = re.compile(r'create.*clue token', re.IGNORECASE)
|
||||
CREATE_BLOOD: Pattern = re.compile(r'create.*blood token', re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# COUNTER PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
PLUS_ONE_COUNTER: Pattern = re.compile(r'\+1/\+1 counter', re.IGNORECASE)
|
||||
MINUS_ONE_COUNTER: Pattern = re.compile(r'\-1/\-1 counter', re.IGNORECASE)
|
||||
LOYALTY_COUNTER: Pattern = re.compile(r'loyalty counter', re.IGNORECASE)
|
||||
PROLIFERATE: Pattern = re.compile(r'\bproliferate\b', re.IGNORECASE)
|
||||
|
||||
ONE_OR_MORE_COUNTERS: Pattern = re.compile(r'one or more counter', re.IGNORECASE)
|
||||
ONE_OR_MORE_PLUS_ONE_COUNTERS: Pattern = re.compile(r'one or more \+1/\+1 counter', re.IGNORECASE)
|
||||
IF_HAD_COUNTERS: Pattern = re.compile(r'if it had counter', re.IGNORECASE)
|
||||
WITH_COUNTERS_ON_THEM: Pattern = re.compile(r'with counter[s]? on them', re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# SACRIFICE & REMOVAL PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
SACRIFICE: Pattern = re.compile(r'sacrifice[s]?', re.IGNORECASE)
|
||||
SACRIFICED: Pattern = re.compile(r'sacrificed', re.IGNORECASE)
|
||||
DESTROY: Pattern = re.compile(r'destroy[s]?', re.IGNORECASE)
|
||||
EXILE: Pattern = re.compile(r'exile[s]?', re.IGNORECASE)
|
||||
EXILED: Pattern = re.compile(r'exiled', re.IGNORECASE)
|
||||
|
||||
SACRIFICE_DRAW: Pattern = re.compile(r'sacrifice (?:a|an) (?:artifact|creature|permanent)(?:[^,]*),?[^,]*draw', re.IGNORECASE)
|
||||
SACRIFICE_COLON_DRAW: Pattern = re.compile(r'sacrifice [^:]+: draw', re.IGNORECASE)
|
||||
SACRIFICED_COMMA_DRAW: Pattern = re.compile(r'sacrificed[^,]+, draw', re.IGNORECASE)
|
||||
EXILE_RETURN_BATTLEFIELD: Pattern = re.compile(r'exile.*return.*to the battlefield', re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# DISCARD PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
DISCARD_A_CARD: Pattern = re.compile(r'discard (?:a|one|two|three|x) card', re.IGNORECASE)
|
||||
DISCARD_YOUR_HAND: Pattern = re.compile(r'discard your hand', re.IGNORECASE)
|
||||
YOU_DISCARD: Pattern = re.compile(r'you discard', re.IGNORECASE)
|
||||
|
||||
# Discard triggers
|
||||
WHENEVER_YOU_DISCARD: Pattern = re.compile(r'whenever you discard', re.IGNORECASE)
|
||||
IF_YOU_DISCARDED: Pattern = re.compile(r'if you discarded', re.IGNORECASE)
|
||||
WHEN_YOU_DISCARD: Pattern = re.compile(r'when you discard', re.IGNORECASE)
|
||||
FOR_EACH_DISCARDED: Pattern = re.compile(r'for each card you discarded', re.IGNORECASE)
|
||||
|
||||
# Opponent discard
|
||||
TARGET_PLAYER_DISCARDS: Pattern = re.compile(r'target player discards', re.IGNORECASE)
|
||||
TARGET_OPPONENT_DISCARDS: Pattern = re.compile(r'target opponent discards', re.IGNORECASE)
|
||||
EACH_PLAYER_DISCARDS: Pattern = re.compile(r'each player discards', re.IGNORECASE)
|
||||
EACH_OPPONENT_DISCARDS: Pattern = re.compile(r'each opponent discards', re.IGNORECASE)
|
||||
THAT_PLAYER_DISCARDS: Pattern = re.compile(r'that player discards', re.IGNORECASE)
|
||||
|
||||
# Discard cost
|
||||
ADDITIONAL_COST_DISCARD: Pattern = re.compile(r'as an additional cost to (?:cast this spell|activate this ability),? discard (?:a|one) card', re.IGNORECASE)
|
||||
ADDITIONAL_COST_DISCARD_SHORT: Pattern = re.compile(r'as an additional cost,? discard (?:a|one) card', re.IGNORECASE)
|
||||
|
||||
MADNESS: Pattern = re.compile(r'\bmadness\b', re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# DAMAGE & LIFE LOSS PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
DEALS_ONE_DAMAGE: Pattern = re.compile(r'deals\s+1\s+damage', re.IGNORECASE)
|
||||
EXACTLY_ONE_DAMAGE: Pattern = re.compile(r'exactly\s+1\s+damage', re.IGNORECASE)
|
||||
LOSES_ONE_LIFE: Pattern = re.compile(r'loses\s+1\s+life', re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# COST REDUCTION PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
COST_LESS: Pattern = re.compile(r'cost[s]? \{[\d\w]\} less', re.IGNORECASE)
|
||||
COST_LESS_TO_CAST: Pattern = re.compile(r'cost[s]? less to cast', re.IGNORECASE)
|
||||
WITH_X_IN_COST: Pattern = re.compile(r'with \{[xX]\} in (?:its|their)', re.IGNORECASE)
|
||||
AFFINITY_FOR: Pattern = re.compile(r'affinity for', re.IGNORECASE)
|
||||
SPELLS_COST: Pattern = re.compile(r'spells cost', re.IGNORECASE)
|
||||
SPELLS_YOU_CAST_COST: Pattern = re.compile(r'spells you cast cost', re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# MONARCH & INITIATIVE PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
BECOME_MONARCH: Pattern = re.compile(r'becomes? the monarch', re.IGNORECASE)
|
||||
IS_MONARCH: Pattern = re.compile(r'is the monarch', re.IGNORECASE)
|
||||
WAS_MONARCH: Pattern = re.compile(r'was the monarch', re.IGNORECASE)
|
||||
YOU_ARE_MONARCH: Pattern = re.compile(r"you are the monarch|you're the monarch", re.IGNORECASE)
|
||||
YOU_BECOME_MONARCH: Pattern = re.compile(r'you become the monarch', re.IGNORECASE)
|
||||
CANT_BECOME_MONARCH: Pattern = re.compile(r"can't become the monarch", re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# KEYWORD ABILITY PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
PARTNER_BASIC: Pattern = re.compile(r'\bpartner\b(?!\s*(?:with|[-—–]))', re.IGNORECASE)
|
||||
PARTNER_WITH: Pattern = re.compile(r'partner with', re.IGNORECASE)
|
||||
PARTNER_SURVIVORS: Pattern = re.compile(r'Partner\s*[-—–]\s*Survivors', re.IGNORECASE)
|
||||
PARTNER_FATHER_SON: Pattern = re.compile(r'Partner\s*[-—–]\s*Father\s*&\s*Son', re.IGNORECASE)
|
||||
|
||||
FLYING: Pattern = re.compile(r'\bflying\b', re.IGNORECASE)
|
||||
VIGILANCE: Pattern = re.compile(r'\bvigilance\b', re.IGNORECASE)
|
||||
TRAMPLE: Pattern = re.compile(r'\btrample\b', re.IGNORECASE)
|
||||
HASTE: Pattern = re.compile(r'\bhaste\b', re.IGNORECASE)
|
||||
LIFELINK: Pattern = re.compile(r'\blifelink\b', re.IGNORECASE)
|
||||
DEATHTOUCH: Pattern = re.compile(r'\bdeathtouch\b', re.IGNORECASE)
|
||||
DOUBLE_STRIKE: Pattern = re.compile(r'double strike', re.IGNORECASE)
|
||||
FIRST_STRIKE: Pattern = re.compile(r'first strike', re.IGNORECASE)
|
||||
MENACE: Pattern = re.compile(r'\bmenace\b', re.IGNORECASE)
|
||||
REACH: Pattern = re.compile(r'\breach\b', re.IGNORECASE)
|
||||
|
||||
UNDYING: Pattern = re.compile(r'\bundying\b', re.IGNORECASE)
|
||||
PERSIST: Pattern = re.compile(r'\bpersist\b', re.IGNORECASE)
|
||||
PHASING: Pattern = re.compile(r'\bphasing\b', re.IGNORECASE)
|
||||
FLASH: Pattern = re.compile(r'\bflash\b', re.IGNORECASE)
|
||||
TOXIC: Pattern = re.compile(r'toxic\s*\d+', re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# RETURN TO BATTLEFIELD PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
RETURN_TO_BATTLEFIELD: Pattern = re.compile(r'return.*to the battlefield', re.IGNORECASE)
|
||||
RETURN_IT_TO_BATTLEFIELD: Pattern = re.compile(r'return it to the battlefield', re.IGNORECASE)
|
||||
RETURN_THAT_CARD_TO_BATTLEFIELD: Pattern = re.compile(r'return that card to the battlefield', re.IGNORECASE)
|
||||
RETURN_THEM_TO_BATTLEFIELD: Pattern = re.compile(r'return them to the battlefield', re.IGNORECASE)
|
||||
RETURN_THOSE_CARDS_TO_BATTLEFIELD: Pattern = re.compile(r'return those cards to the battlefield', re.IGNORECASE)
|
||||
|
||||
RETURN_TO_HAND: Pattern = re.compile(r'return.*to.*hand', re.IGNORECASE)
|
||||
RETURN_YOU_CONTROL_TO_HAND: Pattern = re.compile(r'return target.*you control.*to.*hand', re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# SCOPE & QUALIFIER PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
OTHER_CREATURES: Pattern = re.compile(r'other creature[s]?', re.IGNORECASE)
|
||||
ALL_CREATURES: Pattern = re.compile(r'\ball creature[s]?\b', re.IGNORECASE)
|
||||
ALL_PERMANENTS: Pattern = re.compile(r'\ball permanent[s]?\b', re.IGNORECASE)
|
||||
ALL_SLIVERS: Pattern = re.compile(r'\ball sliver[s]?\b', re.IGNORECASE)
|
||||
|
||||
EQUIPPED_CREATURE: Pattern = re.compile(r'equipped creature', re.IGNORECASE)
|
||||
ENCHANTED_CREATURE: Pattern = re.compile(r'enchanted creature', re.IGNORECASE)
|
||||
ENCHANTED_PERMANENT: Pattern = re.compile(r'enchanted permanent', re.IGNORECASE)
|
||||
ENCHANTED_ENCHANTMENT: Pattern = re.compile(r'enchanted enchantment', re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# COMBAT PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
ATTACK: Pattern = re.compile(r'\battack[s]?\b', re.IGNORECASE)
|
||||
ATTACKS: Pattern = re.compile(r'\battacks\b', re.IGNORECASE)
|
||||
BLOCK: Pattern = re.compile(r'\bblock[s]?\b', re.IGNORECASE)
|
||||
BLOCKS: Pattern = re.compile(r'\bblocks\b', re.IGNORECASE)
|
||||
COMBAT_DAMAGE: Pattern = re.compile(r'combat damage', re.IGNORECASE)
|
||||
|
||||
WHENEVER_ATTACKS: Pattern = re.compile(r'whenever .* attacks', re.IGNORECASE)
|
||||
WHEN_ATTACKS: Pattern = re.compile(r'when .* attacks', re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# TYPE LINE PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
INSTANT: Pattern = re.compile(r'\bInstant\b', re.IGNORECASE)
|
||||
SORCERY: Pattern = re.compile(r'\bSorcery\b', re.IGNORECASE)
|
||||
ARTIFACT: Pattern = re.compile(r'\bArtifact\b', re.IGNORECASE)
|
||||
ENCHANTMENT: Pattern = re.compile(r'\bEnchantment\b', re.IGNORECASE)
|
||||
CREATURE: Pattern = re.compile(r'\bCreature\b', re.IGNORECASE)
|
||||
PLANESWALKER: Pattern = re.compile(r'\bPlaneswalker\b', re.IGNORECASE)
|
||||
LAND: Pattern = re.compile(r'\bLand\b', re.IGNORECASE)
|
||||
|
||||
AURA: Pattern = re.compile(r'\bAura\b', re.IGNORECASE)
|
||||
EQUIPMENT: Pattern = re.compile(r'\bEquipment\b', re.IGNORECASE)
|
||||
VEHICLE: Pattern = re.compile(r'\bVehicle\b', re.IGNORECASE)
|
||||
SAGA: Pattern = re.compile(r'\bSaga\b', re.IGNORECASE)
|
||||
|
||||
NONCREATURE: Pattern = re.compile(r'noncreature', re.IGNORECASE)
|
||||
|
||||
# =============================================================================
|
||||
# PATTERN BUILDER FUNCTIONS
|
||||
# =============================================================================
|
||||
|
||||
def ownership_pattern(subject: str, owner: str = "you") -> Pattern:
|
||||
"""
|
||||
Build ownership pattern like 'creatures you control', 'permanents opponent controls'.
|
||||
|
||||
Args:
|
||||
subject: The card type (e.g., 'creature', 'permanent', 'artifact')
|
||||
owner: Controller ('you', 'opponent', 'they', etc.)
|
||||
|
||||
Returns:
|
||||
Compiled regex pattern
|
||||
|
||||
Examples:
|
||||
>>> ownership_pattern('creature', 'you')
|
||||
# Matches "creatures you control"
|
||||
>>> ownership_pattern('artifact', 'opponent')
|
||||
# Matches "artifacts opponent controls"
|
||||
"""
|
||||
pattern = fr'{subject}[s]?\s+{owner}\s+control[s]?'
|
||||
return re.compile(pattern, re.IGNORECASE)
|
||||
|
||||
|
||||
def grant_pattern(subject: str, verb: str, ability: str) -> Pattern:
|
||||
"""
|
||||
Build grant pattern like 'creatures you control gain hexproof'.
|
||||
|
||||
Args:
|
||||
subject: What gains the ability ('creatures you control', 'target creature', etc.)
|
||||
verb: Grant verb ('gain', 'has', 'get', etc.)
|
||||
ability: Ability granted ('hexproof', 'flying', 'ward', etc.)
|
||||
|
||||
Returns:
|
||||
Compiled regex pattern
|
||||
|
||||
Examples:
|
||||
>>> grant_pattern('creatures you control', 'gain', 'hexproof')
|
||||
# Matches "creatures you control gain hexproof"
|
||||
"""
|
||||
pattern = fr'{subject}\s+{verb}[s]?\s+{ability}'
|
||||
return re.compile(pattern, re.IGNORECASE)
|
||||
|
||||
|
||||
def token_creation_pattern(quantity: str, token_type: str) -> Pattern:
|
||||
"""
|
||||
Build token creation pattern like 'create two 1/1 Soldier tokens'.
|
||||
|
||||
Args:
|
||||
quantity: Number word or variable ('one', 'two', 'x', etc.)
|
||||
token_type: Token name ('treasure', 'food', 'soldier', etc.)
|
||||
|
||||
Returns:
|
||||
Compiled regex pattern
|
||||
|
||||
Examples:
|
||||
>>> token_creation_pattern('two', 'treasure')
|
||||
# Matches "create two Treasure tokens"
|
||||
"""
|
||||
pattern = fr'create[s]?\s+(?:{quantity})\s+.*{token_type}\s+token'
|
||||
return re.compile(pattern, re.IGNORECASE)
|
||||
|
||||
|
||||
def kindred_grant_pattern(tribe: str, ability: str) -> Pattern:
|
||||
"""
|
||||
Build kindred grant pattern like 'knights you control gain protection'.
|
||||
|
||||
Args:
|
||||
tribe: Creature type ('knight', 'elf', 'zombie', etc.)
|
||||
ability: Ability granted ('hexproof', 'protection', etc.)
|
||||
|
||||
Returns:
|
||||
Compiled regex pattern
|
||||
|
||||
Examples:
|
||||
>>> kindred_grant_pattern('knight', 'hexproof')
|
||||
# Matches "Knights you control gain hexproof"
|
||||
"""
|
||||
pattern = fr'{tribe}[s]?\s+you\s+control.*\b{ability}\b'
|
||||
return re.compile(pattern, re.IGNORECASE)
|
||||
|
||||
|
||||
def targeting_pattern(target: str, subject: str = None) -> Pattern:
|
||||
"""
|
||||
Build targeting pattern like 'target creature you control'.
|
||||
|
||||
Args:
|
||||
target: What is targeted ('player', 'opponent', 'creature', etc.)
|
||||
subject: Optional qualifier ('you control', 'opponent controls', etc.)
|
||||
|
||||
Returns:
|
||||
Compiled regex pattern
|
||||
|
||||
Examples:
|
||||
>>> targeting_pattern('creature', 'you control')
|
||||
# Matches "target creature you control"
|
||||
>>> targeting_pattern('opponent')
|
||||
# Matches "target opponent"
|
||||
"""
|
||||
if subject:
|
||||
pattern = fr'target\s+{target}\s+{subject}'
|
||||
else:
|
||||
pattern = fr'target\s+{target}'
|
||||
return re.compile(pattern, re.IGNORECASE)
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# MODULE EXPORTS
|
||||
# =============================================================================
|
||||
|
||||
__all__ = [
|
||||
# Ownership
|
||||
'YOU_CONTROL', 'THEY_CONTROL', 'OPPONENT_CONTROL',
|
||||
'CREATURE_YOU_CONTROL', 'PERMANENT_YOU_CONTROL', 'ARTIFACT_YOU_CONTROL',
|
||||
'ENCHANTMENT_YOU_CONTROL',
|
||||
|
||||
# Grant verbs
|
||||
'GAIN', 'HAS', 'HAVE', 'GET', 'GRANT_VERBS',
|
||||
|
||||
# Targeting
|
||||
'TARGET_PLAYER', 'TARGET_OPPONENT', 'TARGET_CREATURE', 'TARGET_PERMANENT',
|
||||
'TARGET_ARTIFACT', 'TARGET_ENCHANTMENT', 'EACH_PLAYER', 'EACH_OPPONENT',
|
||||
'TARGET_YOU_CONTROL',
|
||||
|
||||
# Protection abilities
|
||||
'HEXPROOF', 'SHROUD', 'INDESTRUCTIBLE', 'WARD', 'PROTECTION_FROM',
|
||||
'PROTECTION_ABILITIES', 'CANT_HAVE_PROTECTION', 'LOSE_PROTECTION',
|
||||
|
||||
# Draw
|
||||
'DRAW_A_CARD', 'DRAW_CARDS', 'DRAW',
|
||||
|
||||
# Tokens
|
||||
'CREATE_TOKEN', 'PUT_TOKEN',
|
||||
'CREATE_TREASURE', 'CREATE_FOOD', 'CREATE_CLUE', 'CREATE_BLOOD',
|
||||
|
||||
# Counters
|
||||
'PLUS_ONE_COUNTER', 'MINUS_ONE_COUNTER', 'LOYALTY_COUNTER', 'PROLIFERATE',
|
||||
'ONE_OR_MORE_COUNTERS', 'ONE_OR_MORE_PLUS_ONE_COUNTERS', 'IF_HAD_COUNTERS', 'WITH_COUNTERS_ON_THEM',
|
||||
|
||||
# Removal
|
||||
'SACRIFICE', 'SACRIFICED', 'DESTROY', 'EXILE', 'EXILED',
|
||||
'SACRIFICE_DRAW', 'SACRIFICE_COLON_DRAW', 'SACRIFICED_COMMA_DRAW',
|
||||
'EXILE_RETURN_BATTLEFIELD',
|
||||
|
||||
# Discard
|
||||
'DISCARD_A_CARD', 'DISCARD_YOUR_HAND', 'YOU_DISCARD',
|
||||
'WHENEVER_YOU_DISCARD', 'IF_YOU_DISCARDED', 'WHEN_YOU_DISCARD', 'FOR_EACH_DISCARDED',
|
||||
'TARGET_PLAYER_DISCARDS', 'TARGET_OPPONENT_DISCARDS', 'EACH_PLAYER_DISCARDS',
|
||||
'EACH_OPPONENT_DISCARDS', 'THAT_PLAYER_DISCARDS',
|
||||
'ADDITIONAL_COST_DISCARD', 'ADDITIONAL_COST_DISCARD_SHORT', 'MADNESS',
|
||||
|
||||
# Damage & Life Loss
|
||||
'DEALS_ONE_DAMAGE', 'EXACTLY_ONE_DAMAGE', 'LOSES_ONE_LIFE',
|
||||
|
||||
# Cost reduction
|
||||
'COST_LESS', 'COST_LESS_TO_CAST', 'WITH_X_IN_COST', 'AFFINITY_FOR', 'SPELLS_COST', 'SPELLS_YOU_CAST_COST',
|
||||
|
||||
# Monarch
|
||||
'BECOME_MONARCH', 'IS_MONARCH', 'WAS_MONARCH', 'YOU_ARE_MONARCH',
|
||||
'YOU_BECOME_MONARCH', 'CANT_BECOME_MONARCH',
|
||||
|
||||
# Keywords
|
||||
'PARTNER_BASIC', 'PARTNER_WITH', 'PARTNER_SURVIVORS', 'PARTNER_FATHER_SON',
|
||||
'FLYING', 'VIGILANCE', 'TRAMPLE', 'HASTE', 'LIFELINK', 'DEATHTOUCH',
|
||||
'DOUBLE_STRIKE', 'FIRST_STRIKE', 'MENACE', 'REACH',
|
||||
'UNDYING', 'PERSIST', 'PHASING', 'FLASH', 'TOXIC',
|
||||
|
||||
# Return
|
||||
'RETURN_TO_BATTLEFIELD', 'RETURN_IT_TO_BATTLEFIELD', 'RETURN_THAT_CARD_TO_BATTLEFIELD',
|
||||
'RETURN_THEM_TO_BATTLEFIELD', 'RETURN_THOSE_CARDS_TO_BATTLEFIELD',
|
||||
'RETURN_TO_HAND', 'RETURN_YOU_CONTROL_TO_HAND',
|
||||
|
||||
# Scope
|
||||
'OTHER_CREATURES', 'ALL_CREATURES', 'ALL_PERMANENTS', 'ALL_SLIVERS',
|
||||
'EQUIPPED_CREATURE', 'ENCHANTED_CREATURE', 'ENCHANTED_PERMANENT', 'ENCHANTED_ENCHANTMENT',
|
||||
|
||||
# Combat
|
||||
'ATTACK', 'ATTACKS', 'BLOCK', 'BLOCKS', 'COMBAT_DAMAGE',
|
||||
'WHENEVER_ATTACKS', 'WHEN_ATTACKS',
|
||||
|
||||
# Type line
|
||||
'INSTANT', 'SORCERY', 'ARTIFACT', 'ENCHANTMENT', 'CREATURE', 'PLANESWALKER', 'LAND',
|
||||
'AURA', 'EQUIPMENT', 'VEHICLE', 'SAGA', 'NONCREATURE',
|
||||
|
||||
# Builders
|
||||
'ownership_pattern', 'grant_pattern', 'token_creation_pattern',
|
||||
'kindred_grant_pattern', 'targeting_pattern',
|
||||
]
|
||||
420
code/tagging/scope_detection_utils.py
Normal file
420
code/tagging/scope_detection_utils.py
Normal file
|
|
@ -0,0 +1,420 @@
|
|||
"""
|
||||
Scope Detection Utilities
|
||||
|
||||
Generic utilities for detecting the scope of card abilities (protection, phasing, etc.).
|
||||
Provides reusable pattern-matching logic to avoid duplication across modules.
|
||||
|
||||
Created as part of M2: Create Scope Detection Utilities milestone.
|
||||
"""
|
||||
|
||||
# Standard library imports
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
from typing import List, Optional, Set
|
||||
|
||||
# Local application imports
|
||||
from . import regex_patterns as rgx
|
||||
from . import tag_utils
|
||||
from code.logging_util import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
@dataclass
|
||||
class ScopePatterns:
|
||||
"""
|
||||
Pattern collections for scope detection.
|
||||
|
||||
Attributes:
|
||||
opponent: Patterns that indicate opponent ownership
|
||||
self_ref: Patterns that indicate self-reference
|
||||
your_permanents: Patterns that indicate "you control"
|
||||
blanket: Patterns that indicate no ownership qualifier
|
||||
targeted: Patterns that indicate targeting (optional)
|
||||
"""
|
||||
opponent: List[re.Pattern]
|
||||
self_ref: List[re.Pattern]
|
||||
your_permanents: List[re.Pattern]
|
||||
blanket: List[re.Pattern]
|
||||
targeted: Optional[List[re.Pattern]] = None
|
||||
|
||||
|
||||
def detect_scope(
|
||||
text: str,
|
||||
card_name: str,
|
||||
ability_keyword: str,
|
||||
patterns: ScopePatterns,
|
||||
allow_multiple: bool = False,
|
||||
check_grant_verbs: bool = False,
|
||||
keywords: Optional[str] = None,
|
||||
) -> Optional[str]:
|
||||
"""
|
||||
Generic scope detection with priority ordering.
|
||||
|
||||
Detection priority (prevents misclassification):
|
||||
0. Static keyword (in keywords field or simple list) → "Self"
|
||||
1. Opponent ownership → "Opponent Permanents"
|
||||
2. Self-reference → "Self"
|
||||
3. Your ownership → "Your Permanents"
|
||||
4. No ownership qualifier → "Blanket"
|
||||
|
||||
Args:
|
||||
text: Card text
|
||||
card_name: Card name (for self-reference detection)
|
||||
ability_keyword: Ability keyword to look for (e.g., "hexproof", "phasing")
|
||||
patterns: ScopePatterns object with pattern collections
|
||||
allow_multiple: If True, returns Set[str] instead of single scope
|
||||
check_grant_verbs: If True, checks for grant verbs before assuming "Self"
|
||||
keywords: Optional keywords field from card data (for static keyword detection)
|
||||
|
||||
Returns:
|
||||
Scope string or None: "Self", "Your Permanents", "Blanket", "Opponent Permanents"
|
||||
If allow_multiple=True, returns Set[str] with all matching scopes
|
||||
"""
|
||||
if not text or not ability_keyword:
|
||||
return set() if allow_multiple else None
|
||||
|
||||
text_lower = text.lower()
|
||||
ability_lower = ability_keyword.lower()
|
||||
card_name_lower = card_name.lower() if card_name else ''
|
||||
|
||||
# Check if ability is mentioned in text
|
||||
if ability_lower not in text_lower:
|
||||
return set() if allow_multiple else None
|
||||
|
||||
# Priority 0: Check if this is a static keyword ability
|
||||
# Static keywords appear in the keywords field or as simple comma-separated lists
|
||||
# without grant verbs (e.g., "Flying, first strike, protection from black")
|
||||
if check_static_keyword(ability_keyword, keywords, text):
|
||||
if allow_multiple:
|
||||
return {"Self"}
|
||||
else:
|
||||
return "Self"
|
||||
|
||||
if allow_multiple:
|
||||
scopes = set()
|
||||
else:
|
||||
scopes = None
|
||||
|
||||
# Priority 1: Opponent ownership
|
||||
for pattern in patterns.opponent:
|
||||
if pattern.search(text_lower):
|
||||
if allow_multiple:
|
||||
scopes.add("Opponent Permanents")
|
||||
break
|
||||
else:
|
||||
return "Opponent Permanents"
|
||||
|
||||
# Priority 2: Self-reference
|
||||
is_self = _check_self_reference(text_lower, card_name_lower, ability_lower, patterns.self_ref)
|
||||
|
||||
# If check_grant_verbs is True, verify we don't have grant patterns before assuming Self
|
||||
if is_self and check_grant_verbs:
|
||||
has_grant_pattern = _has_grant_verbs(text_lower)
|
||||
if not has_grant_pattern:
|
||||
if allow_multiple:
|
||||
scopes.add("Self")
|
||||
else:
|
||||
return "Self"
|
||||
elif is_self:
|
||||
if allow_multiple:
|
||||
scopes.add("Self")
|
||||
else:
|
||||
return "Self"
|
||||
|
||||
# Priority 3: Your ownership
|
||||
for pattern in patterns.your_permanents:
|
||||
if pattern.search(text_lower):
|
||||
if allow_multiple:
|
||||
scopes.add("Your Permanents")
|
||||
break
|
||||
else:
|
||||
return "Your Permanents"
|
||||
|
||||
# Priority 4: Blanket (no ownership qualifier)
|
||||
for pattern in patterns.blanket:
|
||||
if pattern.search(text_lower):
|
||||
# Double-check no ownership was missed
|
||||
if not rgx.YOU_CONTROL.search(text_lower) and 'opponent' not in text_lower:
|
||||
if allow_multiple:
|
||||
scopes.add("Blanket")
|
||||
break
|
||||
else:
|
||||
return "Blanket"
|
||||
|
||||
return scopes if allow_multiple else None
|
||||
|
||||
|
||||
def detect_multi_scope(
|
||||
text: str,
|
||||
card_name: str,
|
||||
ability_keyword: str,
|
||||
patterns: ScopePatterns,
|
||||
check_grant_verbs: bool = False,
|
||||
keywords: Optional[str] = None,
|
||||
) -> Set[str]:
|
||||
"""
|
||||
Detect multiple scopes for cards with multiple effects.
|
||||
|
||||
Some cards grant abilities to multiple scopes:
|
||||
- Self-hexproof + grants ward to others
|
||||
- Target phasing + your permanents phasing
|
||||
|
||||
Args:
|
||||
text: Card text
|
||||
card_name: Card name
|
||||
ability_keyword: Ability keyword to look for
|
||||
patterns: ScopePatterns object
|
||||
check_grant_verbs: If True, checks for grant verbs before assuming "Self"
|
||||
keywords: Optional keywords field for static keyword detection
|
||||
|
||||
Returns:
|
||||
Set of scope strings
|
||||
"""
|
||||
scopes = set()
|
||||
|
||||
if not text or not ability_keyword:
|
||||
return scopes
|
||||
|
||||
text_lower = text.lower()
|
||||
ability_lower = ability_keyword.lower()
|
||||
card_name_lower = card_name.lower() if card_name else ''
|
||||
|
||||
# Check for static keyword first
|
||||
if check_static_keyword(ability_keyword, keywords, text):
|
||||
scopes.add("Self")
|
||||
# For static keywords, we usually don't have multiple scopes
|
||||
# But continue checking in case there are additional effects
|
||||
|
||||
# Check if ability is mentioned
|
||||
if ability_lower not in text_lower:
|
||||
return scopes
|
||||
|
||||
# Check opponent patterns
|
||||
if any(pattern.search(text_lower) for pattern in patterns.opponent):
|
||||
scopes.add("Opponent Permanents")
|
||||
|
||||
# Check self-reference
|
||||
is_self = _check_self_reference(text_lower, card_name_lower, ability_lower, patterns.self_ref)
|
||||
|
||||
if is_self:
|
||||
if check_grant_verbs:
|
||||
has_grant_pattern = _has_grant_verbs(text_lower)
|
||||
if not has_grant_pattern:
|
||||
scopes.add("Self")
|
||||
else:
|
||||
scopes.add("Self")
|
||||
|
||||
# Check your permanents
|
||||
if any(pattern.search(text_lower) for pattern in patterns.your_permanents):
|
||||
scopes.add("Your Permanents")
|
||||
|
||||
# Check blanket (no ownership)
|
||||
has_blanket = any(pattern.search(text_lower) for pattern in patterns.blanket)
|
||||
no_ownership = not rgx.YOU_CONTROL.search(text_lower) and 'opponent' not in text_lower
|
||||
|
||||
if has_blanket and no_ownership:
|
||||
scopes.add("Blanket")
|
||||
|
||||
# Optional: Check for targeting
|
||||
if patterns.targeted:
|
||||
if any(pattern.search(text_lower) for pattern in patterns.targeted):
|
||||
scopes.add("Targeted")
|
||||
|
||||
return scopes
|
||||
|
||||
|
||||
def _check_self_reference(
|
||||
text_lower: str,
|
||||
card_name_lower: str,
|
||||
ability_lower: str,
|
||||
self_patterns: List[re.Pattern]
|
||||
) -> bool:
|
||||
"""
|
||||
Check if text contains self-reference patterns.
|
||||
|
||||
Args:
|
||||
text_lower: Lowercase card text
|
||||
card_name_lower: Lowercase card name
|
||||
ability_lower: Lowercase ability keyword
|
||||
self_patterns: List of self-reference patterns
|
||||
|
||||
Returns:
|
||||
True if self-reference found
|
||||
"""
|
||||
# Check provided self patterns
|
||||
for pattern in self_patterns:
|
||||
if pattern.search(text_lower):
|
||||
return True
|
||||
|
||||
# Check for card name reference (if provided)
|
||||
if card_name_lower:
|
||||
card_name_escaped = re.escape(card_name_lower)
|
||||
card_name_pattern = re.compile(rf'\b{card_name_escaped}\b', re.IGNORECASE)
|
||||
|
||||
if card_name_pattern.search(text_lower):
|
||||
# Make sure it's in a self-ability context
|
||||
self_context_patterns = [
|
||||
re.compile(rf'\b{card_name_escaped}\s+(?:has|gains?)\s+{ability_lower}', re.IGNORECASE),
|
||||
re.compile(rf'\b{card_name_escaped}\s+is\s+{ability_lower}', re.IGNORECASE),
|
||||
]
|
||||
|
||||
for pattern in self_context_patterns:
|
||||
if pattern.search(text_lower):
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
|
||||
def _has_grant_verbs(text_lower: str) -> bool:
|
||||
"""
|
||||
Check if text contains grant verb patterns.
|
||||
|
||||
Used to distinguish inherent abilities from granted abilities.
|
||||
|
||||
Args:
|
||||
text_lower: Lowercase card text
|
||||
|
||||
Returns:
|
||||
True if grant verbs found
|
||||
"""
|
||||
grant_patterns = [
|
||||
re.compile(r'(?:have|gain|grant|give|get)[s]?\s+', re.IGNORECASE),
|
||||
rgx.OTHER_CREATURES,
|
||||
rgx.CREATURE_YOU_CONTROL,
|
||||
rgx.PERMANENT_YOU_CONTROL,
|
||||
rgx.EQUIPPED_CREATURE,
|
||||
rgx.ENCHANTED_CREATURE,
|
||||
rgx.TARGET_CREATURE,
|
||||
]
|
||||
|
||||
return any(pattern.search(text_lower) for pattern in grant_patterns)
|
||||
|
||||
|
||||
def format_scope_tag(scope: str, ability: str) -> str:
|
||||
"""
|
||||
Format a scope and ability into a metadata tag.
|
||||
|
||||
Args:
|
||||
scope: Scope string (e.g., "Self", "Your Permanents")
|
||||
ability: Ability name (e.g., "Hexproof", "Phasing")
|
||||
|
||||
Returns:
|
||||
Formatted tag string (e.g., "Self: Hexproof")
|
||||
"""
|
||||
return f"{scope}: {ability}"
|
||||
|
||||
|
||||
def has_keyword(text: str, keywords: List[str]) -> bool:
|
||||
"""
|
||||
Quick check if card text contains any of the specified keywords.
|
||||
|
||||
Args:
|
||||
text: Card text
|
||||
keywords: List of keywords to search for
|
||||
|
||||
Returns:
|
||||
True if any keyword found
|
||||
"""
|
||||
if not text:
|
||||
return False
|
||||
|
||||
text_lower = text.lower()
|
||||
return any(keyword.lower() in text_lower for keyword in keywords)
|
||||
|
||||
|
||||
def check_static_keyword(
|
||||
ability_keyword: str,
|
||||
keywords: Optional[str] = None,
|
||||
text: Optional[str] = None
|
||||
) -> bool:
|
||||
"""
|
||||
Check if card has ability as a static keyword (not granted to others).
|
||||
|
||||
A static keyword is one that appears:
|
||||
1. In the keywords field, OR
|
||||
2. As a simple comma-separated list without grant verbs
|
||||
(e.g., "Flying, first strike, protection from black")
|
||||
|
||||
Args:
|
||||
ability_keyword: Ability to check (e.g., "Protection", "Hexproof")
|
||||
keywords: Optional keywords field from card data
|
||||
text: Optional card text for fallback detection
|
||||
|
||||
Returns:
|
||||
True if ability appears as static keyword
|
||||
"""
|
||||
ability_lower = ability_keyword.lower()
|
||||
|
||||
# Check keywords field first (most reliable)
|
||||
if keywords:
|
||||
keywords_lower = keywords.lower()
|
||||
if ability_lower in keywords_lower:
|
||||
return True
|
||||
|
||||
# Fallback: Check if ability appears in simple comma-separated keyword list
|
||||
# Pattern: starts with keywords (Flying, First strike, etc.) without grant verbs
|
||||
# Example: "Flying, first strike, vigilance, trample, haste, protection from black"
|
||||
if text:
|
||||
text_lower = text.lower()
|
||||
|
||||
# Check if ability appears in text but WITHOUT grant verbs
|
||||
if ability_lower in text_lower:
|
||||
# Look for grant verbs that would indicate this is NOT a static keyword
|
||||
grant_verbs = ['have', 'has', 'gain', 'gains', 'get', 'gets', 'grant', 'grants', 'give', 'gives']
|
||||
|
||||
# Find the position of the ability in text
|
||||
ability_pos = text_lower.find(ability_lower)
|
||||
|
||||
# Check the 50 characters before the ability for grant verbs
|
||||
# This catches patterns like "creatures gain protection" or "has hexproof"
|
||||
context_before = text_lower[max(0, ability_pos - 50):ability_pos]
|
||||
|
||||
# If no grant verbs found nearby, it's likely a static keyword
|
||||
if not any(verb in context_before for verb in grant_verbs):
|
||||
# Additional check: is it part of a comma-separated list?
|
||||
# This helps with "Flying, first strike, protection from X" patterns
|
||||
context_before_30 = text_lower[max(0, ability_pos - 30):ability_pos]
|
||||
if ',' in context_before_30 or ability_pos < 10:
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
|
||||
def check_static_keyword_legacy(
|
||||
keywords: str,
|
||||
static_keyword: str,
|
||||
text: str,
|
||||
grant_patterns: Optional[List[re.Pattern]] = None
|
||||
) -> bool:
|
||||
"""
|
||||
LEGACY: Check if card has static keyword without granting it to others.
|
||||
|
||||
Used for abilities like "Phasing" that can be both static and granted.
|
||||
|
||||
Args:
|
||||
keywords: Card keywords field
|
||||
static_keyword: Keyword to search for (e.g., "phasing")
|
||||
text: Card text
|
||||
grant_patterns: Optional patterns to check for granting language
|
||||
|
||||
Returns:
|
||||
True if static keyword found and not granted to others
|
||||
"""
|
||||
if not keywords:
|
||||
return False
|
||||
|
||||
keywords_lower = keywords.lower()
|
||||
|
||||
if static_keyword.lower() not in keywords_lower:
|
||||
return False
|
||||
|
||||
# If grant patterns provided, check if card grants to others
|
||||
if grant_patterns:
|
||||
text_no_reminder = tag_utils.strip_reminder_text(text.lower()) if text else ''
|
||||
grants_to_others = any(pattern.search(text_no_reminder) for pattern in grant_patterns)
|
||||
|
||||
# Only return True if NOT granting to others
|
||||
return not grants_to_others
|
||||
|
||||
return True
|
||||
|
|
@ -1,13 +1,59 @@
|
|||
from typing import Dict, List, Final
|
||||
"""
|
||||
Tag Constants Module
|
||||
|
||||
Centralized constants for card tagging and theme detection across the MTG deckbuilder.
|
||||
This module contains all shared constants used by the tagging system including:
|
||||
- Card types and creature types
|
||||
- Pattern groups and regex fragments
|
||||
- Tag groupings and relationships
|
||||
- Protection and ability keywords
|
||||
- Magic numbers and thresholds
|
||||
"""
|
||||
|
||||
from typing import Dict, Final, List
|
||||
|
||||
# =============================================================================
|
||||
# TABLE OF CONTENTS
|
||||
# =============================================================================
|
||||
# 1. TRIGGERS & BASIC PATTERNS
|
||||
# 2. TAG GROUPS & RELATIONSHIPS
|
||||
# 3. PATTERN GROUPS & REGEX FRAGMENTS
|
||||
# 4. PHRASE GROUPS
|
||||
# 5. COUNTER TYPES
|
||||
# 6. CREATURE TYPES
|
||||
# 7. NON-CREATURE TYPES & SPECIAL TYPES
|
||||
# 8. PROTECTION & ABILITY KEYWORDS
|
||||
# 9. TOKEN TYPES
|
||||
# 10. MAGIC NUMBERS & THRESHOLDS
|
||||
# 11. DATAFRAME COLUMN REQUIREMENTS
|
||||
# 12. TYPE-TAG MAPPINGS
|
||||
# 13. DRAW-RELATED CONSTANTS
|
||||
# 14. EQUIPMENT-RELATED CONSTANTS
|
||||
# 15. AURA & VOLTRON CONSTANTS
|
||||
# 16. LANDS MATTER PATTERNS
|
||||
# 17. SACRIFICE & GRAVEYARD PATTERNS
|
||||
# 18. CREATURE-RELATED PATTERNS
|
||||
# 19. TOKEN-RELATED PATTERNS
|
||||
# 20. REMOVAL & DESTRUCTION PATTERNS
|
||||
# 21. SPELL-RELATED PATTERNS
|
||||
# 22. MISC PATTERNS & EXCLUSIONS
|
||||
|
||||
# =============================================================================
|
||||
# 1. TRIGGERS & BASIC PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
TRIGGERS: List[str] = ['when', 'whenever', 'at']
|
||||
|
||||
NUM_TO_SEARCH: List[str] = ['a', 'an', 'one', '1', 'two', '2', 'three', '3', 'four','4', 'five', '5',
|
||||
'six', '6', 'seven', '7', 'eight', '8', 'nine', '9', 'ten', '10',
|
||||
'x','one or more']
|
||||
NUM_TO_SEARCH: List[str] = [
|
||||
'a', 'an', 'one', '1', 'two', '2', 'three', '3', 'four', '4', 'five', '5',
|
||||
'six', '6', 'seven', '7', 'eight', '8', 'nine', '9', 'ten', '10',
|
||||
'x', 'one or more'
|
||||
]
|
||||
|
||||
# =============================================================================
|
||||
# 2. TAG GROUPS & RELATIONSHIPS
|
||||
# =============================================================================
|
||||
|
||||
# Constants for common tag groupings
|
||||
TAG_GROUPS: Dict[str, List[str]] = {
|
||||
"Cantrips": ["Cantrips", "Card Draw", "Spellslinger", "Spells Matter"],
|
||||
"Tokens": ["Token Creation", "Tokens Matter"],
|
||||
|
|
@ -19,8 +65,11 @@ TAG_GROUPS: Dict[str, List[str]] = {
|
|||
"Spells": ["Spellslinger", "Spells Matter"]
|
||||
}
|
||||
|
||||
# Common regex patterns
|
||||
PATTERN_GROUPS: Dict[str, str] = {
|
||||
# =============================================================================
|
||||
# 3. PATTERN GROUPS & REGEX FRAGMENTS
|
||||
# =============================================================================
|
||||
|
||||
PATTERN_GROUPS: Dict[str, str] = {
|
||||
"draw": r"draw[s]? a card|draw[s]? one card",
|
||||
"combat": r"attack[s]?|block[s]?|combat damage",
|
||||
"tokens": r"create[s]? .* token|put[s]? .* token",
|
||||
|
|
@ -30,7 +79,10 @@ PATTERN_GROUPS: Dict[str, str] = {
|
|||
"cost_reduction": r"cost[s]? \{[\d\w]\} less|affinity for|cost[s]? less to cast|chosen type cost|copy cost|from exile cost|from exile this turn cost|from your graveyard cost|has undaunted|have affinity for artifacts|other than your hand cost|spells cost|spells you cast cost|that target .* cost|those spells cost|you cast cost|you pay cost"
|
||||
}
|
||||
|
||||
# Common phrase groups (lists) used across taggers
|
||||
# =============================================================================
|
||||
# 4. PHRASE GROUPS
|
||||
# =============================================================================
|
||||
|
||||
PHRASE_GROUPS: Dict[str, List[str]] = {
|
||||
# Variants for monarch wording
|
||||
"monarch": [
|
||||
|
|
@ -52,11 +104,15 @@ PHRASE_GROUPS: Dict[str, List[str]] = {
|
|||
r"return .* to the battlefield"
|
||||
]
|
||||
}
|
||||
# Common action patterns
|
||||
|
||||
CREATE_ACTION_PATTERN: Final[str] = r"create|put"
|
||||
|
||||
# Creature/Counter types
|
||||
COUNTER_TYPES: List[str] = [r'\+0/\+1', r'\+0/\+2', r'\+1/\+0', r'\+1/\+2', r'\+2/\+0', r'\+2/\+2',
|
||||
# =============================================================================
|
||||
# 5. COUNTER TYPES
|
||||
# =============================================================================
|
||||
|
||||
COUNTER_TYPES: List[str] = [
|
||||
r'\+0/\+1', r'\+0/\+2', r'\+1/\+0', r'\+1/\+2', r'\+2/\+0', r'\+2/\+2',
|
||||
'-0/-1', '-0/-2', '-1/-0', '-1/-2', '-2/-0', '-2/-2',
|
||||
'Acorn', 'Aegis', 'Age', 'Aim', 'Arrow', 'Arrowhead','Awakening',
|
||||
'Bait', 'Blaze', 'Blessing', 'Blight',' Blood', 'Bloddline',
|
||||
|
|
@ -90,9 +146,15 @@ COUNTER_TYPES: List[str] = [r'\+0/\+1', r'\+0/\+2', r'\+1/\+0', r'\+1/\+2', r'\+
|
|||
'Task', 'Ticket', 'Tide', 'Time', 'Tower', 'Training', 'Trap',
|
||||
'Treasure', 'Unity', 'Unlock', 'Valor', 'Velocity', 'Verse',
|
||||
'Vitality', 'Void', 'Volatile', 'Vortex', 'Vow', 'Voyage', 'Wage',
|
||||
'Winch', 'Wind', 'Wish']
|
||||
'Winch', 'Wind', 'Wish'
|
||||
]
|
||||
|
||||
CREATURE_TYPES: List[str] = ['Advisor', 'Aetherborn', 'Alien', 'Ally', 'Angel', 'Antelope', 'Ape', 'Archer', 'Archon', 'Armadillo',
|
||||
# =============================================================================
|
||||
# 6. CREATURE TYPES
|
||||
# =============================================================================
|
||||
|
||||
CREATURE_TYPES: List[str] = [
|
||||
'Advisor', 'Aetherborn', 'Alien', 'Ally', 'Angel', 'Antelope', 'Ape', 'Archer', 'Archon', 'Armadillo',
|
||||
'Army', 'Artificer', 'Assassin', 'Assembly-Worker', 'Astartes', 'Atog', 'Aurochs', 'Automaton',
|
||||
'Avatar', 'Azra', 'Badger', 'Balloon', 'Barbarian', 'Bard', 'Basilisk', 'Bat', 'Bear', 'Beast', 'Beaver',
|
||||
'Beeble', 'Beholder', 'Berserker', 'Bird', 'Blinkmoth', 'Boar', 'Brainiac', 'Bringer', 'Brushwagg',
|
||||
|
|
@ -122,9 +184,15 @@ CREATURE_TYPES: List[str] = ['Advisor', 'Aetherborn', 'Alien', 'Ally', 'Angel',
|
|||
'Thopter', 'Thrull', 'Tiefling', 'Time Lord', 'Toy', 'Treefolk', 'Trilobite', 'Triskelavite', 'Troll',
|
||||
'Turtle', 'Tyranid', 'Unicorn', 'Urzan', 'Vampire', 'Varmint', 'Vedalken', 'Volver', 'Wall', 'Walrus',
|
||||
'Warlock', 'Warrior', 'Wasp', 'Weasel', 'Weird', 'Werewolf', 'Whale', 'Wizard', 'Wolf', 'Wolverine', 'Wombat',
|
||||
'Worm', 'Wraith', 'Wurm', 'Yeti', 'Zombie', 'Zubera']
|
||||
'Worm', 'Wraith', 'Wurm', 'Yeti', 'Zombie', 'Zubera'
|
||||
]
|
||||
|
||||
NON_CREATURE_TYPES: List[str] = ['Legendary', 'Creature', 'Enchantment', 'Artifact',
|
||||
# =============================================================================
|
||||
# 7. NON-CREATURE TYPES & SPECIAL TYPES
|
||||
# =============================================================================
|
||||
|
||||
NON_CREATURE_TYPES: List[str] = [
|
||||
'Legendary', 'Creature', 'Enchantment', 'Artifact',
|
||||
'Battle', 'Sorcery', 'Instant', 'Land', '-', '—',
|
||||
'Blood', 'Clue', 'Food', 'Gold', 'Incubator',
|
||||
'Junk', 'Map', 'Powerstone', 'Treasure',
|
||||
|
|
@ -136,23 +204,66 @@ NON_CREATURE_TYPES: List[str] = ['Legendary', 'Creature', 'Enchantment', 'Artifa
|
|||
'Shrine',
|
||||
'Plains', 'Island', 'Swamp', 'Forest', 'Mountain',
|
||||
'Cave', 'Desert', 'Gate', 'Lair', 'Locus', 'Mine',
|
||||
'Power-Plant', 'Sphere', 'Tower', 'Urza\'s']
|
||||
'Power-Plant', 'Sphere', 'Tower', 'Urza\'s'
|
||||
]
|
||||
|
||||
OUTLAW_TYPES: List[str] = ['Assassin', 'Mercenary', 'Pirate', 'Rogue', 'Warlock']
|
||||
|
||||
ENCHANTMENT_TOKENS: List[str] = ['Cursed Role', 'Monster Role', 'Royal Role', 'Sorcerer Role',
|
||||
'Virtuous Role', 'Wicked Role', 'Young Hero Role', 'Shard']
|
||||
ARTIFACT_TOKENS: List[str] = ['Blood', 'Clue', 'Food', 'Gold', 'Incubator',
|
||||
'Junk','Map','Powerstone', 'Treasure']
|
||||
# =============================================================================
|
||||
# 8. PROTECTION & ABILITY KEYWORDS
|
||||
# =============================================================================
|
||||
|
||||
PROTECTION_ABILITIES: List[str] = [
|
||||
'Protection',
|
||||
'Ward',
|
||||
'Hexproof',
|
||||
'Shroud',
|
||||
'Indestructible'
|
||||
]
|
||||
|
||||
PROTECTION_KEYWORDS: Final[frozenset] = frozenset({
|
||||
'hexproof',
|
||||
'shroud',
|
||||
'indestructible',
|
||||
'ward',
|
||||
'protection from',
|
||||
'protection',
|
||||
})
|
||||
|
||||
# =============================================================================
|
||||
# 9. TOKEN TYPES
|
||||
# =============================================================================
|
||||
|
||||
ENCHANTMENT_TOKENS: List[str] = [
|
||||
'Cursed Role', 'Monster Role', 'Royal Role', 'Sorcerer Role',
|
||||
'Virtuous Role', 'Wicked Role', 'Young Hero Role', 'Shard'
|
||||
]
|
||||
|
||||
ARTIFACT_TOKENS: List[str] = [
|
||||
'Blood', 'Clue', 'Food', 'Gold', 'Incubator',
|
||||
'Junk', 'Map', 'Powerstone', 'Treasure'
|
||||
]
|
||||
|
||||
# =============================================================================
|
||||
# 10. MAGIC NUMBERS & THRESHOLDS
|
||||
# =============================================================================
|
||||
|
||||
CONTEXT_WINDOW_SIZE: Final[int] = 70 # Characters to examine around a regex match
|
||||
|
||||
# =============================================================================
|
||||
# 11. DATAFRAME COLUMN REQUIREMENTS
|
||||
# =============================================================================
|
||||
|
||||
# Constants for DataFrame validation and processing
|
||||
REQUIRED_COLUMNS: List[str] = [
|
||||
'name', 'faceName', 'edhrecRank', 'colorIdentity', 'colors',
|
||||
'manaCost', 'manaValue', 'type', 'creatureTypes', 'text',
|
||||
'power', 'toughness', 'keywords', 'themeTags', 'layout', 'side'
|
||||
]
|
||||
|
||||
# Mapping of card types to their corresponding theme tags
|
||||
# =============================================================================
|
||||
# 12. TYPE-TAG MAPPINGS
|
||||
# =============================================================================
|
||||
|
||||
TYPE_TAG_MAPPING: Dict[str, List[str]] = {
|
||||
'Artifact': ['Artifacts Matter'],
|
||||
'Battle': ['Battles Matter'],
|
||||
|
|
@ -166,7 +277,10 @@ TYPE_TAG_MAPPING: Dict[str, List[str]] = {
|
|||
'Sorcery': ['Spells Matter', 'Spellslinger']
|
||||
}
|
||||
|
||||
# Constants for draw-related functionality
|
||||
# =============================================================================
|
||||
# 13. DRAW-RELATED CONSTANTS
|
||||
# =============================================================================
|
||||
|
||||
DRAW_RELATED_TAGS: List[str] = [
|
||||
'Card Draw', # General card draw effects
|
||||
'Conditional Draw', # Draw effects with conditions/triggers
|
||||
|
|
@ -175,16 +289,18 @@ DRAW_RELATED_TAGS: List[str] = [
|
|||
'Loot', # Draw + discard effects
|
||||
'Replacement Draw', # Effects that modify or replace draws
|
||||
'Sacrifice to Draw', # Draw effects requiring sacrificing permanents
|
||||
'Unconditional Draw' # Pure card draw without conditions
|
||||
'Unconditional Draw' # Pure card draw without conditions
|
||||
]
|
||||
|
||||
# Text patterns that exclude cards from being tagged as unconditional draw
|
||||
DRAW_EXCLUSION_PATTERNS: List[str] = [
|
||||
'annihilator', # Eldrazi mechanic that can match 'draw' patterns
|
||||
'ravenous', # Keyword that can match 'draw' patterns
|
||||
'ravenous', # Keyword that can match 'draw' patterns
|
||||
]
|
||||
|
||||
# Equipment-related constants
|
||||
# =============================================================================
|
||||
# 14. EQUIPMENT-RELATED CONSTANTS
|
||||
# =============================================================================
|
||||
|
||||
EQUIPMENT_EXCLUSIONS: List[str] = [
|
||||
'Bruenor Battlehammer', # Equipment cost reduction
|
||||
'Nazahn, Revered Bladesmith', # Equipment tutor
|
||||
|
|
@ -223,7 +339,10 @@ EQUIPMENT_TEXT_PATTERNS: List[str] = [
|
|||
'unequip', # Equipment removal
|
||||
]
|
||||
|
||||
# Aura-related constants
|
||||
# =============================================================================
|
||||
# 15. AURA & VOLTRON CONSTANTS
|
||||
# =============================================================================
|
||||
|
||||
AURA_SPECIFIC_CARDS: List[str] = [
|
||||
'Ardenn, Intrepid Archaeologist', # Aura movement
|
||||
'Calix, Guided By Fate', # Create duplicate Auras
|
||||
|
|
@ -267,7 +386,10 @@ VOLTRON_PATTERNS: List[str] = [
|
|||
'reconfigure'
|
||||
]
|
||||
|
||||
# Constants for lands matter functionality
|
||||
# =============================================================================
|
||||
# 16. LANDS MATTER PATTERNS
|
||||
# =============================================================================
|
||||
|
||||
LANDS_MATTER_PATTERNS: Dict[str, List[str]] = {
|
||||
'land_play': [
|
||||
'play a land',
|
||||
|
|
@ -849,4 +971,110 @@ TOPDECK_EXCLUSION_PATTERNS: List[str] = [
|
|||
'from the top of their library',
|
||||
'look at the top card of target player\'s library',
|
||||
'reveal the top card of target player\'s library'
|
||||
]
|
||||
]
|
||||
|
||||
# ==============================================================================
|
||||
# Keyword Normalization (M1 - Tagging Refinement)
|
||||
# ==============================================================================
|
||||
|
||||
# Keyword normalization map: variant -> canonical
|
||||
# Maps Commander-specific and variant keywords to their canonical forms
|
||||
KEYWORD_NORMALIZATION_MAP: Dict[str, str] = {
|
||||
# Commander variants
|
||||
'Commander ninjutsu': 'Ninjutsu',
|
||||
'Commander Ninjutsu': 'Ninjutsu',
|
||||
|
||||
# Partner variants (already excluded but mapped for reference)
|
||||
'Partner with': 'Partner',
|
||||
'Choose a Background': 'Choose a Background', # Keep distinct
|
||||
"Doctor's Companion": "Doctor's Companion", # Keep distinct
|
||||
|
||||
# Case normalization for common keywords (most are already correct)
|
||||
'flying': 'Flying',
|
||||
'trample': 'Trample',
|
||||
'vigilance': 'Vigilance',
|
||||
'haste': 'Haste',
|
||||
'deathtouch': 'Deathtouch',
|
||||
'lifelink': 'Lifelink',
|
||||
'menace': 'Menace',
|
||||
'reach': 'Reach',
|
||||
}
|
||||
|
||||
# Keywords that should never appear in theme tags
|
||||
# Already excluded during keyword tagging, but documented here
|
||||
KEYWORD_EXCLUSION_SET: set[str] = {
|
||||
'partner', # Already excluded in tag_for_keywords
|
||||
}
|
||||
|
||||
# Keyword allowlist - keywords that should survive singleton pruning
|
||||
# Seeded from top keywords and theme whitelist
|
||||
KEYWORD_ALLOWLIST: set[str] = {
|
||||
# Evergreen keywords (top 50 from baseline)
|
||||
'Flying', 'Enchant', 'Trample', 'Vigilance', 'Haste', 'Equip', 'Flash',
|
||||
'Mill', 'Scry', 'Transform', 'Cycling', 'First strike', 'Reach', 'Menace',
|
||||
'Lifelink', 'Treasure', 'Defender', 'Deathtouch', 'Kicker', 'Flashback',
|
||||
'Protection', 'Surveil', 'Landfall', 'Crew', 'Ward', 'Morph', 'Devoid',
|
||||
'Investigate', 'Fight', 'Food', 'Partner', 'Double strike', 'Indestructible',
|
||||
'Threshold', 'Proliferate', 'Convoke', 'Hexproof', 'Cumulative upkeep',
|
||||
'Goad', 'Delirium', 'Prowess', 'Suspend', 'Affinity', 'Madness', 'Manifest',
|
||||
'Amass', 'Domain', 'Unearth', 'Explore', 'Changeling',
|
||||
|
||||
# Additional important mechanics
|
||||
'Myriad', 'Cascade', 'Storm', 'Dredge', 'Delve', 'Escape', 'Mutate',
|
||||
'Ninjutsu', 'Overload', 'Rebound', 'Retrace', 'Bloodrush', 'Cipher',
|
||||
'Extort', 'Evolve', 'Undying', 'Persist', 'Wither', 'Infect', 'Annihilator',
|
||||
'Exalted', 'Phasing', 'Shadow', 'Horsemanship', 'Banding', 'Rampage',
|
||||
'Shroud', 'Split second', 'Totem armor', 'Living weapon', 'Undaunted',
|
||||
'Improvise', 'Surge', 'Emerge', 'Escalate', 'Meld', 'Partner', 'Afflict',
|
||||
'Aftermath', 'Embalm', 'Eternalize', 'Exert', 'Fabricate', 'Improvise',
|
||||
'Assist', 'Jump-start', 'Mentor', 'Riot', 'Spectacle', 'Addendum',
|
||||
'Afterlife', 'Adapt', 'Enrage', 'Ascend', 'Learn', 'Boast', 'Foretell',
|
||||
'Squad', 'Encore', 'Daybound', 'Nightbound', 'Disturb', 'Cleave', 'Training',
|
||||
'Reconfigure', 'Blitz', 'Casualty', 'Connive', 'Hideaway', 'Prototype',
|
||||
'Read ahead', 'Living metal', 'More than meets the eye', 'Ravenous',
|
||||
'Squad', 'Toxic', 'For Mirrodin!', 'Backup', 'Bargain', 'Craft', 'Freerunning',
|
||||
'Plot', 'Spree', 'Offspring', 'Bestow', 'Monstrosity', 'Tribute',
|
||||
|
||||
# Partner mechanics (distinct types)
|
||||
'Choose a Background', "Doctor's Companion",
|
||||
|
||||
# Token types (frequently used)
|
||||
'Blood', 'Clue', 'Food', 'Gold', 'Treasure', 'Powerstone',
|
||||
|
||||
# Common ability words
|
||||
'Landfall', 'Raid', 'Revolt', 'Threshold', 'Metalcraft', 'Morbid',
|
||||
'Bloodthirst', 'Battalion', 'Channel', 'Grandeur', 'Kinship', 'Sweep',
|
||||
'Radiance', 'Join forces', 'Fateful hour', 'Inspired', 'Heroic',
|
||||
'Constellation', 'Strive', 'Prowess', 'Ferocious', 'Formidable', 'Renown',
|
||||
'Tempting offer', 'Will of the council', 'Parley', 'Adamant', 'Devotion',
|
||||
}
|
||||
|
||||
# ==============================================================================
|
||||
# Metadata Tag Classification (M3 - Tagging Refinement)
|
||||
# ==============================================================================
|
||||
|
||||
# Metadata tag prefixes - tags starting with these are classified as metadata
|
||||
METADATA_TAG_PREFIXES: List[str] = [
|
||||
'Applied:',
|
||||
'Bracket:',
|
||||
'Diagnostic:',
|
||||
'Internal:',
|
||||
]
|
||||
|
||||
# Specific metadata tags (full match) - additional tags to classify as metadata
|
||||
# These are typically diagnostic, bracket-related, or internal annotations
|
||||
METADATA_TAG_ALLOWLIST: set[str] = {
|
||||
# Bracket annotations
|
||||
'Bracket: Game Changer',
|
||||
'Bracket: Staple',
|
||||
'Bracket: Format Warping',
|
||||
|
||||
# Cost reduction diagnostics (from Applied: namespace)
|
||||
'Applied: Cost Reduction',
|
||||
|
||||
# Kindred-specific protection metadata (from M2)
|
||||
# Format: "{CreatureType}s Gain Protection"
|
||||
# These are auto-generated for kindred-specific protection grants
|
||||
# Example: "Knights Gain Protection", "Frogs Gain Protection"
|
||||
# Note: These are dynamically generated, so we match via prefix in classify_tag
|
||||
}
|
||||
|
|
@ -13,18 +13,11 @@ The module is designed to work with pandas DataFrames containing card data and p
|
|||
vectorized operations for efficient processing of large card collections.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
# Standard library imports
|
||||
import re
|
||||
from typing import List, Set, Union, Any, Tuple
|
||||
from functools import lru_cache
|
||||
|
||||
from typing import Any, List, Set, Tuple, Union
|
||||
import numpy as np
|
||||
|
||||
# Third-party imports
|
||||
import pandas as pd
|
||||
|
||||
# Local application imports
|
||||
from . import tag_constants
|
||||
|
||||
|
||||
|
|
@ -58,7 +51,6 @@ def _ensure_norm_series(df: pd.DataFrame, source_col: str, norm_col: str) -> pd.
|
|||
"""
|
||||
if norm_col in df.columns:
|
||||
return df[norm_col]
|
||||
# Create normalized string series
|
||||
series = df[source_col].fillna('') if source_col in df.columns else pd.Series([''] * len(df), index=df.index)
|
||||
series = series.astype(str)
|
||||
df[norm_col] = series
|
||||
|
|
@ -120,8 +112,6 @@ def create_type_mask(df: pd.DataFrame, type_text: Union[str, List[str]], regex:
|
|||
|
||||
if len(df) == 0:
|
||||
return pd.Series([], dtype=bool)
|
||||
|
||||
# Use normalized cached series
|
||||
type_series = _ensure_norm_series(df, 'type', '__type_s')
|
||||
|
||||
if regex:
|
||||
|
|
@ -160,8 +150,6 @@ def create_text_mask(df: pd.DataFrame, type_text: Union[str, List[str]], regex:
|
|||
|
||||
if len(df) == 0:
|
||||
return pd.Series([], dtype=bool)
|
||||
|
||||
# Use normalized cached series
|
||||
text_series = _ensure_norm_series(df, 'text', '__text_s')
|
||||
|
||||
if regex:
|
||||
|
|
@ -192,10 +180,7 @@ def create_keyword_mask(df: pd.DataFrame, type_text: Union[str, List[str]], rege
|
|||
TypeError: If type_text is not a string or list of strings
|
||||
ValueError: If required 'keywords' column is missing from DataFrame
|
||||
"""
|
||||
# Validate required columns
|
||||
validate_dataframe_columns(df, {'keywords'})
|
||||
|
||||
# Handle empty DataFrame case
|
||||
if len(df) == 0:
|
||||
return pd.Series([], dtype=bool)
|
||||
|
||||
|
|
@ -206,8 +191,6 @@ def create_keyword_mask(df: pd.DataFrame, type_text: Union[str, List[str]], rege
|
|||
type_text = [type_text]
|
||||
elif not isinstance(type_text, list):
|
||||
raise TypeError("type_text must be a string or list of strings")
|
||||
|
||||
# Use normalized cached series for keywords
|
||||
keywords = _ensure_norm_series(df, 'keywords', '__keywords_s')
|
||||
|
||||
if regex:
|
||||
|
|
@ -245,8 +228,6 @@ def create_name_mask(df: pd.DataFrame, type_text: Union[str, List[str]], regex:
|
|||
|
||||
if len(df) == 0:
|
||||
return pd.Series([], dtype=bool)
|
||||
|
||||
# Use normalized cached series
|
||||
name_series = _ensure_norm_series(df, 'name', '__name_s')
|
||||
|
||||
if regex:
|
||||
|
|
@ -324,21 +305,14 @@ def create_tag_mask(df: pd.DataFrame, tag_patterns: Union[str, List[str]], colum
|
|||
Boolean Series indicating matching rows
|
||||
|
||||
Examples:
|
||||
# Match cards with draw-related tags
|
||||
>>> mask = create_tag_mask(df, ['Card Draw', 'Conditional Draw'])
|
||||
>>> mask = create_tag_mask(df, 'Unconditional Draw')
|
||||
"""
|
||||
if isinstance(tag_patterns, str):
|
||||
tag_patterns = [tag_patterns]
|
||||
|
||||
# Handle empty DataFrame case
|
||||
if len(df) == 0:
|
||||
return pd.Series([], dtype=bool)
|
||||
|
||||
# Create mask for each pattern
|
||||
masks = [df[column].apply(lambda x: any(pattern in tag for tag in x)) for pattern in tag_patterns]
|
||||
|
||||
# Combine masks with OR
|
||||
return pd.concat(masks, axis=1).any(axis=1)
|
||||
|
||||
def validate_dataframe_columns(df: pd.DataFrame, required_columns: Set[str]) -> None:
|
||||
|
|
@ -365,11 +339,7 @@ def apply_tag_vectorized(df: pd.DataFrame, mask: pd.Series[bool], tags: Union[st
|
|||
"""
|
||||
if not isinstance(tags, list):
|
||||
tags = [tags]
|
||||
|
||||
# Get current tags for masked rows
|
||||
current_tags = df.loc[mask, 'themeTags']
|
||||
|
||||
# Add new tags
|
||||
df.loc[mask, 'themeTags'] = current_tags.apply(lambda x: sorted(list(set(x + tags))))
|
||||
|
||||
def apply_rules(df: pd.DataFrame, rules: List[dict]) -> None:
|
||||
|
|
@ -463,7 +433,6 @@ def create_numbered_phrase_mask(
|
|||
numbers = tag_constants.NUM_TO_SEARCH
|
||||
# Normalize verbs to list
|
||||
verbs = [verb] if isinstance(verb, str) else verb
|
||||
# Build patterns
|
||||
if noun:
|
||||
patterns = [fr"{v}\s+{num}\s+{noun}" for v in verbs for num in numbers]
|
||||
else:
|
||||
|
|
@ -490,13 +459,8 @@ def create_mass_damage_mask(df: pd.DataFrame) -> pd.Series[bool]:
|
|||
Returns:
|
||||
Boolean Series indicating which cards have mass damage effects
|
||||
"""
|
||||
# Create patterns for numeric damage
|
||||
number_patterns = [create_damage_pattern(i) for i in range(1, 21)]
|
||||
|
||||
# Add X damage pattern
|
||||
number_patterns.append(create_damage_pattern('X'))
|
||||
|
||||
# Add patterns for damage targets
|
||||
target_patterns = [
|
||||
'to each creature',
|
||||
'to all creatures',
|
||||
|
|
@ -504,9 +468,385 @@ def create_mass_damage_mask(df: pd.DataFrame) -> pd.Series[bool]:
|
|||
'to each opponent',
|
||||
'to everything'
|
||||
]
|
||||
|
||||
# Create masks
|
||||
damage_mask = create_text_mask(df, number_patterns)
|
||||
target_mask = create_text_mask(df, target_patterns)
|
||||
|
||||
return damage_mask & target_mask
|
||||
return damage_mask & target_mask
|
||||
|
||||
|
||||
# ==============================================================================
|
||||
# Keyword Normalization (M1 - Tagging Refinement)
|
||||
# ==============================================================================
|
||||
|
||||
def normalize_keywords(
|
||||
raw: Union[List[str], Set[str], Tuple[str, ...]],
|
||||
allowlist: Set[str],
|
||||
frequency_map: dict[str, int]
|
||||
) -> list[str]:
|
||||
"""Normalize keyword strings for theme tagging.
|
||||
|
||||
Applies normalization rules:
|
||||
1. Case normalization (via normalization map)
|
||||
2. Canonical mapping (e.g., "Commander Ninjutsu" -> "Ninjutsu")
|
||||
3. Singleton pruning (unless allowlisted)
|
||||
4. Deduplication
|
||||
5. Exclusion of blacklisted keywords
|
||||
|
||||
Args:
|
||||
raw: Iterable of raw keyword strings
|
||||
allowlist: Set of keywords that should survive singleton pruning
|
||||
frequency_map: Dict mapping keywords to their occurrence count
|
||||
|
||||
Returns:
|
||||
Deduplicated list of normalized keywords
|
||||
|
||||
Raises:
|
||||
ValueError: If raw is not iterable
|
||||
|
||||
Examples:
|
||||
>>> normalize_keywords(
|
||||
... ['Commander Ninjutsu', 'Flying', 'Allons-y!'],
|
||||
... {'Flying', 'Ninjutsu'},
|
||||
... {'Commander Ninjutsu': 2, 'Flying': 100, 'Allons-y!': 1}
|
||||
... )
|
||||
['Ninjutsu', 'Flying'] # 'Allons-y!' pruned as singleton
|
||||
"""
|
||||
if not hasattr(raw, '__iter__') or isinstance(raw, (str, bytes)):
|
||||
raise ValueError(f"raw must be iterable, got {type(raw)}")
|
||||
|
||||
normalized_keywords: set[str] = set()
|
||||
|
||||
for keyword in raw:
|
||||
if not isinstance(keyword, str):
|
||||
continue
|
||||
keyword = keyword.strip()
|
||||
if not keyword:
|
||||
continue
|
||||
if keyword.lower() in tag_constants.KEYWORD_EXCLUSION_SET:
|
||||
continue
|
||||
normalized = tag_constants.KEYWORD_NORMALIZATION_MAP.get(keyword, keyword)
|
||||
frequency = frequency_map.get(keyword, 0)
|
||||
is_singleton = frequency == 1
|
||||
is_allowlisted = normalized in allowlist or keyword in allowlist
|
||||
|
||||
# Prune singletons that aren't allowlisted
|
||||
if is_singleton and not is_allowlisted:
|
||||
continue
|
||||
|
||||
normalized_keywords.add(normalized)
|
||||
|
||||
return sorted(list(normalized_keywords))
|
||||
|
||||
|
||||
# ==============================================================================
|
||||
# M3: Metadata vs Theme Tag Classification
|
||||
# ==============================================================================
|
||||
|
||||
def classify_tag(tag: str) -> str:
|
||||
"""Classify a tag as either 'metadata' or 'theme'.
|
||||
|
||||
Metadata tags are diagnostic, bracket-related, or internal annotations that
|
||||
should not appear in theme catalogs or player-facing tag lists. Theme tags
|
||||
represent gameplay mechanics and deck archetypes.
|
||||
|
||||
Classification rules (in order of precedence):
|
||||
1. Prefix match: Tags starting with METADATA_TAG_PREFIXES → metadata
|
||||
2. Exact match: Tags in METADATA_TAG_ALLOWLIST → metadata
|
||||
3. Kindred pattern: "{Type}s Gain Protection" → metadata
|
||||
4. Default: All other tags → theme
|
||||
|
||||
Args:
|
||||
tag: Tag string to classify
|
||||
|
||||
Returns:
|
||||
"metadata" or "theme"
|
||||
|
||||
Examples:
|
||||
>>> classify_tag("Applied: Cost Reduction")
|
||||
'metadata'
|
||||
>>> classify_tag("Bracket: Game Changer")
|
||||
'metadata'
|
||||
>>> classify_tag("Knights Gain Protection")
|
||||
'metadata'
|
||||
>>> classify_tag("Card Draw")
|
||||
'theme'
|
||||
>>> classify_tag("Spellslinger")
|
||||
'theme'
|
||||
"""
|
||||
# Prefix-based classification
|
||||
for prefix in tag_constants.METADATA_TAG_PREFIXES:
|
||||
if tag.startswith(prefix):
|
||||
return "metadata"
|
||||
|
||||
# Exact match classification
|
||||
if tag in tag_constants.METADATA_TAG_ALLOWLIST:
|
||||
return "metadata"
|
||||
|
||||
# Kindred protection metadata patterns: "{Type} Gain {Ability}"
|
||||
# Covers all protective abilities: Protection, Ward, Hexproof, Shroud, Indestructible
|
||||
# Examples: "Knights Gain Protection", "Spiders Gain Ward", "Merfolk Gain Ward"
|
||||
# Note: Checks for " Gain " pattern since some creature types like "Merfolk" don't end in 's'
|
||||
kindred_abilities = ["Protection", "Ward", "Hexproof", "Shroud", "Indestructible"]
|
||||
for ability in kindred_abilities:
|
||||
if " Gain " in tag and tag.endswith(ability):
|
||||
return "metadata"
|
||||
|
||||
# Protection scope metadata patterns (M5): "{Scope}: {Ability}"
|
||||
# Indicates whether protection applies to self, your permanents, all permanents, or opponent's permanents
|
||||
# Examples: "Self: Hexproof", "Your Permanents: Ward", "Blanket: Indestructible"
|
||||
# These enable deck builder to filter for board-relevant protection vs self-only
|
||||
protection_scopes = ["Self:", "Your Permanents:", "Blanket:", "Opponent Permanents:"]
|
||||
for scope in protection_scopes:
|
||||
if tag.startswith(scope):
|
||||
return "metadata"
|
||||
|
||||
# Phasing scope metadata patterns: "{Scope}: Phasing"
|
||||
# Indicates whether phasing applies to self, your permanents, all permanents, or opponents
|
||||
# Examples: "Self: Phasing", "Your Permanents: Phasing", "Blanket: Phasing",
|
||||
# "Targeted: Phasing", "Opponent Permanents: Phasing"
|
||||
# Similar to protection scopes, enables filtering for board-relevant phasing
|
||||
# Opponent Permanents: Phasing also triggers Removal tag (removal-style phasing)
|
||||
if tag in ["Self: Phasing", "Your Permanents: Phasing", "Blanket: Phasing",
|
||||
"Targeted: Phasing", "Opponent Permanents: Phasing"]:
|
||||
return "metadata"
|
||||
|
||||
# Default: treat as theme tag
|
||||
return "theme"
|
||||
|
||||
|
||||
# --- Text Processing Helpers (M0.6) ---------------------------------------------------------
|
||||
def strip_reminder_text(text: str) -> str:
|
||||
"""Remove reminder text (content in parentheses) from card text.
|
||||
|
||||
Reminder text often contains keywords and patterns that can cause false positives
|
||||
in pattern matching. This function strips all parenthetical content to focus on
|
||||
the actual game text.
|
||||
|
||||
Args:
|
||||
text: Card text possibly containing reminder text in parentheses
|
||||
|
||||
Returns:
|
||||
Text with all parenthetical content removed
|
||||
|
||||
Example:
|
||||
>>> strip_reminder_text("Hexproof (This creature can't be the target of spells)")
|
||||
"Hexproof "
|
||||
"""
|
||||
if not text:
|
||||
return text
|
||||
return re.sub(r'\([^)]*\)', '', text)
|
||||
|
||||
|
||||
def extract_context_window(text: str, match_start: int, match_end: int,
|
||||
window_size: int = None, include_before: bool = False) -> str:
|
||||
"""Extract a context window around a regex match for validation.
|
||||
|
||||
When pattern matching finds a potential match, we often need to examine
|
||||
the surrounding text to validate the match or check for additional keywords.
|
||||
This function extracts a window of text around the match position.
|
||||
|
||||
Args:
|
||||
text: Full text to extract context from
|
||||
match_start: Start position of the regex match
|
||||
match_end: End position of the regex match
|
||||
window_size: Number of characters to include after the match.
|
||||
If None, uses CONTEXT_WINDOW_SIZE from tag_constants (default: 70).
|
||||
To include context before the match, use include_before=True.
|
||||
include_before: If True, includes window_size characters before the match
|
||||
in addition to after. If False (default), only includes after.
|
||||
|
||||
Returns:
|
||||
Substring of text containing the match plus surrounding context
|
||||
|
||||
Example:
|
||||
>>> text = "Creatures you control have hexproof and vigilance"
|
||||
>>> match = re.search(r'creatures you control', text)
|
||||
>>> extract_context_window(text, match.start(), match.end(), window_size=30)
|
||||
'Creatures you control have hexproof and '
|
||||
"""
|
||||
if not text:
|
||||
return text
|
||||
if window_size is None:
|
||||
from .tag_constants import CONTEXT_WINDOW_SIZE
|
||||
window_size = CONTEXT_WINDOW_SIZE
|
||||
|
||||
# Calculate window boundaries
|
||||
if include_before:
|
||||
context_start = max(0, match_start - window_size)
|
||||
else:
|
||||
context_start = match_start
|
||||
|
||||
context_end = min(len(text), match_end + window_size)
|
||||
|
||||
return text[context_start:context_end]
|
||||
|
||||
|
||||
# --- Enhanced Tagging Utilities (M3.5/M3.6) ----------------------------------------------------
|
||||
|
||||
def build_combined_mask(
|
||||
df: pd.DataFrame,
|
||||
text_patterns: Union[str, List[str], None] = None,
|
||||
type_patterns: Union[str, List[str], None] = None,
|
||||
keyword_patterns: Union[str, List[str], None] = None,
|
||||
name_list: Union[List[str], None] = None,
|
||||
exclusion_patterns: Union[str, List[str], None] = None,
|
||||
combine_with_or: bool = True
|
||||
) -> pd.Series[bool]:
|
||||
"""Build a combined boolean mask from multiple pattern types.
|
||||
|
||||
This utility reduces boilerplate when creating complex masks by combining
|
||||
text, type, keyword, and name patterns into a single mask. Patterns are
|
||||
combined with OR by default, but can be combined with AND.
|
||||
|
||||
Args:
|
||||
df: DataFrame to search
|
||||
text_patterns: Patterns to match in 'text' column
|
||||
type_patterns: Patterns to match in 'type' column
|
||||
keyword_patterns: Patterns to match in 'keywords' column
|
||||
name_list: List of exact card names to match
|
||||
exclusion_patterns: Text patterns to exclude from final mask
|
||||
combine_with_or: If True, combine masks with OR (default).
|
||||
If False, combine with AND (requires all conditions)
|
||||
|
||||
Returns:
|
||||
Boolean Series combining all specified patterns
|
||||
|
||||
Example:
|
||||
>>> # Match cards with flying OR haste, exclude creatures
|
||||
>>> mask = build_combined_mask(
|
||||
... df,
|
||||
... keyword_patterns=['Flying', 'Haste'],
|
||||
... exclusion_patterns='Creature'
|
||||
... )
|
||||
"""
|
||||
if combine_with_or:
|
||||
result = pd.Series([False] * len(df), index=df.index)
|
||||
else:
|
||||
result = pd.Series([True] * len(df), index=df.index)
|
||||
masks = []
|
||||
|
||||
if text_patterns is not None:
|
||||
masks.append(create_text_mask(df, text_patterns))
|
||||
|
||||
if type_patterns is not None:
|
||||
masks.append(create_type_mask(df, type_patterns))
|
||||
|
||||
if keyword_patterns is not None:
|
||||
masks.append(create_keyword_mask(df, keyword_patterns))
|
||||
|
||||
if name_list is not None:
|
||||
masks.append(create_name_mask(df, name_list))
|
||||
if masks:
|
||||
if combine_with_or:
|
||||
for mask in masks:
|
||||
result |= mask
|
||||
else:
|
||||
for mask in masks:
|
||||
result &= mask
|
||||
if exclusion_patterns is not None:
|
||||
exclusion_mask = create_text_mask(df, exclusion_patterns)
|
||||
result &= ~exclusion_mask
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def tag_with_logging(
|
||||
df: pd.DataFrame,
|
||||
mask: pd.Series[bool],
|
||||
tags: Union[str, List[str]],
|
||||
log_message: str,
|
||||
color: str = '',
|
||||
logger=None
|
||||
) -> int:
|
||||
"""Apply tags with standardized logging.
|
||||
|
||||
This utility wraps the common pattern of applying tags and logging the count.
|
||||
It provides consistent formatting for log messages across the tagging module.
|
||||
|
||||
Args:
|
||||
df: DataFrame to modify
|
||||
mask: Boolean mask indicating which rows to tag
|
||||
tags: Tag(s) to apply
|
||||
log_message: Description of what's being tagged (e.g., "flying creatures")
|
||||
color: Color identifier for context (optional)
|
||||
logger: Logger instance to use (optional, uses print if None)
|
||||
|
||||
Returns:
|
||||
Count of cards tagged
|
||||
|
||||
Example:
|
||||
>>> count = tag_with_logging(
|
||||
... df,
|
||||
... flying_mask,
|
||||
... 'Flying',
|
||||
... 'creatures with flying ability',
|
||||
... color='blue',
|
||||
... logger=logger
|
||||
... )
|
||||
# Logs: "Tagged 42 blue creatures with flying ability"
|
||||
"""
|
||||
count = mask.sum()
|
||||
if count > 0:
|
||||
apply_tag_vectorized(df, mask, tags)
|
||||
color_part = f'{color} ' if color else ''
|
||||
full_message = f'Tagged {count} {color_part}{log_message}'
|
||||
|
||||
if logger:
|
||||
logger.info(full_message)
|
||||
else:
|
||||
print(full_message)
|
||||
|
||||
return count
|
||||
|
||||
|
||||
def tag_with_rules_and_logging(
|
||||
df: pd.DataFrame,
|
||||
rules: List[dict],
|
||||
summary_message: str,
|
||||
color: str = '',
|
||||
logger=None
|
||||
) -> int:
|
||||
"""Apply multiple tag rules with summarized logging.
|
||||
|
||||
This utility combines apply_rules with logging, providing a summary of
|
||||
all cards affected across multiple rules.
|
||||
|
||||
Args:
|
||||
df: DataFrame to modify
|
||||
rules: List of rule dicts (each with 'mask' and 'tags')
|
||||
summary_message: Overall description (e.g., "card draw effects")
|
||||
color: Color identifier for context (optional)
|
||||
logger: Logger instance to use (optional)
|
||||
|
||||
Returns:
|
||||
Total count of unique cards affected by any rule
|
||||
|
||||
Example:
|
||||
>>> rules = [
|
||||
... {'mask': flying_mask, 'tags': ['Flying']},
|
||||
... {'mask': haste_mask, 'tags': ['Haste', 'Aggro']}
|
||||
... ]
|
||||
>>> count = tag_with_rules_and_logging(
|
||||
... df, rules, 'evasive creatures', color='red', logger=logger
|
||||
... )
|
||||
"""
|
||||
affected = pd.Series([False] * len(df), index=df.index)
|
||||
for rule in rules:
|
||||
mask = rule.get('mask')
|
||||
if callable(mask):
|
||||
mask = mask(df)
|
||||
if mask is not None and mask.any():
|
||||
tags = rule.get('tags', [])
|
||||
apply_tag_vectorized(df, mask, tags)
|
||||
affected |= mask
|
||||
|
||||
count = affected.sum()
|
||||
color_part = f'{color} ' if color else ''
|
||||
full_message = f'Tagged {count} {color_part}{summary_message}'
|
||||
|
||||
if logger:
|
||||
logger.info(full_message)
|
||||
else:
|
||||
print(full_message)
|
||||
|
||||
return count
|
||||
File diff suppressed because it is too large
Load diff
|
|
@ -4,7 +4,7 @@ from pathlib import Path
|
|||
|
||||
import pytest
|
||||
|
||||
from headless_runner import _resolve_additional_theme_inputs, _parse_theme_list
|
||||
from code.headless_runner import resolve_additional_theme_inputs as _resolve_additional_theme_inputs, _parse_theme_list
|
||||
|
||||
|
||||
def _write_catalog(path: Path) -> None:
|
||||
|
|
|
|||
182
code/tests/test_keyword_normalization.py
Normal file
182
code/tests/test_keyword_normalization.py
Normal file
|
|
@ -0,0 +1,182 @@
|
|||
"""Tests for keyword normalization (M1 - Tagging Refinement)."""
|
||||
from __future__ import annotations
|
||||
|
||||
import pytest
|
||||
|
||||
from code.tagging import tag_utils, tag_constants
|
||||
|
||||
|
||||
class TestKeywordNormalization:
|
||||
"""Test suite for normalize_keywords function."""
|
||||
|
||||
def test_canonical_mappings(self):
|
||||
"""Test that variant keywords map to canonical forms."""
|
||||
raw = ['Commander Ninjutsu', 'Flying', 'Trample']
|
||||
allowlist = tag_constants.KEYWORD_ALLOWLIST
|
||||
frequency_map = {
|
||||
'Commander Ninjutsu': 2,
|
||||
'Flying': 100,
|
||||
'Trample': 50
|
||||
}
|
||||
|
||||
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||
|
||||
assert 'Ninjutsu' in result
|
||||
assert 'Flying' in result
|
||||
assert 'Trample' in result
|
||||
assert 'Commander Ninjutsu' not in result
|
||||
|
||||
def test_singleton_pruning(self):
|
||||
"""Test that singleton keywords are pruned unless allowlisted."""
|
||||
raw = ['Allons-y!', 'Flying', 'Take 59 Flights of Stairs']
|
||||
allowlist = {'Flying'} # Only Flying is allowlisted
|
||||
frequency_map = {
|
||||
'Allons-y!': 1,
|
||||
'Flying': 100,
|
||||
'Take 59 Flights of Stairs': 1
|
||||
}
|
||||
|
||||
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||
|
||||
assert 'Flying' in result
|
||||
assert 'Allons-y!' not in result
|
||||
assert 'Take 59 Flights of Stairs' not in result
|
||||
|
||||
def test_case_normalization(self):
|
||||
"""Test that keywords are normalized to proper case."""
|
||||
raw = ['flying', 'TRAMPLE', 'vigilance']
|
||||
allowlist = {'Flying', 'Trample', 'Vigilance'}
|
||||
frequency_map = {
|
||||
'flying': 100,
|
||||
'TRAMPLE': 50,
|
||||
'vigilance': 75
|
||||
}
|
||||
|
||||
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||
|
||||
# Case normalization happens via the map
|
||||
# If not in map, original case is preserved
|
||||
assert len(result) == 3
|
||||
|
||||
def test_partner_exclusion(self):
|
||||
"""Test that partner keywords remain excluded."""
|
||||
raw = ['Partner', 'Flying', 'Trample']
|
||||
allowlist = {'Flying', 'Trample'}
|
||||
frequency_map = {
|
||||
'Partner': 50,
|
||||
'Flying': 100,
|
||||
'Trample': 50
|
||||
}
|
||||
|
||||
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||
|
||||
assert 'Flying' in result
|
||||
assert 'Trample' in result
|
||||
assert 'Partner' not in result # Excluded
|
||||
assert 'partner' not in result
|
||||
|
||||
def test_empty_input(self):
|
||||
"""Test that empty input returns empty list."""
|
||||
result = tag_utils.normalize_keywords([], set(), {})
|
||||
assert result == []
|
||||
|
||||
def test_whitespace_handling(self):
|
||||
"""Test that whitespace is properly stripped."""
|
||||
raw = [' Flying ', 'Trample ', ' Vigilance']
|
||||
allowlist = {'Flying', 'Trample', 'Vigilance'}
|
||||
frequency_map = {
|
||||
'Flying': 100,
|
||||
'Trample': 50,
|
||||
'Vigilance': 75
|
||||
}
|
||||
|
||||
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||
|
||||
assert 'Flying' in result
|
||||
assert 'Trample' in result
|
||||
assert 'Vigilance' in result
|
||||
|
||||
def test_deduplication(self):
|
||||
"""Test that duplicate keywords are deduplicated."""
|
||||
raw = ['Flying', 'Flying', 'Trample', 'Flying']
|
||||
allowlist = {'Flying', 'Trample'}
|
||||
frequency_map = {
|
||||
'Flying': 100,
|
||||
'Trample': 50
|
||||
}
|
||||
|
||||
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||
|
||||
assert result.count('Flying') == 1
|
||||
assert result.count('Trample') == 1
|
||||
|
||||
def test_non_string_entries_skipped(self):
|
||||
"""Test that non-string entries are safely skipped."""
|
||||
raw = ['Flying', None, 123, 'Trample', '']
|
||||
allowlist = {'Flying', 'Trample'}
|
||||
frequency_map = {
|
||||
'Flying': 100,
|
||||
'Trample': 50
|
||||
}
|
||||
|
||||
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||
|
||||
assert 'Flying' in result
|
||||
assert 'Trample' in result
|
||||
assert len(result) == 2
|
||||
|
||||
def test_invalid_input_raises_error(self):
|
||||
"""Test that non-iterable input raises ValueError."""
|
||||
with pytest.raises(ValueError, match="raw must be iterable"):
|
||||
tag_utils.normalize_keywords("not-a-list", set(), {})
|
||||
|
||||
def test_allowlist_preserves_singletons(self):
|
||||
"""Test that allowlisted keywords survive even if they're singletons."""
|
||||
raw = ['Myriad', 'Flying', 'Cascade']
|
||||
allowlist = {'Flying', 'Myriad', 'Cascade'} # All allowlisted
|
||||
frequency_map = {
|
||||
'Myriad': 1, # Singleton
|
||||
'Flying': 100,
|
||||
'Cascade': 1 # Singleton
|
||||
}
|
||||
|
||||
result = tag_utils.normalize_keywords(raw, allowlist, frequency_map)
|
||||
|
||||
assert 'Myriad' in result # Preserved despite being singleton
|
||||
assert 'Flying' in result
|
||||
assert 'Cascade' in result # Preserved despite being singleton
|
||||
|
||||
|
||||
class TestKeywordIntegration:
|
||||
"""Integration tests for keyword normalization in tagging flow."""
|
||||
|
||||
def test_normalization_preserves_evergreen_keywords(self):
|
||||
"""Test that common evergreen keywords are always preserved."""
|
||||
evergreen = ['Flying', 'Trample', 'Vigilance', 'Haste', 'Deathtouch', 'Lifelink']
|
||||
allowlist = tag_constants.KEYWORD_ALLOWLIST
|
||||
frequency_map = {kw: 100 for kw in evergreen} # All common
|
||||
|
||||
result = tag_utils.normalize_keywords(evergreen, allowlist, frequency_map)
|
||||
|
||||
for kw in evergreen:
|
||||
assert kw in result
|
||||
|
||||
def test_crossover_keywords_pruned(self):
|
||||
"""Test that crossover-specific singletons are pruned."""
|
||||
crossover_singletons = [
|
||||
'Gae Bolg', # Final Fantasy
|
||||
'Psychic Defense', # Warhammer 40K
|
||||
'Allons-y!', # Doctor Who
|
||||
'Flying' # Evergreen (control)
|
||||
]
|
||||
allowlist = {'Flying'} # Only Flying allowed
|
||||
frequency_map = {
|
||||
'Gae Bolg': 1,
|
||||
'Psychic Defense': 1,
|
||||
'Allons-y!': 1,
|
||||
'Flying': 100
|
||||
}
|
||||
|
||||
result = tag_utils.normalize_keywords(crossover_singletons, allowlist, frequency_map)
|
||||
|
||||
assert result == ['Flying'] # Only evergreen survived
|
||||
300
code/tests/test_metadata_partition.py
Normal file
300
code/tests/test_metadata_partition.py
Normal file
|
|
@ -0,0 +1,300 @@
|
|||
"""Tests for M3 metadata/theme tag partition functionality.
|
||||
|
||||
Tests cover:
|
||||
- Tag classification (metadata vs theme)
|
||||
- Column creation and data migration
|
||||
- Feature flag behavior
|
||||
- Compatibility with missing columns
|
||||
- CSV read/write with new schema
|
||||
"""
|
||||
import pandas as pd
|
||||
import pytest
|
||||
from code.tagging import tag_utils
|
||||
from code.tagging.tagger import _apply_metadata_partition
|
||||
|
||||
|
||||
class TestTagClassification:
|
||||
"""Tests for classify_tag function."""
|
||||
|
||||
def test_prefix_based_metadata(self):
|
||||
"""Metadata tags identified by prefix."""
|
||||
assert tag_utils.classify_tag("Applied: Cost Reduction") == "metadata"
|
||||
assert tag_utils.classify_tag("Bracket: Game Changer") == "metadata"
|
||||
assert tag_utils.classify_tag("Diagnostic: Test") == "metadata"
|
||||
assert tag_utils.classify_tag("Internal: Debug") == "metadata"
|
||||
|
||||
def test_exact_match_metadata(self):
|
||||
"""Metadata tags identified by exact match."""
|
||||
assert tag_utils.classify_tag("Bracket: Game Changer") == "metadata"
|
||||
assert tag_utils.classify_tag("Bracket: Staple") == "metadata"
|
||||
|
||||
def test_kindred_protection_metadata(self):
|
||||
"""Kindred protection tags are metadata."""
|
||||
assert tag_utils.classify_tag("Knights Gain Protection") == "metadata"
|
||||
assert tag_utils.classify_tag("Frogs Gain Protection") == "metadata"
|
||||
assert tag_utils.classify_tag("Zombies Gain Protection") == "metadata"
|
||||
|
||||
def test_theme_classification(self):
|
||||
"""Regular gameplay tags are themes."""
|
||||
assert tag_utils.classify_tag("Card Draw") == "theme"
|
||||
assert tag_utils.classify_tag("Spellslinger") == "theme"
|
||||
assert tag_utils.classify_tag("Tokens Matter") == "theme"
|
||||
assert tag_utils.classify_tag("Ramp") == "theme"
|
||||
assert tag_utils.classify_tag("Protection") == "theme"
|
||||
|
||||
def test_edge_cases(self):
|
||||
"""Edge cases in tag classification."""
|
||||
# Empty string
|
||||
assert tag_utils.classify_tag("") == "theme"
|
||||
|
||||
# Similar but not exact matches
|
||||
assert tag_utils.classify_tag("Apply: Something") == "theme" # Wrong prefix
|
||||
assert tag_utils.classify_tag("Knights Have Protection") == "theme" # Not "Gain"
|
||||
|
||||
# Case sensitivity
|
||||
assert tag_utils.classify_tag("applied: Cost Reduction") == "theme" # Lowercase
|
||||
|
||||
|
||||
class TestMetadataPartition:
|
||||
"""Tests for _apply_metadata_partition function."""
|
||||
|
||||
def test_basic_partition(self, monkeypatch):
|
||||
"""Basic partition splits tags correctly."""
|
||||
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||
|
||||
df = pd.DataFrame({
|
||||
'name': ['Card A', 'Card B'],
|
||||
'themeTags': [
|
||||
['Card Draw', 'Applied: Cost Reduction'],
|
||||
['Spellslinger', 'Bracket: Game Changer', 'Tokens Matter']
|
||||
]
|
||||
})
|
||||
|
||||
df_out, diag = _apply_metadata_partition(df)
|
||||
|
||||
# Check theme tags
|
||||
assert df_out.loc[0, 'themeTags'] == ['Card Draw']
|
||||
assert df_out.loc[1, 'themeTags'] == ['Spellslinger', 'Tokens Matter']
|
||||
|
||||
# Check metadata tags
|
||||
assert df_out.loc[0, 'metadataTags'] == ['Applied: Cost Reduction']
|
||||
assert df_out.loc[1, 'metadataTags'] == ['Bracket: Game Changer']
|
||||
|
||||
# Check diagnostics
|
||||
assert diag['enabled'] is True
|
||||
assert diag['rows_with_tags'] == 2
|
||||
assert diag['metadata_tags_moved'] == 2
|
||||
assert diag['theme_tags_kept'] == 3
|
||||
|
||||
def test_empty_tags(self, monkeypatch):
|
||||
"""Handles empty tag lists."""
|
||||
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||
|
||||
df = pd.DataFrame({
|
||||
'name': ['Card A', 'Card B'],
|
||||
'themeTags': [[], ['Card Draw']]
|
||||
})
|
||||
|
||||
df_out, diag = _apply_metadata_partition(df)
|
||||
|
||||
assert df_out.loc[0, 'themeTags'] == []
|
||||
assert df_out.loc[0, 'metadataTags'] == []
|
||||
assert df_out.loc[1, 'themeTags'] == ['Card Draw']
|
||||
assert df_out.loc[1, 'metadataTags'] == []
|
||||
|
||||
assert diag['rows_with_tags'] == 1
|
||||
|
||||
def test_all_metadata_tags(self, monkeypatch):
|
||||
"""Handles rows with only metadata tags."""
|
||||
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||
|
||||
df = pd.DataFrame({
|
||||
'name': ['Card A'],
|
||||
'themeTags': [['Applied: Cost Reduction', 'Bracket: Game Changer']]
|
||||
})
|
||||
|
||||
df_out, diag = _apply_metadata_partition(df)
|
||||
|
||||
assert df_out.loc[0, 'themeTags'] == []
|
||||
assert df_out.loc[0, 'metadataTags'] == ['Applied: Cost Reduction', 'Bracket: Game Changer']
|
||||
|
||||
assert diag['metadata_tags_moved'] == 2
|
||||
assert diag['theme_tags_kept'] == 0
|
||||
|
||||
def test_all_theme_tags(self, monkeypatch):
|
||||
"""Handles rows with only theme tags."""
|
||||
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||
|
||||
df = pd.DataFrame({
|
||||
'name': ['Card A'],
|
||||
'themeTags': [['Card Draw', 'Ramp', 'Spellslinger']]
|
||||
})
|
||||
|
||||
df_out, diag = _apply_metadata_partition(df)
|
||||
|
||||
assert df_out.loc[0, 'themeTags'] == ['Card Draw', 'Ramp', 'Spellslinger']
|
||||
assert df_out.loc[0, 'metadataTags'] == []
|
||||
|
||||
assert diag['metadata_tags_moved'] == 0
|
||||
assert diag['theme_tags_kept'] == 3
|
||||
|
||||
def test_feature_flag_disabled(self, monkeypatch):
|
||||
"""Feature flag disables partition."""
|
||||
monkeypatch.setenv('TAG_METADATA_SPLIT', '0')
|
||||
|
||||
df = pd.DataFrame({
|
||||
'name': ['Card A'],
|
||||
'themeTags': [['Card Draw', 'Applied: Cost Reduction']]
|
||||
})
|
||||
|
||||
df_out, diag = _apply_metadata_partition(df)
|
||||
|
||||
# Should not create metadataTags column
|
||||
assert 'metadataTags' not in df_out.columns
|
||||
|
||||
# Should not modify themeTags
|
||||
assert df_out.loc[0, 'themeTags'] == ['Card Draw', 'Applied: Cost Reduction']
|
||||
|
||||
# Should indicate disabled
|
||||
assert diag['enabled'] is False
|
||||
|
||||
def test_missing_theme_tags_column(self, monkeypatch):
|
||||
"""Handles missing themeTags column gracefully."""
|
||||
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||
|
||||
df = pd.DataFrame({
|
||||
'name': ['Card A'],
|
||||
'other_column': ['value']
|
||||
})
|
||||
|
||||
df_out, diag = _apply_metadata_partition(df)
|
||||
|
||||
# Should return unchanged
|
||||
assert 'themeTags' not in df_out.columns
|
||||
assert 'metadataTags' not in df_out.columns
|
||||
|
||||
# Should indicate error
|
||||
assert diag['enabled'] is True
|
||||
assert 'error' in diag
|
||||
|
||||
def test_non_list_tags(self, monkeypatch):
|
||||
"""Handles non-list values in themeTags."""
|
||||
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||
|
||||
df = pd.DataFrame({
|
||||
'name': ['Card A', 'Card B', 'Card C'],
|
||||
'themeTags': [['Card Draw'], None, 'not a list']
|
||||
})
|
||||
|
||||
df_out, diag = _apply_metadata_partition(df)
|
||||
|
||||
# Only first row should be processed
|
||||
assert df_out.loc[0, 'themeTags'] == ['Card Draw']
|
||||
assert df_out.loc[0, 'metadataTags'] == []
|
||||
|
||||
assert diag['rows_with_tags'] == 1
|
||||
|
||||
def test_kindred_protection_partition(self, monkeypatch):
|
||||
"""Kindred protection tags are moved to metadata."""
|
||||
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||
|
||||
df = pd.DataFrame({
|
||||
'name': ['Card A'],
|
||||
'themeTags': [['Protection', 'Knights Gain Protection', 'Card Draw']]
|
||||
})
|
||||
|
||||
df_out, diag = _apply_metadata_partition(df)
|
||||
|
||||
assert 'Protection' in df_out.loc[0, 'themeTags']
|
||||
assert 'Card Draw' in df_out.loc[0, 'themeTags']
|
||||
assert 'Knights Gain Protection' in df_out.loc[0, 'metadataTags']
|
||||
|
||||
def test_diagnostics_structure(self, monkeypatch):
|
||||
"""Diagnostics contain expected fields."""
|
||||
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||
|
||||
df = pd.DataFrame({
|
||||
'name': ['Card A'],
|
||||
'themeTags': [['Card Draw', 'Applied: Cost Reduction']]
|
||||
})
|
||||
|
||||
df_out, diag = _apply_metadata_partition(df)
|
||||
|
||||
# Check required diagnostic fields
|
||||
assert 'enabled' in diag
|
||||
assert 'total_rows' in diag
|
||||
assert 'rows_with_tags' in diag
|
||||
assert 'metadata_tags_moved' in diag
|
||||
assert 'theme_tags_kept' in diag
|
||||
assert 'unique_metadata_tags' in diag
|
||||
assert 'unique_theme_tags' in diag
|
||||
assert 'most_common_metadata' in diag
|
||||
assert 'most_common_themes' in diag
|
||||
|
||||
# Check types
|
||||
assert isinstance(diag['most_common_metadata'], list)
|
||||
assert isinstance(diag['most_common_themes'], list)
|
||||
|
||||
|
||||
class TestCSVCompatibility:
|
||||
"""Tests for CSV read/write with new schema."""
|
||||
|
||||
def test_csv_roundtrip_with_metadata(self, tmp_path, monkeypatch):
|
||||
"""CSV roundtrip preserves both columns."""
|
||||
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||
|
||||
csv_path = tmp_path / "test_cards.csv"
|
||||
|
||||
# Create initial dataframe
|
||||
df = pd.DataFrame({
|
||||
'name': ['Card A'],
|
||||
'themeTags': [['Card Draw', 'Ramp']],
|
||||
'metadataTags': [['Applied: Cost Reduction']]
|
||||
})
|
||||
|
||||
# Write to CSV
|
||||
df.to_csv(csv_path, index=False)
|
||||
|
||||
# Read back
|
||||
df_read = pd.read_csv(
|
||||
csv_path,
|
||||
converters={'themeTags': pd.eval, 'metadataTags': pd.eval}
|
||||
)
|
||||
|
||||
# Verify data preserved
|
||||
assert df_read.loc[0, 'themeTags'] == ['Card Draw', 'Ramp']
|
||||
assert df_read.loc[0, 'metadataTags'] == ['Applied: Cost Reduction']
|
||||
|
||||
def test_csv_backward_compatible(self, tmp_path, monkeypatch):
|
||||
"""Can read old CSVs without metadataTags."""
|
||||
monkeypatch.setenv('TAG_METADATA_SPLIT', '1')
|
||||
|
||||
csv_path = tmp_path / "old_cards.csv"
|
||||
|
||||
# Create old-style CSV without metadataTags
|
||||
df = pd.DataFrame({
|
||||
'name': ['Card A'],
|
||||
'themeTags': [['Card Draw', 'Applied: Cost Reduction']]
|
||||
})
|
||||
df.to_csv(csv_path, index=False)
|
||||
|
||||
# Read back
|
||||
df_read = pd.read_csv(csv_path, converters={'themeTags': pd.eval})
|
||||
|
||||
# Should read successfully
|
||||
assert 'themeTags' in df_read.columns
|
||||
assert 'metadataTags' not in df_read.columns
|
||||
assert df_read.loc[0, 'themeTags'] == ['Card Draw', 'Applied: Cost Reduction']
|
||||
|
||||
# Apply partition
|
||||
df_partitioned, _ = _apply_metadata_partition(df_read)
|
||||
|
||||
# Should now have both columns
|
||||
assert 'themeTags' in df_partitioned.columns
|
||||
assert 'metadataTags' in df_partitioned.columns
|
||||
assert df_partitioned.loc[0, 'themeTags'] == ['Card Draw']
|
||||
assert df_partitioned.loc[0, 'metadataTags'] == ['Applied: Cost Reduction']
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
pytest.main([__file__, "-v"])
|
||||
169
code/tests/test_protection_grant_detection.py
Normal file
169
code/tests/test_protection_grant_detection.py
Normal file
|
|
@ -0,0 +1,169 @@
|
|||
"""
|
||||
Tests for protection grant detection (M2).
|
||||
|
||||
Tests the ability to distinguish between cards that grant protection
|
||||
and cards that have inherent protection.
|
||||
"""
|
||||
|
||||
import pytest
|
||||
from code.tagging.protection_grant_detection import (
|
||||
is_granting_protection,
|
||||
categorize_protection_card
|
||||
)
|
||||
|
||||
|
||||
class TestGrantDetection:
|
||||
"""Test grant verb detection."""
|
||||
|
||||
def test_gains_hexproof(self):
|
||||
"""Cards with 'gains hexproof' should be detected as granting."""
|
||||
text = "Target creature gains hexproof until end of turn."
|
||||
assert is_granting_protection(text, "")
|
||||
|
||||
def test_gives_indestructible(self):
|
||||
"""Cards with 'gives indestructible' should be detected as granting."""
|
||||
text = "This creature gives target creature indestructible."
|
||||
assert is_granting_protection(text, "")
|
||||
|
||||
def test_creatures_you_control_have(self):
|
||||
"""Mass grant pattern should be detected."""
|
||||
text = "Creatures you control have hexproof."
|
||||
assert is_granting_protection(text, "")
|
||||
|
||||
def test_equipped_creature_gets(self):
|
||||
"""Equipment grant pattern should be detected."""
|
||||
text = "Equipped creature gets +2/+2 and has indestructible."
|
||||
assert is_granting_protection(text, "")
|
||||
|
||||
|
||||
class TestInherentDetection:
|
||||
"""Test inherent protection detection."""
|
||||
|
||||
def test_creature_with_hexproof_keyword(self):
|
||||
"""Creature with hexproof keyword should not be detected as granting."""
|
||||
text = "Hexproof (This creature can't be the target of spells or abilities.)"
|
||||
keywords = "Hexproof"
|
||||
assert not is_granting_protection(text, keywords)
|
||||
|
||||
def test_indestructible_artifact(self):
|
||||
"""Artifact with indestructible keyword should not be detected as granting."""
|
||||
text = "Indestructible"
|
||||
keywords = "Indestructible"
|
||||
assert not is_granting_protection(text, keywords)
|
||||
|
||||
def test_ward_creature(self):
|
||||
"""Creature with Ward should not be detected as granting (unless it grants to others)."""
|
||||
text = "Ward {2}"
|
||||
keywords = "Ward"
|
||||
assert not is_granting_protection(text, keywords)
|
||||
|
||||
|
||||
class TestMixedCases:
|
||||
"""Test cards that both grant and have protection."""
|
||||
|
||||
def test_creature_with_self_grant(self):
|
||||
"""Creature that grants itself protection should be detected."""
|
||||
text = "This creature gains indestructible until end of turn."
|
||||
keywords = ""
|
||||
assert is_granting_protection(text, keywords)
|
||||
|
||||
def test_equipment_with_inherent_and_grant(self):
|
||||
"""Equipment with indestructible that grants protection."""
|
||||
text = "Indestructible. Equipped creature has hexproof."
|
||||
keywords = "Indestructible"
|
||||
# Should be detected as granting because of "has hexproof"
|
||||
assert is_granting_protection(text, keywords)
|
||||
|
||||
|
||||
class TestExclusions:
|
||||
"""Test exclusion patterns."""
|
||||
|
||||
def test_cant_have_hexproof(self):
|
||||
"""Cards that prevent protection should not be tagged."""
|
||||
text = "Creatures your opponents control can't have hexproof."
|
||||
assert not is_granting_protection(text, "")
|
||||
|
||||
def test_loses_indestructible(self):
|
||||
"""Cards that remove protection should not be tagged."""
|
||||
text = "Target creature loses indestructible until end of turn."
|
||||
assert not is_granting_protection(text, "")
|
||||
|
||||
|
||||
class TestEdgeCases:
|
||||
"""Test edge cases and special patterns."""
|
||||
|
||||
def test_protection_from_color(self):
|
||||
"""Protection from [quality] in keywords without grant text."""
|
||||
text = "Protection from red"
|
||||
keywords = "Protection from red"
|
||||
assert not is_granting_protection(text, keywords)
|
||||
|
||||
def test_empty_text(self):
|
||||
"""Empty text should return False."""
|
||||
assert not is_granting_protection("", "")
|
||||
|
||||
def test_none_text(self):
|
||||
"""None text should return False."""
|
||||
assert not is_granting_protection(None, "")
|
||||
|
||||
|
||||
class TestCategorization:
|
||||
"""Test full card categorization."""
|
||||
|
||||
def test_shell_shield_is_grant(self):
|
||||
"""Shell Shield grants hexproof - should be Grant."""
|
||||
text = "Target creature gets +0/+3 and gains hexproof until end of turn."
|
||||
cat = categorize_protection_card("Shell Shield", text, "", "Instant")
|
||||
assert cat == "Grant"
|
||||
|
||||
def test_geist_of_saint_traft_is_mixed(self):
|
||||
"""Geist has hexproof and creates tokens - Mixed."""
|
||||
text = "Hexproof. Whenever this attacks, create a token."
|
||||
keywords = "Hexproof"
|
||||
cat = categorize_protection_card("Geist", text, keywords, "Creature")
|
||||
# Has hexproof keyword, so inherent
|
||||
assert cat in ("Inherent", "Mixed")
|
||||
|
||||
def test_darksteel_brute_is_inherent(self):
|
||||
"""Darksteel Brute has indestructible - should be Inherent."""
|
||||
text = "Indestructible"
|
||||
keywords = "Indestructible"
|
||||
cat = categorize_protection_card("Darksteel Brute", text, keywords, "Artifact")
|
||||
assert cat == "Inherent"
|
||||
|
||||
def test_scion_of_oona_is_grant(self):
|
||||
"""Scion of Oona grants shroud to other faeries - should be Grant."""
|
||||
text = "Other Faeries you control have shroud."
|
||||
keywords = "Flying, Flash"
|
||||
cat = categorize_protection_card("Scion of Oona", text, keywords, "Creature")
|
||||
assert cat == "Grant"
|
||||
|
||||
|
||||
class TestRealWorldCards:
|
||||
"""Test against actual card samples from baseline audit."""
|
||||
|
||||
def test_bulwark_ox(self):
|
||||
"""Bulwark Ox - grants hexproof and indestructible."""
|
||||
text = "Sacrifice: Creatures you control with counters gain hexproof and indestructible"
|
||||
assert is_granting_protection(text, "")
|
||||
|
||||
def test_bloodsworn_squire(self):
|
||||
"""Bloodsworn Squire - grants itself indestructible."""
|
||||
text = "This creature gains indestructible until end of turn"
|
||||
assert is_granting_protection(text, "")
|
||||
|
||||
def test_kaldra_compleat(self):
|
||||
"""Kaldra Compleat - equipment with indestructible that grants."""
|
||||
text = "Indestructible. Equipped creature gets +5/+5 and has indestructible"
|
||||
keywords = "Indestructible"
|
||||
assert is_granting_protection(text, keywords)
|
||||
|
||||
def test_ward_sliver(self):
|
||||
"""Ward Sliver - grants protection to all slivers."""
|
||||
text = "All Slivers have protection from the chosen color"
|
||||
assert is_granting_protection(text, "")
|
||||
|
||||
def test_rebbec(self):
|
||||
"""Rebbec - grants protection to artifacts."""
|
||||
text = "Artifacts you control have protection from each mana value"
|
||||
assert is_granting_protection(text, "")
|
||||
|
|
@ -170,7 +170,7 @@ def _step5_summary_placeholder_html(token: int, *, message: str | None = None) -
|
|||
return (
|
||||
f'<div id="deck-summary" data-summary '
|
||||
f'hx-get="/build/step5/summary?token={token}" '
|
||||
'hx-trigger="load, step5:refresh from:body" hx-swap="outerHTML">'
|
||||
'hx-trigger="step5:refresh from:body" hx-swap="outerHTML">'
|
||||
f'<div class="muted" style="margin-top:1rem;">{_esc(text)}</div>'
|
||||
'</div>'
|
||||
)
|
||||
|
|
|
|||
|
|
@ -159,11 +159,18 @@ def _read_csv_summary(csv_path: Path) -> Tuple[dict, Dict[str, int], Dict[str, i
|
|||
# Type counts/cards (exclude commander entry from distribution)
|
||||
if not is_commander:
|
||||
type_counts[cat] = type_counts.get(cat, 0) + cnt
|
||||
# M5: Extract metadata tags column if present
|
||||
metadata_tags_raw = ''
|
||||
metadata_idx = headers.index('MetadataTags') if 'MetadataTags' in headers else -1
|
||||
if metadata_idx >= 0 and metadata_idx < len(row):
|
||||
metadata_tags_raw = row[metadata_idx] or ''
|
||||
metadata_tags_list = [t.strip() for t in metadata_tags_raw.split(';') if t.strip()]
|
||||
type_cards.setdefault(cat, []).append({
|
||||
'name': name,
|
||||
'count': cnt,
|
||||
'role': role,
|
||||
'tags': tags_list,
|
||||
'metadata_tags': metadata_tags_list, # M5: Include metadata tags
|
||||
})
|
||||
|
||||
# Curve
|
||||
|
|
|
|||
|
|
@ -900,7 +900,7 @@ def ideal_labels() -> Dict[str, str]:
|
|||
'removal': 'Spot Removal',
|
||||
'wipes': 'Board Wipes',
|
||||
'card_advantage': 'Card Advantage',
|
||||
'protection': 'Protection',
|
||||
'protection': 'Protective Effects',
|
||||
}
|
||||
|
||||
|
||||
|
|
@ -1181,6 +1181,9 @@ def _ensure_setup_ready(out, force: bool = False) -> None:
|
|||
# Only flip phase if previous run finished
|
||||
if st.get('phase') in {'themes','themes-fast'}:
|
||||
st['phase'] = 'done'
|
||||
# Also ensure percent is 100 when done
|
||||
if st.get('finished_at'):
|
||||
st['percent'] = 100
|
||||
with open(status_path, 'w', encoding='utf-8') as _wf:
|
||||
json.dump(st, _wf)
|
||||
except Exception:
|
||||
|
|
@ -1463,16 +1466,17 @@ def _ensure_setup_ready(out, force: bool = False) -> None:
|
|||
except Exception:
|
||||
pass
|
||||
|
||||
# Unconditional fallback: if (for any reason) no theme export ran above, perform a fast-path export now.
|
||||
# This guarantees that clicking Run Setup/Tagging always leaves themes current even when tagging wasn't needed.
|
||||
# Conditional fallback: only run theme export if refresh_needed was True but somehow no export performed.
|
||||
# This avoids repeated exports when setup is already complete and _ensure_setup_ready is called again.
|
||||
try:
|
||||
if not theme_export_performed:
|
||||
if not theme_export_performed and refresh_needed:
|
||||
_refresh_theme_catalog(out, force=False, fast_path=True)
|
||||
except Exception:
|
||||
pass
|
||||
else: # If export just ran (either earlier or via fallback), ensure enrichment ran (safety double-call guard inside helper)
|
||||
try:
|
||||
_run_theme_metadata_enrichment(out)
|
||||
if theme_export_performed or refresh_needed:
|
||||
_run_theme_metadata_enrichment(out)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
|
|
@ -1907,7 +1911,7 @@ def _make_stages(b: DeckBuilder) -> List[Dict[str, Any]]:
|
|||
("removal", "Confirm Removal", "add_removal"),
|
||||
("wipes", "Confirm Board Wipes", "add_board_wipes"),
|
||||
("card_advantage", "Confirm Card Advantage", "add_card_advantage"),
|
||||
("protection", "Confirm Protection", "add_protection"),
|
||||
("protection", "Confirm Protective Effects", "add_protection"),
|
||||
]
|
||||
any_granular = any(callable(getattr(b, rn, None)) for _key, _label, rn in spell_categories)
|
||||
if any_granular:
|
||||
|
|
|
|||
|
|
@ -309,7 +309,8 @@
|
|||
.catch(function(){ /* noop */ });
|
||||
} catch(e) {}
|
||||
}
|
||||
setInterval(pollStatus, 3000);
|
||||
// Poll every 10 seconds instead of 3 to reduce server load (only for header indicator)
|
||||
setInterval(pollStatus, 10000);
|
||||
pollStatus();
|
||||
|
||||
// Health indicator poller
|
||||
|
|
@ -1011,6 +1012,7 @@
|
|||
var role = (attr('data-role')||'').trim();
|
||||
var reasonsRaw = attr('data-reasons')||'';
|
||||
var tagsRaw = attr('data-tags')||'';
|
||||
var metadataTagsRaw = attr('data-metadata-tags')||''; // M5: Extract metadata tags
|
||||
var reasonsRaw = attr('data-reasons')||'';
|
||||
var roleEl = panel.querySelector('.hcp-role');
|
||||
var hasFlip = !!card.querySelector('.dfc-toggle');
|
||||
|
|
@ -1115,6 +1117,14 @@
|
|||
tagsEl.style.display = 'none';
|
||||
} else {
|
||||
var tagText = allTags.map(displayLabel).join(', ');
|
||||
// M5: Temporarily append metadata tags for debugging
|
||||
if(metadataTagsRaw && metadataTagsRaw.trim()){
|
||||
var metaTags = metadataTagsRaw.split(',').map(function(t){return t.trim();}).filter(Boolean);
|
||||
if(metaTags.length){
|
||||
var metaText = metaTags.map(displayLabel).join(', ');
|
||||
tagText = tagText ? (tagText + ' | META: ' + metaText) : ('META: ' + metaText);
|
||||
}
|
||||
}
|
||||
tagsEl.textContent = tagText;
|
||||
tagsEl.style.display = tagText ? '' : 'none';
|
||||
}
|
||||
|
|
|
|||
|
|
@ -462,11 +462,12 @@
|
|||
<!-- controls now above -->
|
||||
|
||||
{% if allow_must_haves %}
|
||||
{% include "partials/include_exclude_summary.html" with oob=False %}
|
||||
{% set oob = False %}
|
||||
{% include "partials/include_exclude_summary.html" %}
|
||||
{% endif %}
|
||||
<div id="deck-summary" data-summary
|
||||
hx-get="/build/step5/summary?token={{ summary_token }}"
|
||||
hx-trigger="load, step5:refresh from:body"
|
||||
hx-trigger="load once, step5:refresh from:body"
|
||||
hx-swap="outerHTML">
|
||||
<div class="muted" style="margin-top:1rem;">
|
||||
{% if summary_ready %}Loading deck summary…{% else %}Deck summary will appear after the build completes.{% endif %}
|
||||
|
|
|
|||
|
|
@ -74,7 +74,7 @@
|
|||
{% set owned = (owned_set is defined and c.name and (c.name|lower in owned_set)) %}
|
||||
<span class="count">{{ cnt }}</span>
|
||||
<span class="times">x</span>
|
||||
<span class="name dfc-anchor" title="{{ c.name }}" data-card-name="{{ c.name }}" data-count="{{ cnt }}" data-role="{{ c.role }}" data-tags="{{ (c.tags|map('trim')|join(', ')) if c.tags else '' }}"{% if overlaps %} data-overlaps="{{ overlaps|join(', ') }}"{% endif %}>{{ c.name }}</span>
|
||||
<span class="name dfc-anchor" title="{{ c.name }}" data-card-name="{{ c.name }}" data-count="{{ cnt }}" data-role="{{ c.role }}" data-tags="{{ (c.tags|map('trim')|join(', ')) if c.tags else '' }}"{% if c.metadata_tags %} data-metadata-tags="{{ (c.metadata_tags|map('trim')|join(', ')) }}"{% endif %}{% if overlaps %} data-overlaps="{{ overlaps|join(', ') }}"{% endif %}>{{ c.name }}</span>
|
||||
<span class="flip-slot" aria-hidden="true">
|
||||
{% if c.dfc_land %}
|
||||
<span class="dfc-land-chip {% if c.dfc_adds_extra_land %}extra{% else %}counts{% endif %}" title="{{ c.dfc_note or 'Modal double-faced land' }}">DFC land{% if c.dfc_adds_extra_land %} +1{% endif %}</span>
|
||||
|
|
|
|||
|
|
@ -127,7 +127,8 @@
|
|||
.then(update)
|
||||
.catch(function(){});
|
||||
}
|
||||
setInterval(poll, 3000);
|
||||
// Poll every 5 seconds instead of 3 to reduce server load
|
||||
setInterval(poll, 5000);
|
||||
poll();
|
||||
})();
|
||||
</script>
|
||||
|
|
|
|||
File diff suppressed because it is too large
Load diff
|
|
@ -99,6 +99,12 @@ services:
|
|||
WEB_AUTO_REFRESH_DAYS: "7" # Refresh cards.csv if older than N days; 0=never
|
||||
WEB_TAG_PARALLEL: "1" # 1=parallelize tagging
|
||||
WEB_TAG_WORKERS: "4" # Worker count when parallel tagging
|
||||
|
||||
# Tagging Refinement Feature Flags
|
||||
TAG_NORMALIZE_KEYWORDS: "1" # 1=normalize keywords & filter specialty mechanics (recommended)
|
||||
TAG_PROTECTION_GRANTS: "1" # 1=Protection tag only for cards granting shields (recommended)
|
||||
TAG_METADATA_SPLIT: "1" # 1=separate metadata tags from themes in CSVs (recommended)
|
||||
|
||||
THEME_CATALOG_MODE: "merge" # Use merged Phase B catalog builder (with YAML export)
|
||||
THEME_YAML_FAST_SKIP: "0" # 1=allow skipping per-theme YAML on fast path (rare; default always export)
|
||||
# Live YAML scan interval in seconds for change detection (dev convenience)
|
||||
|
|
|
|||
|
|
@ -101,6 +101,12 @@ services:
|
|||
WEB_AUTO_REFRESH_DAYS: "7" # Refresh cards.csv if older than N days; 0=never
|
||||
WEB_TAG_PARALLEL: "1" # 1=parallelize tagging
|
||||
WEB_TAG_WORKERS: "4" # Worker count when parallel tagging
|
||||
|
||||
# Tagging Refinement Feature Flags
|
||||
TAG_NORMALIZE_KEYWORDS: "1" # 1=normalize keywords & filter specialty mechanics (recommended)
|
||||
TAG_PROTECTION_GRANTS: "1" # 1=Protection tag only for cards granting shields (recommended)
|
||||
TAG_METADATA_SPLIT: "1" # 1=separate metadata tags from themes in CSVs (recommended)
|
||||
|
||||
THEME_CATALOG_MODE: "merge" # Use merged Phase B catalog builder (with YAML export)
|
||||
THEME_YAML_FAST_SKIP: "0" # 1=allow skipping per-theme YAML on fast path (rare; default always export)
|
||||
# Live YAML scan interval in seconds for change detection (dev convenience)
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue