Commit graph

13 commits

Author SHA1 Message Date
matt
345dfb3e01 perf: improve commander selection speed and fix color identity display 2025-10-19 13:29:47 -07:00
matt
bff64de370 fix: systematically handle numpy arrays from Parquet files across codebase
- Add ensure_theme_tags_list() utility to builder_utils for simpler numpy array handling
- Update phase3_creatures.py: 6 locations now use bu.ensure_theme_tags_list()
- Update phase4_spells.py: 9 locations now use bu.ensure_theme_tags_list()
- Update tagger.py: 2 locations use hasattr/list() for numpy compatibility
- Update extract_themes.py: 2 locations use hasattr/list() for numpy compatibility
- Fix build-similarity-cache.yml verification script to handle numpy arrays
- Enhance workflow debug output to show complete row data

Parquet files return numpy.ndarray objects for array columns, not Python lists.
The M4 migration added numpy support to canonical parse_theme_tags() in builder_utils,
but many parts of the codebase still used isinstance(list) checks that fail with arrays.
This commit systematically replaces all 19 instances with proper numpy array handling.

Fixes GitHub Actions workflow 'RuntimeError: No theme tags found' and verification failures.
2025-10-18 22:47:09 -07:00
matt
29b5da4778 fix: correct DataFrame column filtering and enhance debug output
- Fix KeyError in generate_theme_catalog.py: use isCommander column correctly
- DataFrame.get() doesn't work like dict.get() - use column name directly
- Enhanced debug step to print full row data for better diagnostics
2025-10-18 22:32:54 -07:00
matt
a689400c47 fix: add Path wrapper in workflow debug step 2025-10-18 22:27:13 -07:00
matt
30dfca0b67 fix: remove CSV fallback from theme catalog generation, add Parquet debug step
- Remove CSV fallback logic (Parquet-only in M4 migration)
- Add better error messages when Parquet file missing or empty
- Add workflow debug step to inspect Parquet file after tagging
- Simplify build_theme_catalog function signature
2025-10-18 22:22:35 -07:00
matt
0e19824372 fix: use generate_theme_catalog script instead of non-existent function 2025-10-18 22:07:48 -07:00
matt
3694a5382d fix: ensure theme catalog is generated before similarity cache build 2025-10-18 21:57:45 -07:00
matt
8e8b788091 fix: add detailed tag validation to CI workflow 2025-10-18 21:56:23 -07:00
mwisnowski
74eb47e670
Change tagging step to run in parallel 2025-10-18 21:37:07 -07:00
matt
8435312c8f feat: migrate to unified Parquet format with instant GitHub setup and 4x faster tagging 2025-10-18 21:32:12 -07:00
matt
86752b351b feat: optimize cache workflow with orphan branch and age check
- Create/use orphan branch 'similarity-cache-data' for cache distribution
- Add age check to dockerhub-publish: only rebuild if cache >7 days old
- Use git add -f to force-add cache files (keeps .gitignore clean)
- Weekly scheduled builds will keep cache fresh automatically

This avoids rebuilding cache on every Docker publish while ensuring
cache is always reasonably fresh (<7 days old).
2025-10-17 17:11:04 -07:00
matt
fc911b818e fix: correct module path for all_cards.parquet generation in CI
Changed from non-existent code.web.services.card_loader to correct
code.file_setup.card_aggregator.CardAggregator module.

Fixes ModuleNotFoundError in build-similarity-cache workflow.
2025-10-17 16:41:44 -07:00
matt
c2960c808e Add card browser with similar cards and performance optimizations 2025-10-17 16:17:36 -07:00