Skip to content

fix: proper YAML folded scalar parsing + bundle checks#2

Open
guohongbin-git wants to merge 1 commit into
topprismdata:mainfrom
guohongbin-git:fix/yaml-parser-and-bundle-checks
Open

fix: proper YAML folded scalar parsing + bundle checks#2
guohongbin-git wants to merge 1 commit into
topprismdata:mainfrom
guohongbin-git:fix/yaml-parser-and-bundle-checks

Conversation

@guohongbin-git
Copy link
Copy Markdown

Summary

Fixes 4 bugs in scripts/run_tests.py that caused widespread false scoring errors:

Bug 1 — YAML Folded Scalars Not Parsed (Critical)

The naive for line in frontmatter.splitlines() parser treated description: | and description: >- as empty values, truncating descriptions and causing ~36 skills to be wrongly scored as BASIC/REJECT.

Fix: Add _parse_yaml_fm() with PyYAML as primary parser + custom fallback that properly handles literal (|) and folded (>) scalar blocks.

Bug 2 — .DS_Store False Positives on macOS

any((skill_dir / "assets").iterdir()) returns True on macOS even when only .DS_Store exists.

Fix: Add _dir_has_real_files() that excludes hidden files.

Bug 3 — Angle Brackets Check Too Broad

All code examples containing <T> or <div> triggered the "angle brackets bug" warning, even though they were intentional content in folded scalar blocks.

Fix: Only flag when description is a literal single-line value (not a folded scalar).

Bug 4 — Frontmatter Regex Fragile

re.match(r"^---\n...") fails if there's leading whitespace.

Fix: re.search(r"^---\s*\n...") + accept ... as YAML document end marker.

Impact on Your Skill Collection

Before → After:

Tier Before After
POWERFUL 0 2
STANDARD 23 79
BASIC 36 4
REJECT 7 3
NO_FRONTMATTER 6 3

The 36 skills previously in BASIC were all false positives. The fix unlocks accurate scoring for any skill using multi-line descriptions in YAML frontmatter.

- Add _parse_yaml_fm() with PyYAML fallback + custom parser for
  literal (|) and folded (>) scalars, so multi-line descriptions
  are read correctly instead of being truncated
- Replace naive line-split parser that broke on | and >- descriptions
- Add _dir_has_real_files() to exclude .DS_Store and other hidden
  files on macOS (was causing false has_assets=True)
- Make angle-brackets check only fire on raw single-line values,
  not on folded scalar blocks (code examples are intentional content)
- Use re.search instead of re.match for frontmatter delimiter to
  handle leading whitespace in --- lines
- Accept both --- and ... as YAML document end markers

Impact on your own skills:
  POWERFUL:  0 → 2
  STANDARD: 23 → 79
  BASIC:    36 → 4 (all were false positives)
  REJECT:    7 → 3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant