fix: skip HTML comments in checkBrokenLinks#82
Conversation
checkBrokenLinks was scanning links inside HTML comments and flagging them as broken. Template examples in patterns/INDEX.md (which live in <!-- --> comment blocks) were producing false BROKEN_LINK warnings. Strips complete <!-- ... --> spans from content before line processing using the same regex approach as checkIndexSync. Unclosed <!-- stays as plain text.
theDakshJaitly
left a comment
There was a problem hiding this comment.
Thanks for tracking down the HTML-comment false positives. The behavior change is right, but this implementation strips multi-line comments before splitting the file into lines, which changes the reported line numbers for any real broken links after a comment block.
For example, a link after:
would be reported a few lines earlier than its actual source location because the comment lines were removed before i + 1 is assigned.
Could you preserve newline positions when removing complete HTML comments, or otherwise skip comment spans without changing the line numbering? A regression test with a real broken link after a multi-line comment would cover it.
Also, the PR is currently conflicting with main, so it needs an update before merge.
…nLinks Replace empty-string comment removal with newline-preserving padding so the line array stays the same length as the original file. Adds line number assertions to existing tests and two regression tests covering broken links after single-line and multi-line HTML comments.
I updated the code to introduce |
What
checkBrokenLinksnow strips complete<!-- ... -->HTML comment spans before scanning for links. Previously, links inside HTML comments were flagged as broken — producing falseBROKEN_LINKwarnings in scaffold files likepatterns/INDEX.mdwhere template examples live in comment blocks.Unclosed
<!--(no matching-->) is left as plain text, consistent withcheckIndexSync's behavior.Line numbering fix: The initial implementation stripped comments with an empty replacement, which collapsed line numbers for any real broken links appearing after a comment block. The replacement now preserves newline characters so the line array stays the same length as the original file — reported line numbers match the actual source location.
Why
The
patterns/INDEX.mdtemplate generated bymex setupincludes example links inside HTML comments. After setup, runningmex checkon the scaffold produced 7 falseBROKEN_LINKwarnings from these comment examples — dropping the score to 79/100 even when the scaffold is otherwise correct.Verified against
Taegost/homelab-k8s(feat/mex-integrationbranch).Changes
src/drift/checkers/broken-link.ts— Strip<!-- ... -->from content before line processing, replacing with newline-preserving padding so line numbers remain accuratetest/checkers.test.ts— 4 new tests: single-line comments, multi-line comments, inline comments with surrounding content, unclosed comments. Addedlineassertions to 2 existing tests. 2 regression tests verifying correct line numbers for broken links after single-line and multi-line HTML comments.CHANGELOG.md— Entry under[Unreleased]🤖 Generated with Claude Code
The additions vs. the original: