fix: change readme md renderer behavior#1776
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
2 Skipped Deployments
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
📝 WalkthroughWalkthroughRefactors readme rendering to a single-pass pipeline: adds user-content prefixing utilities, centralises heading processing to compute semantic levels and generate prefixed slugs, consolidates link resolution and security attributes via a new processLink, and updates the renderer to intercept Markdown and raw HTML headings and anchors. Also unifies playground link collection with de-duplication and adjusts transform/tag handling and tests to cover mixed HTML/Markdown ordering, slug collisions, ID consistency and URL resolution. Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
test/unit/server/utils/readme.spec.ts (1)
579-584: Add a regression case foruser-content-slug collisions.This duplicate-slug test only covers identical heading text. Please add a case like
# Titleplus# user-content-titleto ensure generated IDs remain unique.🧪 Suggested additional test
it('handles duplicate raw HTML heading slugs', async () => { const md = '<h2>API</h2>\n\n<h2>API</h2>' const result = await renderReadmeHtml(md, 'test-pkg') expect(result.html).toContain('id="user-content-api"') expect(result.html).toContain('id="user-content-api-1"') }) + + it('keeps IDs unique when a heading slug already starts with user-content-', async () => { + const md = '# Title\n\n# user-content-title' + const result = await renderReadmeHtml(md, 'test-pkg') + const ids = result.toc.map(t => t.id) + expect(new Set(ids).size).toBe(ids.length) + }) })
| const htmlHeadingRe = /<h([1-6])(\s[^>]*)?>([\s\S]*?)<\/h\1>/gi | ||
| const htmlAnchorRe = /<a(\s[^>]*?)href=(["'])([^"']*)\2([^>]*)>([\s\S]*?)<\/a>/gi | ||
| renderer.html = function ({ text }: Tokens.HTML) { | ||
| let result = text.replace(htmlHeadingRe, (_, level, attrs, inner) => { | ||
| const depth = parseInt(level) | ||
| const plainText = decodeHtmlEntities(stripHtmlTags(inner).trim()) | ||
| const align = /\balign=(["'])(.*?)\1/i.exec(attrs)?.[2] | ||
| const preservedAttrs = align ? ` align="${align}"` : '' | ||
| return processHeading(depth, plainText, preservedAttrs).trimEnd() | ||
| }) |
There was a problem hiding this comment.
Guard against undefined attrs before regex execution.
On Line 486, if the HTML heading has no attributes (e.g., <h1>Title</h1>), the attrs capture group will be undefined. Calling .exec(attrs) on undefined will coerce it to the string "undefined", which could match unintended content.
🛡️ Proposed fix
let result = text.replace(htmlHeadingRe, (_, level, attrs, inner) => {
const depth = parseInt(level)
const plainText = decodeHtmlEntities(stripHtmlTags(inner).trim())
- const align = /\balign=(["'])(.*?)\1/i.exec(attrs)?.[2]
+ const align = attrs ? /\balign=(["'])(.*?)\1/i.exec(attrs)?.[2] : undefined
const preservedAttrs = align ? ` align="${align}"` : ''
return processHeading(depth, plainText, preservedAttrs).trimEnd()
})
🔗 Linked issue
resolves #1323
🧭 Context
📚 Description
This PR fixes README rendering issues caused by split processing between
markedandsanitizeHtml, and makes heading/link handling more consistent in mixed markdown + raw HTML content.What changed:
markedrendering.user-content-.align)title, even when placed beforehref)hrefand enforce saferel/targetwithout duplicates