docs: README important-first + architecture SVG + validated MI350/FlyDSL content#2
Conversation
…face validated MI350/FlyDSL content README restructure (important-first; Acknowledgements moved to the very end): - Lead with what-it-is → Hardware Scope (MI350/gfx950) → "Validated on real silicon" → What's Here → Install/Query → Architecture → Maintenance/Quality → License → Acknowledgements. - Embed a hand-authored architecture diagram (docs/architecture.svg) in the Architecture section. Validated MI350/FlyDSL content (first-party, MI350X silicon): - New source anchor sources/refs/ref-flydsl-kernel-profiling.md — the rocprofv3 ATT sweep + GitHub Pages dashboard (17 kernels, AITER/CK/hipBLASLt baselines, ROCm 7.2). - New wiki page wiki/kernels/flydsl-flash-attention.md — generic vs gfx950 dual-wave SWP, PR arc #225→#334→#462→#629→#661 (layout MMA-atom API), measured ~0.92x vs CK-tile (register-pressure-capped occupancy). - Augment wiki/languages/flydsl.md with the FA/atom-API note + a "Measured on MI350X" section. - data/tags.yaml: add profiling/rocprofv3/kernel-profiling/register-pressure misc tags. - index.md: list the new page + a silicon-validation pointer. - Regenerated queries/*.md indices. validate.py: 0 errors. tests/test_validate.py: pass. generate-indices.py: clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reviewer's GuideRestructures the README to front-load hardware scope, MI350X silicon validation, and architecture details (including a new SVG diagram), and adds first-party MI350X-validated FlyDSL/flash-attention content plus corresponding tags and query index updates across the documentation set. File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- In
kernel-flydsl-flash-attention.md, theperformance_claims.valuefield mixes a numeric and qualitative label ("~0.92x (HEADROOM)"); consider keeping this strictly machine-parseable (e.g.,0.92or"~0.92") and representing the HEADROOM bucket in a separate field or tag for downstream tooling. - The MI350X/FlyDSL profiling summary is now spread across the README,
lang-flydsl,kernel-flydsl-flash-attention, andref-flydsl-kernel-profiling; you may want to pick one page (e.g., the new ref) as the single canonical source of detailed numbers and keep the others as shorter pointers to avoid future drift.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `kernel-flydsl-flash-attention.md`, the `performance_claims.value` field mixes a numeric and qualitative label (`"~0.92x (HEADROOM)"`); consider keeping this strictly machine-parseable (e.g., `0.92` or `"~0.92"`) and representing the HEADROOM bucket in a separate field or tag for downstream tooling.
- The MI350X/FlyDSL profiling summary is now spread across the README, `lang-flydsl`, `kernel-flydsl-flash-attention`, and `ref-flydsl-kernel-profiling`; you may want to pick one page (e.g., the new ref) as the single canonical source of detailed numbers and keep the others as shorter pointers to avoid future drift.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Code Review
This pull request introduces comprehensive documentation and profiling data for FlyDSL kernels on real AMD Instinct MI350X silicon (gfx950, ROCm 7.2). Key additions include a new reference page for the FlyDSL kernel profiling sweep, a synthesized wiki page detailing the FlyDSL Flash Attention generic and dual-wave fast paths, and an architecture diagram (docs/architecture.svg) explaining the three-layer structure of the wiki. The README, index, and query indices have been updated accordingly. Feedback on the changes suggests quoting font family names with spaces in the SVG file, removing synthesized wiki pages from the sources metadata list in the new Flash Attention page to maintain architectural separation, and correcting a URL-encoding typo (seq%256==0) in the frontmatter metadata.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
What & why
Restructures the README so the important stuff leads and acknowledgements come last, adds an architecture SVG, and surfaces our first-party MI350X-validated MI350/FlyDSL content (which wasn't in the wiki yet).
README (important-first)
New order: what-it-is → Hardware Scope (MI350/gfx950) → Validated on real silicon → What's Here → Install/Query → Architecture (+ SVG) → Maintenance/Quality Gates → License → Acknowledgements & Citation (moved to the very end).
Architecture diagram
docs/architecture.svg— hand-authored three-layer flow (sources/→wiki/→queries/, gated bydata/+scripts/), embedded in the README Architecture section. Renders on GitHub.Validated MI350 / FlyDSL content (silicon-measured)
sources/refs/ref-flydsl-kernel-profiling.md— new source anchor for the rocprofv3 ATT sweep + GitHub Pages dashboard: 17 FlyDSL kernels profiled on MI350X (ROCm 7.2) vs AITER/CK/hipBLASLt.wiki/kernels/flydsl-flash-attention.md— new page filling a real gap (FlyDSL FA had no page). Generic vs gfx950 dual-wave software-pipelined kernel, the #225→#334→#462→#629→#661 (layout MMA-atom API) arc, measured ~0.92× vs CK-tile, register-pressure-capped occupancy.wiki/languages/flydsl.md— adds the FA / MMA-atom-API note and a "Measured on MI350X" section; FA added tokernel_types/related/sources.data/tags.yaml—profiling,rocprofv3,kernel-profiling,register-pressuremisc tags.index.md— lists the new page + a silicon-validation pointer.queries/*.md.Validation
python3 scripts/validate.py→ 0 errors (7541 pages)python3 tests/test_validate.py→ all passpython3 scripts/generate-indices.py→ clean (indices committed)docs/architecture.svg→ well-formed XML🤖 Generated with Claude Code
Summary by Sourcery
Restructure the README to foreground hardware scope, MI350X validation details, architecture overview, and usage, move acknowledgements to the end, and add an embedded architecture diagram. Add first-party MI350X profiling content for FlyDSL, including a new flash-attention kernel page, a profiling reference repo anchor, and updated FlyDSL language docs and indices. Extend tag vocabulary for profiling-related concepts and regenerate query indices to surface the new FlyDSL flash-attention and profiling content across hardware-feature, technique, language, and kernel-type views.
New Features:
Enhancements:
data/files in README to include hardware verification metadata and clarify optional ROCM_WIKI_ROOT usage.Documentation:
Chores:
data/tags.yamlwith profiling-related tags for use across the knowledge base.