feat: Generate AGENTS.MD from GitHub repository via LLM#14
Conversation
Add new feature that fetches a GitHub repository's structure and key files via the GitHub API, then uses the LLM to generate a comprehensive AGENTS.MD file summarizing the repository. Changes: - extension.ts: add parseGitHubUrl, fetchGitHubRepoContent, generateAgentMdFromRepo functions and generateAgentMd command handler - server.ts: mirror the same functions and command handler for standalone server - media/webview.html: add Generate AGENTS.MD UI section with input, button, and result textarea - client/index.html: add same UI section for standalone web mode - client/js/app.js: add button handler and result message handler Co-authored-by: raux <5723643+raux@users.noreply.github.com>
Co-authored-by: raux <5723643+raux@users.noreply.github.com>
|
@copilot test and generate test cases for each functionality. Show screenshots. |
There was a problem hiding this comment.
Pull request overview
Adds a new “Generate AGENTS.MD from GitHub Repository” feature that fetches a target repo’s structure/key files via the GitHub API and uses the configured LLM to produce an AGENTS.MD summary, exposed in both the VS Code webview and the standalone server UI.
Changes:
- Added GitHub URL parsing + GitHub API fetching (repo metadata, tree, selected key files) and an LLM prompt to generate AGENTS.MD.
- Wired a new
generateAgentMdcommand through both the extension and standalone server message handlers. - Added UI controls and client-side handlers to trigger generation and display results.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/server.ts | Implements GitHub fetch helpers + AGENTS.MD LLM generation and exposes it via the server message handler. |
| src/extension.ts | Mirrors the server-side feature inside the VS Code extension message handler and adds the same helper/LLM logic. |
| media/webview.html | Adds a webview UI section to enter a GitHub URL, trigger generation, and display the generated AGENTS.MD. |
| client/index.html | Adds the same UI section for the standalone web client. |
| client/js/app.js | Adds the button click handler and message handler for generateAgentMdResult. |
| package-lock.json | Lockfile updates from dependency tooling/run. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| await Promise.all(filesToFetch.map(async (filePath) => { | ||
| try { | ||
| const fileRes = await fetch( | ||
| `https://api.github.com/repos/${encodeURIComponent(owner)}/${encodeURIComponent(repo)}/contents/${encodeURIComponent(filePath)}?ref=${encodeURIComponent(defaultBranch)}`, |
There was a problem hiding this comment.
encodeURIComponent(filePath) will encode path separators (/ -> %2F), which breaks GitHub’s /contents/{path} endpoint for nested files (e.g., src/index.ts). Use an encoding approach that preserves / (e.g., encodeURI(filePath) or encode each segment and join with /).
| `https://api.github.com/repos/${encodeURIComponent(owner)}/${encodeURIComponent(repo)}/contents/${encodeURIComponent(filePath)}?ref=${encodeURIComponent(defaultBranch)}`, | |
| `https://api.github.com/repos/${encodeURIComponent(owner)}/${encodeURIComponent(repo)}/contents/${filePath.split('/').map(encodeURIComponent).join('/')}?ref=${encodeURIComponent(defaultBranch)}`, |
| await Promise.all(filesToFetch.map(async (filePath) => { | ||
| try { | ||
| const fileRes = await fetch( | ||
| `https://api.github.com/repos/${encodeURIComponent(owner)}/${encodeURIComponent(repo)}/contents/${encodeURIComponent(filePath)}?ref=${encodeURIComponent(defaultBranch)}`, |
There was a problem hiding this comment.
encodeURIComponent(filePath) encodes / characters, so requests to .../contents/... will fail for files in subdirectories (GitHub expects slashes in the path). Switch to an encoding method that preserves / (e.g., encodeURI(filePath) or encode path segments individually).
| `https://api.github.com/repos/${encodeURIComponent(owner)}/${encodeURIComponent(repo)}/contents/${encodeURIComponent(filePath)}?ref=${encodeURIComponent(defaultBranch)}`, | |
| `https://api.github.com/repos/${encodeURIComponent(owner)}/${encodeURIComponent(repo)}/contents/${encodeURI(filePath)}?ref=${encodeURIComponent(defaultBranch)}`, |
| /** Parse owner and repo name from a GitHub URL */ | ||
| function parseGitHubUrl(url: string): { owner: string; repo: string } | null { | ||
| // Support formats: https://github.com/owner/repo, github.com/owner/repo, owner/repo | ||
| const match = url.trim().match(/(?:https?:\/\/)?(?:www\.)?github\.com\/([^/\s]+)\/([^/\s#?]+)|^([^/\s]+)\/([^/\s#?]+)$/); | ||
| if (match) { | ||
| const owner = match[1] || match[3]; | ||
| const repo = (match[2] || match[4]).replace(/\.git$/, ''); | ||
| return { owner, repo }; | ||
| } | ||
| return null; | ||
| } | ||
|
|
||
| /** Fetch the repository tree and key file contents from GitHub API */ | ||
| async function fetchGitHubRepoContent(owner: string, repo: string): Promise<string> { | ||
| const headers: Record<string, string> = { |
There was a problem hiding this comment.
This GitHub fetch + summarization logic is duplicated in both src/extension.ts and src/server.ts (parse URL, fetch repo/tree, select key files, truncate, etc.). Consider extracting the shared code into a common module to avoid future behavior drift and reduce maintenance overhead.
New feature: given a GitHub repository URL, fetch repo structure and key files via the GitHub API, then generate a comprehensive AGENTS.MD summarizing the codebase using the configured LLM.
Backend (
src/extension.ts,src/server.ts)parseGitHubUrl()— acceptshttps://github.com/owner/repo,github.com/owner/repo, orowner/repofetchGitHubRepoContent()— fetches repo metadata, recursive file tree, and contents of key files (README, package.json, entry points, config files) via GitHub's public API. Size-limited to 15 files, 8KB each.generateAgentMdFromRepo()— LLM call with a system prompt tailored for AGENTS.MD generation (architecture, setup, conventions, testing, dependencies, etc.)generateAgentMdcommand handler — validates URL → verifies LLM connection → fetches from GitHub → calls LLM → broadcastsgenerateAgentMdResultFrontend (
media/webview.html,client/index.html,client/js/app.js)generateAgentMdResultmessage to display generated content or errorsFollows the same patterns as the existing
processAgentMdfeature (LLM connection verification, retry logic, XML tag response parsing, dual UI support).🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.