Detailed technical documentation for IssueScout internals. Supported by Vexrail.
- Architecture Overview
- Project Structure
- Pages
- API Routes
- Services
- Models (MongoDB)
- Components
- Auth Flow
- Search System
- Recommendations System
- Health Score Algorithm
- Difficulty Estimation
- Caching Strategy
- Analytics
- Known Issues and Gotchas
- Deployment
User (Browser)
|
v
Next.js App Router (pages + API routes)
|
+-- NextAuth v5 (GitHub OAuth) --> GitHub OAuth App
|
+-- API Routes
| |
| +-- /api/issues ---------> GitHub GraphQL API (search)
| | + healthScore service (per-repo GraphQL)
| | + difficulty service (rule-based + OpenAI)
| | + CachedIssue (MongoDB, 24h TTL)
| | + IndexedRepo (MongoDB, permanent)
| | + SearchLog (MongoDB, permanent)
| |
| +-- /api/issues/enrich ---> Two-level cache enrichment pipeline
| | L1: CachedIssue (difficulty, 24h TTL)
| | L2: IndexedRepo (health, permanent, stale-while-revalidate 48h)
| |
| +-- /api/recommendations -> GitHub GraphQL API (per-language search)
| | + parallel language searches (Promise.all)
| | + two-level cache enrichment
| | + match score calculation
| |
| +-- /api/issues/bookmark -> Bookmark collection (MongoDB, permanent)
| +-- /api/onboarding ------> fetchUserProfile (GitHub GraphQL)
| +-- /api/user/preferences -> User collection (MongoDB)
| +-- /api/stats -----------> IndexedRepo.countDocuments()
| +-- /api/survey ----------> SurveyVote collection (MongoDB)
|
+-- MongoDB Atlas
+-- users (preferences, onboarding state)
+-- cachedissues (24h TTL, difficulty cache)
+-- indexedrepos (permanent, repo health cache)
+-- searchlogs (permanent, analytics)
+-- bookmarks (permanent, issue snapshots)
+-- surveyvotes (permanent, landing page survey)
Key design decisions:
- Each authenticated user's GitHub OAuth token is used for API requests (5K req/hr per user) instead of a single server token
- Two-phase progressive loading: issues appear instantly, enrichment fills in asynchronously
- Two-level caching: issue difficulty (24h TTL) + repo health (permanent with stale-while-revalidate at 48h)
- GitHub token is revoked server-side on sign-out
src/
├── app/
│ ├── layout.tsx # Root layout (fonts, providers, header)
│ ├── globals.css # Tailwind v4 theme, dark mode, prose styles
│ ├── page.tsx # Landing page (/)
│ ├── explore/page.tsx # Issue search + recommendations (/explore)
│ ├── issue/[id]/page.tsx # Issue detail view (/issue/owner__repo__number)
│ ├── bookmarks/page.tsx # Saved & archived bookmarks (/bookmarks)
│ ├── onboarding/page.tsx # 2-step onboarding wizard (/onboarding)
│ ├── settings/page.tsx # Edit preferences (/settings)
│ └── api/
│ ├── auth/[...nextauth]/ # NextAuth handlers
│ ├── issues/ # Search issues (GET)
│ ├── issues/[id]/ # Single issue detail (GET)
│ ├── issues/enrich/ # Batch enrichment (POST)
│ ├── issues/bookmark/ # Bookmark CRUD (GET, POST)
│ ├── onboarding/ # Profile detection + save prefs (GET, POST)
│ ├── recommendations/ # Personalized issues (GET)
│ ├── stats/ # IndexedRepo count (GET)
│ ├── survey/ # Landing page survey (GET, POST)
│ └── user/preferences/ # User preferences CRUD (GET, PUT)
├── components/
│ ├── Header.tsx # Sticky nav with auth, theme toggle
│ ├── FilterBar.tsx # Search filters (language, difficulty, labels, sort)
│ ├── IssueCard.tsx # Issue card with stats, badges, bookmarks
│ ├── IssueCardSkeleton.tsx # Loading skeleton
│ ├── DifficultyBadge.tsx # Easy/Medium/Hard badge with AI indicator
│ ├── HealthScoreBadge.tsx # 0-100 health score badge
│ ├── ThemeToggle.tsx # Dark/light mode toggle
│ ├── SessionProvider.tsx # NextAuth session wrapper
│ └── ui/ # 16 shadcn/ui components
├── lib/
│ ├── auth.ts # NextAuth v5 config, JWT callbacks, user upsert, token revocation
│ ├── mongodb.ts # Mongoose singleton connection
│ └── utils.ts # cn() utility for classnames
├── models/
│ ├── User.ts # User preferences + onboarding state
│ ├── CachedIssue.ts # 24h TTL cache for difficulty
│ ├── IndexedRepo.ts # Permanent repo health cache (stale-while-revalidate 48h)
│ ├── SearchLog.ts # Permanent search analytics
│ ├── Bookmark.ts # Saved issue snapshots
│ └── SurveyVote.ts # Landing page survey votes
├── services/
│ ├── github.ts # GitHub GraphQL queries + search builder
│ ├── healthScore.ts # 7-factor repo health calculator
│ └── difficulty.ts # Rule-based + GPT-4o-mini estimator
└── types/
├── index.ts # All app types (GitHubIssue, EnrichedIssue, etc.)
└── next-auth.d.ts # NextAuth type augmentations
Client component. Hero section with search bar, language quick-filters (JS, TS, Python, Go, Rust, Java, Ruby, C++), feature cards, how-it-works section, stats strip with animated repos-indexed counter, open source survey banner, footer. If authenticated user hasn't completed onboarding, redirects to /onboarding.
Client component wrapped in <Suspense>. Requires authentication (shows sign-in prompt if unauthenticated). Two tabs:
Search tab: Full <FilterBar> with text search, language dropdown (16 languages), difficulty filter, label multi-select (18 labels in 4 groups), sort (6 options), claimed/unclaimed toggle. Results from /api/issues with server-side pagination ("Load More"). URL state sync with debounce (300ms for query, instant for dropdowns). Two-phase progressive loading: issues render immediately, enrichment badges fill in asynchronously.
For You tab (authenticated only): Shows user's preference tags (clickable language pills + topic badges). Lightweight inline filters (search input + difficulty dropdown). Fetches all 60 recommendations at once from /api/recommendations, client-side pagination showing 20 at a time with "Load More".
Client component. URL param format: owner__repo__number. Shows full issue body (rendered from bodyHTML with prose styling), health score badge, difficulty badge, labels. Sidebar with repo info, community health report breakdown, and "Getting Started" guide. Uses IndexedRepo for health (blocking refresh if stale/missing).
Client component. Requires auth. Two tabs: Saved (active bookmarks) and Archived. Each shows a grid of <IssueCard> with bookmark/archive action buttons.
Client component. Requires auth. Two steps:
- Languages: Auto-detected from GitHub profile (own repos weighted 2x, starred repos 1x) + popular languages list (21 options) + custom input.
- Frameworks: Auto-detected topics from starred repos + popular frameworks list (35 options) + custom input.
Saves to User.preferredLanguages and User.preferredFrameworks. Sets onboardingCompleted: true. Redirects to /explore?tab=recommended after completion.
Client component. Requires auth. Same language/framework selection UI as onboarding but editable at any time. "Re-scan GitHub" button to refresh auto-detected data.
Main search endpoint. Returns raw GitHub results without enrichment (fast path).
Query params:
| Param | Type | Default | Description |
|---|---|---|---|
q |
string | "" |
Free-text search |
language |
string | "" |
Programming language |
difficulty |
easy|medium|hard|all |
"all" |
Difficulty filter (post-enrichment) |
sort |
string | "newest" |
Sort order |
after |
string | null |
Pagination cursor |
limit |
number | 60 |
Results per page (max 100) |
labels |
comma-separated | "" |
Label filter (OR logic) |
showClaimed |
"true"|"false" |
"false" |
Include assigned/linked-PR issues |
Batch enrichment endpoint. Two-level cache lookup:
- L1: Bulk
CachedIssuecheck for difficulty - L2: Bulk
IndexedRepocheck for health scores - Missing repos get blocking
calculateHealthScore()+ stored in IndexedRepo - Stale repos (>48h) return cached data immediately + fire-and-forget background refresh
Single issue detail. URL param format: owner__repo__number. Returns full issue with bodyHTML, participants, timeline items, plus health score and difficulty. Uses IndexedRepo with blocking refresh if stale.
GET: Fetch user's bookmarks. Query param status: active|archived|all (default active).
POST: Modify bookmarks. Body: { issueId, action: "add"|"remove"|"archive"|"unarchive", issueData? }. issueData required for "add" — stores full enriched issue snapshot.
GET: Returns auto-detected languages/topics from GitHub profile + any saved preferences.
POST: Saves onboarding preferences. Body: { languages: string[], frameworks: string[] }.
Returns up to 60 personalized issues. Auth required. Parallel language searches via Promise.all. Two-level cache enrichment. See Recommendations System.
GET: Returns current user preferences and onboarding status.
PUT: Updates preferences. Body: { languages?: string[], frameworks?: string[] }.
Returns { reposIndexed: number } — count of documents in the IndexedRepo collection.
GET: Returns { yes: number, no: number } vote counts.
POST: Records a vote. Body: { vote: "yes"|"no" }.
Three GraphQL queries:
SEARCH_ISSUES_QUERY: Searches issues with reactions, repo metadata, labels, commentsREPO_HEALTH_QUERY: Fetches repo health signals (CONTRIBUTING.md, license, CoC, activity, PRs)USER_PROFILE_QUERY: Fetches user's repos + starred repos for language/topic detection
Key functions:
searchIssues(query, language, first, after, userToken, options) - Builds a GitHub search query string with these qualifiers:
- Base:
state:open is:issue is:public archived:false(always) - Availability:
no:assignee -linked:pr(default, toggleable) - Labels: OR-logic
label:"l1","l2"(defaults to good-first-issue variants) - User text query (prepended)
- Language filter
- Difficulty proxy: easy ->
comments:0..5, medium/hard -> no proxy - Sort qualifier
fetchRepoHealth(owner, name, userToken) - Fetches repo metadata for health scoring.
fetchUserProfile(login, userToken) - Fetches user's top 20 repos (by stars) + last 30 starred repos. Returns top 10 languages (own repos weighted 2x) and top 10 topics.
| Field | Type | Default | Notes |
|---|---|---|---|
githubId |
String (unique) | - | GitHub user ID |
login |
String | - | GitHub username |
name |
String | "" |
Display name |
avatarUrl |
String | "" |
Avatar URL |
email |
String | "" |
|
languages |
String[] | [] |
Auto-detected from GitHub |
frameworks |
String[] | [] |
Auto-detected |
topics |
String[] | [] |
Auto-detected from starred repos |
preferredLanguages |
String[] | [] |
User-curated (onboarding + settings) |
preferredFrameworks |
String[] | [] |
User-curated |
onboardingCompleted |
Boolean | false |
Onboarding status |
Caches difficulty per issue. Health scores live in IndexedRepo.
| Field | Type | TTL |
|---|---|---|
issueId |
String (unique) | - |
difficulty |
easy|medium|hard|unknown |
- |
difficultyReason |
String | - |
difficultyUsedAI |
Boolean | - |
repoFullName |
String | - |
language |
String | - |
cachedAt |
Date | 86400s (24h) |
Permanent repo health cache. No TTL. Data older than 48h is returned immediately but triggers a background refresh.
| Field | Type | Notes |
|---|---|---|
fullName |
String (unique) | owner/name |
owner |
String | - |
name |
String | - |
healthScore |
Number | 0-100 |
healthDetails |
Mixed | Full breakdown |
stargazerCount |
Number | - |
forkCount |
Number | - |
primaryLanguage |
String | - |
description |
String | - |
lastEnrichedAt |
Date | Used for stale check |
Indexes: primaryLanguage+healthScore, lastEnrichedAt.
| Field | Type | Default |
|---|---|---|
userId |
String | null |
userLogin |
String | null |
query |
String | "" |
programmingLanguage |
String | "" |
difficulty |
String | "all" |
sort |
String | "newest" |
resultCount |
Number | 0 |
timestamp |
Date | Date.now |
Note: Field is programmingLanguage not language — MongoDB reserves language for text index language override.
| Field | Type | Default |
|---|---|---|
userId |
String | - |
issueId |
String | - |
issueData |
Mixed | Full enriched issue snapshot |
archived |
Boolean | false |
savedAt |
Date | Date.now |
archivedAt |
Date | null |
| Field | Type |
|---|---|
question |
String (indexed) |
vote |
"yes"|"no" |
| Component | Key Props | Description |
|---|---|---|
Header |
none | Sticky nav bar. Logo, links, ThemeToggle, user dropdown or sign-in button. Clears cookies and revokes token on sign-out. |
FilterBar |
query, language, difficulty, sort, labels, showClaimed, callbacks | Full search control panel. Text input, language select, difficulty select, label multi-select popover (4 groups, 18 labels), sort select, claimed toggle. |
IssueCard |
issue, enriching?, onBookmarkToggle? | Card with repo info, title, body preview, labels, HealthScoreBadge, DifficultyBadge, footer stats with tooltips, bookmark buttons, match score. Skeleton badges while enriching. |
DifficultyBadge |
difficulty, reason?, usedAI? | Color-coded badge. Purple sparkle if AI was used. |
HealthScoreBadge |
score, details? | Score badge with breakdown tooltip. |
ThemeToggle |
none | Dark/light mode toggle. Persists to localStorage. |
GoogleAnalytics |
none | Conditionally loads GA4 gtag.js scripts. Only renders after cookie consent is accepted. Listens for cookie-consent-update custom event. |
CookieConsent |
none | Fixed bottom banner asking user to accept/decline cookies. Persists choice to localStorage (cookie-consent key). Dispatches cookie-consent-update event on accept. |
Located in src/components/ui/: avatar, badge, button, card, checkbox, command, dialog, dropdown-menu, input, popover, select, separator, sheet, skeleton, tabs, tooltip.
- User clicks "Sign in with GitHub" -> NextAuth redirects to GitHub OAuth
- User authorizes -> GitHub redirects back with code
- NextAuth exchanges code for access token
- JWT callback fires:
- Stores
access_token,githubId,login,avatarUrlon JWT - Upserts user in MongoDB (creates on first sign-in, updates on subsequent)
- Uses
$setOnInsertfor defaults (doesn't overwrite existing preferences)
- Stores
- Session callback copies JWT fields to the client-visible session
- Client checks
/api/user/preferences— ifonboardingCompleted === false, redirects to/onboarding - On sign-out: GitHub OAuth token is revoked server-side via
DELETE /applications/{client_id}/token. Utility cookies are cleared client-side.
Why not trust JWT for onboarding status: JWT tokens are static — onboardingCompleted stored in JWT at sign-in becomes stale after onboarding completes mid-session.
OAuth scopes: read:user, user:email
Every search query starts with base qualifiers:
state:open is:issue is:public archived:false
Then conditionally adds:
no:assignee -linked:pr— default on, toggle off with "Show Claimed"label:"good first issue","good-first-issue"— default labels, customizable via multi-select- User's free-text query
language:X— from language dropdowncomments:0..5— only when difficulty=easy (heuristic proxy)sort:X— maps to GitHub sort qualifiers
18 labels organized in 4 groups:
Beginner-Friendly (default: first two selected): good first issue, good-first-issue, beginner, beginner-friendly, easy, starter, first-timers-only, help wanted
Contribution Type: documentation, bug, enhancement, feature
Issue Type: frontend, backend, ui, ux, testing
Events: hacktoberfest
Labels use OR logic. Empty selection = no label filter.
| UI Label | GitHub Qualifier | Client-side? |
|---|---|---|
| Newest First | sort:created-desc | No |
| Oldest First | sort:created-asc | No |
| Most Discussed | sort:comments-desc | No |
| Most Reactions | sort:reactions-desc | No |
| Recently Updated | sort:updated-desc | No |
| Best Community | - | Yes (sorts by healthScore) |
Difficulty is computed post-fetch. When a difficulty filter is active, the API over-fetches:
- Easy (with
comments:0..5proxy): ~80% hit rate, up to 3 rounds x 60 issues - Medium/Hard (no proxy): ~20-40% hit rate, up to 5 rounds x 60 issues
- Determine preferences: Use
user.preferredLanguages(up to 5) +user.preferredFrameworks. Fallback tofetchUserProfile()(top 3 languages from GitHub). - Fetch issues: For each of the top 3 languages, call
searchIssues("", language, 30)in parallel viaPromise.all. Up to 90 raw issues total. - Enrich: Two-level cache (CachedIssue + IndexedRepo). Repo-level deduplication.
- Calculate match score (0-100):
| Factor | Points | Logic |
|---|---|---|
| Language rank | 20-50 | 1st = 50, 2nd = 35, 3rd = 20 |
| Health contribution | 0-30 | healthScore * 0.3 |
| Difficulty bonus | 0-10 | Easy = 10, Medium = 5, Hard = 0 |
| Framework match | 0-10 | +10 if repo description contains any user framework |
- Sort by match score descending, deduplicate, return top 60.
7-factor score from 0-100. Calculated per repository via a dedicated GraphQL query.
| Factor | Max Points | Scoring |
|---|---|---|
| CONTRIBUTING.md | 15 | Checks both CONTRIBUTING.md and contributing.md |
| License | 10 | Any license present |
| Code of Conduct | 5 | Has code of conduct |
| Recent Activity | 20 | Last commit <30d = 20, <90d = 10, else 0 |
| Star Count | 15 | >=1000 = 15, >=100 = 12, >=10 = 8, >=1 = 4 |
| Response Time | 20 | Avg first-comment time on recent issues. <24h = 20, <72h = 15, <168h = 10, else 5 |
| PR Merge Rate | 15 | Merged PRs >=100 = 15, >=50 = 12, >=10 = 8, >=1 = 4 |
Community size label: large (stars>=1000 or forks>=100), medium (stars>=100 or forks>=20), small (else).
Two-tier system:
Scans title, body, and labels for keyword signals:
- Easy (18 keywords): typo, documentation, readme, spelling, first-timers-only, beginner, simple, translation, etc.
- Medium (13): feature, refactor, component, test, bug, fix, implement, etc.
- Hard (12): architecture, security, performance, migration, api redesign, critical, etc.
Additional: body length (>2000 chars = +hard, <300 = +easy), label overrides.
Confidence = max category score / total signals.
Sends title + body (500 chars) + labels to GPT-4o-mini with temperature: 0.1, max_tokens: 100, response_format: json_object. Returns { difficulty, reason }.
Falls back to rule-based if API key is missing or call fails.
The usedAI flag is displayed as a purple sparkle on DifficultyBadge.
| Collection | TTL | What's Cached | Purpose |
|---|---|---|---|
CachedIssue |
24 hours | Difficulty per issue | Avoid re-computing difficulty within a day |
IndexedRepo |
Permanent (stale at 48h) | Health score per repo | Repo health persists across issues, stale-while-revalidate |
SearchLog |
Permanent | Search queries + filters | Analytics |
Bookmark |
Permanent | Full enriched issue snapshot | User's saved issues |
User |
Permanent | Preferences, onboarding | User profile |
SurveyVote |
Permanent | Landing page votes | Community feedback |
MongoDB connection: Singleton pattern with global.mongooseCache to prevent multiple connections during Next.js hot reloading.
The hosted version uses Google Analytics 4 (GA4). The measurement ID is configured via the NEXT_PUBLIC_GA_MEASUREMENT_ID environment variable. If the variable is unset, no analytics scripts are loaded. Analytics are consent-gated:
- On first visit,
CookieConsentbanner appears at the bottom of the page - User clicks Accept or Decline
- Choice is stored in
localStorage(cookie-consentkey) — banner never shows again - If accepted,
GoogleAnalyticscomponent loads the gtag.js scripts via Next.js<Script strategy="afterInteractive"> - If declined, no tracking scripts are ever loaded
Communication between components: CookieConsent dispatches a cookie-consent-update custom DOM event. GoogleAnalytics listens for this event and conditionally renders the <Script> tags. Both components read from the same localStorage key on mount.
Self-hosting: Leave NEXT_PUBLIC_GA_MEASUREMENT_ID unset for no analytics, or set your own GA4 measurement ID. You can also remove GoogleAnalytics and CookieConsent from src/app/layout.tsx entirely.
@octokit/graphqlreservesqueryas a parameter name — use$searchQueryinsteadRepositoryOrderFieldusesSTARGAZERSnotSTARGAZER_COUNT- Issue search does NOT support
stars:orforks:qualifiers (repo search only) - Issue search sort options:
sort:created,sort:comments,sort:updated,sort:reactions,sort:interactionsonly - Labels use OR logic:
label:"a","b","c"matches ANY
languagefield conflicts with text indexes — useprogrammingLanguageinstead- Mongoose
{ new: true }is deprecated in v9 — use{ returnDocument: "after" }
- JWT tokens are static — mutable state (like
onboardingCompleted) goes stale mid-session. Always verify against DB. - Using NextAuth v5 beta (
5.0.0-beta.30)
- Radix
Selectwithposition="item-aligned"hijacks scroll — useposition="popper" - "Load More" can return duplicates — deduplicated by
id - Empty
labels[]vsundefinedhas different semantics
The app auto-deploys to Vercel on push to main.
- Connect GitHub repo to Vercel
- Set environment variables (all from
.env.example) for both Production and Preview - Update GitHub OAuth App callback URL to production domain
NEXTAUTH_URLis only needed for Production — Preview usesVERCEL_URLautomatically
npm run build # Next.js production build with Turbopack
npm start # Start production server
npm run lint # ESLint