A ruthless, high-performance aggregator that answers one question: "Who has the cheapest authentic NMN (and other longevity supplements) today?"
Architecture: Git-Scraper. Go scrapes vendor sites, runs the math engine, and writes a pre-computed data/analysis_report.json. Next.js reads that single file at build time and generates a static HTML page. $0/month infrastructure cost.
- Multi-vendor price comparison across Shopify, Magento, and LD+JSON storefronts.
- Bioavailability-adjusted pricing (True Cost) — liposomal, sublingual, and gel formulations receive a multiplier that lowers their effective $/gram. The multiplier value and label are exported in the JSON and displayed in the frontend's True Cost column as muted subtext (e.g.,
1.5x Lipo Bonus). - Synthetic Subscription Pricing — vendors whose Shopify APIs hide subscription prices (e.g., Renue By Science) are handled via a
globalSubscriptionDiscountfield indata/vendor_rules.json. The analyzer emits BOTH a one-time purchase entry and a synthetic "Subscribe & Save" entry (withis_subscription: true) for every valid variant. The frontend receives both rows and can toggle between purchase types. - Clean product names — the analyzer strips redundant vendor name prefixes from product titles (case-insensitive). E.g., vendor
"Nutricost"+ title"Nutricost Creatine Monohydrate"→"Creatine Monohydrate". - Multi-supplement tracking — NMN, NAD+, TMG, Resveratrol, and Creatine out of the box. Configurable via
--supplementsflag. - Cloudflare-safe — vendors behind Cloudflare (Jinfiniti, Wonderfeel) are flagged with
Cloudflare: truein the vendor config. The scraper skips them on--refreshand uses manually-maintained JSON instead. - Hybrid Catalog/Regex Engine — the analyzer uses a two-path architecture with active/gross mass disambiguation. ~80% of standard products are handled automatically by the regex extraction pipeline. The remaining ~20% of complex products (multi-ingredient, non-standard weights) are handled by immutable overrides in
data/vendor_rules.jsonthat bypass regex entirely. Overrides specifyforceActiveGrams(the pre-computed total active ingredient mass) and optionallyforceTypeandforceServingMg.activeGramsis the denominator for all cost calculations.grossGrams(the physical label weight) is resolved via a two-tier chain:variantGrossOverrides(manual per-variant override for titles lacking gram/kg patterns) > regex extraction from product/variant titles. No OCR. No image parsing. The same file supportsglobalSubscriptionDiscountfor synthetic subscription price generation. - Triage Engine — products whose mass was resolved by regex (no override) are scanned against a hardcoded
dirtyKeywordslist (flavors, blends, gummies, combos). A false-positive guard skips the"flavor"keyword when the target string contains"unflavored"— only that trigger is suppressed; the loop continues checking remaining keywords so that e.g."unflavored blend"is still correctly flagged by"blend". Servings sub-exception: before skipping the"flavor"match for an unflavored product, the engine checks if the target string also contains"serv". If it does, the product is flagged withreview_reason: "Detected 'unflavored' but uses 'servings' (needs manual math check)"— because servings-based sizing forces the regex to guess scoop size, making the computed mass mathematically unsafe. Only unflavored products with explicit gram/kg weights (e.g.,"Unflavored / 500 GMS") pass cleanly. Matches are flagged withneeds_review: trueandreview_reasonin the analysis output, and collected intodata/needs_review.jsonfor operator review. The triage is intentionally aggressive — it flags for human review, not rejection. - Pagination safety — Shopify scraper uses proper URL construction, product deduplication, and a hard page limit (50) to prevent infinite loops.
- Daily CI/CD — GitHub Actions workflow scrapes daily, commits changed JSON, and triggers a Vercel build.
go mod tidy
cd web
npm install
go run cmd/main.go -refresh
go run cmd/main.go
go run cmd/main.go -audit
go run cmd/main.go -refresh -audit
Scans all products that pass the supplement keyword filter and vendor blocklist, then reports any that lack enough data (mg, count, grams) for the analyzer to compute activeGrams. For each gap, prints the product handle, what data was extracted, what is missing, and a suggested vendor_rules.json override snippet. Use this after scraping to discover new products that need manual overrides.
go run cmd/main.go -cpuprofile cpu.prof
go tool pprof cpu.prof
go run cmd/main.go -pprof
# Then visit http://localhost:6060/debug/pprof/
The pprof HTTP server is off by default. Pass -pprof to enable it.
go run cmd/main.go --supplements=nmn,nad
go run cmd/main.go --supplements=creatine
go run cmd/main.go --supplements=tmg,resveratrol
Default value: nmn,nad,tmg,trimethylglycine,resveratrol,creatine
cd web
npm run build
Output lands in web/out/. This is a fully static site — no server required.
cd web
npm run dev
Opens at http://localhost:3000. Reads data/analysis_report.json from the repo root at build time.
Prerequisite: Run
go run cmd/main.goat least once to generatedata/analysis_report.jsonbefore building the frontend.
cmd/main.go CLI entry point. Flags: --refresh, --supplements, --audit, --pprof, --cpuprofile. No global state — constructs Analyzer struct with injected rules and supplements. scrapeAll() handles concurrent vendor fetching. saveReviewQueue() persists flagged entries.
internal/
config/vendors.go Vendor registry (name, URL, scraper type, cloudflare flag).
models/types.go Core structs: Vendor, Product, Variant, Analysis (with JSON tags, including ActiveGrams, GrossGrams, Multiplier, MultiplierLabel, IsSubscription, NeedsReview, and ReviewReason).
parser/analyzer.go Analyzer struct (holds Rules and Supplements, no globals). AnalyzeProduct() method implements Hybrid Catalog/Regex Engine. Mass extraction delegated to extractMass(). Gross weight delegated to extractGrossGrams(). Type classification via classifyType(). Bioavailability via bioavailabilityMultiplier(). Display name via buildDisplayName(). Dirty-data triage via triageDirtyData(). Cost metrics via buildAnalysis() — single helper for both one-time and subscription entries.
parser/audit.go AuditProduct() method on Analyzer. Gap detector using extractFloat/extractFloatFrom helpers. Prints override suggestions using forceActiveGrams/forceServingMg format.
parser/extract.go Shared regex helpers: extractFloat(re, s), extractFloatFrom(re, sources...), containsAny(s, substrs). Replaces ~13 instances of the 3-5 line regex→parse→check pattern.
rules/rules.go LoadRules() returns (Registry, error) — no global variable. Registry is a type alias for map[string]VendorConfig. ApplyRules(reg, vendorName, p) evaluates product-level blocklist only (returns true/false). No data enrichment.
scraper/client.go Shared HTTP infrastructure: DefaultClient (*http.Client), NewRequest(url), FetchBody(url). Eliminates duplicate client/header setup across scrapers.
scraper/router.go FetchFunc type + map-based registry. FetchProducts() dispatches via map lookup — no switch statement.
scraper/shopify.go Shopify products.json scraper with pagination safety. Uses shared DefaultClient/NewRequest.
scraper/magento.go Magento swatch-renderer JSON + bulk pricing scraper. All regexps compiled once at package level. Uses shared FetchBody.
scraper/ld+json.go Schema.org LD+JSON @graph scraper. Uses shared FetchBody.
storage/json_store.go Generic SaveJSON[T](path, data) and LoadJSON[T](path). VendorFilename() converts vendor name to file path.
data/
analysis_report.json ★ THE INTEGRATION POINT. Pre-computed Analysis array. Frontend reads ONLY this.
needs_review.json Triage Engine output. Subset of analysis_report.json entries where needs_review == true. Written by cmd/main.go after every run. Operator reviews this to decide which products need overrides in vendor_rules.json.
vendor_rules.json Blocklists and manual dosage overrides per vendor.
*.json Scraped raw product data (one file per vendor). NOT read by the frontend.
web/
app/layout.tsx Root layout. Dark theme, font loading, metadata.
app/page.tsx Main SSG page. Reads analysis_report.json, renders table. Zero parsing logic.
app/globals.css Tailwind v4 styles, rank badges, type badges, custom scrollbar.
components/
ProductTable.tsx Desktop table + mobile card layout. Sorting, filtering.
SupplementFilter.tsx Pill-style filter tabs (All, NMN, NAD+, TMG, Resveratrol, Creatine).
TypeBadge.tsx Colored badge for product type (Capsules, Powder, Tablets, Gel, etc.).
RankBadge.tsx Gold/silver/bronze for top 3, plain number for the rest.
lib/
data.ts Reads data/analysis_report.json. Maps snake_case → camelCase. Single file.
types.ts Analysis interface (camelCase). The only data type the frontend uses.
vendors.ts Vendor registry with base URLs.
next.config.ts Static export, remote image patterns for vendor CDNs.
package.json Next.js 15, React 19, Tailwind CSS v4.
tsconfig.json Path aliases (@/*).
.github/workflows/scrape.yml Daily cron job: scrape → diff → commit → deploy.
Each vendor can have:
blocklist: Product title substrings to reject at the product level (e.g."Bundle","Subscription"). Evaluated byApplyRules()before the product reaches the analyzer.variantBlocklist: Variant title substrings to reject at the variant level (e.g."30 SERV","Sample"). Evaluated inside the analyzer's variant loop — matched variants are skipped viacontinue. Use this to suppress ghost variants that share a product handle with valid variants.overrides: Keyed by product handle. Each override is aProductSpecwith immutable math fields:forceType(string): Product type override (e.g."Capsules","Powder","Tablets","Gel"). Bypasses string-matching type classification.forceActiveGrams(float): Pre-computed total active ingredient mass in grams. Mapped toActiveGramsin the Analysis output. When > 0, the regex mass-extraction pipeline is bypassed entirely. Formula:mg_per_serving × count / 1000. This is the denominator for all cost calculations.forceServingMg(float): Per-serving mg. Informational/documentation field — not consumed by the analyzer, but aids operators in verifying theforceActiveGramscalculation.variantOverrides(map[string]float64): Per-variant active ingredient grams, keyed by exact variant title string. When a variant title matches a key and the value is > 0, it takes highest priority — bypassing bothforceActiveGramsand the regex pipeline. Use this when a single product handle groups variants with drastically different active weights (e.g. Nutricost "500 GMS" vs "30 SERV" under one handle).variantGrossOverrides(map[string]float64): Per-variant gross (label) weight in grams, keyed by exact variant title string. When a variant title matches a key and the value is > 0, the regex label-weight extraction is bypassed for that variant. Use this for variants whose titles lack standard gram/kg patterns (e.g.,"30 SERV") where the physical container weight is known but not parseable.
globalSubscriptionDiscount: A float between 0 and 1 representing the fractional discount for subscription purchases (e.g.,0.10= 10% off). When set, the analyzer emits a second "Subscribe & Save" entry for every valid variant of that vendor's products, withis_subscription: trueand the discounted price. Used for vendors whose Shopify APIs do not expose subscription pricing directly.
Example:
{
"Nutricost": {
"blocklist": ["5-HTP", "Carnitine"],
"globalSubscriptionDiscount": 0.20,
"variantBlocklist": ["Unflavored / 30 SERV"],
"overrides": {
"nutricost-creatine-monohydrate-powder-500-grams": {
"forceType": "Powder",
"variantOverrides": {
"Frozen Lemonade / 30 SERV": 150.0,
"Fruit Punch / 500 GMS": 380.0
},
"variantGrossOverrides": {
"Frozen Lemonade / 30 SERV": 207.0
}
}
}
},
"Renue By Science": {
"blocklist": ["Cream", "Serum", "Pet"],
"globalSubscriptionDiscount": 0.10,
"overrides": {
"lipo-nmn-powdered-liposomal-nmn2": {
"forceType": "Capsules",
"forceActiveGrams": 22.5,
"forceServingMg": 250,
"variantOverrides": {
"60 Capsules": 15.0,
"90 Capsules": 22.5
}
}
}
}
}The GitHub Actions workflow (.github/workflows/scrape.yml) runs daily at 08:00 UTC:
- Checks out the repo.
- Runs
go run cmd/main.go -refresh. - Diffs
data/*.json. If unchanged, stops. - Commits changes as
"Auto-update product data [skip ci]". - Hits the Vercel deploy hook (requires
VERCEL_DEPLOY_HOOKsecret).
Manual trigger: use the Run workflow button on the Actions tab.
Jinfiniti and Wonderfeel are behind Cloudflare. Their Cloudflare: true flag causes the scraper to skip live fetching and load from data/<vendor>.json instead. To update their data:
- Manually visit the vendor site.
- Extract product info into the matching JSON file in
data/. - Commit and push.
Go Scraper (concurrent) → data/*.json (raw) → analyzer.go → data/analysis_report.json → Next.js (dumb renderer)
The Go backend is the single source of truth for all parsing, regex extraction, bioavailability math, multiplier assignment, vendor-name stripping, type classification, and synthetic subscription generation. Vendor scraping/loading runs concurrently: cmd/main.go spawns one goroutine per vendor, each scraping (or reading from disk) independently. Results are collected via a buffered channel and processed sequentially for analysis, sorting, and report generation. The analyzer implements a Hybrid Catalog/Regex Engine with active/gross mass disambiguation: activeGrams (active ingredient mass) is populated via the priority chain (variant override > forceActiveGrams > regex pipeline) and serves as the denominator for CostPerGram and EffectiveCost. grossGrams (label weight) is resolved via a two-tier chain: variantGrossOverrides (manual per-variant override for titles lacking gram/kg patterns) > reLabelGrams/reLabelKg regex on product/variant titles — it defaults to 0 for capsule products or when neither override nor regex yields a value. For "Pure Powder" products (no dirty keywords), if grossGrams was found and activeGrams was regex-resolved (not override), activeGrams is set equal to grossGrams. Products with forceActiveGrams overrides in vendor_rules.json bypass the regex mass-extraction pipeline entirely; all other products use the standard regex pipeline. The rePack (pack multiplier) regex always runs regardless of override source. ApplyRules() performs blocklist filtering only — it does not inject strings into product context. A Triage Engine runs after mass extraction: regex-resolved products are scanned against dirtyKeywords (flavors, blends, gummies, combos) and flagged with needs_review: true / review_reason. cmd/main.go extracts all flagged entries into data/needs_review.json for operator review. The frontend contains zero duplicated logic. It reads data/analysis_report.json and renders it. Each product may appear twice in the report — once as a one-time purchase (is_subscription: false) and once as a subscription (is_subscription: true) — enabling the frontend to toggle between purchase types.
Stack: Next.js 15 (App Router), React 19, Tailwind CSS v4, static export (output: "export").
How it works:
lib/data.tsreadsdata/analysis_report.jsonusingfs.readFileSyncduring the build step. It maps the Go backend's snake_case JSON fields (includingactive_grams,gross_grams,is_subscription) to camelCase TypeScript (includingactiveGrams,grossGrams,isSubscription) via amapEntry()function.app/page.tsxcallsloadReport(), attachesVendorInfometadata, and passes the result toProductTable. No parsing. No math.ProductTableis a client component that provides supplement filtering (pill tabs) and column sorting. Desktop shows a data table with separate Active and Gross columns; mobile (<768px) shows a card layout with Gross shown as subtext below Active when they differ. The Gross column displays "—" whengrossGramsis 0 or equalsactiveGrams. The True Cost column header has a hover tooltip(i)explaining: "Base Price ÷ Bioavailability Multiplier". When a product'smultiplier > 1, a muted subtext below the True Cost value shows the multiplier and its label (e.g.,(1.5x Lipo Bonus)).- The build produces a fully static
out/directory. No server, no client-side API calls.
Allowed frontend math: Only user-driven state calculations (e.g., a future "Monthly Cost" column based on dosage input). All product-level computation is pre-computed by Go.