A Next.js application for strict 1-to-1 comparison of massive datasets (QIDs, P-Values, and Titles). Designed to handle millions of records by leveraging MongoDB with a split-collection architecture.
- Batch Creation: Compare "File A" (Source) vs "File B" (Reference).
- Strict Alignment:
- Preserves every row from File A (even duplicates).
- Matches File B items exactly once.
- Separates "Extra" File B items at the bottom.
- Custom Parsing:
- CSV Support: Auto-detects QID columns.
- Dynamic P-Values: Automatically detects and extracts columns like
P123,P_xyzfrom File A headers. - "Diff Hist" Log Support: Parses custom log files (
... diff hist ... (Q123)) for File B.
- Legacy Support: Hosts the previous V1 Vite application at
/v1.
- Batch Metadata: Stores summary stats and configurations in the
Batchcollection. - Batch Rows: Stores millions of individual rows in the
BatchRowcollection (indexed bybatchId+index) to bypass MongoDB's 16MB document limit.
-
Environment Setup: Create a
.env.localfile with your MongoDB connection string:MONGODB_URI=mongodb+srv://user:pass@cluster.mongodb.net/my-db
-
Install Dependencies:
npm install
-
Run Development Server:
npm run dev
Open http://localhost:3000 for the main app. Open http://localhost:3000/v1/index.html for the legacy V1 app.
This project is optimized for Vercel.
- Push code to GitHub.
- Import project in Vercel.
- Add
MONGODB_URIto Vercel Environment Variables. - Deploy.
The original Vite-based application (V1) has been compiled and included in public/v1.
- Source:
vite-backup/ - Deployment: Served statically via Next.js
publicfolder.