Skip to content

haseebafeef/QS3-ref-unedits

Repository files navigation

Item Refiner (Batch System)

A Next.js application for strict 1-to-1 comparison of massive datasets (QIDs, P-Values, and Titles). Designed to handle millions of records by leveraging MongoDB with a split-collection architecture.

Features

🚀 Core Functionality

  • Batch Creation: Compare "File A" (Source) vs "File B" (Reference).
  • Strict Alignment:
    • Preserves every row from File A (even duplicates).
    • Matches File B items exactly once.
    • Separates "Extra" File B items at the bottom.
  • Custom Parsing:
    • CSV Support: Auto-detects QID columns.
    • Dynamic P-Values: Automatically detects and extracts columns like P123, P_xyz from File A headers.
    • "Diff Hist" Log Support: Parses custom log files (... diff hist ... (Q123)) for File B.
  • Legacy Support: Hosts the previous V1 Vite application at /v1.

💾 Database Architecture

  • Batch Metadata: Stores summary stats and configurations in the Batch collection.
  • Batch Rows: Stores millions of individual rows in the BatchRow collection (indexed by batchId + index) to bypass MongoDB's 16MB document limit.

Getting Started

  1. Environment Setup: Create a .env.local file with your MongoDB connection string:

    MONGODB_URI=mongodb+srv://user:pass@cluster.mongodb.net/my-db
  2. Install Dependencies:

    npm install
  3. Run Development Server:

    npm run dev

    Open http://localhost:3000 for the main app. Open http://localhost:3000/v1/index.html for the legacy V1 app.

Deployment (Vercel)

This project is optimized for Vercel.

  1. Push code to GitHub.
  2. Import project in Vercel.
  3. Add MONGODB_URI to Vercel Environment Variables.
  4. Deploy.

Legacy Support

The original Vite-based application (V1) has been compiled and included in public/v1.

  • Source: vite-backup/
  • Deployment: Served statically via Next.js public folder.

About

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors