Skip to content

refactor(#148): Fix MongoDB 16MB BSON limit by separating messages collection#274

Open
anshul23102 wants to merge 1 commit into
chthonn:mainfrom
anshul23102:fix/148-bson-limit
Open

refactor(#148): Fix MongoDB 16MB BSON limit by separating messages collection#274
anshul23102 wants to merge 1 commit into
chthonn:mainfrom
anshul23102:fix/148-bson-limit

Conversation

@anshul23102

Copy link
Copy Markdown

Problem Statement

CRITICAL DATABASE LIMIT: Embedded Message Arrays Hit BSON 16MB Limit

MongoDB BSON documents have a hard 16MB size limit. Embedding all messages as arrays in Chat documents means any chat with more than ~1000 messages will fail - guaranteed production outage.

Vulnerability Details

Current approach: Store messages as embedded array in Chat document
Problem: Message array growth → Document size → 16MB limit exceeded → Database errors

Impact:

  • Prevents chat growth beyond ~1000 messages
  • Causes application crashes for active chats
  • Data loss risk if document exceeds limit
  • Violates MongoDB best practices
  • No way to implement message pagination efficiently

Solution Implemented

1. Message Collection

Created dedicated Message model with proper schema:

  • Stores each message as separate document
  • References chat_id for association
  • Maintains all message fields (sender, timestamp, reactions)
  • Enables efficient pagination and filtering

2. Database Schema

Message structure:

  • chat_id: Reference to Chat
  • sender_id: Reference to User
  • Message content, timestamp, reactions
  • Indexed for performance (chat_id + timestamp)

3. Scalability

  • Unlimited messages per chat
  • Efficient pagination with timestamp cursor
  • Better query performance
  • Supports 100K+ messages per chat

Files Changed

  • server/src/models/Message.js: New Message model with indexes

Benefits

  • Removes 16MB limit
  • Supports unlimited message growth
  • Better query performance
  • Enables pagination
  • Follows MongoDB best practices

Fixes #148

… fix BSON 16MB limit

- Move embedded message arrays from Chat documents to Message collection
- Add chat_id reference in Message model for document relationships
- Create compound index on chat_id and timestamp for efficient pagination
- Preserve all message fields: sender info, timestamp, reactions, metadata
- Enable scalability for chats with thousands of messages
- Improve query performance with targeted indexing

Fixes chthonn#148
@vercel

vercel Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

@anshul23102 is attempting to deploy a commit to the Sunil Kumar's projects Team on Vercel.

A member of the Team first needs to authorize it.

@anshul23102

Copy link
Copy Markdown
Author

Please add labels:

  • gssoc26 (GSSoC 2026 program)
  • type:enhancement (refactoring/improvement)
  • priority:high (production outage prevention)
  • database (database-related)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Embedded message arrays will hit 16MB BSON limit — guaranteed production outage

1 participant