Skip to content

Comments

adding migration verifier#173

Open
DakshD7 wants to merge 1 commit intomongodb:masterfrom
DakshD7:migrationVerifiercodeAdded
Open

adding migration verifier#173
DakshD7 wants to merge 1 commit intomongodb:masterfrom
DakshD7:migrationVerifiercodeAdded

Conversation

@DakshD7
Copy link

@DakshD7 DakshD7 commented Jan 22, 2026

Adding migration verifier-related modifications that allow data verification when embedded verifiers are not in use and mongosync is running. They also help with data verification when source and sync connectors between clusters are used in a rollback strategy.

it will show first page

image

when initial sync run

image

and during the recheck, the namespace colour was displayed at the bottom based on the final check.

image

once its green mean data match in source and destination
image

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds migration verifier monitoring functionality to Mongosync Insights, allowing users to visualize and track data verification progress when using the MongoDB migration-verifier tool. The feature provides a dashboard showing verification task status, generation history, namespace statistics, and mismatch details.

Changes:

  • Added migration verifier monitoring with real-time dashboard showing verification task status, failures, and mismatch details
  • Integrated verifier metrics endpoint with session-based authentication and auto-refresh capabilities
  • Enhanced home page with new form for verifier database connection configuration

Reviewed changes

Copilot reviewed 17 out of 40 changed files in this pull request and generated no comments.

Show a summary per file
File Description
migration_verifier.py Core logic for gathering and visualizing verifier metrics from MongoDB
mongosync_insights.py Integration of verifier routes and session management
verifier_metrics.html Dashboard template with auto-refresh and error handling
home.html Added verifier form with client-side validation
requirements.txt Updated dependencies (no new packages added)
README.md, CONFIGURATION.md, etc. Comprehensive documentation for new feature
Comments suppressed due to low confidence (6)

migration/mongosync_insights/templates/VALIDATION.md:196

  • Documentation files (VALIDATION.md, README.md, HTTPS_SETUP.md, CONFIGURATION.md) should not be placed in the templates/ directory. These belong in the project root or a dedicated docs/ directory. The templates/ directory should only contain HTML template files.
# Connection String Validation

This document describes the connection string handling in Mongosync Insights.

## Overview

Mongosync Insights uses PyMongo's built-in validation for connection strings, which provides:
- URI format validation
- Connection testing
- Authentication verification

## Validation Process

### 1. Empty String Check

The application first checks if a connection string was provided:

```python
if not TARGET_MONGO_URI or not TARGET_MONGO_URI.strip():
    return error("Please provide a valid MongoDB connection string.")

2. PyMongo URI Parsing

PyMongo's parse_uri() function validates the connection string format and raises InvalidURI if the format is invalid. This checks:

  • Proper URI scheme (mongodb:// or mongodb+srv://)
  • Valid URI syntax
  • Proper host and port format
  • Valid URI components

3. Connection Test

The application attempts to connect to MongoDB using validate_connection(), which:

  • Creates a MongoDB client
  • Tests connectivity with a ping command
  • Validates authentication credentials
  • Raises PyMongoError if connection fails

Display Sanitization

Connection strings are sanitized before display to protect credentials.

sanitize_for_display(connection_string)

This function removes credentials from connection strings for safe display in the UI.

Example:

# Input
connection_string = "mongodb+srv://user:password@cluster.mongodb.net/mydb"

# Output
sanitized = "cluster.mongodb.net:27017 (database: mydb)"

Implementation:

  • Parses the connection string to extract hosts and database
  • Escapes HTML special characters
  • Returns only non-sensitive information
  • Returns "[Connection String Provided]" if parsing fails

Error Handling

The application provides clear error messages for common issues:

Invalid URI Format

Error Title: "Invalid Connection String"
Error Message: "The connection string format is invalid. Please check your MongoDB connection string and try again."

Common causes:

  • Incorrect URI scheme
  • Missing required components
  • Invalid characters in URI

Connection Failed

Error Title: "Connection Failed"
Error Message: "Could not connect to MongoDB. Please verify your credentials, network connectivity, and that the cluster is accessible."

Common causes:

  • Incorrect username or password
  • Network connectivity issues
  • Firewall blocking connection
  • MongoDB server not running
  • Incorrect host or port

Unexpected Error

Error Title: "Connection Error"
Error Message: "An unexpected error occurred. Please try again."

Common causes:

  • Timeout issues
  • DNS resolution failures
  • Unexpected server responses

Logging

All connection attempts and errors are logged to insights.log:

logger.error(f"Invalid connection string format: {e}")
logger.error(f"Failed to connect: {e}")
logger.error(f"Unexpected error during connection validation: {e}")

Note: Connection strings with credentials are not logged to prevent credential exposure.

Security Considerations

Credential Protection

  1. Never displayed: Credentials are always removed before displaying connection information
  2. Not logged: Connection strings with passwords are never written to logs
  3. Sanitized output: Only host, port, and database name are shown in the UI

HTTPS Recommended

For production deployments, always use HTTPS to protect connection strings in transit. See HTTPS_SETUP.md for setup instructions.

Secure Cookies

Enable secure cookies when using HTTPS:

MI_SECURE_COOKIES=true

This ensures session cookies are only transmitted over encrypted connections.

Connection String Best Practices

MongoDB Atlas

Use the SRV connection string format:

mongodb+srv://username:password@cluster.mongodb.net/

Credentials in Environment Variables

For production, store the connection string in an environment variable:

export MI_CONNECTION_STRING="mongodb+srv://user:pass@cluster.mongodb.net/"
python3 mongosync_insights.py

This prevents credentials from being entered through the web UI.

URL Encoding

Special characters in passwords must be URL-encoded:

  • @ becomes %40
  • : becomes %3A
  • / becomes %2F
  • ? becomes %3F
  • # becomes %23

Example:

# Password: p@ss:word
mongodb://user:p%40ss%3Aword@cluster.mongodb.net/

Troubleshooting

"Invalid Connection String" Error

  1. Check the URI format starts with mongodb:// or mongodb+srv://
  2. Verify all components are properly formatted
  3. Ensure special characters in password are URL-encoded
  4. Check for typos in the connection string

"Connection Failed" Error

  1. Verify credentials are correct
  2. Check network connectivity to MongoDB server
  3. Ensure MongoDB server is running
  4. Verify firewall allows outbound connections on MongoDB port
  5. For Atlas, ensure IP address is whitelisted

Connection Hangs

  1. Check for network timeouts (default: 5 seconds)
  2. Verify DNS resolution for hostname
  3. Ensure no proxy blocking MongoDB traffic

Support

For connection issues:

  1. Check logs: insights.log
  2. Verify connection string format
  3. Test connection using MongoDB shell or Compass
  4. Review MongoDB server logs for authentication failures
**migration/mongosync_insights/migration_verifier.py:606**
* This conditional expression has a logic error. The ternary operator at the end `if "dst:" in details_str else False` is applied to the entire expression rather than just the split operation. This causes the expression to evaluate to either the result of the complex check or `False`, which is then used in an `if` statement. The intended logic appears to be checking if "unique" is not in the second part after splitting by "dst:", but the current structure is confusing and potentially incorrect. Consider refactoring for clarity:
```python
has_dst = "dst:" in details_str
if "unique\": true" in details_str:
    if has_dst and "unique" not in details_str.split("dst:")[1][:50]:
        coll_details.append(f"Index '{idx_name}': unique constraint missing on {cluster}")
    else:
        coll_details.append(f"Index '{idx_name}' ({field_type}): property mismatch - {cluster}")
                        if "unique\": true" in details_str and "unique" not in details_str.split("dst:")[1] if "dst:" in details_str else False:
                            coll_details.append(f"Index '{idx_name}': unique constraint missing on {cluster}")

migration/mongosync_insights/mongosync_insights.py:339

  • The session data is retrieved twice in this function. Lines 337-338 duplicate the session retrieval already performed on lines 328-329. This is inefficient and could lead to inconsistency if the session changes between calls (though unlikely with the current timeout). Consider retrieving the session once and reusing it:
session_id = request.cookies.get(SESSION_COOKIE_NAME)
session_data = session_store.get_session(session_id)
connection_string = session_data.get('verifier_connection_string')
db_name = session_data.get('verifier_db_name', 'migration_verification_metadata')
    # Get database name from session
    session_id = request.cookies.get(SESSION_COOKIE_NAME)
    session_data = session_store.get_session(session_id)
    db_name = session_data.get('verifier_db_name', 'migration_verification_metadata')

migration/mongosync_insights/templates/migration_verifier.py:723

  • The file migration_verifier.py is duplicated in both the root migration/mongosync_insights/ directory and the migration/mongosync_insights/templates/ directory. This creates code duplication and maintenance issues. The templates/ directory should typically only contain HTML template files, not Python source code. Consider removing the duplicate and keeping the Python source file only in the main directory.
    migration/mongosync_insights/templates/mongosync_plot_utils.py:44
  • Multiple Python source files (mongosync_plot_utils.py, mongosync_plot_logs.py, mongosync_insights.py, file_decompressor.py, connection_validator.py, app_config.py) are located in the templates/ directory. The templates/ directory should contain only template files (HTML). All Python source code should be moved to the parent directory or an appropriate subdirectory. This violates standard project structure conventions.
    migration/mongosync_insights/templates/requirements.txt:31
  • The requirements.txt file should not be placed in the templates/ directory. This file belongs in the project root or the main application directory (migration/mongosync_insights/). Having it in templates/ violates standard Python project structure.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant