DataBridge Core

Your finance team just spent 4 hours on VLOOKUP. This takes 5 seconds.

DataBridge Core is a Python toolkit for data reconciliation, profiling, ingestion, and Excel triage. Compare CSV files, find fuzzy matches, detect schema drift, scan Excel workbooks, and send results to Slack -- from the command line or Python.

pip install databridge-core

5-Second Demo

# Profile a file
databridge profile sales.csv

# Compare two sources -- find orphans, conflicts, match rate
databridge compare source.csv target.csv --keys id

# Fuzzy match names across systems
databridge fuzzy erp_accounts.csv gl_accounts.csv --column name --threshold 80

# Scan Excel files and classify by archetype
pip install 'databridge-core[triage]'
databridge triage ./excel_files/

Python API

from databridge_core import compare_hashes, profile_data, load_csv

# Profile your data
profile = profile_data("chart_of_accounts.csv")
print(f"{profile['rows']} rows, {profile['columns']} columns")
print(f"Potential keys: {profile['potential_key_columns']}")

# Compare two sources
result = compare_hashes("source.csv", "target.csv", key_columns="account_id")
stats = result["statistics"]
print(f"Match rate: {stats['match_rate_percent']}%")
print(f"Conflicts: {stats['conflicts']}, Orphans: {stats['total_orphans']}")

Smart Excel Import

from databridge_core import smart_import_excel, detect_anchor_cell

# Auto-detect header rows, skip junk, clean column names
df = smart_import_excel("messy_report.xlsx")

# Find the real data start in a complex spreadsheet
anchor = detect_anchor_cell("messy_report.xlsx")
print(f"Data starts at: {anchor['cell']}")

Templates

from databridge_core.templates import TemplateService

svc = TemplateService(templates_dir="templates")
templates = svc.list_templates(domain="accounting")
rec = svc.get_template_recommendations(industry="manufacturing", statement_type="pl")

Slack Integration

from databridge_core.integrations import SlackClient

slack = SlackClient(bot_token="xoxb-...")
slack.send_message("#data-ops", "Reconciliation complete: 99.5% match rate")
slack.post_reconciliation_report("#data-ops", result)

Excel Triage

from databridge_core.triage import scan_and_classify

result = scan_and_classify("./excel_files/", output_dir="./reports/")
print(f"Scanned {result['summary']['total_files']} files")
print(f"Archetypes: {result['summary']['archetype_counts']}")

Commands

Command	Description
`databridge profile <file>`	Profile data: structure, quality, cardinality
`databridge compare <a> <b> --keys <col>`	Hash comparison: orphans, conflicts, match rate
`databridge fuzzy <a> <b> -c <col>`	Fuzzy match columns across two files
`databridge diff <a> <b>`	Text diff between two files
`databridge drift <old> <new>`	Detect schema drift between CSVs
`databridge transform <file> -c <col> --op upper`	Clean a column (upper/lower/strip/trim/remove_special)
`databridge merge <a> <b> --keys <col>`	Merge two CSVs on key columns
`databridge find "*.csv"`	Find files matching a pattern
`databridge parse <text>`	Parse tabular data from messy text
`databridge triage <dir>`	Scan Excel files and classify by archetype
`databridge smart-import <file>`	Smart Excel import with anchor detection

Optional Extras

pip install 'databridge-core[fuzzy]'    # Fuzzy matching (rapidfuzz)
pip install 'databridge-core[pdf]'      # PDF text extraction (pypdf)
pip install 'databridge-core[ocr]'      # OCR image extraction (pytesseract)
pip install 'databridge-core[sql]'      # Database queries (sqlalchemy)
pip install 'databridge-core[triage]'   # Excel triage scanning (openpyxl)
pip install 'databridge-core[all]'      # Everything
pip install 'databridge-core[dev]'      # Development tools (pytest, ruff, build)

Modules

Module	Description	Extra Required
`reconciler`	Hash comparison, fuzzy matching, diffing, merging	-
`profiler`	Data profiling, schema drift detection, expectations	-
`ingestion`	CSV, JSON, PDF, OCR, smart Excel import	`[pdf]`, `[ocr]`
`linker`	Entity resolution and record linkage	-
`connectors`	Snowflake, database connectors	`[sql]`
`detection`	ERP detection, anomaly detection	-
`templates`	Industry hierarchy templates, skills, knowledge base	-
`integrations`	Slack client (BaseClient + SlackClient)	-
`triage`	Batch Excel scanning and archetype classification	`[triage]`

Built for Finance

DataBridge Core is the open-source foundation of DataBridge AI -- a full platform for financial hierarchy management, dbt model generation, and enterprise data reconciliation with 336+ MCP tools.

How it works: Upload your Chart of Accounts. Get a production-ready financial hierarchy and dbt models. Zero config.

What's Next?

DataBridge Core provides the SDK foundation. For the full platform experience:

MCP Server (336+ tools): Headless AI-native data engine
Docker: docker run -p 786:786 ghcr.io/datanexum/databridge-mcp:latest
Claude Code Plugin: claude plugin install datanexum/databridge-plugin
Remote SSE: https://mcp.databridge.dataamplifier.io/sse

See the full documentation for details.

Changelog

See CHANGELOG.md for full version history.

License

MIT -- See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.claude/agent-memory/docs-keeper		.claude/agent-memory/docs-keeper
.github/workflows		.github/workflows
examples		examples
src/databridge_core		src/databridge_core
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DataBridge Core

5-Second Demo

Python API

Smart Excel Import

Templates

Slack Integration

Excel Triage

Commands

Optional Extras

Modules

Built for Finance

What's Next?

Changelog

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DataBridge Core

5-Second Demo

Python API

Smart Excel Import

Templates

Slack Integration

Excel Triage

Commands

Optional Extras

Modules

Built for Finance

What's Next?

Changelog

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages