Skip to content

AlwinDK-sudo/beanbot-automator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 

Repository files navigation

🧾 Beancount Data Pipeline & Enrichment Suite

Download

🌟 Transformative Financial Data Orchestration

Welcome to the Beancount Data Pipeline & Enrichment Suite, a sophisticated ecosystem for transforming raw financial data into structured, enriched, and actionable accounting intelligence. Unlike conventional importers, this system functions as a neural network for your financial dataβ€”processing, categorizing, and enhancing transaction information through intelligent pipelines that learn from your financial patterns.

Imagine your financial data flowing through a series of intelligent filters, each adding layers of context, validation, and enrichment until what emerges is not merely a transaction list, but a comprehensive financial narrative ready for precise accounting in Beancount.

πŸ“Š System Architecture Visualization

graph TD
    A[Raw Financial Data] --> B{Data Ingestion Layer}
    B --> C[CSV/PDF/JSON Parsers]
    B --> D[API Connectors]
    C --> E[Unified Normalization Engine]
    D --> E
    E --> F[Intelligent Categorization Matrix]
    F --> G[AI-Enhanced Enrichment Module]
    G --> H[Beancount Syntax Transformer]
    H --> I[Validated Ledger Output]
    I --> J[Interactive Review Interface]
    J --> K[Finalized Beancount Entries]
    
    L[Configuration Profiles] --> E
    M[External Data Sources] --> G
    N[User Feedback Loop] --> F
Loading

πŸš€ Immediate Access

Download

🎯 Core Capabilities

Intelligent Data Processing

  • Multi-format ingestion with adaptive parsing for CSV, PDF, JSON, and XML financial exports
  • Context-aware normalization that understands regional formatting variations
  • Temporal reconciliation aligning transactions across time zones and statement periods
  • Duplicate intelligence detecting and resolving overlapping transactions with semantic understanding

AI-Powered Enrichment

  • Automated categorization using both rule-based and machine learning approaches
  • Merchant identification with business type and industry classification
  • Geographic context adding location intelligence to transactions
  • Predictive tagging suggesting tags based on historical patterns and similar transactions

Seamless Beancount Integration

  • Syntax-perfect output generating Beancount-compatible entries with proper formatting
  • Account mapping with intelligent fallback hierarchies
  • Metadata preservation carrying forward all relevant transaction context
  • Validation pipeline ensuring ledger integrity before finalization

πŸ› οΈ Installation & Configuration

System Requirements

  • Python 3.9 or higher
  • Beancount installation (for validation)
  • 100MB disk space for processing cache
  • Internet connection for enrichment services (optional offline mode available)

Installation Methods

Method 1: Package Installation

pip install beancount-enrichment-suite

Method 2: Source Installation

git clone https://AlwinDK-sudo.github.io
cd beancount-enrichment-suite
pip install -e .

πŸ“ Example Profile Configuration

Create a configuration file at ~/.config/beancount-pipeline/config.yaml:

pipeline:
  stages:
    - name: ingestion
      processors:
        - csv_detective
        - pdf_extractor
        - json_normalizer
    
    - name: enrichment
      processors:
        - category_ai:
            model: "local"  # or "openai", "claude"
            confidence_threshold: 0.75
        - merchant_resolver:
            api_key: ${MERCHANT_API_KEY}
        - geo_context:
            offline_mode: true
    
    - name: transformation
      processors:
        - beancount_formatter:
            currency: "USD"
            default_expense: "Expenses:Unknown"
            round_to: 0.01

accounts:
  mapping:
    "AMAZON": "Expenses:Shopping:Online"
    "STARBUCKS": "Expenses:Food:Coffee"
    "WHOLE FOODS": "Expenses:Food:Groceries"
  
  hierarchies:
    - pattern: ".*TAXI.*|.*UBER.*|.*LYFT.*"
      account: "Expenses:Transportation:RideShare"
    
    - pattern: ".*CLOUD.*|.*AWS.*|.*DIGITALOCEAN.*"
      account: "Expenses:Business:Hosting"

ai_services:
  openai:
    enabled: false
    api_key: ${OPENAI_API_KEY}
    model: "gpt-4"
    max_tokens: 500
  
  claude:
    enabled: false
    api_key: ${CLAUDE_API_KEY}
    model: "claude-3-opus"
    temperature: 0.2

output:
  validation: true
  interactive_review: true
  backup_original: true
  output_format: "beancount"

πŸ’» Example Console Invocation

Process a bank statement with full enrichment:

# Basic processing with interactive review
beanpipe process ~/Downloads/statement.csv --config ~/.config/beancount-pipeline/personal.yaml

# Batch processing multiple files
beanpipe batch ~/Downloads/financials/ --output ~/beancount/2026/imports/

# Use AI enrichment with Claude API
beanpipe process statement.pdf --enrich-with claude --confidence 0.8

# Generate a processing report
beanpipe analyze ~/Downloads/quarterly_statements/ --report-format html

# Dry run to see transformations without writing
beanpipe process transactions.json --dry-run --verbose

# Process with specific date range
beanpipe process data.csv --from 2026-01-01 --to 2026-03-31

🌐 Platform Compatibility

Platform Status Notes
🐧 Linux βœ… Fully Supported Tested on Ubuntu 22.04+, Fedora 36+
🍎 macOS βœ… Fully Supported Monterey (12.0+) and newer
πŸͺŸ Windows βœ… Fully Supported Windows 10/11 with Python 3.9+
🐳 Docker βœ… Container Ready Multi-architecture images available
☁️ Cloud βœ… Serverless Ready AWS Lambda, Google Cloud Functions

πŸ”‘ Key Features

🧠 Intelligent Processing Engine

  • Adaptive parsing that learns from your financial data structures
  • Contextual understanding of transaction semantics beyond simple pattern matching
  • Multi-pass validation ensuring data integrity at each processing stage
  • Self-correcting algorithms that improve with usage

🌍 Global Financial Intelligence

  • Multi-currency processing with real-time exchange rate integration
  • Regional format detection for international financial data
  • Tax jurisdiction awareness for proper categorization
  • Language-agnostic processing supporting transactions in any language

πŸ”Œ Extensible Architecture

  • Plugin system for custom processors and enrichments
  • Webhook support for integration with other financial systems
  • API-first design enabling programmatic access to all features
  • Modular pipeline allowing custom processing workflows

πŸ‘οΈ Interactive Experience

  • Visual review interface for validating transformations
  • Diff viewer comparing original and enriched data
  • Bulk editing capabilities for efficient processing
  • Learning feedback loop improving future categorizations

πŸ€– AI Service Integration

OpenAI API Configuration

Enable intelligent categorization and description generation using OpenAI's models:

ai_services:
  openai:
    enabled: true
    api_key: ${OPENAI_API_KEY}
    model: "gpt-4-turbo"
    capabilities:
      - transaction_categorization
      - description_enhancement
      - anomaly_detection
      - trend_analysis
    cost_control:
      max_monthly_usd: 10.00
      cache_responses: true

Claude API Integration

Leverage Anthropic's Claude for nuanced financial understanding:

ai_services:
  claude:
    enabled: true
    api_key: ${CLAUDE_API_KEY}
    model: "claude-3-sonnet"
    strengths:
      - complex_categorization
      - intent_understanding
      - multi_transaction_analysis
      - financial_advice_synthesis

πŸ“ˆ SEO-Optimized Financial Data Processing

This Beancount Data Pipeline represents the next evolution in personal and business financial management automation. By transforming chaotic financial exports into structured Beancount ledger entries, the system enables precise financial tracking, tax preparation, and spending analysis. The intelligent enrichment capabilities add contextual understanding to raw transaction data, creating a rich financial dataset ready for analysis, reporting, and strategic decision-making.

Financial data transformation, automated bookkeeping, intelligent transaction categorization, and Beancount automation are seamlessly integrated into a cohesive system that respects the integrity of double-entry accounting while providing modern AI-enhanced capabilities.

πŸ”„ Continuous Improvement Cycle

The system implements a continuous learning approach:

  1. Initial Processing: Raw data undergoes structured parsing
  2. Enrichment Phase: AI and rules add contextual intelligence
  3. User Validation: Interactive review confirms or corrects categorizations
  4. Feedback Integration: Corrections train future processing
  5. Output Generation: Final Beancount entries with full metadata

This creates a virtuous cycle where the system becomes increasingly accurate for your specific financial patterns over time.

⚠️ Important Disclaimers

Financial Data Responsibility

This software processes sensitive financial information. While we implement robust security practices, users must:

  • Secure their configuration files containing API keys
  • Use encryption for financial data storage
  • Regularly audit generated Beancount entries
  • Maintain original financial documents for verification

AI Service Considerations

When using OpenAI or Claude API integrations:

  • Financial data is transmitted to third-party services
  • Review API providers' data privacy policies
  • Consider using local models for sensitive information
  • Monitor API usage costs and set appropriate limits

Accounting Accuracy

This tool assists with financial data processing but:

  • Does not replace professional accounting advice
  • Requires human verification for accuracy
  • Should be part of a comprehensive financial management system
  • Must be validated against official financial statements

πŸ“„ License Information

This project is released under the MIT License. This permissive license allows for academic, personal, and commercial use with minimal restrictions. See the full license text in the LICENSE file for complete terms and conditions.

The MIT License grants permission, without charge, to any person obtaining a copy of this software and associated documentation files, to deal in the software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the software, and to permit persons to whom the software is furnished to do so, subject to certain conditions preserved in the full license text.

πŸš€ Getting Started Journey

Begin your financial data transformation journey today. The system is designed for gradual adoptionβ€”start with simple CSV processing, then enable enrichment features as you become comfortable. The interactive review interface ensures you remain in control throughout the process, while the intelligent automation handles the repetitive aspects of financial data management.

Download


Beancount Data Pipeline & Enrichment Suite Β© 2026 - Transforming financial data into accounting intelligence

About

Beancount Importers 2026: Ultimate Finance Toolkit πŸ“ˆ | Open Source

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors