Skip to content

Latest commit

 

History

History
517 lines (407 loc) · 15.2 KB

File metadata and controls

517 lines (407 loc) · 15.2 KB

Cobalt

A modular, open-source COBOL tooling ecosystem in Rust. Cobalt provides a collection of small, composable libraries that form the foundation for COBOL analysis, refactoring, and modernization tools.

🎯 Goals

  • Modular: Small, focused crates that work together
  • Fast: Built with Rust for performance
  • Composable: Use what you need, combine as needed
  • Open Source: MIT/Apache-2.0 licensed
  • Production Ready: Comprehensive error handling and testing

📦 Crates

cobol-lexer

Fast, modular lexer for COBOL source code supporting both fixed-format and free-format COBOL.

Features:

  • ✅ Free-format COBOL lexing
  • ✅ Fixed-format COBOL lexing with column-based parsing
  • ✅ Case-insensitive keyword recognition
  • ✅ Comprehensive token types (keywords, identifiers, literals, operators, punctuation)
  • ✅ Source location tracking (line, column, span)
  • ✅ Error reporting with precise location information
  • ✅ Supports continuation lines and comment handling

Status: ✅ Complete

📖 Documentation | Examples

cobol-ast

Abstract Syntax Tree (AST) data structures for COBOL programs.

Features:

  • ✅ Complete AST representation of all four COBOL divisions
  • ✅ Data Division structures (data items, PICTURE clauses, OCCURS, etc.)
  • ✅ Procedure Division statements (DISPLAY, MOVE, COMPUTE, IF, PERFORM, etc.)
  • ✅ Expression trees
  • ✅ Source span tracking for all nodes
  • ✅ Visitor pattern for AST traversal
  • ✅ Optional serialization support (serde)

Status: ✅ Core structures defined

📖 Documentation

cobol-parser

Recursive descent parser that converts tokens into a structured AST.

Features:

  • ✅ Parses all four COBOL divisions (Identification, Environment, Data, Procedure)
  • ✅ Data item definitions with PICTURE, VALUE, OCCURS clauses
  • ✅ Comprehensive statement support (DISPLAY, ACCEPT, MOVE, COMPUTE, IF, EVALUATE, PERFORM, etc.)
  • ✅ File operations (OPEN, CLOSE, READ, WRITE, REWRITE, DELETE)
  • ✅ String manipulation (STRING, UNSTRING)
  • ✅ Table operations (SEARCH, SORT)
  • ✅ Complex data structures (OCCURS DEPENDING ON, REDEFINES)
  • ✅ Subprogram support (CALL, LINKAGE SECTION)
  • ✅ Error recovery and detailed error messages
  • ✅ Handles whitespace and comments gracefully

Status: ✅ Comprehensive parsing implemented

📖 Documentation | Examples

cobol-migration-analyzer

CLI tool for assessing COBOL systems for cloud migration and microservices transformation.

Features:

  • ✅ Cloud readiness analysis with detailed scoring
  • ✅ Microservices decomposition recommendations
  • ✅ Effort estimation with resource requirements
  • ✅ Technical debt assessment using real AST analysis
  • ✅ Multiple cloud platform support (AWS, Azure, GCP, Hybrid, Kubernetes)
  • ✅ Migration strategy recommendations (Lift-and-shift, Replatform, Refactor, Rebuild, Replace)
  • ✅ Real COBOL parsing integration (no mock data)
  • ✅ Executive summary generation
  • ✅ Comprehensive risk assessment

Status: ✅ Production ready

Usage:

cargo run --bin cobol-migrate -- \
  --input program.cbl \
  --platform aws \
  --strategy replatform \
  --output report.json

cobol-doc-gen

CLI tool that generates human-readable documentation from COBOL programs.

Features:

  • ✅ Extracts program structure and logic from real COBOL AST
  • ✅ Generates documentation in multiple formats (HTML, Markdown, JSON)
  • ✅ Comprehensive complexity metrics (cyclomatic complexity, nesting depth, maintainability index)
  • ✅ Cross-references and variable usage tracking
  • ✅ Paragraph and section flow analysis
  • ✅ PERFORM call analysis and call graphs
  • ✅ Technical debt calculation
  • ✅ Real COBOL parsing integration (no mock data)
  • ✅ Customizable templates with security validation

Status: ✅ Production ready

Usage:

cargo run --bin cobol-doc -- \
  --input program.cbl \
  --format html \
  --output docs/ \
  --include-source \
  --include-metrics

cobol-repl

Interactive REPL (Read-Eval-Print Loop) for exploring COBOL code.

Features:

  • ✅ Interactive COBOL code parsing and exploration
  • ✅ Load and parse COBOL files
  • ✅ Tokenize COBOL code
  • ✅ View AST structures
  • ✅ List and manage loaded programs
  • ✅ Command history support

Status: ✅ Core functionality ready

Usage:

cargo run --bin cobol-repl

cobol-linter

Static analysis tool for COBOL code quality and compliance.

Features:

  • ✅ Naming convention checks
  • ✅ Deprecated syntax detection (GO TO statements)
  • ✅ Y2K-style date format warnings
  • ✅ COBOL 2014 compliance checks
  • ✅ Multiple output formats (text, JSON)
  • ✅ Severity-based filtering

Status: ✅ Production ready

Usage:

cargo run --bin cobol-linter -- program.cbl

cobol-visualizer

AST visualization tool for COBOL programs.

Features:

  • ✅ Generate visual representations of COBOL AST
  • ✅ SVG output format
  • ✅ Program structure visualization
  • ✅ Division and section highlighting

Status: ✅ Core functionality ready

Usage:

cargo run --bin cobol-visualizer -- program.cbl output.svg

cobol-fmt

Auto-formatter for COBOL source code (like rustfmt or black).

Features:

  • ✅ AST-based formatting for better code quality
  • ✅ Configurable indentation (spaces/tabs, width)
  • ✅ Keyword case normalization (UPPER, lower, preserve)
  • ✅ Identifier case normalization
  • ✅ PICTURE clause formatting
  • ✅ Data item alignment by level number
  • ✅ Spacing around operators
  • ✅ Line length enforcement
  • ✅ Comment preservation
  • ✅ Traditional and modern style presets

Status: ✅ Core functionality ready

Usage:

cargo run --bin cobol-fmt -- program.cbl

cobol-dead-code

Dead code detector for COBOL programs.

Features:

  • ✅ Control Flow Graph (CFG) construction
  • ✅ Reachability analysis
  • ✅ Unused variable detection
  • ✅ Unused paragraph/section detection
  • ✅ Unreachable statement detection
  • ✅ JSON and text output formats
  • ✅ Filtering options (variables-only, procedures-only, unreachable-only)

Status: ✅ Core functionality ready

Usage:

cargo run --bin cobol-dead-code -- program.cbl

🚀 Quick Start

Installation

# Clone the repository
git clone https://github.com/MarsZDF/cobalt.git
cd cobalt

# Build all crates
cargo build --all

Using the Lexer

use cobol_lexer::{tokenize, Format};

let source = r#"
   IDENTIFICATION DIVISION.
   PROGRAM-ID. HELLO-WORLD.
   PROCEDURE DIVISION.
       DISPLAY "Hello, World!".
       STOP RUN.
"#;

let tokens = tokenize(source, Format::FreeFormat)?;
for token in tokens {
    println!("{:?} at line {}", token.token_type, token.line);
}

Using the Parser

use cobol_parser::parse_source;
use cobol_ast::Program;

let source = r#"
   IDENTIFICATION DIVISION.
   PROGRAM-ID. HELLO-WORLD.
   PROCEDURE DIVISION.
       DISPLAY "Hello, World!".
       STOP RUN.
"#;

let program: Program = parse_source(source)?;
println!("Program ID: {:?}", program.identification.program_id);

Complete Pipeline Example

use cobol_lexer::{tokenize, Format};
use cobol_parser::parse;
use cobol_ast::{Program, Visitor};

let source = "/* your COBOL code */";

// Step 1: Tokenize
let tokens = tokenize(source, Format::FreeFormat)?;

// Step 2: Parse
let program: Program = parse(&tokens)?;

// Step 3: Analyze (using visitor pattern)
struct MyVisitor;
impl Visitor for MyVisitor {
    // Implement visitor methods
}

🏗️ Architecture

┌─────────────────────┐
│   COBOL Source      │
│  (.cbl, .cob, etc.) │
└──────────┬──────────┘
           │
           v
┌─────────────────────┐
│   cobol-lexer       │ Tokenizes source code
│                     │ (free-format ✅, fixed-format ✅)
└──────────┬──────────┘
           │
           v
┌─────────────────────┐
│  cobol-parser       │ Parses tokens into AST
│                     │ (comprehensive COBOL support ✅)
└──────────┬──────────┘
           │
           v
┌─────────────────────┐
│    cobol-ast        │ AST data structures
│                     │ (with visitor pattern ✅)
└──────────┬──────────┘
           │
           ├──────────┬──────────┬──────────┬──────────┬──────────┐
           │          │          │          │          │          │
           v          v          v          v          v          v
┌──────────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ cobol-       │ │ cobol-   │ │ cobol-   │ │ cobol-   │ │ cobol-   │ │ cobol-   │
│ migration-   │ │ doc-gen  │ │ repl     │ │ linter   │ │ visual-  │ │ fmt      │
│ analyzer ✅  │ │ ✅       │ │ ✅       │ │ ✅       │ │ izer ✅  │ │ ✅       │
└──────────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘
           │
           v
┌──────────────┐
│ cobol-       │
│ dead-code ✅ │
└──────────────┘

🧪 Development

Prerequisites

  • Rust 1.70+ (stable, beta, or nightly)
  • Cargo (comes with Rust)

Building

# Build all crates
cargo build --all

# Build a specific crate
cd cobol-lexer && cargo build

# Build with optimizations
cargo build --all --release

Testing

# Run all tests
cargo test --all

# Run tests for a specific crate
cd cobol-lexer && cargo test

# Run with output
cargo test --all -- --nocapture

Running Examples

# Run lexer example
cd cobol-lexer && cargo run --example basic_tokenize

# Run parser example
cd cobol-parser && cargo run --example basic_parse

Running CLI Tools

# Run migration analyzer
cargo run --bin cobol-migrate -- --help

# Run documentation generator
cargo run --bin cobol-doc -- --help

Linting and Formatting

# Format code
cargo fmt --all

# Run clippy
cargo clippy --all -- -D warnings

Benchmarks

cd cobol-lexer && cargo bench

🔧 Workspace Structure

cobalt/
├── Cargo.toml              # Workspace configuration
├── README.md               # This file
├── .github/
│   └── workflows/
│       └── ci.yml          # CI/CD pipeline
├── cobol-lexer/            # Lexer crate
│   ├── src/
│   ├── tests/
│   ├── examples/
│   └── benches/
├── cobol-ast/              # AST crate
│   ├── src/
│   └── tests/
├── cobol-parser/           # Parser crate
│   ├── src/
│   ├── tests/
│   └── examples/
├── cobol-migration-analyzer/  # Migration tool
│   └── src/
├── cobol-doc-gen/          # Documentation generator
│   └── src/
├── cobol-repl/             # Interactive REPL
│   └── src/
├── cobol-linter/           # Static analysis linter
│   └── src/
├── cobol-visualizer/       # AST visualization
│   └── src/
├── cobol-fmt/              # Code formatter
│   └── src/
└── cobol-dead-code/        # Dead code detector
    └── src/

🚦 CI/CD

We use GitHub Actions for continuous integration:

  • ✅ Tests on stable, beta, and nightly Rust
  • ✅ Tests on Linux, Windows, and macOS
  • ✅ Linting with clippy and rustfmt
  • ✅ Builds examples and documentation
  • ✅ All crates tested in the pipeline

See `.github/workflows/ci.yml` for details.

📝 Contributing

Contributions are welcome! This project follows standard Rust conventions:

  1. Fork the repository
  2. Create a feature branch (`git checkout -b feature/amazing-feature`)
  3. Make your changes
  4. Add tests for new functionality
  5. Ensure all tests pass (`cargo test --all`)
  6. Run clippy and fix warnings (`cargo clippy --all`)
  7. Format code (`cargo fmt --all`)
  8. Update documentation as needed
  9. Submit a pull request

Development Guidelines

  • Follow Rust naming conventions
  • Write comprehensive tests
  • Document public APIs with rustdoc
  • Handle errors explicitly (use `Result` types)
  • Keep crates focused and modular
  • Use workspace dependencies where appropriate

🗺️ Roadmap

Completed ✅

  • cobol-lexer - Complete lexer with free-format and fixed-format support
  • cobol-ast - Comprehensive AST structures for all COBOL constructs
  • cobol-parser - Full COBOL grammar support (EVALUATE, PERFORM VARYING, file I/O, string operations, etc.)
  • cobol-migration-analyzer - Production-ready migration assessment tool with real AST integration
  • cobol-doc-gen - Complete documentation generator with complexity metrics and cross-references
  • cobol-repl - Interactive REPL for COBOL exploration
  • cobol-linter - Static analysis tool with compliance checks
  • cobol-visualizer - AST visualization tool
  • cobol-fmt - Auto-formatter for COBOL source code
  • cobol-dead-code - Dead code detector with CFG analysis
  • Security hardening - Fixed unsafe operations across all crates
  • Parser integration - Real COBOL parsing in all analysis tools
  • Workspace setup and CI/CD

In Progress 🚧

  • Comprehensive testing framework with real COBOL programs
  • Performance benchmarking and optimization
  • Enhanced error messages and recovery strategies
  • Expand dead code detection (handle dynamic calls, ALTER statements)

Planned 📋

  • Security and Compliance Scanner - OWASP-style security checks, PCI-DSS, GDPR compliance
  • Enhanced dead code detection - Better handling of dynamic PERFORM calls
  • Language server support (LSP)
  • Refactoring tools
  • COBOL to Rust transpiler (experimental)

🤝 Acknowledgments

This project aims to modernize COBOL tooling using Rust's excellent performance and safety guarantees. Special thanks to:

  • The Rust community for excellent tooling and documentation
  • COBOL maintainers for keeping legacy systems running
  • Contributors and users of this project

📚 Additional Resources

💬 Community


Built with ❤️ in Rust