Skip to content

Releases: bysiber/deepworm

v1.5.0 — Data Pipeline, Concurrency, CLI Helpers, Protocols

25 Feb 08:27

Choose a tag to compare

v1.5.0

New Modules

Data Pipeline (data_pipeline.py)

Composable ETL pipelines with stage-based processing. StageStatus/PipelineStatus/ErrorStrategy enums. Stage results with timing, retries, success tracking. DataPipeline with add/remove/enable/disable stages, hooks, error strategies. fan_out/fan_in for parallel processing. batch_process with error handling. Functional transforms: map_data, filter_data, reduce_data, group_by, flatten, distinct, chunk.

Concurrency (concurrency.py)

Thread pool execution, task queues, and synchronization primitives. AtomicCounter and AtomicValue (thread-safe). TaskQueue with priority support. WorkerPool with configurable threads, submit/wait/results. parallel_map for concurrent processing. Debouncer, Throttle, and Once for execution control.

CLI Helpers (cli_helpers.py)

CLI argument parsing, output formatting, and progress indicators. Color enum with ANSI codes. Text formatting: truncate, pad, indent, tables, key-value, lists, sizes, durations. ProgressBar and Spinner with animations. parse_args with type coercion. format_help and draw_box with Unicode borders.

Protocols (protocols.py)

Type contracts and algebraic types. Result type (Ok/Err) with unwrap, map, try_result. Option type (Some/Nothing) for nullable values. Either type (Left/Right) for disjoint unions. 6 Protocol interfaces: Serializable, Renderable, Validatable, Disposable, Configurable, Identifiable. Type guards and Lazy evaluation. Safe conversions: safe_int, safe_float, safe_bool, safe_str.

Stats

  • Tests: 1999 passing
  • Public API: 464 exports
  • New exports: 109

v1.0.0

25 Feb 06:43

Choose a tag to compare

DeepWorm v1.0.0

🎉 First Major Release!

DeepWorm reaches v1.0.0 with 136 public API exports, 1126 tests, and ~19,000 lines of code.

New Modules

Word Cloud & Frequency Analysis (wordcloud.py)

  • WordFrequency dataclass with count, frequency, rank, TF-IDF, weight
  • WordCloudData with markdown table, inline HTML cloud, CSV, size map output
  • generate_word_cloud() — 130+ built-in stop words, configurable max_words, min_length, min_count
  • compare_word_clouds() — compare frequency distributions across documents
  • tfidf_cloud() — multi-document TF-IDF word cloud analysis

Document Revision Tracking (revisions.py)

  • Revision with SHA-256 content hashing, word/line counts
  • RevisionDiff with LCS-based diff algorithm, unified diff and markdown output
  • RevisionHistory with add/get/rollback/changelog/statistics
  • track_changes() for quick two-version comparison
  • merge_revisions() with chronological ordering and deduplication

Comprehensive Document Statistics (statistics.py)

  • TextStatistics — 25+ metrics: characters, words, sentences, paragraphs, vocabulary richness, hapax legomena
  • compute_statistics() — reading time (238 WPM), speaking time (150 WPM)
  • compare_statistics() — side-by-side document comparison with diffs
  • vocabulary_analysis() — frequency distribution, rare words, type-token ratio
  • section_statistics() — per-heading breakdown
  • reading_level() — Flesch-Kincaid Grade Level and ARI

Table of Contents (toc.py)

  • TocEntry with auto-anchor slugification and depth tracking
  • TableOfContents with flat view, level filtering, max_depth
  • Output formats: markdown, numbered markdown (1, 1.1, 1.2), HTML
  • inject_toc() — marker-based or auto-placement insertion
  • merge_tocs() for combining multiple ToCs

Stats

  • 136 public API exports (+27 from v0.9.0)
  • 1,126 tests passing
  • ~19,000 lines of code
  • 51 test files
  • 65+ source modules

Full Changelog

v0.9.0...v1.0.0