Releases: bysiber/deepworm
v1.5.0 — Data Pipeline, Concurrency, CLI Helpers, Protocols
v1.5.0
New Modules
Data Pipeline (data_pipeline.py)
Composable ETL pipelines with stage-based processing. StageStatus/PipelineStatus/ErrorStrategy enums. Stage results with timing, retries, success tracking. DataPipeline with add/remove/enable/disable stages, hooks, error strategies. fan_out/fan_in for parallel processing. batch_process with error handling. Functional transforms: map_data, filter_data, reduce_data, group_by, flatten, distinct, chunk.
Concurrency (concurrency.py)
Thread pool execution, task queues, and synchronization primitives. AtomicCounter and AtomicValue (thread-safe). TaskQueue with priority support. WorkerPool with configurable threads, submit/wait/results. parallel_map for concurrent processing. Debouncer, Throttle, and Once for execution control.
CLI Helpers (cli_helpers.py)
CLI argument parsing, output formatting, and progress indicators. Color enum with ANSI codes. Text formatting: truncate, pad, indent, tables, key-value, lists, sizes, durations. ProgressBar and Spinner with animations. parse_args with type coercion. format_help and draw_box with Unicode borders.
Protocols (protocols.py)
Type contracts and algebraic types. Result type (Ok/Err) with unwrap, map, try_result. Option type (Some/Nothing) for nullable values. Either type (Left/Right) for disjoint unions. 6 Protocol interfaces: Serializable, Renderable, Validatable, Disposable, Configurable, Identifiable. Type guards and Lazy evaluation. Safe conversions: safe_int, safe_float, safe_bool, safe_str.
Stats
- Tests: 1999 passing
- Public API: 464 exports
- New exports: 109
v1.0.0
DeepWorm v1.0.0
🎉 First Major Release!
DeepWorm reaches v1.0.0 with 136 public API exports, 1126 tests, and ~19,000 lines of code.
New Modules
Word Cloud & Frequency Analysis (wordcloud.py)
WordFrequencydataclass with count, frequency, rank, TF-IDF, weightWordCloudDatawith markdown table, inline HTML cloud, CSV, size map outputgenerate_word_cloud()— 130+ built-in stop words, configurable max_words, min_length, min_countcompare_word_clouds()— compare frequency distributions across documentstfidf_cloud()— multi-document TF-IDF word cloud analysis
Document Revision Tracking (revisions.py)
Revisionwith SHA-256 content hashing, word/line countsRevisionDiffwith LCS-based diff algorithm, unified diff and markdown outputRevisionHistorywith add/get/rollback/changelog/statisticstrack_changes()for quick two-version comparisonmerge_revisions()with chronological ordering and deduplication
Comprehensive Document Statistics (statistics.py)
TextStatistics— 25+ metrics: characters, words, sentences, paragraphs, vocabulary richness, hapax legomenacompute_statistics()— reading time (238 WPM), speaking time (150 WPM)compare_statistics()— side-by-side document comparison with diffsvocabulary_analysis()— frequency distribution, rare words, type-token ratiosection_statistics()— per-heading breakdownreading_level()— Flesch-Kincaid Grade Level and ARI
Table of Contents (toc.py)
TocEntrywith auto-anchor slugification and depth trackingTableOfContentswith flat view, level filtering, max_depth- Output formats: markdown, numbered markdown (1, 1.1, 1.2), HTML
inject_toc()— marker-based or auto-placement insertionmerge_tocs()for combining multiple ToCs
Stats
- 136 public API exports (+27 from v0.9.0)
- 1,126 tests passing
- ~19,000 lines of code
- 51 test files
- 65+ source modules