Conversation
zipfile with remove
type annotate magic metron field functions and make all params kwargs
use eslint outside of editor
update deps, new ruff rules. lint & format
commit e27050fbd42f0cf8e549871cc06c70f041672306
Author: AJ Slater <aj@slater.net>
Date: Thu Nov 7 21:36:49 2024 -0800
rename deserializeMeta class to TrapExcepionsMeta
fix type issues with field metaclass wrapper
Loguru's logger object isn't picklable into ProcessPoolExecutor
workers, so callers like codex couldn't get worker log output to
match their parent-process format. Adds a worker_log_config dict
({level, format, sink}) that runs through the executor initializer
and reconfigures loguru in each worker via init_logging. Also adds
enqueue=True to the default sink for thread-safe logging.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* upgrade confuse to 2.2.0; replace AttrDict with typed Settings dataclass confuse 2.2.0 makes AttrDict properly generic, so per-key types resolve to `object` and consumers across the box mixins fail typecheck. Convert the validated AttrDict into a frozen `Settings` dataclass once in get_config() and propagate that typed object everywhere; confuse stays confined to comicbox/config. - New comicbox/config/settings.py defines `Settings` and `ComputedSettings` (frozen, slots). - get_config() returns Settings; new _build_settings() does the conversion. post_process_set_for_path() rebuilt around dataclasses.replace. - FrozenAttrDict deleted — frozen dataclass enforces immutability. - process.py passes Settings through pickle directly so workers skip re-running confuse. - Drops dead `dest_path is None` checks now that the field is required. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * rename Settings to ComicboxSettings So that client programs that already define their own `Settings` type don't collide on import. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * flatten ComputedSettings into ComicboxSettings The hierarchical split was a confuse-template setup convenience, not a logical grouping — there's no API benefit to keeping client code chained through `cfg.computed.X`. Promote the six computed fields onto ComicboxSettings under a clearly labeled comment block. The confuse template's nested `computed` MappingTemplate is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`_get_source_config_metadata` early-returned an empty list whenever the caller set `metadata_format`, because `fmt not in self._config.read` compared a string against a frozenset of `MetadataFormats` enums — always True. The conversion + correct membership check happens in the try block on the next lines, so the early return was both wrong and redundant. Adds tests/unit/test_sources.py covering the four behavioral cases: fmt-in-read, no-fmt, fmt-not-in-read, invalid-fmt. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
) read_config_sources used config.add() for the Mapping branch, which appends to the BOTTOM of confuse's source priority stack — below the config_default.yaml loaded by config.read() at the top of the function. So any caller passing a dict / Mapping override (e.g. `get_config({"comicbox": {"compute_pages": True}})`) silently got the default instead. Switch to config.set() so Mapping args land on top, matching set_args() for the Namespace branch. Surfaced by a downstream Codex migration that hit dead Mapping overrides; covered now by tests/unit/test_config_layering.py. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The template arms for `read`, `write`, `export`, `delete_keys`, `read_ignore`, and `print` previously combined `frozenset` (a pass-through marker) and `Sequence(str)` (list-of-strings coercion). That works for the common YAML/CLI list path but rejects callers passing a `set` / `tuple` / `frozenset` literal — which is logically fine for fields whose post-compute value is always a frozenset. Replaces the per-field unions with `OneOf((set, frozenset, tuple, list))` (`print` also accepts `str` for the historical phase-char form). The `_build_settings` boundary already calls `frozenset(...)` on these values, so any of the four containers normalize correctly. Also adapts `compute_config`'s helpers — Subview iteration only supports dict/list source values, so user-supplied set/frozenset/tuple inputs would error before reaching the template. New `_raw_or_empty` pulls the Python value via `.get()` and explicitly rejects mappings with a clear error (dict iteration would silently accept dict input otherwise). `_parse_print` now accepts a phase-char string OR any iterable of phase chars. Path-list fields (`paths`, `import_paths`, `metadata_cli`) keep their existing `Sequence(...)` form with element-type validation — that trade-off felt worth keeping. 14 new tests in tests/unit/test_config_container_inputs.py cover the four container types per field and assert mapping rejection. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Callers that only want a thumbnail (e.g. codex's CoverThread) don't need the full ComicInfo/CoverImage hint resolution. Parsing the metadata for every cover dominates the cost of cover extraction and emits a flood of debug-bucket Union ValidationErrors that look like real failures in DEBUG logs. When skip_metadata=True, bypass generate_cover_paths entirely and read archive index 0 directly. This drops per-call schema instantiation, Union resolution, and path normalization. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* compact news * update deps
…122) ClearingErrorStoreSchema previously split each schema's errors into two buckets: ignored ones logged at DEBUG, real ones at WARNING. The DEBUG bucket only ever held errors from ``_ignore_errors`` — ``Field may not be null.`` (sparse-field tolerance) and ``Invalid input type.`` (Union variant misses) — both of which are internal mechanics, not operator-actionable signal. Each Union miss emitted one ``ValidationError - {'_schema': ['Invalid input type.']}`` line per field per archive, drowning the genuinely useful per-source DEBUG messages emitted by ``_except_on_load``. Filter ignored errors at split time, log only WARNINGs. Real schema failures still surface with full context (path, schema class, normalized message). Collapses the dual-bucket _split_*_errors methods into _filter_* + _log_warnings. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* compact news * update deps * metron: drop broken URL slugs for genre, location, reprint, role, story, tag Metron has no public web pages for these types — only API endpoints — so URLs like https://metron.cloud/genre/3 always 404. Stop emitting them. The numeric Metron ID is still preserved on the identifier. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Shortens the import path for the helper from comicbox.enums.maps.age_rating to comicbox.enums.maps so downstream callers can reach it without drilling into the submodule. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Remove unused module/class constants: _COMMENT_ARCHIVE_TYPES, SUFFIXES, _LOG_FORMAT, comet.py IDENTIFIER_TAG/IS_VERSION_OF_TAG, comictagger.py IDENTIFIER_TAG/PAGES_TAG, XmlCountryField (and now-orphaned imports RarFile, ZipFile, CountryField). - Fix latent bug in TrapExceptionsMeta: `attr_name in "deserialize"` was a substring check that wrapped any callable whose name was a substring of "deserialize" (e.g. "er", "size", "ali"). Use the existing _WRAP_METHODS tuple instead so only the exact `deserialize` method is wrapped. - Simplify _get_pdf_enabled() to a plain `import pdffile` probe; the except-arm stub import had no effect. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Consolidate the optional comicbox-pdffile integration into one module (comicbox/_pdf.py) and delete the hand-maintained pdffile_stub.py. Previously six call sites each duplicated a `try: from pdffile import X / except: from pdffile_stub import X` block, and the stub class mirrored the real PDFFile API method-for-method — silent drift risk every time upstream pdffile shipped. Now: - comicbox/_pdf.py is the single source of truth for PDF_ENABLED, PDFFile, and PAGE_FORMAT_VALUES. When pdffile is absent, PDFFile is None at runtime; type checkers see the real class via TYPE_CHECKING. - Every call site that touches PDFFile is gated by `if PDF_ENABLED`. - The `case PDFFile():` arm in box/archive/archive.py is lifted to an `if PDF_ENABLED and isinstance(archive, PDFFile):` guard above the match (the match form would fail when PDFFile is None). - config/__init__.py reads PAGE_FORMAT_VALUES instead of iterating an empty stub Enum. Verified with `pdffile` installed (307/307 tests pass) and in a fresh venv without it (PDF_ENABLED=False, CBZ archives still work, PDF files raise UnsupportedArchiveTypeError, CLI shows the "not installed" hint). Net: -70 lines across 9 files. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* compact news * update deps * update news and version to alpha 4 * update deps * rename function path in NEWS * bump alpha version to 3.0.0a5
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
AttrDict. Comicbox constructor now accepts this dataclass instead of an
AttrDict
metron.cloud/{genre,location,reprint,role,story,tag}/...URLs for Metron identifiers — those paths 404 because Metron has no public
web pages for those types (only API endpoints). The numeric Metron ID is
still preserved on the identifier.
metadata to the filesystem.
comicbox.process.iter_process_files() and
comicbox.process.aread_metadata() for reading large batches of files at
once.
Comicbox.get_cover_page(skip_metadata=True)skips metadata parsing forcallers that just need the first archive image as a thumbnail. Removes
per-call schema instantiation and Union resolution overhead.
validation errors (
Invalid input type.from Union variant misses,Field may not be null.from sparse fields). These were context-freenoise — ~50 lines per archive at DEBUG that read like real failures. Real
schema errors still log at WARNING with full context.
comicbox.enums.maps.to_metron_age_rating(value: str | Enum) ->
MetronAgeRatingEnum | None