Skip to content

v3.0.3#134

Merged
ajslater merged 335 commits into
mainfrom
develop
May 15, 2026
Merged

v3.0.3#134
ajslater merged 335 commits into
mainfrom
develop

Conversation

@ajslater
Copy link
Copy Markdown
Owner

@ajslater ajslater commented May 15, 2026

  • Fix small crashes with metron credits and comicbox with no path

    type annotate magic metron field functions and make all params kwargs
    use eslint outside of editor
    update deps, new ruff rules. lint & format
commit e27050fbd42f0cf8e549871cc06c70f041672306
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 7 21:36:49 2024 -0800

    rename deserializeMeta class to TrapExcepionsMeta
    fix type issues with field metaclass wrapper
    fix notes parsing for metron and many variations
    move notes parsing into another file.
    add comicinfo metron origin test
    rename modules to not shadow python builtins
    fix binary pdf files for new mupdf
ajslater and others added 29 commits May 1, 2026 14:02
Callers that only want a thumbnail (e.g. codex's CoverThread) don't
need the full ComicInfo/CoverImage hint resolution. Parsing the
metadata for every cover dominates the cost of cover extraction
and emits a flood of debug-bucket Union ValidationErrors that look
like real failures in DEBUG logs.

When skip_metadata=True, bypass generate_cover_paths entirely and
read archive index 0 directly. This drops per-call schema
instantiation, Union resolution, and path normalization.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* compact news

* update deps
…122)

ClearingErrorStoreSchema previously split each schema's errors into
two buckets: ignored ones logged at DEBUG, real ones at WARNING.
The DEBUG bucket only ever held errors from ``_ignore_errors`` —
``Field may not be null.`` (sparse-field tolerance) and
``Invalid input type.`` (Union variant misses) — both of which are
internal mechanics, not operator-actionable signal. Each Union miss
emitted one ``ValidationError - {'_schema': ['Invalid input type.']}``
line per field per archive, drowning the genuinely useful per-source
DEBUG messages emitted by ``_except_on_load``.

Filter ignored errors at split time, log only WARNINGs. Real schema
failures still surface with full context (path, schema class,
normalized message). Collapses the dual-bucket _split_*_errors
methods into _filter_* + _log_warnings.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* compact news

* update deps

* metron: drop broken URL slugs for genre, location, reprint, role, story, tag

Metron has no public web pages for these types — only API endpoints — so
URLs like https://metron.cloud/genre/3 always 404. Stop emitting them.
The numeric Metron ID is still preserved on the identifier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Shortens the import path for the helper from
comicbox.enums.maps.age_rating to comicbox.enums.maps so downstream
callers can reach it without drilling into the submodule.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Remove unused module/class constants: _COMMENT_ARCHIVE_TYPES, SUFFIXES,
  _LOG_FORMAT, comet.py IDENTIFIER_TAG/IS_VERSION_OF_TAG, comictagger.py
  IDENTIFIER_TAG/PAGES_TAG, XmlCountryField (and now-orphaned imports
  RarFile, ZipFile, CountryField).
- Fix latent bug in TrapExceptionsMeta: `attr_name in "deserialize"` was a
  substring check that wrapped any callable whose name was a substring of
  "deserialize" (e.g. "er", "size", "ali"). Use the existing _WRAP_METHODS
  tuple instead so only the exact `deserialize` method is wrapped.
- Simplify _get_pdf_enabled() to a plain `import pdffile` probe; the
  except-arm stub import had no effect.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Consolidate the optional comicbox-pdffile integration into one module
(comicbox/_pdf.py) and delete the hand-maintained pdffile_stub.py.

Previously six call sites each duplicated a `try: from pdffile import X /
except: from pdffile_stub import X` block, and the stub class mirrored
the real PDFFile API method-for-method — silent drift risk every time
upstream pdffile shipped.

Now:
- comicbox/_pdf.py is the single source of truth for PDF_ENABLED,
  PDFFile, and PAGE_FORMAT_VALUES. When pdffile is absent, PDFFile is
  None at runtime; type checkers see the real class via TYPE_CHECKING.
- Every call site that touches PDFFile is gated by `if PDF_ENABLED`.
- The `case PDFFile():` arm in box/archive/archive.py is lifted to an
  `if PDF_ENABLED and isinstance(archive, PDFFile):` guard above the
  match (the match form would fail when PDFFile is None).
- config/__init__.py reads PAGE_FORMAT_VALUES instead of iterating an
  empty stub Enum.

Verified with `pdffile` installed (307/307 tests pass) and in a fresh
venv without it (PDF_ENABLED=False, CBZ archives still work, PDF files
raise UnsupportedArchiveTypeError, CLI shows the "not installed" hint).

Net: -70 lines across 9 files.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* compact news

* update deps

* update news and version to alpha 4

* update deps

* rename function path in NEWS

* bump alpha version to 3.0.0a5
* require comicbox-pdffile 0.6.x for image-dominant page detection

Widens the optional ``[pdf]`` extra to require comicbox-pdffile 0.6.x.
The new minor release adds image-dominant page detection (
``PDFFile.classify_page``, ``PDFFile.read_image_if_dominant``,
``PDFFile.read_full_pixmap_jpeg``) used by browser readers to serve
scanned-comic PDF pages as plain ``<img>`` instead of routing through
pdf.js on the client.

comicbox itself doesn't use the new API — the bump is purely a pin
update so downstream callers (Codex, OPDS readers) can adopt it.

The ``[tool.uv.sources]`` block is transient: it points at the
pdffile PR branch so this CI can resolve dependencies before
0.6.x lands on PyPI. Drop it once 0.6.x publishes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* just use the released pdffile

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add bin/regenerate-pdf-test-pages.py — drives Comicbox.get_page_by_index
against tests/files/test_pdf.pdf to refresh tests/files/pdf/{N}.pdf when
pymupdf or pdffile change page-extraction output. Run on the next drift.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ajslater ajslater merged commit 9bbc50b into main May 15, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant