Skip to content

comicbox 3 alpha 5#123

Merged
ajslater merged 11 commits into
developfrom
pre-release
May 4, 2026
Merged

comicbox 3 alpha 5#123
ajslater merged 11 commits into
developfrom
pre-release

Conversation

@ajslater
Copy link
Copy Markdown
Owner

@ajslater ajslater commented May 2, 2026

  • Breaking Changes
    • get_config() now returns a ComicboxSettings dataclass, not a Confuse
      AttrDict. Comicbox constructor now accepts this dataclass instead of an
      AttrDict
  • Fixes
    • Stop emitting metron.cloud/{genre,location,reprint,role,story,tag}/...
      URLs for Metron identifiers — those paths 404 because Metron has no public
      web pages for those types (only API endpoints). The numeric Metron ID is
      still preserved on the identifier.
    • Security against suspicious archive paths when extracting pages and
      metadata to the filesystem.
  • Performance
    • Reducing startup time for new instances of comicbox.
    • General performance improvements for reading metadata from many files.
    • Special multiprocessing and async methods
      comicbox.process.iter_process_files() and
      comicbox.process.aread_metadata() for reading large batches of files at
      once.
    • Comicbox.get_cover_page(skip_metadata=True) skips metadata parsing for
      callers that just need the first archive image as a thumbnail. Removes
      per-call schema instantiation and Union resolution overhead.
    • Drop the DEBUG-level emission of intentionally-ignored Marshmallow
      validation errors (Invalid input type. from Union variant misses,
      Field may not be null. from sparse fields). These were context-free
      noise — ~50 lines per archive at DEBUG that read like real failures. Real
      schema errors still log at WARNING with full context.
  • Features
    • Add Age Rating conversion function
      comicbox.enums.maps.to_metron_age_rating(value: str | Enum) ->
      MetronAgeRatingEnum | None

@ajslater ajslater changed the title comicbox 3 alpha 3 comicbox 3 alpha 4 May 3, 2026
@ajslater ajslater changed the title comicbox 3 alpha 4 comicbox 3 alpha 5 May 3, 2026
@ajslater ajslater merged commit 34652a5 into develop May 4, 2026
3 checks passed
ajslater added a commit that referenced this pull request May 4, 2026
* replace tag feature

* use const

* replace and delete options

* Squashed commit of the following:

zipfile with remove

* update deps

* update machine image for circleci

* update deps

* update deps and type hint

* expand linear yaml help to be more useful'

* update deps

* update eslint

* update deps

* bump news and version

* update pdf pages. binary difference with new mupdf

* update docker images

* fix make install dependencies

* add jxl to image extensions

* fix ignoring macos resource forks

* resource fork test file

* update deps

* adjust news

* Squashed commit of the following:

    type annotate magic metron field functions and make all params kwargs
    use eslint outside of editor
    update deps, new ruff rules. lint & format

* add venv upgrade script

* ignore PERF203

* update deps and install pdffile

* update deps. appease typechecker. new eslint.config

* Squashed commit of the following:

commit e27050fbd42f0cf8e549871cc06c70f041672306
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 7 21:36:49 2024 -0800

    rename deserializeMeta class to TrapExcepionsMeta
    fix type issues with field metaclass wrapper

* add eslint-plugin-json-schema-validator

* update deps and lint

* use mdx instead of markdown

* remove unused import

* remove superfluous plugins. remove first level globs

* update deps

* Squashed commit of the following:

    fix notes parsing for metron and many variations
    move notes parsing into another file.
    add comicinfo metron origin test
    rename modules to not shadow python builtins
    fix binary pdf files for new mupdf

* bump version and news

* fix type errors

* format

* refactor dynamic class creation to appease typchecker

* add libmupdf docs

* Simplify Identifier URL construction for Metron pk ids.

* update deps

* fix story arc parsing. bump version

* update dockerfile with modern node

* Squashed commit of the following:

Comicbox 2.0

* Resolve circular import if not installed with \[pdf\] option.

* Make archive comments that aren't ComicBookInfo JSON log as debug comments
  more often.

* update package links

* add more aliases for comicvine sources

* ensure dattetimes from archives are timezone aware

* update deps and bump version

* bump news

* drop version back appropriately

* fix alias tree builder

* update deps, typecheck with ty

* alphabetize  comicbox fields

* uv_build

* update pyproect, eslint config, deps

* update deps

* update deps

* normalize Trade Paper Back into Trade Paperback

* update deps

* update deps

* Squashed commit of the following:

update to xmltodict 1.0. remove special code for xmltodict #text type conversion bugs
compact code for xml_fields that get cdata
remove cdatata mixn from xml lists

* update deps

* pyright ignore

* fix age rating coercion for CIX"

* add github issue code example.

* update deps

* update deps

* replace poetry with uv for run script

* update deps

* no support for python 3.14

* explicitly build with 3.13 trixie

* remove ruamel.yaml.clib from test docker

* update deps

* update deps

* new verson. fix comicbox.json dump crash

* remove unused typing exceptions. add typing exceptions for ty foolishness

* update deps add ty to makefile

* python 3.14 support

* bump version and news

* update deps

* ignore ty type ignores

* update deps

* update deps

* Squashed commit of the following:

commit 259e561
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 19:51:31 2025 -0800

    use released pdffile

commit 4136a3b
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 19:41:28 2025 -0800

    use a proper base RenderModule and clean loads for tabs because it breaks yaml

commit 3426cf0
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 17:20:05 2025 -0800

    bump deps

commit 9fcaded
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 17:19:49 2025 -0800

    reduce complexity of dump

commit f96d27a
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 19:12:05 2025 -0800

    gate writing pdf metadata on delete all or data exists

commit 7415b82
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 19:08:26 2025 -0800

    optimize pdf writing by writing pdf data in the same context and only saving once

commit 2bd0f2c
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:57:26 2025 -0800

    rename legacy embedded variables to LEGACY_NESTED equivalents

commit 5222159
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:45:06 2025 -0800

    lint

commit 5d38acb
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:40:34 2025 -0800

    fix print test

commit 65410c7
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:37:18 2025 -0800

    fix most tests

commit 19d2dfe
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 17:40:51 2025 -0800

    fix pdf xml tests

commit f6bf854
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 16:50:08 2025 -0800

    fix tests for pdf_json

commit 590ffb8
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 14:44:51 2025 -0800

    fix accepting flexible datetimes from pdfs

commit e18925f
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 15:33:37 2025 -0800

    fix pdf tests using removed params

commit 55725b5
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 15:33:19 2025 -0800

    fix set subtraction

commit 2673e3a
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:57:35 2025 -0800

    add bpepple to news

commit 3de741d
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:56:30 2025 -0800

    update schemas doc for pdf embeds

commit 484737d
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:52:54 2025 -0800

    add bpepple to news

commit 0b6cdaf
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:47:31 2025 -0800

    bump version and news

commit bda414c
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:38:42 2025 -0800

    pdf write to embed files. pdf metadata keywords write tags.

commit 29fd04b
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:38:12 2025 -0800

    ty ignore

commit b795c49
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:34:20 2025 -0800

    add ty ignores

commit ce3ef91
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:33:43 2025 -0800

    update pdffile stub

commit fd2f4a0
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:31:38 2025 -0800

    update deps

commit 267d9d0
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:31:30 2025 -0800

    add alpha pdffile to sources

commit 041ce67
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:30:55 2025 -0800

    add pythondevmode to test script

* fix typing

* update deps

* Squashed commit of the following:

commit b31f22e6d178fcc1a5896c0dd7f680c26bc91657
Author: AJ Slater <aj@slater.net>
Date:   Mon Dec 1 20:03:13 2025 -0800

    typecheck with ty

* update deps

* complexipy & group deps

* reduce complexity

* update py7z library

* remove unused ty ignores

* ty fixes and ignores

* update deps

* update deps

* remove unused ty ignore

* update deps

* remove unusued ty ignores

* use OneOf instead of list syntax sugar for confuse

* update deps

* Raw yaml datetimes (#102)

* use OneOf instead of list syntax sugar for confuse

* update deps

* let yaml have raw yaml datetimes instead of strings

* use simplejson decode errors

* bump news and version

* fix test script

* fix lint backend groups

* remove unused groups

* fix test script

* really fix test script

* use grooup lint in tests for jsonschema

* tweak dep version ranges

* update deps. use dockerfmt. ruff changes inlie ifs to ors

* update dockerfile base

* update deps, remove unused ty warning ignore

* update deps add eslint plugins

* add mbake

* update deps

* fix tests for new pymupdf

* Squashed commit of the following:

commit 1fb394e109263188a16c4addeaab87bbdfdf882e
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 17:09:25 2026 -0800

    generate-schema scripts

commit fc9b4f5c27db827ae1592010b01708865cf3733e
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 17:09:08 2026 -0800

    format schemas

commit 9ccdf70d8c2318220c443714e509b6746f19a90e
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 16:39:04 2026 -0800

    fix schema

commit 1a082c52887571cd258ebbc467846461c8e9686f
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 13:29:02 2026 -0800

    add marshmallow jsonschema

* bump version and news

* add script comment

* update deps

* ty ignores

* lots more type annotaions. include py.typed sentinel

* remove unneeded ruff ignores

* prettier xml schema xsds

* convert to devenv

* update devenv and deps

* update devenv

* update devenv

* fix pytests. update pycountry

* fix cli help

* fix date serializization if already a string

* update devenv & deps

* import accepts quoted globs. bump version and news

* VALIDATE FEATURE

Squashed commit of the following:

commit 4f712ddc46859bb82eb6383d41a72502bf49f7be
Merge: 2b0b5db 06af8e3
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 14:01:25 2026 -0800

    Merge branch 'develop' into validate

commit 2b0b5db77d073da699cdf26e9481e5efd69ad424
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:32:39 2026 -0800

    better validate cli help

commit f78dd859c3c8c8adf44399f723de171da9d5467a
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:25:48 2026 -0800

    xsd printWidth to 120. fixes CoMet xsd.

commit d1563e96bbc944dc0669e4df0d647c44cce8c7dd
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:03:47 2026 -0800

    format test files with validator

commit 59350c9e3c13e9248368146e403a1cc05c755523
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:01:09 2026 -0800

    no available validator is a warning

commit f80fc325bc1cfecc9a9286f7538ac02eb6391ad6
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:00:40 2026 -0800

    use original schema definitions unreformatted

commit 8eb5d884136e215a19754f1d6ae2fdc9c0cd2cd3
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 11:51:35 2026 -0800

    fix symlink

commit bffef02777ba01b6c4f54ba36df7f433c45841da
Merge: 3547d24 6478b78
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 11:26:18 2026 -0800

    Merge branch 'develop' into validate

commit 3547d24639eed74841fb76b49aa49ab238b820a4
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 11:25:50 2026 -0800

    update deps

commit 29dba04deaf029466ca6794060c55b81d5c0a054
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:44:30 2026 -0800

    update deps

commit 273da7ab3e87d60eea56167199e466c61867c57c
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:43:55 2026 -0800

    only catch and warn on validation errors

commit 5ec0ad1928c709388facb054b3f6915285a4e4a8
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:35:41 2026 -0800

    move xmlschema and jsonschema into regular deps

commit 0887cf1e07daec89b59972e9cf8ffc59c143dba2
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:33:33 2026 -0800

    fix getting format from input files. change validation exception to warning

commit 4e7be5f44225522398a407c94d76c26fbd22a925
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:32:54 2026 -0800

    fix guess_format

commit 2342605b4b08ce641f056a38b5b634bae75bcfec
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:13:41 2026 -0800

    fix script for new location of validate_cli

commit b2ab1995e543204d15e83c00e8596681e52b70f7
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:12:08 2026 -0800

    move schema to schema_definitions

commit deacf119c2d0823af5c6405162d23b7e32f8fb37
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 22:23:05 2026 -0800

    better validation logging

commit c9a615f5885b53dd8e8b81c9d735808f4eaa7736
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 22:15:10 2026 -0800

    fix validation format assignment. validation info logging

commit 914d35d15f536a6a042cbe66346b2cb4a38d636a
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 22:05:04 2026 -0800

    basically working validation with definitions dir

commit 7e860e8f6110dd868dbec2f724a8bff1bd0a980d
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 21:22:04 2026 -0800

    ignore bad typecheck warnings

commit 3d5ae84354b772ce8fc08793a2c7db64e95c46ac
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 21:13:02 2026 -0800

    fix validate tests

commit 324a0c6fd9935c153fc5a172020a8c02b6f901d0
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 18:26:12 2026 -0800

    most tests pass. validate test fails. typecheck fails. schemas need moving into the package

commit 5c3d4cd77020b5318a0f45c8d72d432d50ad158e
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 17:16:27 2026 -0800

    update deps

commit 112a71aece1adba12e4d380359da3a167456af8c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 17:16:19 2026 -0800

    pin comicbox-pdffile

* bump NEWS

* PDF2CBZ extract images
Squashed commit of the following:

commit b6296ee49b49556b04adaefb12bed332f4fee857
Merge: 5bf0007 bdd3879
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 14:07:16 2026 -0800

    Merge branch 'develop' into pdf2cbz

commit 5bf0007
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 14:44:53 2026 -0800

    bump news and version

commit 362123c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 14:39:19 2026 -0800

    update pdffile to released version

commit f09571c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 13:36:29 2026 -0800

    switch image_pdf to more powerful pdf_page_format

commit b1d2d1b
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 12:37:12 2026 -0800

    fix pdf cover compare test

commit 5aaeae0
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 12:36:39 2026 -0800

    move pdf format decision to _archive_readfile()

commit 2107241
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 10:32:34 2026 -0800

    update deps

commit 566e426
Merge: cdc2250 38bcfe2
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:58:28 2026 -0800

    Merge branch 'develop' into pdf2cbz

commit cdc2250
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:57:57 2026 -0800

    fix cli help

commit 1190fe4
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:52:12 2026 -0800

    fix cli option collision"

commit 63bf418
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:49:48 2026 -0800

    cli option for image_pdf

commit 1d7d852
Merge: db7061c f5f03b5
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:39:49 2026 -0800

    Merge branch 'develop' into pdf2cbz

commit db7061c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:36:11 2026 -0800

    basic support for extract image from pdf

* move docker-compose.yaml to compose.yaml

* fix dockerfile for new devenv

* fix dockerfile for new devenv the kludgey way

* fix news

* update deps

* format dockerfile

* color and clarification for help

* fix colors for help

* update devenv

* fix prettierignore

* update devenv

* fix makefile

* v2.2.1 fix pdf datetimes

* update devenv

* update deps

* update devenv

* add ty ignores to match pyright ignores

* update devenv

* add node_root feature

* update devenv

* update devenv

* update devenv, deps and fix some ty typing

* update deepdiff and bump version

* fix news typo

* update pdfs for new pymupdf

* update deps & devenv

* update deps v2.2.3

* use usr/env for scripts

* gha workflow

* switch to github actions

* Add to_metron_age_rating() public conversion function (#108)

Provide a standalone function to convert any age rating enum or string
to a MetronAgeRatingEnum. Supports Marvel, DC, Generic, ComicInfo, and
Metron enums with fuzzy string matching (case/space-insensitive).

* add claude.md

* bump news and version

* when extracting pages make path absolute

* use python convenience method

* rename variable

* Fix path traversal vulnerability in archive extraction (#109)

Validate that resolved output paths stay within the destination
directory before writing, preventing zip-slip attacks from crafted
archive member names.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* bump news

* Optimize for large-scale workloads (600K+ files) (#110)

Reduce per-file overhead for bulk metadata reading:
- Extension-hint archive detection: check file extension first to avoid
  unnecessary magic-byte disk reads (saves ~1.2M file opens for CBZ collections)
- Cache marshmallow schema instances by (class, exclude_keys) to eliminate
  ~4.8M schema constructions at scale
- Cache transform instances per Comicbox instance to avoid redundant creation
- Skip FrozenAttrDict re-wrapping when pre-built config is passed
- Skip redundant logger init when loglevel hasn't changed
- Remove always-on glom_debug=True from transform calls

Add parallelization API (comicbox/process.py):
- process_files() for ProcessPoolExecutor-based batch processing
- aread_metadata() async wrapper for event loop integration

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* sort ignorefiles

* bump news

* fix datetime ordering bug

* Code quality pass: match statements, pathlib, immutable constants (#111)

* Targeted code quality pass: match statements, pathlib, immutable constants

- Convert isinstance if/elif chains to match statements in archive.py,
  archiveinfo.py, and time_fields.py
- Replace os.walk with Path.rglob in run.py, fixing a double-recursion
  bug where recurse() re-walked subdirectories already visited by os.walk
- Wrap _HANDLE_MERGE dict in MappingProxyType in mergedeep.py
- Replace accumulator loop with list comprehension in config/computed.py
- Replace loop-append with extend + generator in box/sources.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Sort ignore files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* use pdf group for tests

* update deps

* iterprocess files

* fix print test

* Righttyper Typing with corrections (#112)

* righttyper

* raw types commit

* Fix righttyper auto-generated type annotations

Correct ~535 basedpyright errors and 10 ruff errors introduced by
righttyper's runtime type capture, which used overly-literal types.

Key changes:
- Replace PosixPath annotations with Path throughout
- Simplify overly-specific dict union types to dict[str, Any]
- Remove broken self: "Module.ClassName" annotations in mixins
- Rename/remove rt_T1 TypeVars (N815/N816)
- Move Callable import to TYPE_CHECKING block (TC003)
- Make boolean params keyword-only in tests/util.py (FBT001)
- Add pyright: ignore on marshmallow method override incompatibilities
- Fix _path override annotations in archive write/dump_files
- Widen function signatures to accept Path | str | None where needed
- Fix circular import in transforms/spec.py (was referencing xml_reprints.MetaSpec)
- Guard None.items() calls in metroninfo identifiers with early returns
- Clean up various unused imports left by annotation removals

Result: 0 errors, 259 tests passing, make fix/lint/typecheck all clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* sort ignore files

* massively typed

* remove righttyper. back to python 3.10 req

* update devenv. switch to bun

* remove quoted self typing

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* remove self types

* remove self typing

* typing attempt

* fix typing errors

* fix circular import

* reorg news

* update devenv

* add bun to dockerfile

* only copy bun deps first for dockerfile

* update devenv & deps

* switch back to main marshmallow-jsonschema now that it's back from the dead

* update devenv

* fix process pool runs to deliver exceptions back and not break on passing in the logger

* test the process module

* comments

* enhance news for iterfiles

* decomplexify box init

* decomplexify process iterfiles

* allow callers to configure subprocess loguru via picklable dict (#113)

Loguru's logger object isn't picklable into ProcessPoolExecutor
workers, so callers like codex couldn't get worker log output to
match their parent-process format. Adds a worker_log_config dict
({level, format, sink}) that runs through the executor initializer
and reconfigures loguru in each worker via init_logging. Also adds
enqueue=True to the default sink for thread-safe logging.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update devenv

* Upgrade confuse to 2.2.0; replace AttrDict with typed Settings (#114)

* upgrade confuse to 2.2.0; replace AttrDict with typed Settings dataclass

confuse 2.2.0 makes AttrDict properly generic, so per-key types resolve
to `object` and consumers across the box mixins fail typecheck. Convert
the validated AttrDict into a frozen `Settings` dataclass once in
get_config() and propagate that typed object everywhere; confuse stays
confined to comicbox/config.

- New comicbox/config/settings.py defines `Settings` and
  `ComputedSettings` (frozen, slots).
- get_config() returns Settings; new _build_settings() does the
  conversion. post_process_set_for_path() rebuilt around
  dataclasses.replace.
- FrozenAttrDict deleted — frozen dataclass enforces immutability.
- process.py passes Settings through pickle directly so workers skip
  re-running confuse.
- Drops dead `dest_path is None` checks now that the field is required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* rename Settings to ComicboxSettings

So that client programs that already define their own `Settings` type
don't collide on import.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* flatten ComputedSettings into ComicboxSettings

The hierarchical split was a confuse-template setup convenience, not a
logical grouping — there's no API benefit to keeping client code
chained through `cfg.computed.X`. Promote the six computed fields onto
ComicboxSettings under a clearly labeled comment block. The confuse
template's nested `computed` MappingTemplate is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix metadata_format hint silently dropping all api metadata (#115)

`_get_source_config_metadata` early-returned an empty list whenever the
caller set `metadata_format`, because `fmt not in self._config.read`
compared a string against a frozenset of `MetadataFormats` enums —
always True. The conversion + correct membership check happens in the
try block on the next lines, so the early return was both wrong and
redundant.

Adds tests/unit/test_sources.py covering the four behavioral cases:
fmt-in-read, no-fmt, fmt-not-in-read, invalid-fmt.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update deps

* bump news and version to 3.0.0

* fix Mapping config args silently dropped under config_default.yaml (#116)

read_config_sources used config.add() for the Mapping branch, which
appends to the BOTTOM of confuse's source priority stack — below the
config_default.yaml loaded by config.read() at the top of the
function. So any caller passing a dict / Mapping override (e.g.
`get_config({"comicbox": {"compute_pages": True}})`) silently got the
default instead. Switch to config.set() so Mapping args land on top,
matching set_args() for the Namespace branch.

Surfaced by a downstream Codex migration that hit dead Mapping
overrides; covered now by tests/unit/test_config_layering.py.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update deps

* widen set-like config fields to accept any non-mapping container (#117)

The template arms for `read`, `write`, `export`, `delete_keys`,
`read_ignore`, and `print` previously combined `frozenset` (a
pass-through marker) and `Sequence(str)` (list-of-strings coercion).
That works for the common YAML/CLI list path but rejects callers
passing a `set` / `tuple` / `frozenset` literal — which is logically
fine for fields whose post-compute value is always a frozenset.

Replaces the per-field unions with `OneOf((set, frozenset, tuple, list))`
(`print` also accepts `str` for the historical phase-char form). The
`_build_settings` boundary already calls `frozenset(...)` on these
values, so any of the four containers normalize correctly.

Also adapts `compute_config`'s helpers — Subview iteration only
supports dict/list source values, so user-supplied set/frozenset/tuple
inputs would error before reaching the template. New `_raw_or_empty`
pulls the Python value via `.get()` and explicitly rejects mappings
with a clear error (dict iteration would silently accept dict input
otherwise). `_parse_print` now accepts a phase-char string OR any
iterable of phase chars.

Path-list fields (`paths`, `import_paths`, `metadata_cli`) keep their
existing `Sequence(...)` form with element-type validation — that
trade-off felt worth keeping.

14 new tests in tests/unit/test_config_container_inputs.py cover the
four container types per field and assert mapping rejection.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* reuse types tuple

* update deps

* 3.0.0 alpha version 0

* update compose for generic gha build

* ReadResults data structure for process functions

* compact news (#119)

* Add skip_metadata flag to get_cover_page (#120)

Callers that only want a thumbnail (e.g. codex's CoverThread) don't
need the full ComicInfo/CoverImage hint resolution. Parsing the
metadata for every cover dominates the cost of cover extraction
and emits a flood of debug-bucket Union ValidationErrors that look
like real failures in DEBUG logs.

When skip_metadata=True, bypass generate_cover_paths entirely and
read archive index 0 directly. This drops per-call schema
instantiation, Union resolution, and path normalization.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v3.0.0a2 (#121)

* compact news

* update deps

* Drop DEBUG-bucket logging of intentionally-ignored validation errors (#122)

ClearingErrorStoreSchema previously split each schema's errors into
two buckets: ignored ones logged at DEBUG, real ones at WARNING.
The DEBUG bucket only ever held errors from ``_ignore_errors`` —
``Field may not be null.`` (sparse-field tolerance) and
``Invalid input type.`` (Union variant misses) — both of which are
internal mechanics, not operator-actionable signal. Each Union miss
emitted one ``ValidationError - {'_schema': ['Invalid input type.']}``
line per field per archive, drowning the genuinely useful per-source
DEBUG messages emitted by ``_except_on_load``.

Filter ignored errors at split time, log only WARNINGs. Real schema
failures still surface with full context (path, schema class,
normalized message). Collapses the dual-bucket _split_*_errors
methods into _filter_* + _log_warnings.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update devenv

* metron: drop URL slugs for types with no public web page (#124)

* compact news

* update deps

* metron: drop broken URL slugs for genre, location, reprint, role, story, tag

Metron has no public web pages for these types — only API endpoints — so
URLs like https://metron.cloud/genre/3 always 404. Stop emitting them.
The numeric Metron ID is still preserved on the identifier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Re-export to_metron_age_rating from comicbox.enums.maps (#125)

Shortens the import path for the helper from
comicbox.enums.maps.age_rating to comicbox.enums.maps so downstream
callers can reach it without drilling into the submodule.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update devenv

* Drop dead code surfaced by skylos scan (#126)

- Remove unused module/class constants: _COMMENT_ARCHIVE_TYPES, SUFFIXES,
  _LOG_FORMAT, comet.py IDENTIFIER_TAG/IS_VERSION_OF_TAG, comictagger.py
  IDENTIFIER_TAG/PAGES_TAG, XmlCountryField (and now-orphaned imports
  RarFile, ZipFile, CountryField).
- Fix latent bug in TrapExceptionsMeta: `attr_name in "deserialize"` was a
  substring check that wrapped any callable whose name was a substring of
  "deserialize" (e.g. "er", "size", "ali"). Use the existing _WRAP_METHODS
  tuple instead so only the exact `deserialize` method is wrapped.
- Simplify _get_pdf_enabled() to a plain `import pdffile` probe; the
  except-arm stub import had no effect.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Replace pdffile_stub with single-shim optional integration (#127)

Consolidate the optional comicbox-pdffile integration into one module
(comicbox/_pdf.py) and delete the hand-maintained pdffile_stub.py.

Previously six call sites each duplicated a `try: from pdffile import X /
except: from pdffile_stub import X` block, and the stub class mirrored
the real PDFFile API method-for-method — silent drift risk every time
upstream pdffile shipped.

Now:
- comicbox/_pdf.py is the single source of truth for PDF_ENABLED,
  PDFFile, and PAGE_FORMAT_VALUES. When pdffile is absent, PDFFile is
  None at runtime; type checkers see the real class via TYPE_CHECKING.
- Every call site that touches PDFFile is gated by `if PDF_ENABLED`.
- The `case PDFFile():` arm in box/archive/archive.py is lifted to an
  `if PDF_ENABLED and isinstance(archive, PDFFile):` guard above the
  match (the match form would fail when PDFFile is None).
- config/__init__.py reads PAGE_FORMAT_VALUES instead of iterating an
  empty stub Enum.

Verified with `pdffile` installed (307/307 tests pass) and in a fresh
venv without it (PDF_ENABLED=False, CBZ archives still work, PDF files
raise UnsupportedArchiveTypeError, CLI shows the "not installed" hint).

Net: -70 lines across 9 files.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* remove unused ty ignore

* comicbox 3 alpha 5 (#123)

* compact news

* update deps

* update news and version to alpha 4

* update deps

* rename function path in NEWS

* bump alpha version to 3.0.0a5

* version 3.0.0

* massage news

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
ajslater added a commit that referenced this pull request May 9, 2026
* update deps

* update deps and type hint

* expand linear yaml help to be more useful'

* update deps

* update eslint

* update deps

* bump news and version

* update pdf pages. binary difference with new mupdf

* update docker images

* fix make install dependencies

* add jxl to image extensions

* fix ignoring macos resource forks

* resource fork test file

* update deps

* adjust news

* Squashed commit of the following:

    type annotate magic metron field functions and make all params kwargs
    use eslint outside of editor
    update deps, new ruff rules. lint & format

* add venv upgrade script

* ignore PERF203

* update deps and install pdffile

* update deps. appease typechecker. new eslint.config

* Squashed commit of the following:

commit e27050fbd42f0cf8e549871cc06c70f041672306
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 7 21:36:49 2024 -0800

    rename deserializeMeta class to TrapExcepionsMeta
    fix type issues with field metaclass wrapper

* add eslint-plugin-json-schema-validator

* update deps and lint

* use mdx instead of markdown

* remove unused import

* remove superfluous plugins. remove first level globs

* update deps

* Squashed commit of the following:

    fix notes parsing for metron and many variations
    move notes parsing into another file.
    add comicinfo metron origin test
    rename modules to not shadow python builtins
    fix binary pdf files for new mupdf

* bump version and news

* fix type errors

* format

* refactor dynamic class creation to appease typchecker

* add libmupdf docs

* Simplify Identifier URL construction for Metron pk ids.

* update deps

* fix story arc parsing. bump version

* update dockerfile with modern node

* Squashed commit of the following:

Comicbox 2.0

* Resolve circular import if not installed with \[pdf\] option.

* Make archive comments that aren't ComicBookInfo JSON log as debug comments
  more often.

* update package links

* add more aliases for comicvine sources

* ensure dattetimes from archives are timezone aware

* update deps and bump version

* bump news

* drop version back appropriately

* fix alias tree builder

* update deps, typecheck with ty

* alphabetize  comicbox fields

* uv_build

* update pyproect, eslint config, deps

* update deps

* update deps

* normalize Trade Paper Back into Trade Paperback

* update deps

* update deps

* Squashed commit of the following:

update to xmltodict 1.0. remove special code for xmltodict #text type conversion bugs
compact code for xml_fields that get cdata
remove cdatata mixn from xml lists

* update deps

* pyright ignore

* fix age rating coercion for CIX"

* add github issue code example.

* update deps

* update deps

* replace poetry with uv for run script

* update deps

* no support for python 3.14

* explicitly build with 3.13 trixie

* remove ruamel.yaml.clib from test docker

* update deps

* update deps

* new verson. fix comicbox.json dump crash

* remove unused typing exceptions. add typing exceptions for ty foolishness

* update deps add ty to makefile

* python 3.14 support

* bump version and news

* update deps

* ignore ty type ignores

* update deps

* update deps

* Squashed commit of the following:

commit 259e561
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 19:51:31 2025 -0800

    use released pdffile

commit 4136a3b
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 19:41:28 2025 -0800

    use a proper base RenderModule and clean loads for tabs because it breaks yaml

commit 3426cf0
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 17:20:05 2025 -0800

    bump deps

commit 9fcaded
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 17:19:49 2025 -0800

    reduce complexity of dump

commit f96d27a
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 19:12:05 2025 -0800

    gate writing pdf metadata on delete all or data exists

commit 7415b82
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 19:08:26 2025 -0800

    optimize pdf writing by writing pdf data in the same context and only saving once

commit 2bd0f2c
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:57:26 2025 -0800

    rename legacy embedded variables to LEGACY_NESTED equivalents

commit 5222159
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:45:06 2025 -0800

    lint

commit 5d38acb
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:40:34 2025 -0800

    fix print test

commit 65410c7
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:37:18 2025 -0800

    fix most tests

commit 19d2dfe
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 17:40:51 2025 -0800

    fix pdf xml tests

commit f6bf854
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 16:50:08 2025 -0800

    fix tests for pdf_json

commit 590ffb8
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 14:44:51 2025 -0800

    fix accepting flexible datetimes from pdfs

commit e18925f
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 15:33:37 2025 -0800

    fix pdf tests using removed params

commit 55725b5
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 15:33:19 2025 -0800

    fix set subtraction

commit 2673e3a
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:57:35 2025 -0800

    add bpepple to news

commit 3de741d
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:56:30 2025 -0800

    update schemas doc for pdf embeds

commit 484737d
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:52:54 2025 -0800

    add bpepple to news

commit 0b6cdaf
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:47:31 2025 -0800

    bump version and news

commit bda414c
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:38:42 2025 -0800

    pdf write to embed files. pdf metadata keywords write tags.

commit 29fd04b
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:38:12 2025 -0800

    ty ignore

commit b795c49
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:34:20 2025 -0800

    add ty ignores

commit ce3ef91
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:33:43 2025 -0800

    update pdffile stub

commit fd2f4a0
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:31:38 2025 -0800

    update deps

commit 267d9d0
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:31:30 2025 -0800

    add alpha pdffile to sources

commit 041ce67
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:30:55 2025 -0800

    add pythondevmode to test script

* fix typing

* update deps

* Squashed commit of the following:

commit b31f22e6d178fcc1a5896c0dd7f680c26bc91657
Author: AJ Slater <aj@slater.net>
Date:   Mon Dec 1 20:03:13 2025 -0800

    typecheck with ty

* update deps

* complexipy & group deps

* reduce complexity

* update py7z library

* remove unused ty ignores

* ty fixes and ignores

* update deps

* update deps

* remove unused ty ignore

* update deps

* remove unusued ty ignores

* use OneOf instead of list syntax sugar for confuse

* update deps

* Raw yaml datetimes (#102)

* use OneOf instead of list syntax sugar for confuse

* update deps

* let yaml have raw yaml datetimes instead of strings

* use simplejson decode errors

* bump news and version

* fix test script

* fix lint backend groups

* remove unused groups

* fix test script

* really fix test script

* use grooup lint in tests for jsonschema

* tweak dep version ranges

* update deps. use dockerfmt. ruff changes inlie ifs to ors

* update dockerfile base

* update deps, remove unused ty warning ignore

* update deps add eslint plugins

* add mbake

* update deps

* fix tests for new pymupdf

* Squashed commit of the following:

commit 1fb394e109263188a16c4addeaab87bbdfdf882e
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 17:09:25 2026 -0800

    generate-schema scripts

commit fc9b4f5c27db827ae1592010b01708865cf3733e
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 17:09:08 2026 -0800

    format schemas

commit 9ccdf70d8c2318220c443714e509b6746f19a90e
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 16:39:04 2026 -0800

    fix schema

commit 1a082c52887571cd258ebbc467846461c8e9686f
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 13:29:02 2026 -0800

    add marshmallow jsonschema

* bump version and news

* add script comment

* update deps

* ty ignores

* lots more type annotaions. include py.typed sentinel

* remove unneeded ruff ignores

* prettier xml schema xsds

* convert to devenv

* update devenv and deps

* update devenv

* update devenv

* fix pytests. update pycountry

* fix cli help

* fix date serializization if already a string

* update devenv & deps

* import accepts quoted globs. bump version and news

* VALIDATE FEATURE

Squashed commit of the following:

commit 4f712ddc46859bb82eb6383d41a72502bf49f7be
Merge: 2b0b5db 06af8e3
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 14:01:25 2026 -0800

    Merge branch 'develop' into validate

commit 2b0b5db77d073da699cdf26e9481e5efd69ad424
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:32:39 2026 -0800

    better validate cli help

commit f78dd859c3c8c8adf44399f723de171da9d5467a
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:25:48 2026 -0800

    xsd printWidth to 120. fixes CoMet xsd.

commit d1563e96bbc944dc0669e4df0d647c44cce8c7dd
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:03:47 2026 -0800

    format test files with validator

commit 59350c9e3c13e9248368146e403a1cc05c755523
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:01:09 2026 -0800

    no available validator is a warning

commit f80fc325bc1cfecc9a9286f7538ac02eb6391ad6
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:00:40 2026 -0800

    use original schema definitions unreformatted

commit 8eb5d884136e215a19754f1d6ae2fdc9c0cd2cd3
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 11:51:35 2026 -0800

    fix symlink

commit bffef02777ba01b6c4f54ba36df7f433c45841da
Merge: 3547d24 6478b78
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 11:26:18 2026 -0800

    Merge branch 'develop' into validate

commit 3547d24639eed74841fb76b49aa49ab238b820a4
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 11:25:50 2026 -0800

    update deps

commit 29dba04deaf029466ca6794060c55b81d5c0a054
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:44:30 2026 -0800

    update deps

commit 273da7ab3e87d60eea56167199e466c61867c57c
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:43:55 2026 -0800

    only catch and warn on validation errors

commit 5ec0ad1928c709388facb054b3f6915285a4e4a8
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:35:41 2026 -0800

    move xmlschema and jsonschema into regular deps

commit 0887cf1e07daec89b59972e9cf8ffc59c143dba2
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:33:33 2026 -0800

    fix getting format from input files. change validation exception to warning

commit 4e7be5f44225522398a407c94d76c26fbd22a925
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:32:54 2026 -0800

    fix guess_format

commit 2342605b4b08ce641f056a38b5b634bae75bcfec
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:13:41 2026 -0800

    fix script for new location of validate_cli

commit b2ab1995e543204d15e83c00e8596681e52b70f7
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:12:08 2026 -0800

    move schema to schema_definitions

commit deacf119c2d0823af5c6405162d23b7e32f8fb37
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 22:23:05 2026 -0800

    better validation logging

commit c9a615f5885b53dd8e8b81c9d735808f4eaa7736
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 22:15:10 2026 -0800

    fix validation format assignment. validation info logging

commit 914d35d15f536a6a042cbe66346b2cb4a38d636a
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 22:05:04 2026 -0800

    basically working validation with definitions dir

commit 7e860e8f6110dd868dbec2f724a8bff1bd0a980d
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 21:22:04 2026 -0800

    ignore bad typecheck warnings

commit 3d5ae84354b772ce8fc08793a2c7db64e95c46ac
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 21:13:02 2026 -0800

    fix validate tests

commit 324a0c6fd9935c153fc5a172020a8c02b6f901d0
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 18:26:12 2026 -0800

    most tests pass. validate test fails. typecheck fails. schemas need moving into the package

commit 5c3d4cd77020b5318a0f45c8d72d432d50ad158e
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 17:16:27 2026 -0800

    update deps

commit 112a71aece1adba12e4d380359da3a167456af8c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 17:16:19 2026 -0800

    pin comicbox-pdffile

* bump NEWS

* PDF2CBZ extract images
Squashed commit of the following:

commit b6296ee49b49556b04adaefb12bed332f4fee857
Merge: 5bf0007 bdd3879
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 14:07:16 2026 -0800

    Merge branch 'develop' into pdf2cbz

commit 5bf0007
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 14:44:53 2026 -0800

    bump news and version

commit 362123c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 14:39:19 2026 -0800

    update pdffile to released version

commit f09571c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 13:36:29 2026 -0800

    switch image_pdf to more powerful pdf_page_format

commit b1d2d1b
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 12:37:12 2026 -0800

    fix pdf cover compare test

commit 5aaeae0
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 12:36:39 2026 -0800

    move pdf format decision to _archive_readfile()

commit 2107241
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 10:32:34 2026 -0800

    update deps

commit 566e426
Merge: cdc2250 38bcfe2
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:58:28 2026 -0800

    Merge branch 'develop' into pdf2cbz

commit cdc2250
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:57:57 2026 -0800

    fix cli help

commit 1190fe4
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:52:12 2026 -0800

    fix cli option collision"

commit 63bf418
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:49:48 2026 -0800

    cli option for image_pdf

commit 1d7d852
Merge: db7061c f5f03b5
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:39:49 2026 -0800

    Merge branch 'develop' into pdf2cbz

commit db7061c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:36:11 2026 -0800

    basic support for extract image from pdf

* move docker-compose.yaml to compose.yaml

* fix dockerfile for new devenv

* fix dockerfile for new devenv the kludgey way

* fix news

* update deps

* format dockerfile

* color and clarification for help

* fix colors for help

* update devenv

* fix prettierignore

* update devenv

* fix makefile

* v2.2.1 fix pdf datetimes

* update devenv

* update deps

* update devenv

* add ty ignores to match pyright ignores

* update devenv

* add node_root feature

* update devenv

* update devenv

* update devenv, deps and fix some ty typing

* update deepdiff and bump version

* fix news typo

* update pdfs for new pymupdf

* update deps & devenv

* update deps v2.2.3

* use usr/env for scripts

* gha workflow

* switch to github actions

* Add to_metron_age_rating() public conversion function (#108)

Provide a standalone function to convert any age rating enum or string
to a MetronAgeRatingEnum. Supports Marvel, DC, Generic, ComicInfo, and
Metron enums with fuzzy string matching (case/space-insensitive).

* add claude.md

* bump news and version

* when extracting pages make path absolute

* use python convenience method

* rename variable

* Fix path traversal vulnerability in archive extraction (#109)

Validate that resolved output paths stay within the destination
directory before writing, preventing zip-slip attacks from crafted
archive member names.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* bump news

* Optimize for large-scale workloads (600K+ files) (#110)

Reduce per-file overhead for bulk metadata reading:
- Extension-hint archive detection: check file extension first to avoid
  unnecessary magic-byte disk reads (saves ~1.2M file opens for CBZ collections)
- Cache marshmallow schema instances by (class, exclude_keys) to eliminate
  ~4.8M schema constructions at scale
- Cache transform instances per Comicbox instance to avoid redundant creation
- Skip FrozenAttrDict re-wrapping when pre-built config is passed
- Skip redundant logger init when loglevel hasn't changed
- Remove always-on glom_debug=True from transform calls

Add parallelization API (comicbox/process.py):
- process_files() for ProcessPoolExecutor-based batch processing
- aread_metadata() async wrapper for event loop integration

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* sort ignorefiles

* bump news

* fix datetime ordering bug

* Code quality pass: match statements, pathlib, immutable constants (#111)

* Targeted code quality pass: match statements, pathlib, immutable constants

- Convert isinstance if/elif chains to match statements in archive.py,
  archiveinfo.py, and time_fields.py
- Replace os.walk with Path.rglob in run.py, fixing a double-recursion
  bug where recurse() re-walked subdirectories already visited by os.walk
- Wrap _HANDLE_MERGE dict in MappingProxyType in mergedeep.py
- Replace accumulator loop with list comprehension in config/computed.py
- Replace loop-append with extend + generator in box/sources.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Sort ignore files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* use pdf group for tests

* update deps

* iterprocess files

* fix print test

* Righttyper Typing with corrections (#112)

* righttyper

* raw types commit

* Fix righttyper auto-generated type annotations

Correct ~535 basedpyright errors and 10 ruff errors introduced by
righttyper's runtime type capture, which used overly-literal types.

Key changes:
- Replace PosixPath annotations with Path throughout
- Simplify overly-specific dict union types to dict[str, Any]
- Remove broken self: "Module.ClassName" annotations in mixins
- Rename/remove rt_T1 TypeVars (N815/N816)
- Move Callable import to TYPE_CHECKING block (TC003)
- Make boolean params keyword-only in tests/util.py (FBT001)
- Add pyright: ignore on marshmallow method override incompatibilities
- Fix _path override annotations in archive write/dump_files
- Widen function signatures to accept Path | str | None where needed
- Fix circular import in transforms/spec.py (was referencing xml_reprints.MetaSpec)
- Guard None.items() calls in metroninfo identifiers with early returns
- Clean up various unused imports left by annotation removals

Result: 0 errors, 259 tests passing, make fix/lint/typecheck all clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* sort ignore files

* massively typed

* remove righttyper. back to python 3.10 req

* update devenv. switch to bun

* remove quoted self typing

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* remove self types

* remove self typing

* typing attempt

* fix typing errors

* fix circular import

* reorg news

* update devenv

* add bun to dockerfile

* only copy bun deps first for dockerfile

* update devenv & deps

* switch back to main marshmallow-jsonschema now that it's back from the dead

* update devenv

* fix process pool runs to deliver exceptions back and not break on passing in the logger

* test the process module

* comments

* enhance news for iterfiles

* decomplexify box init

* decomplexify process iterfiles

* allow callers to configure subprocess loguru via picklable dict (#113)

Loguru's logger object isn't picklable into ProcessPoolExecutor
workers, so callers like codex couldn't get worker log output to
match their parent-process format. Adds a worker_log_config dict
({level, format, sink}) that runs through the executor initializer
and reconfigures loguru in each worker via init_logging. Also adds
enqueue=True to the default sink for thread-safe logging.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update devenv

* Upgrade confuse to 2.2.0; replace AttrDict with typed Settings (#114)

* upgrade confuse to 2.2.0; replace AttrDict with typed Settings dataclass

confuse 2.2.0 makes AttrDict properly generic, so per-key types resolve
to `object` and consumers across the box mixins fail typecheck. Convert
the validated AttrDict into a frozen `Settings` dataclass once in
get_config() and propagate that typed object everywhere; confuse stays
confined to comicbox/config.

- New comicbox/config/settings.py defines `Settings` and
  `ComputedSettings` (frozen, slots).
- get_config() returns Settings; new _build_settings() does the
  conversion. post_process_set_for_path() rebuilt around
  dataclasses.replace.
- FrozenAttrDict deleted — frozen dataclass enforces immutability.
- process.py passes Settings through pickle directly so workers skip
  re-running confuse.
- Drops dead `dest_path is None` checks now that the field is required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* rename Settings to ComicboxSettings

So that client programs that already define their own `Settings` type
don't collide on import.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* flatten ComputedSettings into ComicboxSettings

The hierarchical split was a confuse-template setup convenience, not a
logical grouping — there's no API benefit to keeping client code
chained through `cfg.computed.X`. Promote the six computed fields onto
ComicboxSettings under a clearly labeled comment block. The confuse
template's nested `computed` MappingTemplate is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix metadata_format hint silently dropping all api metadata (#115)

`_get_source_config_metadata` early-returned an empty list whenever the
caller set `metadata_format`, because `fmt not in self._config.read`
compared a string against a frozenset of `MetadataFormats` enums —
always True. The conversion + correct membership check happens in the
try block on the next lines, so the early return was both wrong and
redundant.

Adds tests/unit/test_sources.py covering the four behavioral cases:
fmt-in-read, no-fmt, fmt-not-in-read, invalid-fmt.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update deps

* bump news and version to 3.0.0

* fix Mapping config args silently dropped under config_default.yaml (#116)

read_config_sources used config.add() for the Mapping branch, which
appends to the BOTTOM of confuse's source priority stack — below the
config_default.yaml loaded by config.read() at the top of the
function. So any caller passing a dict / Mapping override (e.g.
`get_config({"comicbox": {"compute_pages": True}})`) silently got the
default instead. Switch to config.set() so Mapping args land on top,
matching set_args() for the Namespace branch.

Surfaced by a downstream Codex migration that hit dead Mapping
overrides; covered now by tests/unit/test_config_layering.py.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update deps

* widen set-like config fields to accept any non-mapping container (#117)

The template arms for `read`, `write`, `export`, `delete_keys`,
`read_ignore`, and `print` previously combined `frozenset` (a
pass-through marker) and `Sequence(str)` (list-of-strings coercion).
That works for the common YAML/CLI list path but rejects callers
passing a `set` / `tuple` / `frozenset` literal — which is logically
fine for fields whose post-compute value is always a frozenset.

Replaces the per-field unions with `OneOf((set, frozenset, tuple, list))`
(`print` also accepts `str` for the historical phase-char form). The
`_build_settings` boundary already calls `frozenset(...)` on these
values, so any of the four containers normalize correctly.

Also adapts `compute_config`'s helpers — Subview iteration only
supports dict/list source values, so user-supplied set/frozenset/tuple
inputs would error before reaching the template. New `_raw_or_empty`
pulls the Python value via `.get()` and explicitly rejects mappings
with a clear error (dict iteration would silently accept dict input
otherwise). `_parse_print` now accepts a phase-char string OR any
iterable of phase chars.

Path-list fields (`paths`, `import_paths`, `metadata_cli`) keep their
existing `Sequence(...)` form with element-type validation — that
trade-off felt worth keeping.

14 new tests in tests/unit/test_config_container_inputs.py cover the
four container types per field and assert mapping rejection.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* reuse types tuple

* update deps

* 3.0.0 alpha version 0

* update compose for generic gha build

* ReadResults data structure for process functions

* compact news (#119)

* Add skip_metadata flag to get_cover_page (#120)

Callers that only want a thumbnail (e.g. codex's CoverThread) don't
need the full ComicInfo/CoverImage hint resolution. Parsing the
metadata for every cover dominates the cost of cover extraction
and emits a flood of debug-bucket Union ValidationErrors that look
like real failures in DEBUG logs.

When skip_metadata=True, bypass generate_cover_paths entirely and
read archive index 0 directly. This drops per-call schema
instantiation, Union resolution, and path normalization.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v3.0.0a2 (#121)

* compact news

* update deps

* Drop DEBUG-bucket logging of intentionally-ignored validation errors (#122)

ClearingErrorStoreSchema previously split each schema's errors into
two buckets: ignored ones logged at DEBUG, real ones at WARNING.
The DEBUG bucket only ever held errors from ``_ignore_errors`` —
``Field may not be null.`` (sparse-field tolerance) and
``Invalid input type.`` (Union variant misses) — both of which are
internal mechanics, not operator-actionable signal. Each Union miss
emitted one ``ValidationError - {'_schema': ['Invalid input type.']}``
line per field per archive, drowning the genuinely useful per-source
DEBUG messages emitted by ``_except_on_load``.

Filter ignored errors at split time, log only WARNINGs. Real schema
failures still surface with full context (path, schema class,
normalized message). Collapses the dual-bucket _split_*_errors
methods into _filter_* + _log_warnings.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update devenv

* metron: drop URL slugs for types with no public web page (#124)

* compact news

* update deps

* metron: drop broken URL slugs for genre, location, reprint, role, story, tag

Metron has no public web pages for these types — only API endpoints — so
URLs like https://metron.cloud/genre/3 always 404. Stop emitting them.
The numeric Metron ID is still preserved on the identifier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Re-export to_metron_age_rating from comicbox.enums.maps (#125)

Shortens the import path for the helper from
comicbox.enums.maps.age_rating to comicbox.enums.maps so downstream
callers can reach it without drilling into the submodule.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update devenv

* Drop dead code surfaced by skylos scan (#126)

- Remove unused module/class constants: _COMMENT_ARCHIVE_TYPES, SUFFIXES,
  _LOG_FORMAT, comet.py IDENTIFIER_TAG/IS_VERSION_OF_TAG, comictagger.py
  IDENTIFIER_TAG/PAGES_TAG, XmlCountryField (and now-orphaned imports
  RarFile, ZipFile, CountryField).
- Fix latent bug in TrapExceptionsMeta: `attr_name in "deserialize"` was a
  substring check that wrapped any callable whose name was a substring of
  "deserialize" (e.g. "er", "size", "ali"). Use the existing _WRAP_METHODS
  tuple instead so only the exact `deserialize` method is wrapped.
- Simplify _get_pdf_enabled() to a plain `import pdffile` probe; the
  except-arm stub import had no effect.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Replace pdffile_stub with single-shim optional integration (#127)

Consolidate the optional comicbox-pdffile integration into one module
(comicbox/_pdf.py) and delete the hand-maintained pdffile_stub.py.

Previously six call sites each duplicated a `try: from pdffile import X /
except: from pdffile_stub import X` block, and the stub class mirrored
the real PDFFile API method-for-method — silent drift risk every time
upstream pdffile shipped.

Now:
- comicbox/_pdf.py is the single source of truth for PDF_ENABLED,
  PDFFile, and PAGE_FORMAT_VALUES. When pdffile is absent, PDFFile is
  None at runtime; type checkers see the real class via TYPE_CHECKING.
- Every call site that touches PDFFile is gated by `if PDF_ENABLED`.
- The `case PDFFile():` arm in box/archive/archive.py is lifted to an
  `if PDF_ENABLED and isinstance(archive, PDFFile):` guard above the
  match (the match form would fail when PDFFile is None).
- config/__init__.py reads PAGE_FORMAT_VALUES instead of iterating an
  empty stub Enum.

Verified with `pdffile` installed (307/307 tests pass) and in a fresh
venv without it (PDF_ENABLED=False, CBZ archives still work, PDF files
raise UnsupportedArchiveTypeError, CLI shows the "not installed" hint).

Net: -70 lines across 9 files.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* remove unused ty ignore

* comicbox 3 alpha 5 (#123)

* compact news

* update deps

* update news and version to alpha 4

* update deps

* rename function path in NEWS

* bump alpha version to 3.0.0a5

* version 3.0.0

* massage news

* bump version and news and update deps

* require comicbox-pdffile 0.6.x for image-dominant page detection (#131)

* require comicbox-pdffile 0.6.x for image-dominant page detection

Widens the optional ``[pdf]`` extra to require comicbox-pdffile 0.6.x.
The new minor release adds image-dominant page detection (
``PDFFile.classify_page``, ``PDFFile.read_image_if_dominant``,
``PDFFile.read_full_pixmap_jpeg``) used by browser readers to serve
scanned-comic PDF pages as plain ``<img>`` instead of routing through
pdf.js on the client.

comicbox itself doesn't use the new API — the bump is purely a pin
update so downstream callers (Codex, OPDS readers) can adopt it.

The ``[tool.uv.sources]`` block is transient: it points at the
pdffile PR branch so this CI can resolve dependencies before
0.6.x lands on PyPI. Drop it once 0.6.x publishes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* just use the released pdffile

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update deps

* regenerate pdf page fixtures for pdffile 0.6.x (#132)

Add bin/regenerate-pdf-test-pages.py — drives Comicbox.get_page_by_index
against tests/files/test_pdf.pdf to refresh tests/files/pdf/{N}.pdf when
pymupdf or pdffile change page-extraction output. Run on the next drift.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* bump pdffile to 0.6.1

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
ajslater added a commit that referenced this pull request May 10, 2026
* update deps and type hint

* expand linear yaml help to be more useful'

* update deps

* update eslint

* update deps

* bump news and version

* update pdf pages. binary difference with new mupdf

* update docker images

* fix make install dependencies

* add jxl to image extensions

* fix ignoring macos resource forks

* resource fork test file

* update deps

* adjust news

* Squashed commit of the following:

    type annotate magic metron field functions and make all params kwargs
    use eslint outside of editor
    update deps, new ruff rules. lint & format

* add venv upgrade script

* ignore PERF203

* update deps and install pdffile

* update deps. appease typechecker. new eslint.config

* Squashed commit of the following:

commit e27050fbd42f0cf8e549871cc06c70f041672306
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 7 21:36:49 2024 -0800

    rename deserializeMeta class to TrapExcepionsMeta
    fix type issues with field metaclass wrapper

* add eslint-plugin-json-schema-validator

* update deps and lint

* use mdx instead of markdown

* remove unused import

* remove superfluous plugins. remove first level globs

* update deps

* Squashed commit of the following:

    fix notes parsing for metron and many variations
    move notes parsing into another file.
    add comicinfo metron origin test
    rename modules to not shadow python builtins
    fix binary pdf files for new mupdf

* bump version and news

* fix type errors

* format

* refactor dynamic class creation to appease typchecker

* add libmupdf docs

* Simplify Identifier URL construction for Metron pk ids.

* update deps

* fix story arc parsing. bump version

* update dockerfile with modern node

* Squashed commit of the following:

Comicbox 2.0

* Resolve circular import if not installed with \[pdf\] option.

* Make archive comments that aren't ComicBookInfo JSON log as debug comments
  more often.

* update package links

* add more aliases for comicvine sources

* ensure dattetimes from archives are timezone aware

* update deps and bump version

* bump news

* drop version back appropriately

* fix alias tree builder

* update deps, typecheck with ty

* alphabetize  comicbox fields

* uv_build

* update pyproect, eslint config, deps

* update deps

* update deps

* normalize Trade Paper Back into Trade Paperback

* update deps

* update deps

* Squashed commit of the following:

update to xmltodict 1.0. remove special code for xmltodict #text type conversion bugs
compact code for xml_fields that get cdata
remove cdatata mixn from xml lists

* update deps

* pyright ignore

* fix age rating coercion for CIX"

* add github issue code example.

* update deps

* update deps

* replace poetry with uv for run script

* update deps

* no support for python 3.14

* explicitly build with 3.13 trixie

* remove ruamel.yaml.clib from test docker

* update deps

* update deps

* new verson. fix comicbox.json dump crash

* remove unused typing exceptions. add typing exceptions for ty foolishness

* update deps add ty to makefile

* python 3.14 support

* bump version and news

* update deps

* ignore ty type ignores

* update deps

* update deps

* Squashed commit of the following:

commit 259e561
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 19:51:31 2025 -0800

    use released pdffile

commit 4136a3b
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 19:41:28 2025 -0800

    use a proper base RenderModule and clean loads for tabs because it breaks yaml

commit 3426cf0
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 17:20:05 2025 -0800

    bump deps

commit 9fcaded
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 17:19:49 2025 -0800

    reduce complexity of dump

commit f96d27a
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 19:12:05 2025 -0800

    gate writing pdf metadata on delete all or data exists

commit 7415b82
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 19:08:26 2025 -0800

    optimize pdf writing by writing pdf data in the same context and only saving once

commit 2bd0f2c
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:57:26 2025 -0800

    rename legacy embedded variables to LEGACY_NESTED equivalents

commit 5222159
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:45:06 2025 -0800

    lint

commit 5d38acb
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:40:34 2025 -0800

    fix print test

commit 65410c7
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:37:18 2025 -0800

    fix most tests

commit 19d2dfe
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 17:40:51 2025 -0800

    fix pdf xml tests

commit f6bf854
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 16:50:08 2025 -0800

    fix tests for pdf_json

commit 590ffb8
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 14:44:51 2025 -0800

    fix accepting flexible datetimes from pdfs

commit e18925f
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 15:33:37 2025 -0800

    fix pdf tests using removed params

commit 55725b5
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 15:33:19 2025 -0800

    fix set subtraction

commit 2673e3a
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:57:35 2025 -0800

    add bpepple to news

commit 3de741d
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:56:30 2025 -0800

    update schemas doc for pdf embeds

commit 484737d
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:52:54 2025 -0800

    add bpepple to news

commit 0b6cdaf
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:47:31 2025 -0800

    bump version and news

commit bda414c
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:38:42 2025 -0800

    pdf write to embed files. pdf metadata keywords write tags.

commit 29fd04b
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:38:12 2025 -0800

    ty ignore

commit b795c49
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:34:20 2025 -0800

    add ty ignores

commit ce3ef91
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:33:43 2025 -0800

    update pdffile stub

commit fd2f4a0
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:31:38 2025 -0800

    update deps

commit 267d9d0
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:31:30 2025 -0800

    add alpha pdffile to sources

commit 041ce67
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:30:55 2025 -0800

    add pythondevmode to test script

* fix typing

* update deps

* Squashed commit of the following:

commit b31f22e6d178fcc1a5896c0dd7f680c26bc91657
Author: AJ Slater <aj@slater.net>
Date:   Mon Dec 1 20:03:13 2025 -0800

    typecheck with ty

* update deps

* complexipy & group deps

* reduce complexity

* update py7z library

* remove unused ty ignores

* ty fixes and ignores

* update deps

* update deps

* remove unused ty ignore

* update deps

* remove unusued ty ignores

* use OneOf instead of list syntax sugar for confuse

* update deps

* Raw yaml datetimes (#102)

* use OneOf instead of list syntax sugar for confuse

* update deps

* let yaml have raw yaml datetimes instead of strings

* use simplejson decode errors

* bump news and version

* fix test script

* fix lint backend groups

* remove unused groups

* fix test script

* really fix test script

* use grooup lint in tests for jsonschema

* tweak dep version ranges

* update deps. use dockerfmt. ruff changes inlie ifs to ors

* update dockerfile base

* update deps, remove unused ty warning ignore

* update deps add eslint plugins

* add mbake

* update deps

* fix tests for new pymupdf

* Squashed commit of the following:

commit 1fb394e109263188a16c4addeaab87bbdfdf882e
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 17:09:25 2026 -0800

    generate-schema scripts

commit fc9b4f5c27db827ae1592010b01708865cf3733e
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 17:09:08 2026 -0800

    format schemas

commit 9ccdf70d8c2318220c443714e509b6746f19a90e
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 16:39:04 2026 -0800

    fix schema

commit 1a082c52887571cd258ebbc467846461c8e9686f
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 13:29:02 2026 -0800

    add marshmallow jsonschema

* bump version and news

* add script comment

* update deps

* ty ignores

* lots more type annotaions. include py.typed sentinel

* remove unneeded ruff ignores

* prettier xml schema xsds

* convert to devenv

* update devenv and deps

* update devenv

* update devenv

* fix pytests. update pycountry

* fix cli help

* fix date serializization if already a string

* update devenv & deps

* import accepts quoted globs. bump version and news

* VALIDATE FEATURE

Squashed commit of the following:

commit 4f712ddc46859bb82eb6383d41a72502bf49f7be
Merge: 2b0b5db 06af8e3
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 14:01:25 2026 -0800

    Merge branch 'develop' into validate

commit 2b0b5db77d073da699cdf26e9481e5efd69ad424
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:32:39 2026 -0800

    better validate cli help

commit f78dd859c3c8c8adf44399f723de171da9d5467a
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:25:48 2026 -0800

    xsd printWidth to 120. fixes CoMet xsd.

commit d1563e96bbc944dc0669e4df0d647c44cce8c7dd
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:03:47 2026 -0800

    format test files with validator

commit 59350c9e3c13e9248368146e403a1cc05c755523
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:01:09 2026 -0800

    no available validator is a warning

commit f80fc325bc1cfecc9a9286f7538ac02eb6391ad6
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:00:40 2026 -0800

    use original schema definitions unreformatted

commit 8eb5d884136e215a19754f1d6ae2fdc9c0cd2cd3
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 11:51:35 2026 -0800

    fix symlink

commit bffef02777ba01b6c4f54ba36df7f433c45841da
Merge: 3547d24 6478b78
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 11:26:18 2026 -0800

    Merge branch 'develop' into validate

commit 3547d24639eed74841fb76b49aa49ab238b820a4
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 11:25:50 2026 -0800

    update deps

commit 29dba04deaf029466ca6794060c55b81d5c0a054
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:44:30 2026 -0800

    update deps

commit 273da7ab3e87d60eea56167199e466c61867c57c
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:43:55 2026 -0800

    only catch and warn on validation errors

commit 5ec0ad1928c709388facb054b3f6915285a4e4a8
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:35:41 2026 -0800

    move xmlschema and jsonschema into regular deps

commit 0887cf1e07daec89b59972e9cf8ffc59c143dba2
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:33:33 2026 -0800

    fix getting format from input files. change validation exception to warning

commit 4e7be5f44225522398a407c94d76c26fbd22a925
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:32:54 2026 -0800

    fix guess_format

commit 2342605b4b08ce641f056a38b5b634bae75bcfec
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:13:41 2026 -0800

    fix script for new location of validate_cli

commit b2ab1995e543204d15e83c00e8596681e52b70f7
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:12:08 2026 -0800

    move schema to schema_definitions

commit deacf119c2d0823af5c6405162d23b7e32f8fb37
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 22:23:05 2026 -0800

    better validation logging

commit c9a615f5885b53dd8e8b81c9d735808f4eaa7736
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 22:15:10 2026 -0800

    fix validation format assignment. validation info logging

commit 914d35d15f536a6a042cbe66346b2cb4a38d636a
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 22:05:04 2026 -0800

    basically working validation with definitions dir

commit 7e860e8f6110dd868dbec2f724a8bff1bd0a980d
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 21:22:04 2026 -0800

    ignore bad typecheck warnings

commit 3d5ae84354b772ce8fc08793a2c7db64e95c46ac
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 21:13:02 2026 -0800

    fix validate tests

commit 324a0c6fd9935c153fc5a172020a8c02b6f901d0
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 18:26:12 2026 -0800

    most tests pass. validate test fails. typecheck fails. schemas need moving into the package

commit 5c3d4cd77020b5318a0f45c8d72d432d50ad158e
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 17:16:27 2026 -0800

    update deps

commit 112a71aece1adba12e4d380359da3a167456af8c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 17:16:19 2026 -0800

    pin comicbox-pdffile

* bump NEWS

* PDF2CBZ extract images
Squashed commit of the following:

commit b6296ee49b49556b04adaefb12bed332f4fee857
Merge: 5bf0007 bdd3879
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 14:07:16 2026 -0800

    Merge branch 'develop' into pdf2cbz

commit 5bf0007
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 14:44:53 2026 -0800

    bump news and version

commit 362123c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 14:39:19 2026 -0800

    update pdffile to released version

commit f09571c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 13:36:29 2026 -0800

    switch image_pdf to more powerful pdf_page_format

commit b1d2d1b
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 12:37:12 2026 -0800

    fix pdf cover compare test

commit 5aaeae0
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 12:36:39 2026 -0800

    move pdf format decision to _archive_readfile()

commit 2107241
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 10:32:34 2026 -0800

    update deps

commit 566e426
Merge: cdc2250 38bcfe2
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:58:28 2026 -0800

    Merge branch 'develop' into pdf2cbz

commit cdc2250
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:57:57 2026 -0800

    fix cli help

commit 1190fe4
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:52:12 2026 -0800

    fix cli option collision"

commit 63bf418
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:49:48 2026 -0800

    cli option for image_pdf

commit 1d7d852
Merge: db7061c f5f03b5
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:39:49 2026 -0800

    Merge branch 'develop' into pdf2cbz

commit db7061c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:36:11 2026 -0800

    basic support for extract image from pdf

* move docker-compose.yaml to compose.yaml

* fix dockerfile for new devenv

* fix dockerfile for new devenv the kludgey way

* fix news

* update deps

* format dockerfile

* color and clarification for help

* fix colors for help

* update devenv

* fix prettierignore

* update devenv

* fix makefile

* v2.2.1 fix pdf datetimes

* update devenv

* update deps

* update devenv

* add ty ignores to match pyright ignores

* update devenv

* add node_root feature

* update devenv

* update devenv

* update devenv, deps and fix some ty typing

* update deepdiff and bump version

* fix news typo

* update pdfs for new pymupdf

* update deps & devenv

* update deps v2.2.3

* use usr/env for scripts

* gha workflow

* switch to github actions

* Add to_metron_age_rating() public conversion function (#108)

Provide a standalone function to convert any age rating enum or string
to a MetronAgeRatingEnum. Supports Marvel, DC, Generic, ComicInfo, and
Metron enums with fuzzy string matching (case/space-insensitive).

* add claude.md

* bump news and version

* when extracting pages make path absolute

* use python convenience method

* rename variable

* Fix path traversal vulnerability in archive extraction (#109)

Validate that resolved output paths stay within the destination
directory before writing, preventing zip-slip attacks from crafted
archive member names.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* bump news

* Optimize for large-scale workloads (600K+ files) (#110)

Reduce per-file overhead for bulk metadata reading:
- Extension-hint archive detection: check file extension first to avoid
  unnecessary magic-byte disk reads (saves ~1.2M file opens for CBZ collections)
- Cache marshmallow schema instances by (class, exclude_keys) to eliminate
  ~4.8M schema constructions at scale
- Cache transform instances per Comicbox instance to avoid redundant creation
- Skip FrozenAttrDict re-wrapping when pre-built config is passed
- Skip redundant logger init when loglevel hasn't changed
- Remove always-on glom_debug=True from transform calls

Add parallelization API (comicbox/process.py):
- process_files() for ProcessPoolExecutor-based batch processing
- aread_metadata() async wrapper for event loop integration

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* sort ignorefiles

* bump news

* fix datetime ordering bug

* Code quality pass: match statements, pathlib, immutable constants (#111)

* Targeted code quality pass: match statements, pathlib, immutable constants

- Convert isinstance if/elif chains to match statements in archive.py,
  archiveinfo.py, and time_fields.py
- Replace os.walk with Path.rglob in run.py, fixing a double-recursion
  bug where recurse() re-walked subdirectories already visited by os.walk
- Wrap _HANDLE_MERGE dict in MappingProxyType in mergedeep.py
- Replace accumulator loop with list comprehension in config/computed.py
- Replace loop-append with extend + generator in box/sources.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Sort ignore files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* use pdf group for tests

* update deps

* iterprocess files

* fix print test

* Righttyper Typing with corrections (#112)

* righttyper

* raw types commit

* Fix righttyper auto-generated type annotations

Correct ~535 basedpyright errors and 10 ruff errors introduced by
righttyper's runtime type capture, which used overly-literal types.

Key changes:
- Replace PosixPath annotations with Path throughout
- Simplify overly-specific dict union types to dict[str, Any]
- Remove broken self: "Module.ClassName" annotations in mixins
- Rename/remove rt_T1 TypeVars (N815/N816)
- Move Callable import to TYPE_CHECKING block (TC003)
- Make boolean params keyword-only in tests/util.py (FBT001)
- Add pyright: ignore on marshmallow method override incompatibilities
- Fix _path override annotations in archive write/dump_files
- Widen function signatures to accept Path | str | None where needed
- Fix circular import in transforms/spec.py (was referencing xml_reprints.MetaSpec)
- Guard None.items() calls in metroninfo identifiers with early returns
- Clean up various unused imports left by annotation removals

Result: 0 errors, 259 tests passing, make fix/lint/typecheck all clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* sort ignore files

* massively typed

* remove righttyper. back to python 3.10 req

* update devenv. switch to bun

* remove quoted self typing

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* remove self types

* remove self typing

* typing attempt

* fix typing errors

* fix circular import

* reorg news

* update devenv

* add bun to dockerfile

* only copy bun deps first for dockerfile

* update devenv & deps

* switch back to main marshmallow-jsonschema now that it's back from the dead

* update devenv

* fix process pool runs to deliver exceptions back and not break on passing in the logger

* test the process module

* comments

* enhance news for iterfiles

* decomplexify box init

* decomplexify process iterfiles

* allow callers to configure subprocess loguru via picklable dict (#113)

Loguru's logger object isn't picklable into ProcessPoolExecutor
workers, so callers like codex couldn't get worker log output to
match their parent-process format. Adds a worker_log_config dict
({level, format, sink}) that runs through the executor initializer
and reconfigures loguru in each worker via init_logging. Also adds
enqueue=True to the default sink for thread-safe logging.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update devenv

* Upgrade confuse to 2.2.0; replace AttrDict with typed Settings (#114)

* upgrade confuse to 2.2.0; replace AttrDict with typed Settings dataclass

confuse 2.2.0 makes AttrDict properly generic, so per-key types resolve
to `object` and consumers across the box mixins fail typecheck. Convert
the validated AttrDict into a frozen `Settings` dataclass once in
get_config() and propagate that typed object everywhere; confuse stays
confined to comicbox/config.

- New comicbox/config/settings.py defines `Settings` and
  `ComputedSettings` (frozen, slots).
- get_config() returns Settings; new _build_settings() does the
  conversion. post_process_set_for_path() rebuilt around
  dataclasses.replace.
- FrozenAttrDict deleted — frozen dataclass enforces immutability.
- process.py passes Settings through pickle directly so workers skip
  re-running confuse.
- Drops dead `dest_path is None` checks now that the field is required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* rename Settings to ComicboxSettings

So that client programs that already define their own `Settings` type
don't collide on import.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* flatten ComputedSettings into ComicboxSettings

The hierarchical split was a confuse-template setup convenience, not a
logical grouping — there's no API benefit to keeping client code
chained through `cfg.computed.X`. Promote the six computed fields onto
ComicboxSettings under a clearly labeled comment block. The confuse
template's nested `computed` MappingTemplate is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix metadata_format hint silently dropping all api metadata (#115)

`_get_source_config_metadata` early-returned an empty list whenever the
caller set `metadata_format`, because `fmt not in self._config.read`
compared a string against a frozenset of `MetadataFormats` enums —
always True. The conversion + correct membership check happens in the
try block on the next lines, so the early return was both wrong and
redundant.

Adds tests/unit/test_sources.py covering the four behavioral cases:
fmt-in-read, no-fmt, fmt-not-in-read, invalid-fmt.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update deps

* bump news and version to 3.0.0

* fix Mapping config args silently dropped under config_default.yaml (#116)

read_config_sources used config.add() for the Mapping branch, which
appends to the BOTTOM of confuse's source priority stack — below the
config_default.yaml loaded by config.read() at the top of the
function. So any caller passing a dict / Mapping override (e.g.
`get_config({"comicbox": {"compute_pages": True}})`) silently got the
default instead. Switch to config.set() so Mapping args land on top,
matching set_args() for the Namespace branch.

Surfaced by a downstream Codex migration that hit dead Mapping
overrides; covered now by tests/unit/test_config_layering.py.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update deps

* widen set-like config fields to accept any non-mapping container (#117)

The template arms for `read`, `write`, `export`, `delete_keys`,
`read_ignore`, and `print` previously combined `frozenset` (a
pass-through marker) and `Sequence(str)` (list-of-strings coercion).
That works for the common YAML/CLI list path but rejects callers
passing a `set` / `tuple` / `frozenset` literal — which is logically
fine for fields whose post-compute value is always a frozenset.

Replaces the per-field unions with `OneOf((set, frozenset, tuple, list))`
(`print` also accepts `str` for the historical phase-char form). The
`_build_settings` boundary already calls `frozenset(...)` on these
values, so any of the four containers normalize correctly.

Also adapts `compute_config`'s helpers — Subview iteration only
supports dict/list source values, so user-supplied set/frozenset/tuple
inputs would error before reaching the template. New `_raw_or_empty`
pulls the Python value via `.get()` and explicitly rejects mappings
with a clear error (dict iteration would silently accept dict input
otherwise). `_parse_print` now accepts a phase-char string OR any
iterable of phase chars.

Path-list fields (`paths`, `import_paths`, `metadata_cli`) keep their
existing `Sequence(...)` form with element-type validation — that
trade-off felt worth keeping.

14 new tests in tests/unit/test_config_container_inputs.py cover the
four container types per field and assert mapping rejection.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* reuse types tuple

* update deps

* 3.0.0 alpha version 0

* update compose for generic gha build

* ReadResults data structure for process functions

* compact news (#119)

* Add skip_metadata flag to get_cover_page (#120)

Callers that only want a thumbnail (e.g. codex's CoverThread) don't
need the full ComicInfo/CoverImage hint resolution. Parsing the
metadata for every cover dominates the cost of cover extraction
and emits a flood of debug-bucket Union ValidationErrors that look
like real failures in DEBUG logs.

When skip_metadata=True, bypass generate_cover_paths entirely and
read archive index 0 directly. This drops per-call schema
instantiation, Union resolution, and path normalization.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v3.0.0a2 (#121)

* compact news

* update deps

* Drop DEBUG-bucket logging of intentionally-ignored validation errors (#122)

ClearingErrorStoreSchema previously split each schema's errors into
two buckets: ignored ones logged at DEBUG, real ones at WARNING.
The DEBUG bucket only ever held errors from ``_ignore_errors`` —
``Field may not be null.`` (sparse-field tolerance) and
``Invalid input type.`` (Union variant misses) — both of which are
internal mechanics, not operator-actionable signal. Each Union miss
emitted one ``ValidationError - {'_schema': ['Invalid input type.']}``
line per field per archive, drowning the genuinely useful per-source
DEBUG messages emitted by ``_except_on_load``.

Filter ignored errors at split time, log only WARNINGs. Real schema
failures still surface with full context (path, schema class,
normalized message). Collapses the dual-bucket _split_*_errors
methods into _filter_* + _log_warnings.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update devenv

* metron: drop URL slugs for types with no public web page (#124)

* compact news

* update deps

* metron: drop broken URL slugs for genre, location, reprint, role, story, tag

Metron has no public web pages for these types — only API endpoints — so
URLs like https://metron.cloud/genre/3 always 404. Stop emitting them.
The numeric Metron ID is still preserved on the identifier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Re-export to_metron_age_rating from comicbox.enums.maps (#125)

Shortens the import path for the helper from
comicbox.enums.maps.age_rating to comicbox.enums.maps so downstream
callers can reach it without drilling into the submodule.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update devenv

* Drop dead code surfaced by skylos scan (#126)

- Remove unused module/class constants: _COMMENT_ARCHIVE_TYPES, SUFFIXES,
  _LOG_FORMAT, comet.py IDENTIFIER_TAG/IS_VERSION_OF_TAG, comictagger.py
  IDENTIFIER_TAG/PAGES_TAG, XmlCountryField (and now-orphaned imports
  RarFile, ZipFile, CountryField).
- Fix latent bug in TrapExceptionsMeta: `attr_name in "deserialize"` was a
  substring check that wrapped any callable whose name was a substring of
  "deserialize" (e.g. "er", "size", "ali"). Use the existing _WRAP_METHODS
  tuple instead so only the exact `deserialize` method is wrapped.
- Simplify _get_pdf_enabled() to a plain `import pdffile` probe; the
  except-arm stub import had no effect.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Replace pdffile_stub with single-shim optional integration (#127)

Consolidate the optional comicbox-pdffile integration into one module
(comicbox/_pdf.py) and delete the hand-maintained pdffile_stub.py.

Previously six call sites each duplicated a `try: from pdffile import X /
except: from pdffile_stub import X` block, and the stub class mirrored
the real PDFFile API method-for-method — silent drift risk every time
upstream pdffile shipped.

Now:
- comicbox/_pdf.py is the single source of truth for PDF_ENABLED,
  PDFFile, and PAGE_FORMAT_VALUES. When pdffile is absent, PDFFile is
  None at runtime; type checkers see the real class via TYPE_CHECKING.
- Every call site that touches PDFFile is gated by `if PDF_ENABLED`.
- The `case PDFFile():` arm in box/archive/archive.py is lifted to an
  `if PDF_ENABLED and isinstance(archive, PDFFile):` guard above the
  match (the match form would fail when PDFFile is None).
- config/__init__.py reads PAGE_FORMAT_VALUES instead of iterating an
  empty stub Enum.

Verified with `pdffile` installed (307/307 tests pass) and in a fresh
venv without it (PDF_ENABLED=False, CBZ archives still work, PDF files
raise UnsupportedArchiveTypeError, CLI shows the "not installed" hint).

Net: -70 lines across 9 files.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* remove unused ty ignore

* comicbox 3 alpha 5 (#123)

* compact news

* update deps

* update news and version to alpha 4

* update deps

* rename function path in NEWS

* bump alpha version to 3.0.0a5

* version 3.0.0

* massage news

* bump version and news and update deps

* require comicbox-pdffile 0.6.x for image-dominant page detection (#131)

* require comicbox-pdffile 0.6.x for image-dominant page detection

Widens the optional ``[pdf]`` extra to require comicbox-pdffile 0.6.x.
The new minor release adds image-dominant page detection (
``PDFFile.classify_page``, ``PDFFile.read_image_if_dominant``,
``PDFFile.read_full_pixmap_jpeg``) used by browser readers to serve
scanned-comic PDF pages as plain ``<img>`` instead of routing through
pdf.js on the client.

comicbox itself doesn't use the new API — the bump is purely a pin
update so downstream callers (Codex, OPDS readers) can adopt it.

The ``[tool.uv.sources]`` block is transient: it points at the
pdffile PR branch so this CI can resolve dependencies before
0.6.x lands on PyPI. Drop it once 0.6.x publishes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* just use the released pdffile

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update deps

* regenerate pdf page fixtures for pdffile 0.6.x (#132)

Add bin/regenerate-pdf-test-pages.py — drives Comicbox.get_page_by_index
against tests/files/test_pdf.pdf to refresh tests/files/pdf/{N}.pdf when
pymupdf or pdffile change page-extraction output. Run on the next drift.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* bump pdffile to 0.6.1

* bump version and news to 3.0.2

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
ajslater added a commit that referenced this pull request May 15, 2026
* update pdf pages. binary difference with new mupdf

* update docker images

* fix make install dependencies

* add jxl to image extensions

* fix ignoring macos resource forks

* resource fork test file

* update deps

* adjust news

* Squashed commit of the following:

    type annotate magic metron field functions and make all params kwargs
    use eslint outside of editor
    update deps, new ruff rules. lint & format

* add venv upgrade script

* ignore PERF203

* update deps and install pdffile

* update deps. appease typechecker. new eslint.config

* Squashed commit of the following:

commit e27050fbd42f0cf8e549871cc06c70f041672306
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 7 21:36:49 2024 -0800

    rename deserializeMeta class to TrapExcepionsMeta
    fix type issues with field metaclass wrapper

* add eslint-plugin-json-schema-validator

* update deps and lint

* use mdx instead of markdown

* remove unused import

* remove superfluous plugins. remove first level globs

* update deps

* Squashed commit of the following:

    fix notes parsing for metron and many variations
    move notes parsing into another file.
    add comicinfo metron origin test
    rename modules to not shadow python builtins
    fix binary pdf files for new mupdf

* bump version and news

* fix type errors

* format

* refactor dynamic class creation to appease typchecker

* add libmupdf docs

* Simplify Identifier URL construction for Metron pk ids.

* update deps

* fix story arc parsing. bump version

* update dockerfile with modern node

* Squashed commit of the following:

Comicbox 2.0

* Resolve circular import if not installed with \[pdf\] option.

* Make archive comments that aren't ComicBookInfo JSON log as debug comments
  more often.

* update package links

* add more aliases for comicvine sources

* ensure dattetimes from archives are timezone aware

* update deps and bump version

* bump news

* drop version back appropriately

* fix alias tree builder

* update deps, typecheck with ty

* alphabetize  comicbox fields

* uv_build

* update pyproect, eslint config, deps

* update deps

* update deps

* normalize Trade Paper Back into Trade Paperback

* update deps

* update deps

* Squashed commit of the following:

update to xmltodict 1.0. remove special code for xmltodict #text type conversion bugs
compact code for xml_fields that get cdata
remove cdatata mixn from xml lists

* update deps

* pyright ignore

* fix age rating coercion for CIX"

* add github issue code example.

* update deps

* update deps

* replace poetry with uv for run script

* update deps

* no support for python 3.14

* explicitly build with 3.13 trixie

* remove ruamel.yaml.clib from test docker

* update deps

* update deps

* new verson. fix comicbox.json dump crash

* remove unused typing exceptions. add typing exceptions for ty foolishness

* update deps add ty to makefile

* python 3.14 support

* bump version and news

* update deps

* ignore ty type ignores

* update deps

* update deps

* Squashed commit of the following:

commit 259e561
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 19:51:31 2025 -0800

    use released pdffile

commit 4136a3b
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 19:41:28 2025 -0800

    use a proper base RenderModule and clean loads for tabs because it breaks yaml

commit 3426cf0
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 17:20:05 2025 -0800

    bump deps

commit 9fcaded
Author: AJ Slater <aj@slater.net>
Date:   Sat Nov 22 17:19:49 2025 -0800

    reduce complexity of dump

commit f96d27a
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 19:12:05 2025 -0800

    gate writing pdf metadata on delete all or data exists

commit 7415b82
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 19:08:26 2025 -0800

    optimize pdf writing by writing pdf data in the same context and only saving once

commit 2bd0f2c
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:57:26 2025 -0800

    rename legacy embedded variables to LEGACY_NESTED equivalents

commit 5222159
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:45:06 2025 -0800

    lint

commit 5d38acb
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:40:34 2025 -0800

    fix print test

commit 65410c7
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 18:37:18 2025 -0800

    fix most tests

commit 19d2dfe
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 17:40:51 2025 -0800

    fix pdf xml tests

commit f6bf854
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 16:50:08 2025 -0800

    fix tests for pdf_json

commit 590ffb8
Author: AJ Slater <aj@slater.net>
Date:   Fri Nov 21 14:44:51 2025 -0800

    fix accepting flexible datetimes from pdfs

commit e18925f
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 15:33:37 2025 -0800

    fix pdf tests using removed params

commit 55725b5
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 15:33:19 2025 -0800

    fix set subtraction

commit 2673e3a
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:57:35 2025 -0800

    add bpepple to news

commit 3de741d
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:56:30 2025 -0800

    update schemas doc for pdf embeds

commit 484737d
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:52:54 2025 -0800

    add bpepple to news

commit 0b6cdaf
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:47:31 2025 -0800

    bump version and news

commit bda414c
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:38:42 2025 -0800

    pdf write to embed files. pdf metadata keywords write tags.

commit 29fd04b
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:38:12 2025 -0800

    ty ignore

commit b795c49
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:34:20 2025 -0800

    add ty ignores

commit ce3ef91
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:33:43 2025 -0800

    update pdffile stub

commit fd2f4a0
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:31:38 2025 -0800

    update deps

commit 267d9d0
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:31:30 2025 -0800

    add alpha pdffile to sources

commit 041ce67
Author: AJ Slater <aj@slater.net>
Date:   Thu Nov 20 14:30:55 2025 -0800

    add pythondevmode to test script

* fix typing

* update deps

* Squashed commit of the following:

commit b31f22e6d178fcc1a5896c0dd7f680c26bc91657
Author: AJ Slater <aj@slater.net>
Date:   Mon Dec 1 20:03:13 2025 -0800

    typecheck with ty

* update deps

* complexipy & group deps

* reduce complexity

* update py7z library

* remove unused ty ignores

* ty fixes and ignores

* update deps

* update deps

* remove unused ty ignore

* update deps

* remove unusued ty ignores

* use OneOf instead of list syntax sugar for confuse

* update deps

* Raw yaml datetimes (#102)

* use OneOf instead of list syntax sugar for confuse

* update deps

* let yaml have raw yaml datetimes instead of strings

* use simplejson decode errors

* bump news and version

* fix test script

* fix lint backend groups

* remove unused groups

* fix test script

* really fix test script

* use grooup lint in tests for jsonschema

* tweak dep version ranges

* update deps. use dockerfmt. ruff changes inlie ifs to ors

* update dockerfile base

* update deps, remove unused ty warning ignore

* update deps add eslint plugins

* add mbake

* update deps

* fix tests for new pymupdf

* Squashed commit of the following:

commit 1fb394e109263188a16c4addeaab87bbdfdf882e
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 17:09:25 2026 -0800

    generate-schema scripts

commit fc9b4f5c27db827ae1592010b01708865cf3733e
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 17:09:08 2026 -0800

    format schemas

commit 9ccdf70d8c2318220c443714e509b6746f19a90e
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 16:39:04 2026 -0800

    fix schema

commit 1a082c52887571cd258ebbc467846461c8e9686f
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 11 13:29:02 2026 -0800

    add marshmallow jsonschema

* bump version and news

* add script comment

* update deps

* ty ignores

* lots more type annotaions. include py.typed sentinel

* remove unneeded ruff ignores

* prettier xml schema xsds

* convert to devenv

* update devenv and deps

* update devenv

* update devenv

* fix pytests. update pycountry

* fix cli help

* fix date serializization if already a string

* update devenv & deps

* import accepts quoted globs. bump version and news

* VALIDATE FEATURE

Squashed commit of the following:

commit 4f712ddc46859bb82eb6383d41a72502bf49f7be
Merge: 2b0b5db 06af8e3
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 14:01:25 2026 -0800

    Merge branch 'develop' into validate

commit 2b0b5db77d073da699cdf26e9481e5efd69ad424
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:32:39 2026 -0800

    better validate cli help

commit f78dd859c3c8c8adf44399f723de171da9d5467a
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:25:48 2026 -0800

    xsd printWidth to 120. fixes CoMet xsd.

commit d1563e96bbc944dc0669e4df0d647c44cce8c7dd
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:03:47 2026 -0800

    format test files with validator

commit 59350c9e3c13e9248368146e403a1cc05c755523
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:01:09 2026 -0800

    no available validator is a warning

commit f80fc325bc1cfecc9a9286f7538ac02eb6391ad6
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 13:00:40 2026 -0800

    use original schema definitions unreformatted

commit 8eb5d884136e215a19754f1d6ae2fdc9c0cd2cd3
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 11:51:35 2026 -0800

    fix symlink

commit bffef02777ba01b6c4f54ba36df7f433c45841da
Merge: 3547d24 6478b78
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 11:26:18 2026 -0800

    Merge branch 'develop' into validate

commit 3547d24639eed74841fb76b49aa49ab238b820a4
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 11:25:50 2026 -0800

    update deps

commit 29dba04deaf029466ca6794060c55b81d5c0a054
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:44:30 2026 -0800

    update deps

commit 273da7ab3e87d60eea56167199e466c61867c57c
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:43:55 2026 -0800

    only catch and warn on validation errors

commit 5ec0ad1928c709388facb054b3f6915285a4e4a8
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:35:41 2026 -0800

    move xmlschema and jsonschema into regular deps

commit 0887cf1e07daec89b59972e9cf8ffc59c143dba2
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:33:33 2026 -0800

    fix getting format from input files. change validation exception to warning

commit 4e7be5f44225522398a407c94d76c26fbd22a925
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:32:54 2026 -0800

    fix guess_format

commit 2342605b4b08ce641f056a38b5b634bae75bcfec
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:13:41 2026 -0800

    fix script for new location of validate_cli

commit b2ab1995e543204d15e83c00e8596681e52b70f7
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 01:12:08 2026 -0800

    move schema to schema_definitions

commit deacf119c2d0823af5c6405162d23b7e32f8fb37
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 22:23:05 2026 -0800

    better validation logging

commit c9a615f5885b53dd8e8b81c9d735808f4eaa7736
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 22:15:10 2026 -0800

    fix validation format assignment. validation info logging

commit 914d35d15f536a6a042cbe66346b2cb4a38d636a
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 22:05:04 2026 -0800

    basically working validation with definitions dir

commit 7e860e8f6110dd868dbec2f724a8bff1bd0a980d
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 21:22:04 2026 -0800

    ignore bad typecheck warnings

commit 3d5ae84354b772ce8fc08793a2c7db64e95c46ac
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 21:13:02 2026 -0800

    fix validate tests

commit 324a0c6fd9935c153fc5a172020a8c02b6f901d0
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 18:26:12 2026 -0800

    most tests pass. validate test fails. typecheck fails. schemas need moving into the package

commit 5c3d4cd77020b5318a0f45c8d72d432d50ad158e
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 17:16:27 2026 -0800

    update deps

commit 112a71aece1adba12e4d380359da3a167456af8c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 17:16:19 2026 -0800

    pin comicbox-pdffile

* bump NEWS

* PDF2CBZ extract images
Squashed commit of the following:

commit b6296ee49b49556b04adaefb12bed332f4fee857
Merge: 5bf0007 bdd3879
Author: AJ Slater <aj@slater.net>
Date:   Wed Feb 18 14:07:16 2026 -0800

    Merge branch 'develop' into pdf2cbz

commit 5bf0007
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 14:44:53 2026 -0800

    bump news and version

commit 362123c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 14:39:19 2026 -0800

    update pdffile to released version

commit f09571c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 13:36:29 2026 -0800

    switch image_pdf to more powerful pdf_page_format

commit b1d2d1b
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 12:37:12 2026 -0800

    fix pdf cover compare test

commit 5aaeae0
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 12:36:39 2026 -0800

    move pdf format decision to _archive_readfile()

commit 2107241
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 10:32:34 2026 -0800

    update deps

commit 566e426
Merge: cdc2250 38bcfe2
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:58:28 2026 -0800

    Merge branch 'develop' into pdf2cbz

commit cdc2250
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:57:57 2026 -0800

    fix cli help

commit 1190fe4
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:52:12 2026 -0800

    fix cli option collision"

commit 63bf418
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:49:48 2026 -0800

    cli option for image_pdf

commit 1d7d852
Merge: db7061c f5f03b5
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:39:49 2026 -0800

    Merge branch 'develop' into pdf2cbz

commit db7061c
Author: AJ Slater <aj@slater.net>
Date:   Tue Feb 17 01:36:11 2026 -0800

    basic support for extract image from pdf

* move docker-compose.yaml to compose.yaml

* fix dockerfile for new devenv

* fix dockerfile for new devenv the kludgey way

* fix news

* update deps

* format dockerfile

* color and clarification for help

* fix colors for help

* update devenv

* fix prettierignore

* update devenv

* fix makefile

* v2.2.1 fix pdf datetimes

* update devenv

* update deps

* update devenv

* add ty ignores to match pyright ignores

* update devenv

* add node_root feature

* update devenv

* update devenv

* update devenv, deps and fix some ty typing

* update deepdiff and bump version

* fix news typo

* update pdfs for new pymupdf

* update deps & devenv

* update deps v2.2.3

* use usr/env for scripts

* gha workflow

* switch to github actions

* Add to_metron_age_rating() public conversion function (#108)

Provide a standalone function to convert any age rating enum or string
to a MetronAgeRatingEnum. Supports Marvel, DC, Generic, ComicInfo, and
Metron enums with fuzzy string matching (case/space-insensitive).

* add claude.md

* bump news and version

* when extracting pages make path absolute

* use python convenience method

* rename variable

* Fix path traversal vulnerability in archive extraction (#109)

Validate that resolved output paths stay within the destination
directory before writing, preventing zip-slip attacks from crafted
archive member names.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* bump news

* Optimize for large-scale workloads (600K+ files) (#110)

Reduce per-file overhead for bulk metadata reading:
- Extension-hint archive detection: check file extension first to avoid
  unnecessary magic-byte disk reads (saves ~1.2M file opens for CBZ collections)
- Cache marshmallow schema instances by (class, exclude_keys) to eliminate
  ~4.8M schema constructions at scale
- Cache transform instances per Comicbox instance to avoid redundant creation
- Skip FrozenAttrDict re-wrapping when pre-built config is passed
- Skip redundant logger init when loglevel hasn't changed
- Remove always-on glom_debug=True from transform calls

Add parallelization API (comicbox/process.py):
- process_files() for ProcessPoolExecutor-based batch processing
- aread_metadata() async wrapper for event loop integration

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* sort ignorefiles

* bump news

* fix datetime ordering bug

* Code quality pass: match statements, pathlib, immutable constants (#111)

* Targeted code quality pass: match statements, pathlib, immutable constants

- Convert isinstance if/elif chains to match statements in archive.py,
  archiveinfo.py, and time_fields.py
- Replace os.walk with Path.rglob in run.py, fixing a double-recursion
  bug where recurse() re-walked subdirectories already visited by os.walk
- Wrap _HANDLE_MERGE dict in MappingProxyType in mergedeep.py
- Replace accumulator loop with list comprehension in config/computed.py
- Replace loop-append with extend + generator in box/sources.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Sort ignore files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* use pdf group for tests

* update deps

* iterprocess files

* fix print test

* Righttyper Typing with corrections (#112)

* righttyper

* raw types commit

* Fix righttyper auto-generated type annotations

Correct ~535 basedpyright errors and 10 ruff errors introduced by
righttyper's runtime type capture, which used overly-literal types.

Key changes:
- Replace PosixPath annotations with Path throughout
- Simplify overly-specific dict union types to dict[str, Any]
- Remove broken self: "Module.ClassName" annotations in mixins
- Rename/remove rt_T1 TypeVars (N815/N816)
- Move Callable import to TYPE_CHECKING block (TC003)
- Make boolean params keyword-only in tests/util.py (FBT001)
- Add pyright: ignore on marshmallow method override incompatibilities
- Fix _path override annotations in archive write/dump_files
- Widen function signatures to accept Path | str | None where needed
- Fix circular import in transforms/spec.py (was referencing xml_reprints.MetaSpec)
- Guard None.items() calls in metroninfo identifiers with early returns
- Clean up various unused imports left by annotation removals

Result: 0 errors, 259 tests passing, make fix/lint/typecheck all clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* sort ignore files

* massively typed

* remove righttyper. back to python 3.10 req

* update devenv. switch to bun

* remove quoted self typing

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* remove self types

* remove self typing

* typing attempt

* fix typing errors

* fix circular import

* reorg news

* update devenv

* add bun to dockerfile

* only copy bun deps first for dockerfile

* update devenv & deps

* switch back to main marshmallow-jsonschema now that it's back from the dead

* update devenv

* fix process pool runs to deliver exceptions back and not break on passing in the logger

* test the process module

* comments

* enhance news for iterfiles

* decomplexify box init

* decomplexify process iterfiles

* allow callers to configure subprocess loguru via picklable dict (#113)

Loguru's logger object isn't picklable into ProcessPoolExecutor
workers, so callers like codex couldn't get worker log output to
match their parent-process format. Adds a worker_log_config dict
({level, format, sink}) that runs through the executor initializer
and reconfigures loguru in each worker via init_logging. Also adds
enqueue=True to the default sink for thread-safe logging.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update devenv

* Upgrade confuse to 2.2.0; replace AttrDict with typed Settings (#114)

* upgrade confuse to 2.2.0; replace AttrDict with typed Settings dataclass

confuse 2.2.0 makes AttrDict properly generic, so per-key types resolve
to `object` and consumers across the box mixins fail typecheck. Convert
the validated AttrDict into a frozen `Settings` dataclass once in
get_config() and propagate that typed object everywhere; confuse stays
confined to comicbox/config.

- New comicbox/config/settings.py defines `Settings` and
  `ComputedSettings` (frozen, slots).
- get_config() returns Settings; new _build_settings() does the
  conversion. post_process_set_for_path() rebuilt around
  dataclasses.replace.
- FrozenAttrDict deleted — frozen dataclass enforces immutability.
- process.py passes Settings through pickle directly so workers skip
  re-running confuse.
- Drops dead `dest_path is None` checks now that the field is required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* rename Settings to ComicboxSettings

So that client programs that already define their own `Settings` type
don't collide on import.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* flatten ComputedSettings into ComicboxSettings

The hierarchical split was a confuse-template setup convenience, not a
logical grouping — there's no API benefit to keeping client code
chained through `cfg.computed.X`. Promote the six computed fields onto
ComicboxSettings under a clearly labeled comment block. The confuse
template's nested `computed` MappingTemplate is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix metadata_format hint silently dropping all api metadata (#115)

`_get_source_config_metadata` early-returned an empty list whenever the
caller set `metadata_format`, because `fmt not in self._config.read`
compared a string against a frozenset of `MetadataFormats` enums —
always True. The conversion + correct membership check happens in the
try block on the next lines, so the early return was both wrong and
redundant.

Adds tests/unit/test_sources.py covering the four behavioral cases:
fmt-in-read, no-fmt, fmt-not-in-read, invalid-fmt.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update deps

* bump news and version to 3.0.0

* fix Mapping config args silently dropped under config_default.yaml (#116)

read_config_sources used config.add() for the Mapping branch, which
appends to the BOTTOM of confuse's source priority stack — below the
config_default.yaml loaded by config.read() at the top of the
function. So any caller passing a dict / Mapping override (e.g.
`get_config({"comicbox": {"compute_pages": True}})`) silently got the
default instead. Switch to config.set() so Mapping args land on top,
matching set_args() for the Namespace branch.

Surfaced by a downstream Codex migration that hit dead Mapping
overrides; covered now by tests/unit/test_config_layering.py.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update deps

* widen set-like config fields to accept any non-mapping container (#117)

The template arms for `read`, `write`, `export`, `delete_keys`,
`read_ignore`, and `print` previously combined `frozenset` (a
pass-through marker) and `Sequence(str)` (list-of-strings coercion).
That works for the common YAML/CLI list path but rejects callers
passing a `set` / `tuple` / `frozenset` literal — which is logically
fine for fields whose post-compute value is always a frozenset.

Replaces the per-field unions with `OneOf((set, frozenset, tuple, list))`
(`print` also accepts `str` for the historical phase-char form). The
`_build_settings` boundary already calls `frozenset(...)` on these
values, so any of the four containers normalize correctly.

Also adapts `compute_config`'s helpers — Subview iteration only
supports dict/list source values, so user-supplied set/frozenset/tuple
inputs would error before reaching the template. New `_raw_or_empty`
pulls the Python value via `.get()` and explicitly rejects mappings
with a clear error (dict iteration would silently accept dict input
otherwise). `_parse_print` now accepts a phase-char string OR any
iterable of phase chars.

Path-list fields (`paths`, `import_paths`, `metadata_cli`) keep their
existing `Sequence(...)` form with element-type validation — that
trade-off felt worth keeping.

14 new tests in tests/unit/test_config_container_inputs.py cover the
four container types per field and assert mapping rejection.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* reuse types tuple

* update deps

* 3.0.0 alpha version 0

* update compose for generic gha build

* ReadResults data structure for process functions

* compact news (#119)

* Add skip_metadata flag to get_cover_page (#120)

Callers that only want a thumbnail (e.g. codex's CoverThread) don't
need the full ComicInfo/CoverImage hint resolution. Parsing the
metadata for every cover dominates the cost of cover extraction
and emits a flood of debug-bucket Union ValidationErrors that look
like real failures in DEBUG logs.

When skip_metadata=True, bypass generate_cover_paths entirely and
read archive index 0 directly. This drops per-call schema
instantiation, Union resolution, and path normalization.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v3.0.0a2 (#121)

* compact news

* update deps

* Drop DEBUG-bucket logging of intentionally-ignored validation errors (#122)

ClearingErrorStoreSchema previously split each schema's errors into
two buckets: ignored ones logged at DEBUG, real ones at WARNING.
The DEBUG bucket only ever held errors from ``_ignore_errors`` —
``Field may not be null.`` (sparse-field tolerance) and
``Invalid input type.`` (Union variant misses) — both of which are
internal mechanics, not operator-actionable signal. Each Union miss
emitted one ``ValidationError - {'_schema': ['Invalid input type.']}``
line per field per archive, drowning the genuinely useful per-source
DEBUG messages emitted by ``_except_on_load``.

Filter ignored errors at split time, log only WARNINGs. Real schema
failures still surface with full context (path, schema class,
normalized message). Collapses the dual-bucket _split_*_errors
methods into _filter_* + _log_warnings.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update devenv

* metron: drop URL slugs for types with no public web page (#124)

* compact news

* update deps

* metron: drop broken URL slugs for genre, location, reprint, role, story, tag

Metron has no public web pages for these types — only API endpoints — so
URLs like https://metron.cloud/genre/3 always 404. Stop emitting them.
The numeric Metron ID is still preserved on the identifier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Re-export to_metron_age_rating from comicbox.enums.maps (#125)

Shortens the import path for the helper from
comicbox.enums.maps.age_rating to comicbox.enums.maps so downstream
callers can reach it without drilling into the submodule.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update devenv

* Drop dead code surfaced by skylos scan (#126)

- Remove unused module/class constants: _COMMENT_ARCHIVE_TYPES, SUFFIXES,
  _LOG_FORMAT, comet.py IDENTIFIER_TAG/IS_VERSION_OF_TAG, comictagger.py
  IDENTIFIER_TAG/PAGES_TAG, XmlCountryField (and now-orphaned imports
  RarFile, ZipFile, CountryField).
- Fix latent bug in TrapExceptionsMeta: `attr_name in "deserialize"` was a
  substring check that wrapped any callable whose name was a substring of
  "deserialize" (e.g. "er", "size", "ali"). Use the existing _WRAP_METHODS
  tuple instead so only the exact `deserialize` method is wrapped.
- Simplify _get_pdf_enabled() to a plain `import pdffile` probe; the
  except-arm stub import had no effect.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Replace pdffile_stub with single-shim optional integration (#127)

Consolidate the optional comicbox-pdffile integration into one module
(comicbox/_pdf.py) and delete the hand-maintained pdffile_stub.py.

Previously six call sites each duplicated a `try: from pdffile import X /
except: from pdffile_stub import X` block, and the stub class mirrored
the real PDFFile API method-for-method — silent drift risk every time
upstream pdffile shipped.

Now:
- comicbox/_pdf.py is the single source of truth for PDF_ENABLED,
  PDFFile, and PAGE_FORMAT_VALUES. When pdffile is absent, PDFFile is
  None at runtime; type checkers see the real class via TYPE_CHECKING.
- Every call site that touches PDFFile is gated by `if PDF_ENABLED`.
- The `case PDFFile():` arm in box/archive/archive.py is lifted to an
  `if PDF_ENABLED and isinstance(archive, PDFFile):` guard above the
  match (the match form would fail when PDFFile is None).
- config/__init__.py reads PAGE_FORMAT_VALUES instead of iterating an
  empty stub Enum.

Verified with `pdffile` installed (307/307 tests pass) and in a fresh
venv without it (PDF_ENABLED=False, CBZ archives still work, PDF files
raise UnsupportedArchiveTypeError, CLI shows the "not installed" hint).

Net: -70 lines across 9 files.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* remove unused ty ignore

* comicbox 3 alpha 5 (#123)

* compact news

* update deps

* update news and version to alpha 4

* update deps

* rename function path in NEWS

* bump alpha version to 3.0.0a5

* version 3.0.0

* massage news

* bump version and news and update deps

* require comicbox-pdffile 0.6.x for image-dominant page detection (#131)

* require comicbox-pdffile 0.6.x for image-dominant page detection

Widens the optional ``[pdf]`` extra to require comicbox-pdffile 0.6.x.
The new minor release adds image-dominant page detection (
``PDFFile.classify_page``, ``PDFFile.read_image_if_dominant``,
``PDFFile.read_full_pixmap_jpeg``) used by browser readers to serve
scanned-comic PDF pages as plain ``<img>`` instead of routing through
pdf.js on the client.

comicbox itself doesn't use the new API — the bump is purely a pin
update so downstream callers (Codex, OPDS readers) can adopt it.

The ``[tool.uv.sources]`` block is transient: it points at the
pdffile PR branch so this CI can resolve dependencies before
0.6.x lands on PyPI. Drop it once 0.6.x publishes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* just use the released pdffile

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* update deps

* regenerate pdf page fixtures for pdffile 0.6.x (#132)

Add bin/regenerate-pdf-test-pages.py — drives Comicbox.get_page_by_index
against tests/files/test_pdf.pdf to refresh tests/files/pdf/{N}.pdf when
pymupdf or pdffile change page-extraction output. Run on the next drift.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* bump pdffile to 0.6.1

* bump version and news to 3.0.2

* update deps

* fix initializing pdf vars with no path

* make transforming metron credits more durable

* bump news

* bump version to v3.0.3

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant