Conversation
archive gets an _optimize_in_place_on_disk flag container copy_skipped_files turns into a hydrate_optimized paths zip deletes files it would replace on disk and closes the file before appending the new files.
…aise if it calls optimize.
…into handler and use it exclusively
commit 83a5253785fccc471a6dbd75b4d1eba3074c9e8c
Author: AJ Slater <aj@slater.net>
Date: Wed Apr 29 02:18:19 2026 -0700
bump version and news
commit 9b03aed
Author: AJ Slater <aj@slater.net>
Date: Tue Apr 28 20:44:04 2026 -0700
update treestamp loggins
commit 1b7bdc7
Author: AJ Slater <aj@slater.net>
Date: Tue Apr 28 20:32:59 2026 -0700
fix implementation of timestamp loggins to log after not before
commit 8d25937
Author: AJ Slater <aj@slater.net>
Date: Tue Apr 28 20:25:28 2026 -0700
Log timestamp load/dump and surface INFO at default verbose=1 (#111)
Treestamps 4 dropped its own load/dump prints, and picopt's new logger
mapped INFO to verbose>=2, so default runs went silent for messages the
old termcolor Printer always showed (config-style force_verbose=True).
- _VERBOSE_LEVEL: bump so verbose=1 (the argparse default) emits INFO,
matching the old printer's "force_verbose" tier.
- walk.py: log "Loading timestamps for: …" before Grovestamps init and
"Dumping timestamps for: …" before dumpf(). INFO renders cyan via the
existing LEVEL_STYLES.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
commit 4415376
Author: AJ Slater <aj@slater.net>
Date: Tue Apr 28 18:41:23 2026 -0700
Pre-walk file count for a determinate progress bar (#110)
* Pre-walk file count for a determinate progress bar
Without a total the bar showed only an indeterminate spinner + count.
Add ``Walk._count_total`` that mirrors ``walk_file``'s recursion gate
(symlinks, timestamp filenames, ignore patterns, recurse flag) so each
non-recursing visit contributes one mark — matching the events the
scheduler dispatches through Reporter — and pass it as ``total=`` to
``make_progress``.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Use os.scandir for the pre-walk count
DirEntry caches ``is_dir`` / ``is_symlink`` from the directory listing,
so deep trees skip an extra ``stat`` per entry — meaningful on slow or
network filesystems.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
commit 6a46d43
Author: AJ Slater <aj@slater.net>
Date: Tue Apr 28 17:48:59 2026 -0700
Replace termcolor Printer with loguru + rich logger (#109)
The old Printer class wrote dots and messages directly to stdout from
both the main process and worker processes. This change discards it for
a centralized logger modeled on nudebomb's progress branch:
- New picopt/log/ package: shared rich Console, loguru sink, a streaming
CharStreamColumn progress bar, a Stats + render() summary, and a
Reporter that bundles them and dispatches each ReportStats outcome.
- All call sites converted: Printer.{saved,converted,lost,error,skip,
warn,config,...} → logger.* and progress.mark_*. Worker-side dot and
lifecycle calls dropped — workers can't reach the parent's live region,
so per-file progress is now driven from the scheduler when each result
comes back.
- Centralized MARKS table in picopt/log/styles.py drives the streaming
chars, the loguru sink colors, the summary table row colors, and the
--help epilogue legend, so the same outcome reads identically
everywhere. Style choices mirror the old termcolor palette so longtime
users see the same colors for the same outcomes.
- Scheduler now takes a Reporter; Totals removed. report.py is a pure
data class (ReportStats) with no printer dependency.
- doctor.py and the cli help epilogue rewritten with rich (rich.markup
escape() for path strings).
- Grovestamps now constructed with verbose=0 so treestamps's internal
printer doesn't bypass the rich Live region.
- loguru~=0.7 added; termcolor dropped.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
commit 03cdcb0
Author: AJ Slater <aj@slater.net>
Date: Tue Apr 28 17:34:02 2026 -0700
move set_jpeg_xmp into jpeg plugin
commit b283374
Author: AJ Slater <aj@slater.net>
Date: Tue Apr 28 17:06:50 2026 -0700
update devenv. treestamps 4, rich 15
) (#115) Confuse 2.2.0 made ``AttrDict`` a properly-parameterized generic; an unparameterized ``AttrDict`` resolves to ``AttrDict[str, object]``, so every downstream access — ``cfg.bigger``, ``int(cfg.jobs)``, ``cfg.paths`` flowing into ``Iterable[str]``, etc. — now reads as ``object`` and stops type-checking. We had ~31 basedpyright errors after the bump, all rooted at AttrDict consumers. Fix: keep using confuse for what it's good at — YAML / env / CLI parsing, type coercion, requiredness — and convert the validated ``AttrDict`` into a typed frozen dataclass once. Every downstream module then takes ``PicoptSettings`` instead of ``AttrDict``. - ``picopt/config/settings.py``: new ``PicoptSettings`` (and nested ``ComputedSettings`` / ``IgnorePatterns``) frozen dataclasses mirroring the existing ``MappingTemplate`` schema. ``Sequence(...)`` fields are ``tuple[X, ...]`` because the dataclass is frozen. ``handler_stages`` is typed ``dict[Any, Any]`` to avoid an import cycle with ``picopt.plugins.base``; consumers know its concrete shape. - ``picopt/config/__init__.py``: ``PicoptConfig.get_config`` now returns ``PicoptSettings``. New ``_settings_from_attrdict`` does the field-by-field copy (typed ``Any`` so it can read AttrDict attributes directly — the conversion *is* the schema escape hatch). - All consumers (``picopt/path.py``, ``picopt/report.py``, ``picopt/walk/*``, ``picopt/plugins/base/{handler,container}.py``) replace ``from confuse(.templates) import AttrDict`` with ``from picopt.config.settings import PicoptSettings``. - ``Walk._init_timestamps`` builds a shallow ``Mapping[str, Any]`` for Grovestamps' ``program_config=`` so we don't have to serialize the whole frozen dataclass (especially the computed sub-fields, which carry ``re.Pattern`` and class-keyed ``dict`` objects). - ``is_path_ignored`` switched from ``ignore and bool(...)`` to ``ignore is not None and bool(...)`` — its return type is now honestly ``bool``, not ``bool | re.Pattern | None``. - ``Walk._dump_timestamps`` joins the dumpf return paths via ``str(p)`` (treestamps' ``dumpf`` returns ``tuple[Path, ...]``). pyproject pin: ``confuse~=2.1.0,<2.2.0`` → ``confuse~=2.2.0``. Verification: ruff clean, basedpyright clean (0 errors, was 31), ``uv run pytest --deselect …test_timestamp_parents`` 149 passed (deselected test is a pre-existing treestamps v4 regression), ``radon cc`` all A/B, CLI smoke run with env var + CLI flag still layers correctly. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two warnings flagged by `make typecheck complexity`:
- ``picopt/config/__init__.py``: confuse 2.2.0 already types
``config.get(MappingTemplate)`` as ``AttrDict[str, object]``, so the
defensive ``isinstance(ad, AttrDict)`` check after it was statically
always-true (reportUnnecessaryIsInstance). Drop it and the now-unused
``AttrDict`` import.
- ``picopt/walk/scheduler.py``: ``_submit_ready_job`` was rank C in
radon. Split into three:
* ``_drop_cancelled_ready_job`` — counter cleanup for a leaf of a
cancelled subtree (UnpackJob/RepackJob skip; only leaf jobs
decrement the parent counter).
* ``_track_submitted_job`` — record an in-flight future under the
right map and update node state.
* ``_submit_ready_job`` — orchestrate: pop, cancelled-skip, submit,
track.
``_submit_ready_job`` is now rank A; the new helpers are A/B; behavior
unchanged.
Verified: ``make typecheck`` 0/0/0; ``make complexity`` no findings;
``uv run ruff check picopt/`` clean; ``uv run pytest`` 150 passed,
6 skipped.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch picopt's CLI to RawDescriptionRichHelpFormatter so the help output is colorized. A PicoptHelpFormatter subclass adds one extra highlight regex matching every registered format string (PNG, ZIP, WEBP, …) under the `metavar` named group, so format names mentioned inside help text share the color of their corresponding FORMATS / EXTRA_FORMATS / CONVERT_TO metavars. Rewrite the dot-color-key / doctor-mode epilog as Rich markup instead of capturing ANSI from a Console. rich-argparse renders descriptions and epilogs through `console.use_theme(Theme(self.styles))`, so the epilog can use [argparse.groups], [argparse.prog], and [argparse.args] to keep its section header, program name, and `doctor` subcommand visually consistent with how the rest of the help is rendered. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
detect_format eagerly iterated every frame of every animated image to populate info["durations"], which only WebPMuxAnimatedLossless ever consumed -- and that handler already had a webpmux-based fallback (_read_durations) for when PIL's frame iteration was unreliable. Removing the eager extraction speeds up format detection on animated images (animated PNG ~11x, animated GIF ~4x in local microbenchmarks) and simplifies the webp.walk path to always use the webpmux subprocess fallback, which is the more reliable source per the original "PIL frequently fails to populate per-frame durations on WebP" comment. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PdfDetector was opening every file and reading 1024 bytes for every detection attempt -- including non-PDF files where it would read, fail the magic check, and return None. On a directory of non-image non-archive files, this was ~14us per file (88% of the non-PIL detector chain cost). PathInfo now lazily reads up to 4 KB of file head into _header_bytes on first access. PdfDetector consumes path_info.header_bytes() instead of reopening the file. Local benchmark of running the full non-PIL detector chain on non-image files (pyproject.toml, README.md, uv.lock, Makefile): ~16us before -> ~2us after (8x). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…#121) PIL's verify() closes its internal fp for some Path-opened formats (notably GIF), which then breaks lazy attrs like is_animated and n_frames. The previous code worked around this by opening the file twice -- once to verify, again to read format/info. Reading the lazy attrs before verify() preserves the safety check (verify still runs and propagates corruption errors via the same exception path) while collapsing to a single Image.open. Confirmed on Pillow 12.2 against GIF (still + animated), TIFF (single + multi- frame), MPO, animated PNG, animated WebP, JPEG, BMP, PPM. _extract_image_info per-file (warm cache, microbenchmark): test_animated_gif.gif 165us -> 95us test_animated_png.png 145us -> 113us test_animated_webp.webp 66us -> 30us test_png.png 74us -> 35us Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
handler time. Speeds up other animated images.