Skip to content

Restore append_list dedup lost in d788e9f refactor#767

Merged
jjjake merged 2 commits into
masterfrom
fix-append-list-dedupe-regression
May 4, 2026
Merged

Restore append_list dedup lost in d788e9f refactor#767
jjjake merged 2 commits into
masterfrom
fix-append-list-dedupe-regression

Conversation

@jjjake
Copy link
Copy Markdown
Owner

@jjjake jjjake commented May 4, 2026

The prepare_metadata refactor in d788e9f dropped the duplicate check when appending to list fields via append_list=True. Repeated modify_metadata calls with append_list=True would accumulate duplicate values. This restores the original pre-refactor behavior: skip appending if the value is already present in the list.

@jjjake jjjake merged commit e8ef059 into master May 4, 2026
23 checks passed
pull Bot pushed a commit to sysfce2/internetarchive that referenced this pull request May 16, 2026
Previously --glob accepted a single string and split on `|` to form
multiple patterns. Users would naturally reach for `--glob A --glob B`,
which argparse silently truncated to the last value -- files were
missed without warning. This change accepts all four forms consistently:

    --glob "a"               -> match a
    --glob "a|b"             -> match a or b   (unchanged)
    --glob a --glob b        -> match a or b   (new)
    --glob "a|b" --glob c    -> match a, b, c  (new)

The same applies to --exclude on `ia download` and --glob on
`ia delete` / `ia list`.

Implementation:

- Added _flatten_pipe_patterns() in item.py to normalize a glob arg
  (str or list[str], elements optionally `|`-separated) into a flat
  list. Item.get_files() now uses it for both glob_pattern and
  exclude_pattern, so API callers passing mixed forms like
  glob_pattern=['*.mp4|*.xml', '*.jpg'] now work.

- Switched the three CLI flags to `nargs=1, action="extend"` -- the
  same pattern already used by --format, --source, --exclude-source.
  args.glob / args.exclude are now list[str] | None; downstream code
  was already accepting both shapes via Item.get_files().

- ia_list's local pipe-split was updated to flatten the new list shape
  via itertools.chain.

Tests cover the four call styles in tests/cli/test_ia_download.py
using the in-process ia_call + IaRequestsMock + --dry-run pattern
(offline). API tests in tests/test_item.py extend
test_get_files_by_glob{,_with_exclude} to cover the new mixed-form
inputs.

Also includes carried-over 5.9.0 release scaffolding (changelog
entries for jjjake#753, jjjake#767, jjjake#768 and version bump to 5.9.0.dev1) that was
present in the working tree before branching from master.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant