Restore append_list dedup lost in d788e9f refactor#767
Merged
Conversation
6 tasks
pull Bot
pushed a commit
to sysfce2/internetarchive
that referenced
this pull request
May 16, 2026
Previously --glob accepted a single string and split on `|` to form
multiple patterns. Users would naturally reach for `--glob A --glob B`,
which argparse silently truncated to the last value -- files were
missed without warning. This change accepts all four forms consistently:
--glob "a" -> match a
--glob "a|b" -> match a or b (unchanged)
--glob a --glob b -> match a or b (new)
--glob "a|b" --glob c -> match a, b, c (new)
The same applies to --exclude on `ia download` and --glob on
`ia delete` / `ia list`.
Implementation:
- Added _flatten_pipe_patterns() in item.py to normalize a glob arg
(str or list[str], elements optionally `|`-separated) into a flat
list. Item.get_files() now uses it for both glob_pattern and
exclude_pattern, so API callers passing mixed forms like
glob_pattern=['*.mp4|*.xml', '*.jpg'] now work.
- Switched the three CLI flags to `nargs=1, action="extend"` -- the
same pattern already used by --format, --source, --exclude-source.
args.glob / args.exclude are now list[str] | None; downstream code
was already accepting both shapes via Item.get_files().
- ia_list's local pipe-split was updated to flatten the new list shape
via itertools.chain.
Tests cover the four call styles in tests/cli/test_ia_download.py
using the in-process ia_call + IaRequestsMock + --dry-run pattern
(offline). API tests in tests/test_item.py extend
test_get_files_by_glob{,_with_exclude} to cover the new mixed-form
inputs.
Also includes carried-over 5.9.0 release scaffolding (changelog
entries for jjjake#753, jjjake#767, jjjake#768 and version bump to 5.9.0.dev1) that was
present in the working tree before branching from master.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The
prepare_metadatarefactor in d788e9f dropped the duplicate check when appending to list fields viaappend_list=True. Repeatedmodify_metadatacalls withappend_list=Truewould accumulate duplicate values. This restores the original pre-refactor behavior: skip appending if the value is already present in the list.