curl-prgrs: fix bugs and edge cases#18
Conversation
- print_progress: guard against Content-Length: 0 division by zero.
- initialize_variables: dependency precheck honors $CURL (was hardcoded
to "curl" even when CURL=scurl was set). Use the repo-wide `has`
helper from has.sh.
- shutdown: preserve specific exit codes written by curl_exit (81,
114, 115, 116, real curl codes) instead of always overriding with
the generic 110 when ERR fires. Skip temp-dir cleanup in
backgrounded subshells (BASHPID != $$) so the parent can still read
statusfile. Defensive ${var-} for very-early failures.
- Open FD 4 (initialize_terminal) before installing the trap that
writes diagnostics to >&4.
- scan_curl_arguments: replaces remove_argument_for_header_request.
Single pass over argv. Strips the --name=value form of
--continue-at/--output, plus body-affecting flags (-O, --remote-name,
-J, --remote-header-name, --remote-name-all) that must not run during
the HEAD probe. Also detects --compressed/--tr-encoding/
--compressed-ssh and disables the strict 114/115 size-equality
checks since transparent decompression makes the on-disk size differ
from the server-reported Content-Length.
- handle_unknown_content_length: add CURL_PRGRS_ALLOW_UNKNOWN_LENGTH
opt-in for chunked-transfer / dynamic responses that omit
Content-Length. CURL_PRGRS_MAX_FILE_SIZE_BYTES still bounds the
download.
- HEAD probe: explicitly set curl_prgrs_size_check_enabled=no since
expected_header_size=8000 is a placeholder, not a real prediction,
and would otherwise produce false 114 mismatches when real headers
exceed 8000 bytes.
- Drop dead `last_err` variable.
- Drop CURL_PRGRS_ALLOW_UNKNOWN_LENGTH opt-in. Missing/non-numeric Content-Length now always falls through to the unknown-length code path (no progress bar, no 114/115 checks); the 81 max-filesize cap still bounds the download. curl_exit 116 is no longer reachable. - Collapse the redundant curl_prgrs_size_check_enabled flag into curl_prgrs_length_known. The two were always set together in the unknown-length path, and when --compressed/--tr-encoding/ --compressed-ssh make the on-disk size differ from the reported Content-Length, the progress bar would also be misleading, so it is now suppressed in that case as well. One flag, one source of truth.
- Close the max-filesize race: the TUF endless-data mitigation (81) only ran inside the 1-second polling loop, so a download that completed between iterations could leave an over-limit file on disk and exit 0. Extract enforce_max_filesize() and call it once more after the loop exits on curl's termination. - Add has checks for stecho and stcat (previously assumed present but never validated). Drop has tput: neither curl-prgrs nor draw_progress_bar actually uses tput. - Propagate SIGTERM/SIGINT/SIGHUP to curl. curl_pid is assigned inside the backgrounded curl_download subshell and never reaches the parent, so signals to the script were orphaning the download. The subshell now writes curl_pid to a shared pidfile; the parent's shutdown reads it and kills curl directly. Also capture curl_download_pid in main() so the subshell itself is killed. - Map signals to conventional exit codes (130/143/129 for SIGINT/SIGTERM/SIGHUP) when statusfile has no more specific recording. Previously a SIGTERM mid-download would inherit the 0 written by the preceding HEAD probe, so callers saw "success" despite the user aborting. - Strip body-writing / method-changing flags (-d/--data*, -F/--form*, -T/--upload-file, -X/--request, -G/--get) from the HEAD probe arguments. Without this, -X GET would override the --head we add (turning the probe into a full download), and -d would send a request body with the HEAD. - Remove the dead temp_dir_auto_generated flag; it was always true and never set elsewhere.
Remove the '&' that backgrounded the main curl_download. The only benefit it provided was making 'wait' interruptible by signal traps immediately, instead of waiting up to ~1s for the current 'sleep 1' in the polling loop to return. That 1-second trap-firing latency is an acceptable abort delay. Running curl_download in the parent shell lets curl_pid be directly accessible from the shutdown handler, which removes the need for: - the curl_pidfile helper (and its write/read/validate dance) - the curl_download_pid tracking in main - the wait "$curl_download_pid" / wait_exit_code dance - shutdown's multi-PID processes_list loop A failure in curl_download now propagates via set -e -> ERR trap -> shutdown, which reads the specific status from the statusfile, so the explicit 'exit "$wait_exit_code"' is also gone. Net -27 LOC and one shutdown path instead of two (parent + subshell). The HEAD probe is still a $(...) subshell, so the BASHPID guard on temp-dir cleanup is retained.
The HEAD probe used '$(...)' command substitution to capture curl's
--write-out output into curl_prgrs_content_length. That spawned a
subshell in which curl_pid was local, so signal handling differed
from the main-download path and required the BASHPID-vs-$$ gate on
temp-dir cleanup.
Redirect curl_download's stdout into a small file in the temp dir
instead, and read it with the 'read' builtin (in the parent shell,
no subshell). 'read' returns 1 when the file has no trailing
newline (curl --write-out omits it), so '|| true' keeps set -e
happy; the captured content is still assigned. Also guard against
a missing file with test -f.
Refactor the two curl_pid dead-or-alive guards from the
':=""' / non-empty / kill-0 triple-nested form to a single
'test -n "${curl_pid-}" && kill -0 ...' expression.
- Fix TUF bypass. Previously -o/--output/-O/--remote-name/-J/
--remote-header-name/--remote-name-all were only stripped from the
HEAD probe. The main download still received them, so a caller
could pass -O and curl would write the body to a URL-derived
filename that our stat-based 81 max-filesize watchdog never saw,
defeating the TUF endless-data mitigation entirely.
scan_curl_arguments now builds two arrays -- header_arguments
and download_arguments -- and strips output-writing flags from
both. main() pins --output "$CURL_OUT_FILE" ahead of user args so
curl always writes to the file we stat. Verified: passing -O with
a max=100 KB and a 10 MB response now exits 81 instead of
silently downloading the full body.
- Bound Content-Length with sanitize-string. A server-supplied
Content-Length was format-checked (is_whole_number) but not
length-capped, so a 50-digit Content-Length would cause bash's
`[ X -gt Y ]` to error on int64 overflow, DoSing via the ERR
trap. sanitize-string -- 18 caps to ~1 EB, safely within int64.
Verified: 50-nines Content-Length now completes gracefully (115
mismatch against the truncated body) instead of crashing.
- Also strip --write-out / -w from the HEAD probe so a caller's
--write-out cannot win over our '%header{Content-Length}'.
- Uppercase signal tokens (SIGINT/SIGTERM/SIGHUP/ERR/EXIT) to match
bash's conventional signal names, everywhere: trap commands,
shutdown()'s if/case labels.
- Add `--` end-of-options on stat, which is caller-controlled via
$CURL_OUT_FILE. (safe-rm, kill, mktemp, sanitize-string already
use it or take no untrusted positional.)
- Add missing has checks for stat, sleep, sanitize-string.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 33fb83b300
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| -G|--get) | ||
| keep_head=false |
There was a problem hiding this comment.
Preserve query-building flags for the HEAD probe
Removing -G/--get from header_arguments makes the HEAD probe target a different request than the actual download when callers use -G --data ... (curl documents -G as “Put the post data in the URL and use GET”). In that case the probe can read a Content-Length for the wrong resource, and the later size checks (114/115) can fail a valid download because they compare against an unrelated length.
Useful? React with 👍 / 👎.
to "curl" even when CURL=scurl was set). Use the repo-wide
hashelper from has.sh.
114, 115, 116, real curl codes) instead of always overriding with
the generic 110 when ERR fires. Skip temp-dir cleanup in
backgrounded subshells (BASHPID != $$) so the parent can still read
statusfile. Defensive ${var-} for very-early failures.
writes diagnostics to >&4.
Single pass over argv. Strips the --name=value form of
--continue-at/--output, plus body-affecting flags (-O, --remote-name,
-J, --remote-header-name, --remote-name-all) that must not run during
the HEAD probe. Also detects --compressed/--tr-encoding/
--compressed-ssh and disables the strict 114/115 size-equality
checks since transparent decompression makes the on-disk size differ
from the server-reported Content-Length.
opt-in for chunked-transfer / dynamic responses that omit
Content-Length. CURL_PRGRS_MAX_FILE_SIZE_BYTES still bounds the
download.
expected_header_size=8000 is a placeholder, not a real prediction,
and would otherwise produce false 114 mismatches when real headers
exceed 8000 bytes.
last_errvariable.