ext/curl: add CURLOPT_SEEKFUNCTION#22230
Open
GrahamCampbell wants to merge 1 commit into
Open
Conversation
1b5ebea to
ca985de
Compare
Expose libcurl's CURLOPT_SEEKFUNCTION as a userland callable so a request body streamed via CURLOPT_READFUNCTION can be rewound and resent when libcurl needs to replay it: on a redirect, on multi-pass authentication (NTLM/Negotiate), or when a reused connection drops after bytes have been sent. Without a seek callback those transfers fail with CURLE_SEND_FAIL_REWIND, the gap behind bug #47204 and bug #80518. The callback receives the CurlHandle, the offset and the origin (SEEK_SET, SEEK_CUR or SEEK_END) and returns one of CURL_SEEKFUNC_OK, CURL_SEEKFUNC_FAIL or CURL_SEEKFUNC_CANTSEEK. The implementation mirrors the existing callback options: a seek FCC on php_curl_handlers, a curl_seek trampoline that validates the return value the way curl_prereqfunction does, registration through the HANDLE_CURL_OPTION_CALLABLE macro, and handling in init, setopt, copy (curl_copy_handle), reset (curl_reset), free and the cycle collector (curl_get_gc). The option and constants have existed since libcurl 7.18.0, below the 7.61.0 configure floor, so no version guard is needed. Tests cover the happy-path rewind across a redirect, copy-handle inheritance, callable validation, and the callback error paths.
ca985de to
6c293e4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Today PHP lets you stream a request body from userland through CURLOPT_READFUNCTION, but it never gives libcurl a matching seek callback for that body. That's fine right up until libcurl needs to rewind the upload and send it again, which happens more often than you'd think: on a 307 or 308 redirect, during multi-pass authentication like NTLM or Negotiate, or when a reused keep-alive connection dies after some bytes have already gone out. With no way to rewind, the transfer just fails with CURLE_SEND_FAIL_REWIND (curl error 65). This is the gap behind the very old https://bugs.php.net/bug.php?id=47204, open since 2009, and the more recent https://bugs.php.net/bug.php?id=80518.
libcurl has supported CURLOPT_SEEKFUNCTION since 7.18.0; we just never exposed it for the read-callback body. The only seek callback we register internally is the one on the curl_mime/CURLFile path. So every userland HTTP client has had to work around this itself. Guzzle catches error 65 (and the older errno-0 "silent" variant of the same problem) and rewinds the body in PHP before re-issuing the request, and Symfony's HttpClient forces CURLOPT_FORBID_REUSE for NTLM because, as its own comment puts it, reusing those connections needs seeking capability that only string bodies have. Letting people set a seek callback means libcurl can just rewind and resend on its own.
So this exposes CURLOPT_SEEKFUNCTION. You give it a callable that receives the CurlHandle, the offset and the origin (SEEK_SET, SEEK_CUR or SEEK_END), and returns CURL_SEEKFUNC_OK when it has repositioned the body, CURL_SEEKFUNC_FAIL to fail the transfer, or CURL_SEEKFUNC_CANTSEEK when it can't seek and wants to let libcurl deal with it. In practice it looks like this:
The implementation follows the existing callback options as closely as I could. php_curl_handlers gets a seek fcc field next to the other callbacks, a curl_seek trampoline forwards the call into userland and validates the return value the same way curl_prereqfunction does, registration goes through the usual HANDLE_CURL_OPTION_CALLABLE macro (which also points CURLOPT_SEEKDATA at the handle), and the callback is duplicated in curl_copy_handle and released with everything else. When no callback is set, or a callback misbehaves, the trampoline defaults to CURL_SEEKFUNC_CANTSEEK so libcurl never ends up resending from the wrong offset. No version guards are needed, since the option and its return constants have all been around since libcurl 7.18.0, comfortably below our 7.61.0 floor.
Finally, I've implemented a proof of concept cleanup of the relevant Guzzle code that would benefit from this feature at guzzle/guzzle@c2690b5.