Describe the bug
@jakepetroules tells me (and I've looked at the code which seems to confirm) that at the moment, Subprocess makes the flawed assumption that killing the child process will EOF the file descriptors inherited into the child.
This assumption is not correct.
Consider the following scenario:
- Parent Process
P uses Subprocess to spawn a child process A streaming its output
A starts up and starts another child B (e.g. using fork()) inheriting over the file descriptor
A never writes anything to stdout
B sleeps for 1 hour and then writes hello world and exits
P now wants to abort its subprocess so it uses cancellation
- Cancellation will now kill
A even if A isn't responsive
- Subprocess however is still reading from the pipe inherited into
A and is waiting for EOF
- EOF won't arrive for an hour because that file desriptor is now in
B which is a process completely unknown to P
What needs to happen is what AsyncProcess does: The reading/writing needs to be cancelled too.
To Reproduce
Three (one for stdin, one for stdout and one for stderr) test cases similar to
func testWeDoNotHangIfStandardInputRemainsOpenButProcessExits() async throws {
// This tests an odd situation: The child exits but stdin is still not closed, mostly happens if we inherit a
// pipe that we still have another writer to.
var sleepPidToKill: CInt?
defer {
if let sleepPidToKill {
self.logger.debug(
"killing our sleep grand-child",
metadata: ["pid": "\(sleepPidToKill)"]
)
kill(sleepPidToKill, SIGKILL)
} else {
XCTFail("didn't find the pid of sleep to kill")
}
}
do { // We create a scope here to make sure we can leave the scope without hanging
let (stdinStream, stdinStreamProducer) = AsyncStream.makeStream(of: ByteBuffer.self)
let exe = ProcessExecutor(
executable: "/bin/sh",
[
"-c",
#"""
# This construction attempts to emulate a simple `sleep 12345678 < /dev/null` but some shells (eg. dash)
# won't allow stdin inheritance for background processes...
exec 2>&- # close stderr
exec 2<&0 # duplicate stdin into fd 2 (so we can inherit it into sleep
( # this creates a child of our child which gets inherited our fds
exec 0<&2 # map the duplicated fd 2 as our stdin
exec 2>&- # close the duplicated fd2
exec sleep 12345678 # sleep (this will now have the origin stdin as its stdin)
) & # uber long sleep that will inherit our stdin pipe
exec 2>&- # close duplicated 2
read -r line
echo "$line" # write back the line
echo "$!" # write back the sleep
exec >&-
exit 0
"""#
],
standardInput: stdinStream
)
stdinStreamProducer.yield(ByteBuffer(string: "GO\n"))
stdinStreamProducer.yield(ByteBuffer(repeating: 0x42, count: 16 * 1024 * 1024))
async let resultAsync = exe.runWithExtendedInfo()
async let stderrAsync = Array(exe.standardError)
var stdoutLines = await exe.standardOutput.splitIntoLines().makeAsyncIterator()
let lineGo = try await stdoutLines.next()
XCTAssertEqual(ByteBuffer(string: "GO"), lineGo)
let linePid = try await stdoutLines.next().map(String.init(buffer:))
let sleepPid = try XCTUnwrap(linePid.flatMap { CInt($0) })
self.logger.debug("found our sleep grand-child", metadata: ["pid": "\(sleepPid)"])
sleepPidToKill = sleepPid
let stderrBytes = try await stderrAsync
XCTAssertEqual([], stderrBytes)
let result = try await resultAsync
XCTAssertEqual(.exit(0), result.exitReason)
XCTAssertNotNil(result.standardInputWriteError)
XCTAssertEqual(ChannelError.ioOnClosedChannel, result.standardInputWriteError as? ChannelError)
stdinStreamProducer.finish()
}
}
should do the trick
Expected behavior
When a Subprocess.run gets cancelled, it should return control back to the caller within a short and reasonable, user-controllable amount of time. It must cancel the read/writes that communicate on the file descriptors inherited down.
Describe the bug
@jakepetroules tells me (and I've looked at the code which seems to confirm) that at the moment, Subprocess makes the flawed assumption that killing the child process will EOF the file descriptors inherited into the child.
This assumption is not correct.
Consider the following scenario:
PusesSubprocessto spawn a child processAstreaming its outputAstarts up and starts another childB(e.g. usingfork()) inheriting over the file descriptorAnever writes anything to stdoutBsleeps for 1 hour and then writeshello worldand exitsPnow wants to abort its subprocess so it uses cancellationAeven ifAisn't responsiveAand is waiting for EOFBwhich is a process completely unknown toPWhat needs to happen is what
AsyncProcessdoes: The reading/writing needs to be cancelled too.To Reproduce
Three (one for stdin, one for stdout and one for stderr) test cases similar to
should do the trick
Expected behavior
When a Subprocess.run gets cancelled, it should return control back to the caller within a short and reasonable, user-controllable amount of time. It must cancel the read/writes that communicate on the file descriptors inherited down.