Allow running processes in such a way that stop signals are preserved#3461
Allow running processes in such a way that stop signals are preserved#3461rabidaudio wants to merge 1 commit intoconvox:masterfrom
Conversation
|
I wasn't totally correct when I said import signal
import sys
import time
import os
signals = [
signal.SIGINT, signal.SIGTERM,
signal.SIGQUIT, signal.SIGTRAP, signal.SIGUSR1,
signal.SIGHUP, signal.SIGSEGV, signal.SIGUSR2,
signal.SIGILL,
]
def handler(signum, frame):
s = [s for s in signals if s.value == signum]
print(f"{os.getpid()} got signal: {s}")
sys.exit(0)
def main():
for s in signals:
signal.signal(s, handler)
print("running....")
while True:
time.sleep(5)
if __name__ == "__main__":
main()FROM python:3
ENV PYTHONUNBUFFERED=true
COPY test.py test.py
CMD ["python3", "test.py"]
$ sh -c 'python3 test.py' &
[1] 15280
running....
$ kill 15280
15280 got shutdown: 15
[1]+ Done sh -c 'python3 test.py'... and without $ docker build -t signal-test .
$ docker run -d -t signal-test python3 test.py
e174b1b026812cfe8c10223de854210cd183b8de0bc36cbc24f3cd356dfc27a4
$ docker stop e174b1b026812cfe8c10223de854210cd183b8de0bc36cbc24f3cd356dfc27a4
e174b1b026812cfe8c10223de854210cd183b8de0bc36cbc24f3cd356dfc27a4
$ docker logs e174b1b026812cfe8c10223de854210cd183b8de0bc36cbc24f3cd356dfc27a4
running....
1 got shutdown: 15...but running $ docker run -d -t signal-test sh -c 'python3 test.py'
e605023a84f3215c77bb31619b5a2879e867d389c781c84b38b61f134b717095
$ docker stop e605023a84f3215c77bb31619b5a2879e867d389c781c84b38b61f134b717095
# hangs for 30 seconds waiting for process to exit due to SIGTERM, then sends SIGKILL
e605023a84f3215c77bb31619b5a2879e867d389c781c84b38b61f134b717095
$ docker logs e605023a84f3215c77bb31619b5a2879e867d389c781c84b38b61f134b717095
running.... |
|
Okay I feel a little silly now. The problem is on Docker Linux environments RUN ln -f $(which bash) $(which sh)This then works for timers too. |
@rabidaudio would that have to be done in each app's Dockerfile? |
|
@Twsouza yes, we put this at the beginning of all of our dockerfiles now |
One-off processes, including
convox run,convox exec, and timers don't pass stop signals to processes properly.For exec and non-detached runs, this makes sense, as Convox runs a
sleep {timeout}command in the container and then attaches a secondary process to it (a ladocker exec). But for timers and detached runs, SIGTERM is never received by the process. The problem stems from the use of["sh", "-c", command], assh -cdoes not propagate signals to apps.This violates 12 factor as processes aren't given the opportunity to run any cleanup tasks. While apps should also be robust against the occasional SIGTERM, Convox should do it's best to allow more graceful stoppage.
Our use-case revolves around spinning up ETL pipelines on regular intervals, but really any worker process use-case that that wants to use Convox's autoscaling features to create processes on-demand will benefit from this.
This change allows you to disable the shell wrapper for detached runs. Simply removing
sh -cwould likely be a breaking change for users who have come to expect their commands to be run in a shell, so instead I put the behavior behind a flag. I called is flag "bare" but I don't love the name, perhaps[--no]-shell,--shell=falseor similar would be better. Alternatively, we could use a rack parameter, or we could make this the default behavior going forward with a version check. Looking for guidance on this.I only added support to the
awsprovider, as thek8sprovider (andlocalby extension) already behave this way.Note this MR does not fix the same issue for timers, as I wasn't sure of a good backwards-compatible solution.
By the way, detached runs don't actually support the timeout option. This could be resolved by using the unix timeout command. I could add that to this MR also.