Skip to content

Commit ead324c

Browse files
authored
[Docker] Fix ssh zombie processes issue (#3295)
SSHTunnel.open() starts the ssh process with -f option, which causes it to become a daemon (it calls daemon(3) after authentication). Then, SSHTunnel.close() uses `ssh -O exit` to ask that daemon process to exit (the processes communicate via the control socket, -S option). The daemon exits, and then PID 1 should reap it. Before #3165, PID 1 was bash, which handles SIGCHLD, properly wait(2)ing children, but in that pull request exec was added, making `dstack server` PID 1. dstack does nothing with SIGCHLD (neither handles nor explicitly ignores it), thus the default disposition (ignore the signal but do not discard children) leads to an evergrowing number of unreaped zombies. Fixes: #3291
1 parent 14db582 commit ead324c

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

docker/server/entrypoint.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ fi
1010
DB_PATH="${HOME}/.dstack/server/data/sqlite.db"
1111
mkdir -p "$(dirname "$DB_PATH")"
1212
if [[ -z "${LITESTREAM_REPLICA_URL}" ]]; then
13-
exec dstack server --host 0.0.0.0
13+
dstack server --host 0.0.0.0
1414
else
1515
if [[ ! -f "$DB_PATH" ]]; then
1616
echo "Attempting Litestream restore..."
@@ -23,5 +23,5 @@ else
2323
fi
2424
fi
2525
fi
26-
exec litestream replicate -exec "dstack server --host 0.0.0.0" "$DB_PATH" "$LITESTREAM_REPLICA_URL"
26+
litestream replicate -exec "dstack server --host 0.0.0.0" "$DB_PATH" "$LITESTREAM_REPLICA_URL"
2727
fi

0 commit comments

Comments
 (0)