Skip to content

Commit 6c825da

Browse files
committed
[Docker] Fix ssh zombie processes issue
SSHTunnel.open() starts the ssh process with -f option, which causes it to become a daemon (it calls daemon(3) after authentication). Then, SSHTunnel.close() uses `ssh -O exit` to ask that daemon process to exit (the processes communicate via the control socket, -S option). The daemon exits, and then PID 1 should reap it. Before #3165, PID 1 was bash, which handles SIGCHLD, properly wait(2)ing children, but in that pull request exec was added, making `dstack server` PID 1. dstack does nothing with SIGCHLD (neither handles nor explicitly ignores it), thus the default disposition (ignore the signal but do not discard children) leads to an evergrowing number of unreaped zombies. Fixes: #3291
1 parent 14db582 commit 6c825da

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

docker/server/entrypoint.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ fi
1010
DB_PATH="${HOME}/.dstack/server/data/sqlite.db"
1111
mkdir -p "$(dirname "$DB_PATH")"
1212
if [[ -z "${LITESTREAM_REPLICA_URL}" ]]; then
13-
exec dstack server --host 0.0.0.0
13+
dstack server --host 0.0.0.0
1414
else
1515
if [[ ! -f "$DB_PATH" ]]; then
1616
echo "Attempting Litestream restore..."
@@ -23,5 +23,5 @@ else
2323
fi
2424
fi
2525
fi
26-
exec litestream replicate -exec "dstack server --host 0.0.0.0" "$DB_PATH" "$LITESTREAM_REPLICA_URL"
26+
litestream replicate -exec "dstack server --host 0.0.0.0" "$DB_PATH" "$LITESTREAM_REPLICA_URL"
2727
fi

0 commit comments

Comments
 (0)