Skip to content

Commit fedd403

Browse files
committed
Enhance Docker setup documentation by providing detailed explanations for the backend/Dockerfile and entrypoint.sh scripts. The updates clarify the build process, environment setup, dependency management with Poetry, and the entrypoint logic for handling service commands. This aims to improve understanding and usability for developers working with the SecuLite platform.
1 parent 48c4181 commit fedd403

1 file changed

Lines changed: 215 additions & 2 deletions

File tree

docs/platform_plan/05_docker_setup.md

Lines changed: 215 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -458,13 +458,226 @@ With this, the definition of all services, volumes, and networks for the `docker
458458

459459
## 4. `backend/Dockerfile` Details
460460

461-
*(This section will detail the steps: base Python image, working directory, environment variables like `PYTHONUNBUFFERED`, installing dependencies from `requirements.txt`, copying app code, and user setup. CMD or ENTRYPOINT will refer to `entrypoint.sh`.)*
461+
This Dockerfile is responsible for creating the image for the Django backend, Celery worker, and Celery beat services. It sets up the Python environment, installs dependencies, and copies the application code.
462+
463+
```dockerfile
464+
# Stage 1: Base Python image and core dependencies
465+
FROM python:3.11-slim-bullseye AS base
466+
467+
# Set environment variables to ensure Python output is sent straight to terminal
468+
ENV PYTHONDONTWRITEBYTECODE 1
469+
ENV PYTHONUNBUFFERED 1
470+
471+
# Set the working directory in the container
472+
WORKDIR /app
473+
474+
# Install system dependencies required for Python packages (e.g., psycopg2, Pillow)
475+
# Keep this minimal; add more as identified by package installation failures.
476+
RUN apt-get update && \
477+
apt-get install -y --no-install-recommends \
478+
build-essential \
479+
libpq-dev \
480+
# For Pillow (if image processing is needed)
481+
# libjpeg-dev zlib1g-dev libtiff-dev libfreetype6-dev liblcms2-dev libwebp-dev \
482+
# For other common Python packages
483+
# libffi-dev libssl-dev \
484+
&& apt-get clean && \
485+
rm -rf /var/lib/apt/lists/*
486+
487+
# Install Poetry (Python package manager)
488+
# Using a specific version for reproducibility
489+
ARG POETRY_VERSION=1.7.1
490+
RUN pip install "poetry==${POETRY_VERSION}"
491+
492+
# Copy only the files necessary for installing dependencies with Poetry
493+
COPY ./backend/poetry.lock ./backend/pyproject.toml /app/
494+
495+
# Install project dependencies using Poetry
496+
# --no-root: Do not install the project itself as editable, only its dependencies
497+
# --no-dev: Do not install development dependencies (for a leaner production image)
498+
RUN poetry install --no-root --no-dev
499+
500+
# Stage 2: Application code and runtime setup (can be same as base or a new stage)
501+
# For simplicity here, we continue from the base stage.
502+
# FROM base AS runtime
503+
504+
# Copy the rest of the backend application code into the working directory
505+
COPY ./backend /app/
506+
507+
# Create a non-root user for running the application for better security
508+
# ARG APP_USER=appuser
509+
# RUN useradd -ms /bin/bash ${APP_USER}
510+
# RUN chown -R ${APP_USER}:${APP_USER} /app
511+
# USER ${APP_USER}
512+
# Note: If using a non-root user, ensure entrypoint.sh and Gunicorn/Celery can run as this user,
513+
# and that file permissions (especially for media/static volumes) are handled correctly.
514+
# For simplicity in this initial plan, we will run as root, but a non-root user is recommended for production.
515+
516+
# Expose the port Gunicorn will run on (same as in docker-compose.yml)
517+
EXPOSE 8000
518+
519+
# The entrypoint script will handle migrations and then execute the CMD
520+
# CMD is defined in the docker-compose.yml for each service (backend, worker, beat)
521+
ENTRYPOINT ["/app/entrypoint.sh"]
522+
```
523+
524+
**Key Steps and Explanations:**
525+
526+
1. **`FROM python:3.11-slim-bullseye AS base`**:
527+
* Starts from an official Python 3.11 slim image based on Debian Bullseye. Slim images are smaller than the full Debian images.
528+
* `AS base`: Names this stage `base`, allowing it to be referenced later if using multi-stage builds (though not heavily utilized here for simplicity yet).
529+
530+
2. **`ENV PYTHONDONTWRITEBYTECODE 1` and `ENV PYTHONUNBUFFERED 1`**:
531+
* `PYTHONDONTWRITEBYTECODE`: Prevents Python from writing `.pyc` files to disc (useful in containers).
532+
* `PYTHONUNBUFFERED`: Ensures that Python output (e.g., print statements, logs) is sent directly to the terminal without being buffered, which is important for Docker logging.
533+
534+
3. **`WORKDIR /app`**:
535+
* Sets the default working directory for subsequent `RUN`, `CMD`, `ENTRYPOINT`, `COPY`, and `ADD` instructions.
536+
537+
4. **`RUN apt-get update && apt-get install -y ...`**:
538+
* Installs necessary system-level dependencies. `build-essential` and `libpq-dev` are common for Django projects using PostgreSQL (for compiling `psycopg2`).
539+
* Includes commented-out examples for other common libraries like those needed for Pillow or `libffi`.
540+
* `--no-install-recommends` reduces unnecessary packages.
541+
* `apt-get clean && rm -rf /var/lib/apt/lists/*` cleans up apt cache to keep the image size down.
542+
543+
5. **`ARG POETRY_VERSION=1.7.1` and `RUN pip install "poetry==${POETRY_VERSION}"`**:
544+
* Installs Poetry, a dependency management tool for Python. Using a specific version ensures reproducibility.
545+
546+
6. **`COPY ./backend/poetry.lock ./backend/pyproject.toml /app/`**:
547+
* Copies only the `poetry.lock` and `pyproject.toml` files first. This leverages Docker's layer caching. If these files haven't changed, Docker can reuse the cached layer from the next step (dependency installation), speeding up builds when only application code changes.
548+
549+
7. **`RUN poetry install --no-root --no-dev`**:
550+
* Installs Python dependencies defined in `pyproject.toml` using the versions specified in `poetry.lock`.
551+
* `--no-root`: Skips installing the project package itself in editable mode.
552+
* `--no-dev`: Excludes development dependencies (like testing tools) to keep the production image lean. For a development-specific stage, you might omit `--no-dev`.
553+
554+
8. **`COPY ./backend /app/`**:
555+
* Copies the entire `./backend` directory (containing your Django project) into the `/app` directory in the image.
556+
557+
9. **Non-Root User (Commented Out for Initial Simplicity)**:
558+
* The commented-out section shows how to create and switch to a non-root user (`appuser`). This is a security best practice for production.
559+
* *Decision*: For the initial setup, we'll proceed with the root user to simplify volume permissions and Gunicorn/Celery execution, but this should be revisited and implemented before any production deployment.
560+
561+
10. **`EXPOSE 8000`**:
562+
* Documents that the container will listen on port 8000 at runtime. This doesn't actually publish the port; publishing is done in `docker-compose.yml`.
563+
564+
11. **`ENTRYPOINT ["/app/entrypoint.sh"]`**:
565+
* Specifies the `entrypoint.sh` script (detailed in Section 5) as the command to be executed when the container starts.
566+
* The actual command to run the application (Gunicorn, Celery worker, Celery beat) will be passed as an argument to this entrypoint script from the `command` directive in the `docker-compose.yml` file for each respective service.
567+
568+
This Dockerfile provides a solid foundation for building the backend application image. It prioritizes caching, dependency management with Poetry, and includes placeholders for security best practices like using a non-root user.
462569

463570
---
464571

465572
## 5. `backend/entrypoint.sh` Details
466573

467-
*(This script will handle: waiting for DB to be ready (optional), applying Django migrations, collecting static files, and starting Gunicorn/Daphne. Example commands will be provided.)*
574+
The `entrypoint.sh` script is executed when any container using the `backend` image starts (i.e., `backend`, `worker`, `beat` services). Its primary purpose is to handle setup tasks like waiting for the database to be ready, applying Django database migrations, and then executing the main command passed to the container (e.g., Gunicorn, Celery worker/beat).
575+
576+
This script should be placed in the `backend/` directory and made executable (`chmod +x backend/entrypoint.sh`).
577+
578+
```bash
579+
#!/bin/sh
580+
581+
# Exit immediately if a command exits with a non-zero status.
582+
set -e
583+
584+
# Function to check if PostgreSQL is ready
585+
wait_for_postgres() {
586+
echo "Waiting for PostgreSQL to be ready..."
587+
# The PGPASSWORD environment variable should be set for the psql command to work without a password prompt.
588+
# It's typically set in the .env file and loaded by docker-compose.
589+
until psql -h "${DB_HOST:-db}" -U "${DB_USER:-postgres}" -d "${DB_NAME:-postgres}" -c '\q' 2>/dev/null; do
590+
echo "PostgreSQL is unavailable - sleeping"
591+
sleep 1
592+
done
593+
echo "PostgreSQL is up - executing command"
594+
}
595+
596+
# Determine the command to run based on the first argument passed to the script
597+
# This allows the same entrypoint to be used for Gunicorn, Celery worker, Celery beat, etc.
598+
case "$1" in
599+
"gunicorn")
600+
echo "Starting Gunicorn server..."
601+
# Wait for DB only if it's the Gunicorn server (which might need DB for startup/migrations)
602+
if [ "${WAIT_FOR_DB:-true}" = "true" ]; then
603+
wait_for_postgres
604+
fi
605+
606+
echo "Applying database migrations..."
607+
poetry run python manage.py migrate --noinput
608+
609+
echo "Collecting static files..."
610+
poetry run python manage.py collectstatic --noinput --clear
611+
# chown -R appuser:appuser /app/staticfiles_collected # If using non-root user
612+
# chown -R appuser:appuser /app/media # If using non-root user
613+
;;
614+
"celery_worker")
615+
echo "Starting Celery worker..."
616+
# Celery workers also might need DB to be ready if they interact with Django ORM on startup or for specific tasks.
617+
if [ "${WAIT_FOR_DB_CELERY:-true}" = "true" ]; then
618+
wait_for_postgres
619+
fi
620+
# The command for celery worker is passed as "celery_worker" then the actual celery command
621+
# We shift to remove "celery_worker" and then execute the rest.
622+
shift
623+
;; # The actual celery command is executed by "exec" below
624+
"celery_beat")
625+
echo "Starting Celery beat..."
626+
# Celery beat might need DB for DatabaseScheduler.
627+
if [ "${WAIT_FOR_DB_CELERY_BEAT:-true}" = "true" ]; then
628+
wait_for_postgres
629+
fi
630+
# Similar to worker, shift and execute the rest.
631+
shift
632+
;; # The actual celery beat command is executed by "exec" below
633+
*)
634+
# If no specific known command, just run what was passed.
635+
# This allows running arbitrary commands like `poetry run python manage.py shell`
636+
echo "Running command as is: $@"
637+
;;
638+
esac
639+
640+
# Execute the command passed into the Docker container (e.g., Gunicorn, Celery command, or manage.py command)
641+
# `exec "$@"` replaces the shell process with the command, so signals are passed correctly.
642+
exec "$@"
643+
644+
```
645+
646+
**Key Features and Explanations:**
647+
648+
1. **`#!/bin/sh`**: Shebang indicating the script should be run with `sh` (Bourne shell).
649+
2. **`set -e`**: Exits the script immediately if any command fails. This is good for catching errors early.
650+
3. **`wait_for_postgres()` Function**:
651+
* This function polls the PostgreSQL database to check if it's ready to accept connections before proceeding.
652+
* It uses `psql -h "${DB_HOST:-db}" ... -c '\q'`. The environment variables `DB_HOST`, `DB_USER`, `DB_NAME` (and implicitly `PGPASSWORD`) should be available from the `.env` file.
653+
* The `:-db` syntax provides a default value if the variable is unset or null.
654+
4. **Command Handling (`case "$1" in ... esac`)**:
655+
* The script inspects the first argument (`$1`) passed to it. This argument is typically the first part of the `command` specified in `docker-compose.yml`.
656+
* **`"gunicorn"`)**: If the command is for Gunicorn (our `backend` service):
657+
* It optionally waits for the database using `wait_for_postgres` (controlled by `WAIT_FOR_DB` env var, defaulting to true).
658+
* Runs Django database migrations: `poetry run python manage.py migrate --noinput`.
659+
* Collects static files: `poetry run python manage.py collectstatic --noinput --clear`.
660+
* Commented out `chown` commands are placeholders for when a non-root user is implemented, to ensure correct file permissions for volumes.
661+
* **`"celery_worker"` and `"celery_beat"`)**: If the command is for Celery worker or beat:
662+
* It optionally waits for the database (controlled by `WAIT_FOR_DB_CELERY` or `WAIT_FOR_DB_CELERY_BEAT` env vars).
663+
* `shift`: This command removes the first argument (e.g., `celery_worker`) from the list of positional parameters (`$@`). This is because the actual Celery command (e.g., `celery -A seculite_api worker ...`) follows this marker. The `exec "$@"` at the end will then run the intended Celery command.
664+
* **`*)` (Default case)**: If the first argument doesn't match known services, it simply proceeds to execute the command as is. This allows running other Django management commands or a shell, e.g., `docker-compose run backend poetry run python manage.py createsuperuser`.
665+
5. **`exec "$@"`**:
666+
* This is the crucial final step. It replaces the current shell process with the command specified by `"$@"` (all arguments passed to the script, potentially modified by `shift`).
667+
* Using `exec` ensures that the main application (Gunicorn, Celery, etc.) becomes PID 1 in the container (or at least the direct child of the entrypoint that receives signals properly), which is important for signal handling (like `SIGTERM` for graceful shutdown).
668+
669+
**How it integrates with `docker-compose.yml`:**
670+
671+
- The `ENTRYPOINT ["/app/entrypoint.sh"]` in `backend/Dockerfile` sets this script to run first.
672+
- The `command:` directive in `docker-compose.yml` for services using this image will provide the arguments to this script.
673+
- For `backend` service: `command: ["gunicorn", "seculite_api.wsgi:application", ...]`
674+
* `$1` in entrypoint.sh will be `gunicorn`.
675+
- For `worker` service: `command: ["celery_worker", "celery", "-A", "seculite_api", "worker", ...]`
676+
* `$1` will be `celery_worker`. After `shift`, `"$@"` becomes `celery -A seculite_api worker ...`.
677+
- For `beat` service: `command: ["celery_beat", "celery", "-A", "seculite_api", "beat", ...]`
678+
* `$1` will be `celery_beat`. After `shift`, `"$@"` becomes `celery -A seculite_api beat ...`.
679+
680+
This entrypoint script provides flexibility and handles common startup routines for a Django application in Docker.
468681

469682
---
470683

0 commit comments

Comments
 (0)