added rag_ingestion action, script and dockerfile by rootflo-hardik · Pull Request #267 · rootflo/wavefront

rootflo-hardik · 2026-04-03T12:54:34Z

Summary by CodeRabbit

Chores
- Added automated CI/CD pipeline for building and deploying RAG Ingestion service container images to AWS, GCP, and Azure registries
- Images are tagged with commit hash and timestamp for unique version identification and deployment flexibility across cloud platforms

coderabbitai · 2026-04-03T12:54:49Z

📝 Walkthrough

Walkthrough

This pull request introduces Docker containerization and CI/CD automation for a RAG ingestion service. It adds a GitHub Actions workflow that builds a Docker image and pushes it to GCP Artifact Registry, AWS ECR, and Azure ACR registries, along with the corresponding Dockerfile and startup script.

Changes

Cohort / File(s)	Summary
CI/CD Workflow `.github/workflows/build-rag-ingestion-develop.yaml`	New manually-triggered GitHub Actions workflow that builds a Docker image with commit hash and timestamp tags, then authenticates and pushes sequentially to GCP Artifact Registry, AWS ECR, and Azure ACR.
Docker Configuration `wavefront/server/docker/rag_ingestion.Dockerfile`	New Dockerfile building a Python 3.11 slim image with `uv` package manager; installs RAG ingestion dependencies, pre-downloads tiktoken and NLTK runtime data during build, and configures container entrypoint.
Startup Script `wavefront/server/scripts/rag_ingestion/startup-rag-ingestion.sh`	New Bash script that activates Python virtual environment and runs the RAG ingestion main module.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

added azure to github actions #263: Adds Azure ACR support to GitHub Actions workflows with matching ACR environment variables and authentication/tag/push steps for multi-registry deployments.

Suggested reviewers

vizsatiz
vishnurk6247

Poem

🐰 A container for ingestion so fine,
Built with uv, across clouds it will shine,
To three registries it hops with delight,
AWS, GCP, Azure—deployed just right! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title accurately and concisely describes the three main additions: a GitHub Actions workflow for RAG ingestion, a startup script, and a Dockerfile.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix_added_rag_ingestion_action

Warning

Review ran into problems

🔥 Problems

Timed out fetching pipeline failures after 30000ms

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

.github/workflows/build-rag-ingestion-develop.yaml

+    runs-on: ubuntu-latest
+
+    steps:
+      - name: "Checkout"
+        uses: "actions/checkout@v3"
+
+      - name: Get commit hash
+        id: get-commit-hash
+        run: echo "::set-output name=commit-hash::$(git rev-parse --short HEAD)"
+
+      - name: Get timestamp
+        id: get-timestamp
+        run: echo "::set-output name=timestamp::$(date +'%Y-%m-%d-%H-%M')"
+
+      - name: Cache Docker layers
+        id: cache-docker-layers
+        uses: actions/cache@v3
+        with:
+          path: /tmp/.buildx-cache
+          key: ${{ runner.os }}-docker-${{ github.sha }}
+          restore-keys: |
+            ${{ runner.os }}-docker-
+
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3
+
+      - name: Build Docker Image
+        id: build-image
+        run: |
+          docker build -f wavefront/server/docker/rag_ingestion.Dockerfile -t rootflo:${{ steps.get-commit-hash.outputs.commit-hash }}-${{ steps.get-timestamp.outputs.timestamp }} .
+          echo "IMAGE_TAG=${{ steps.get-commit-hash.outputs.commit-hash }}-${{ steps.get-timestamp.outputs.timestamp }}" >> $GITHUB_ENV
+
+      - id: "Auth-to-GCP"
+        uses: "google-github-actions/auth@v1"
+        with:
+          credentials_json: "${{ secrets.GCP_SERVICE_ACCOUNT_KEY }}"
+
+      - name: "Set up Cloud SDK"
+        uses: "google-github-actions/setup-gcloud@v1"
+
+      - name: "Docker auth for GCP"
+        run: |-
+          gcloud auth configure-docker ${{ env.GCP_REGION }}-docker.pkg.dev --quiet
+
+      - name: Tag and push image to GCP Artifact Registry
+        run: |
+          docker tag rootflo:${{ env.IMAGE_TAG }} ${{ env.GAR_LOCATION }}/${{ env.IMAGE_NAME }}:${{ env.IMAGE_TAG }}
+          docker push ${{ env.GAR_LOCATION }}/${{ env.IMAGE_NAME }}:${{ env.IMAGE_TAG }}
+
+        # Configure AWS credentials and push to ECR
+      - name: Configure AWS credentials
+        uses: aws-actions/configure-aws-credentials@v1
+        with:
+          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
+          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
+          aws-region: ${{ env.AWS_REGION }}
+
+      - name: Login to Amazon ECR
+        id: login-ecr
+        uses: aws-actions/amazon-ecr-login@v1
+
+      - name: Tag and push image to Amazon ECR
+        run: |
+          docker tag rootflo:${{ env.IMAGE_TAG }} ${{ env.ECR_REGISTRY }}/${{ env.ECR_REPOSITORY }}:${{ env.IMAGE_TAG }}
+          docker push ${{ env.ECR_REGISTRY }}/${{ env.ECR_REPOSITORY }}:${{ env.IMAGE_TAG }}
+
+        # Configure Azure credentials and push to ACR
+      - name: Login to Azure
+        uses: azure/login@v2
+        with:
+          creds: ${{ secrets.AZURE_CREDENTIALS }}
+
+      - name: Docker auth for Azure ACR
+        run: az acr login --name ${{ env.ACR_REGISTRY_NAME }}
+
+      - name: Tag and push image to Azure Container Registry
+        run: |
+          docker tag rootflo:${{ env.IMAGE_TAG }} ${{ env.ACR_REGISTRY }}/${{ env.ACR_REPOSITORY }}:${{ env.IMAGE_TAG }}
+          docker push ${{ env.ACR_REGISTRY }}/${{ env.ACR_REPOSITORY }}:${{ env.IMAGE_TAG }}
+
+      - name: Cleanup Docker images
+        run: |
+          docker rmi rootflo:${{ env.IMAGE_TAG }} || true
+          docker rmi ${{ env.GAR_LOCATION }}/${{ env.IMAGE_NAME }}:${{ env.IMAGE_TAG }} || true
+          docker rmi ${{ env.ECR_REGISTRY }}/${{ env.ECR_REPOSITORY }}:${{ env.IMAGE_TAG }} || true
+          docker rmi ${{ env.ACR_REGISTRY }}/${{ env.ACR_REPOSITORY }}:${{ env.IMAGE_TAG }} || true


In general, the fix is to explicitly define a least‑privilege permissions block for the workflow or individual jobs instead of relying on default GITHUB_TOKEN permissions. For this workflow, no step needs write access to repository contents or other GitHub resources, so we can safely restrict to contents: read at the workflow level. This will apply to all jobs unless they override it.

The best minimal change is to add a root-level permissions section right after the name: (before on:). This keeps existing behavior for all steps (they still can read the repo via actions/checkout), while ensuring GITHUB_TOKEN cannot perform write operations. Concretely, in .github/workflows/build-rag-ingestion-develop.yaml, insert:

permissions: contents: read

after line 1. No additional methods, imports, or definitions are needed.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (5)

wavefront/server/docker/rag_ingestion.Dockerfile (3)

3-4: Pin the uv version for reproducible builds.

Using :latest tag can cause unexpected build failures if uv introduces breaking changes. Since your pyproject.toml specifies required-version = ">=0.7.3", pin to a specific compatible version.

♻️ Proposed fix

 # Copy UV from official image
-COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
+COPY --from=ghcr.io/astral-sh/uv:0.7.3 /uv /uvx /bin/

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@wavefront/server/docker/rag_ingestion.Dockerfile` around lines 3 - 4, The
Dockerfile uses COPY --from=ghcr.io/astral-sh/uv:latest which is unstable;
update the image tag to a specific pinned uv version compatible with
pyproject.toml's required-version (e.g., ghcr.io/astral-sh/uv:0.7.3) so builds
are reproducible, i.e., replace ghcr.io/astral-sh/uv:latest in the COPY line
with the chosen pinned tag and ensure the pinned version satisfies
required-version >=0.7.3.

21-24: Consider consolidating RUN commands to reduce image layers.

Multiple RUN commands create separate layers. Consolidating them can slightly reduce image size and improve build efficiency.

♻️ Proposed consolidation

 # Download the tiktoken encoding file and NLTK data
-RUN mkdir -p /root/.cache/tiktoken
-RUN uv run python3 -c "import tiktoken; enc = tiktoken.encoding_for_model('gpt-4')"
-RUN uv run python3 -c "import nltk; nltk.download('punkt'); nltk.download('averaged_perceptron_tagger')"
+RUN mkdir -p /root/.cache/tiktoken \
+    && uv run python3 -c "import tiktoken; enc = tiktoken.encoding_for_model('gpt-4')" \
+    && uv run python3 -c "import nltk; nltk.download('punkt'); nltk.download('averaged_perceptron_tagger')"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@wavefront/server/docker/rag_ingestion.Dockerfile` around lines 21 - 24, The
three separate RUN statements that create /root/.cache/tiktoken and call Python
(RUN mkdir -p /root/.cache/tiktoken; RUN uv run python3 -c "import tiktoken...";
RUN uv run python3 -c "import nltk...") should be consolidated into a single RUN
layer: perform the mkdir -p then run both python -c calls joined with && (and
optionally use set -e or set -eux for fail-fast) so the Dockerfile has one RUN
instruction that creates the directory and executes the tiktoken and nltk
downloads in sequence; update the Dockerfile lines containing the three RUN
commands to a single RUN that preserves the same commands and ordering.

1-1: Consider using a newer base image; Debian Buster reached end-of-life.

python:3.11-slim-buster is based on Debian 10 (Buster), which reached end-of-life in June 2024 and no longer receives security updates. Consider upgrading to slim-bookworm for continued security patches.

♻️ Proposed fix

-FROM python:3.11-slim-buster
+FROM python:3.11-slim-bookworm

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@wavefront/server/docker/rag_ingestion.Dockerfile` at line 1, The Dockerfile
uses an EOL base image "python:3.11-slim-buster"; update the FROM line to a
maintained Debian base such as "python:3.11-slim-bookworm" to restore security
updates—locate the FROM instruction (the "python:3.11-slim-buster" token) in
rag_ingestion.Dockerfile and replace it with "python:3.11-slim-bookworm", then
rebuild and run smoke tests to verify compatibility.

.github/workflows/build-rag-ingestion-develop.yaml (1)

36-52: Docker layer cache is configured but not utilized.

The workflow sets up Buildx and a cache directory but uses plain docker build which doesn't leverage the cache. Either use docker buildx build with cache flags or remove the unused cache step.

♻️ Option 1: Use buildx with caching

       - name: Build Docker Image
         id: build-image
         run: |
-          docker build -f wavefront/server/docker/rag_ingestion.Dockerfile -t rootflo:${{ steps.get-commit-hash.outputs.commit-hash }}-${{ steps.get-timestamp.outputs.timestamp }} .
+          docker buildx build \
+            --cache-from type=local,src=/tmp/.buildx-cache \
+            --cache-to type=local,dest=/tmp/.buildx-cache-new,mode=max \
+            --load \
+            -f wavefront/server/docker/rag_ingestion.Dockerfile \
+            -t rootflo:${{ steps.get-commit-hash.outputs.commit-hash }}-${{ steps.get-timestamp.outputs.timestamp }} .
+          # Rotate cache to prevent unbounded growth
+          rm -rf /tmp/.buildx-cache
+          mv /tmp/.buildx-cache-new /tmp/.buildx-cache
           echo "IMAGE_TAG=${{ steps.get-commit-hash.outputs.commit-hash }}-${{ steps.get-timestamp.outputs.timestamp }}" >> $GITHUB_ENV

♻️ Option 2: Remove unused cache step

If caching isn't needed, remove lines 36-43 to simplify the workflow.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.github/workflows/build-rag-ingestion-develop.yaml around lines 36 - 52, The
workflow declares a cache step (id: cache-docker-layers) and sets up Buildx but
the Build Docker Image step (id: build-image, name: Build Docker Image) uses
plain docker build so the cache is never used; fix by either switching the Build
Docker Image step to use docker buildx build with appropriate cache flags (e.g.,
--cache-from and --cache-to pointing at /tmp/.buildx-cache or a registry) so the
cache created by Set up Docker Buildx is consumed, or remove the Cache Docker
layers step entirely if you don’t want caching — update the build step
referenced as Build Docker Image and the cache step id cache-docker-layers
accordingly.

wavefront/server/scripts/rag_ingestion/startup-rag-ingestion.sh (1)

1-6: Add error handling and use exec for proper signal handling in containers.

The script lacks error handling (set -e) and doesn't use exec for the final command. In containers, using exec ensures the Python process becomes PID 1 and receives signals (SIGTERM, etc.) directly, enabling graceful shutdown.
♻️ Proposed improvements
 #!/bin/bash
+set -e
 
 source /app/.venv/bin/activate
 
 # Run the main application for RAG Ingestion
-python rag_ingestion/main.py 
+exec python rag_ingestion/main.py
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wavefront/server/scripts/rag_ingestion/startup-rag-ingestion.sh` around lines
1 - 6, Add strict error handling and ensure the Python process becomes PID 1 by
updating the startup script: enable "set -e" (or "set -euo pipefail") after the
shebang to fail fast on errors, keep the virtualenv activation via "source
/app/.venv/bin/activate", and replace the final "python rag_ingestion/main.py"
invocation with an "exec python rag_ingestion/main.py" so signals are forwarded
to the Python process for graceful shutdown.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/build-rag-ingestion-develop.yaml:
- Around line 25-26: Update the GitHub Actions steps to current major releases:
bump "uses" from actions/checkout@v3 to the latest stable major (e.g.,
actions/checkout@v4) and likewise update actions/setup-node,
actions/upload-artifact, actions/cache, and docker/build-push-action to their
current major tags; after updating each action (identify them by their action
IDs like actions/checkout, actions/setup-node, actions/upload-artifact,
actions/cache, docker/build-push-action) run a quick workflow lint and adjust
any inputs that changed between major versions to match new schemas or required
fields.
- Around line 28-34: The two workflow steps "Get commit hash" (id:
get-commit-hash) and "Get timestamp" (id: get-timestamp) use the deprecated
::set-output syntax; update each step to write outputs to the $GITHUB_OUTPUT
file instead (i.e., produce lines like "commit-hash=..." and "timestamp=..."
appended to $GITHUB_OUTPUT) so the steps set outputs via the supported mechanism
and keep the step names/ids unchanged.
- Around line 20-23: Add an explicit permissions block to the workflow to avoid
default token privileges: insert a top-level or job-level permissions mapping
(for the job named build-push-artifact) that grants only the minimal scopes
required (for example, contents: read, packages: write, id-token: write,
actions: read) instead of leaving permissions undefined; update any steps that
rely on GITHUB_TOKEN to work with these restricted scopes and place the
permissions block adjacent to the runs-on/job definition for
build-push-artifact.

In `@wavefront/server/docker/rag_ingestion.Dockerfile`:
- Around line 26-32: Create and switch to a non-root user in the Dockerfile: add
a step that creates a system/group and a non-root user (e.g., "appuser"), chown
the WORKDIR (/app/background_jobs/rag_ingestion) and the
startup-rag-ingestion.sh to that user, keep the chmod +x step, and add a USER
appuser instruction before the CMD so the container runs as the non-root user;
reference the existing WORKDIR, startup-rag-ingestion.sh and CMD
["./startup-rag-ingestion.sh"] when making these changes.

---

Nitpick comments:
In @.github/workflows/build-rag-ingestion-develop.yaml:
- Around line 36-52: The workflow declares a cache step (id:
cache-docker-layers) and sets up Buildx but the Build Docker Image step (id:
build-image, name: Build Docker Image) uses plain docker build so the cache is
never used; fix by either switching the Build Docker Image step to use docker
buildx build with appropriate cache flags (e.g., --cache-from and --cache-to
pointing at /tmp/.buildx-cache or a registry) so the cache created by Set up
Docker Buildx is consumed, or remove the Cache Docker layers step entirely if
you don’t want caching — update the build step referenced as Build Docker Image
and the cache step id cache-docker-layers accordingly.

In `@wavefront/server/docker/rag_ingestion.Dockerfile`:
- Around line 3-4: The Dockerfile uses COPY --from=ghcr.io/astral-sh/uv:latest
which is unstable; update the image tag to a specific pinned uv version
compatible with pyproject.toml's required-version (e.g.,
ghcr.io/astral-sh/uv:0.7.3) so builds are reproducible, i.e., replace
ghcr.io/astral-sh/uv:latest in the COPY line with the chosen pinned tag and
ensure the pinned version satisfies required-version >=0.7.3.
- Around line 21-24: The three separate RUN statements that create
/root/.cache/tiktoken and call Python (RUN mkdir -p /root/.cache/tiktoken; RUN
uv run python3 -c "import tiktoken..."; RUN uv run python3 -c "import nltk...")
should be consolidated into a single RUN layer: perform the mkdir -p then run
both python -c calls joined with && (and optionally use set -e or set -eux for
fail-fast) so the Dockerfile has one RUN instruction that creates the directory
and executes the tiktoken and nltk downloads in sequence; update the Dockerfile
lines containing the three RUN commands to a single RUN that preserves the same
commands and ordering.
- Line 1: The Dockerfile uses an EOL base image "python:3.11-slim-buster";
update the FROM line to a maintained Debian base such as
"python:3.11-slim-bookworm" to restore security updates—locate the FROM
instruction (the "python:3.11-slim-buster" token) in rag_ingestion.Dockerfile
and replace it with "python:3.11-slim-bookworm", then rebuild and run smoke
tests to verify compatibility.

In `@wavefront/server/scripts/rag_ingestion/startup-rag-ingestion.sh`:
- Around line 1-6: Add strict error handling and ensure the Python process
becomes PID 1 by updating the startup script: enable "set -e" (or "set -euo
pipefail") after the shebang to fail fast on errors, keep the virtualenv
activation via "source /app/.venv/bin/activate", and replace the final "python
rag_ingestion/main.py" invocation with an "exec python rag_ingestion/main.py" so
signals are forwarded to the Python process for graceful shutdown.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3821c593-8b4a-4a32-a441-accd9e742ccd

📥 Commits

Reviewing files that changed from the base of the PR and between 207a081 and 17a1bdb.

📒 Files selected for processing (3)

.github/workflows/build-rag-ingestion-develop.yaml
wavefront/server/docker/rag_ingestion.Dockerfile
wavefront/server/scripts/rag_ingestion/startup-rag-ingestion.sh

coderabbitai · 2026-04-03T13:03:23Z

.github/workflows/build-rag-ingestion-develop.yaml

+jobs:
+  build-push-artifact:
+    runs-on: ubuntu-latest
+


⚠️ Potential issue | 🟠 Major

Add explicit permissions to follow the principle of least privilege.

The workflow lacks a permissions block, which means it uses default token permissions. CodeQL flagged this as a security concern. Explicitly declaring minimal permissions reduces the blast radius if the workflow is compromised.

🛡️ Proposed fix

jobs: build-push-artifact: runs-on: ubuntu-latest + permissions: + contents: read steps:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

jobs:

build-push-artifact:

runs-on: ubuntu-latest

jobs:

build-push-artifact:

runs-on: ubuntu-latest

permissions:

contents: read

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/build-rag-ingestion-develop.yaml around lines 20 - 23, Add an explicit permissions block to the workflow to avoid default token privileges: insert a top-level or job-level permissions mapping (for the job named build-push-artifact) that grants only the minimal scopes required (for example, contents: read, packages: write, id-token: write, actions: read) instead of leaving permissions undefined; update any steps that rely on GITHUB_TOKEN to work with these restricted scopes and place the permissions block adjacent to the runs-on/job definition for build-push-artifact.

coderabbitai · 2026-04-03T13:03:23Z

.github/workflows/build-rag-ingestion-develop.yaml

+      - name: "Checkout"
+        uses: "actions/checkout@v3"


⚠️ Potential issue | 🟠 Major

Update GitHub Actions to current versions.

Several actions are using outdated major versions that may have compatibility issues or security vulnerabilities. The static analyzer flagged these as potentially too old to run.

🔧 Proposed version updates

- name: "Checkout" - uses: "actions/checkout@v3" + uses: "actions/checkout@v4"

- name: Cache Docker layers id: cache-docker-layers - uses: actions/cache@v3 + uses: actions/cache@v4

- id: "Auth-to-GCP" - uses: "google-github-actions/auth@v1" + uses: "google-github-actions/auth@v2" with: credentials_json: "${{ secrets.GCP_SERVICE_ACCOUNT_KEY }}" - name: "Set up Cloud SDK" - uses: "google-github-actions/setup-gcloud@v1" + uses: "google-github-actions/setup-gcloud@v2"

- name: Configure AWS credentials - uses: aws-actions/configure-aws-credentials@v1 + uses: aws-actions/configure-aws-credentials@v4 with: aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} aws-region: ${{ env.AWS_REGION }} - name: Login to Amazon ECR id: login-ecr - uses: aws-actions/amazon-ecr-login@v1 + uses: aws-actions/amazon-ecr-login@v2

Also applies to: 36-38, 54-60, 72-73, 79-81

🧰 Tools

🪛 actionlint (1.7.11)

[error] 26-26: the runner of "actions/checkout@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/build-rag-ingestion-develop.yaml around lines 25 - 26, Update the GitHub Actions steps to current major releases: bump "uses" from actions/checkout@v3 to the latest stable major (e.g., actions/checkout@v4) and likewise update actions/setup-node, actions/upload-artifact, actions/cache, and docker/build-push-action to their current major tags; after updating each action (identify them by their action IDs like actions/checkout, actions/setup-node, actions/upload-artifact, actions/cache, docker/build-push-action) run a quick workflow lint and adjust any inputs that changed between major versions to match new schemas or required fields.

coderabbitai · 2026-04-03T13:03:23Z

.github/workflows/build-rag-ingestion-develop.yaml

+      - name: Get commit hash
+        id: get-commit-hash
+        run: echo "::set-output name=commit-hash::$(git rev-parse --short HEAD)"
+
+      - name: Get timestamp
+        id: get-timestamp
+        run: echo "::set-output name=timestamp::$(date +'%Y-%m-%d-%H-%M')"


⚠️ Potential issue | 🟠 Major

Replace deprecated set-output workflow commands.

The ::set-output command was deprecated in October 2022 and may stop working. Use $GITHUB_OUTPUT instead.

🔧 Proposed fix

- name: Get commit hash id: get-commit-hash - run: echo "::set-output name=commit-hash::$(git rev-parse --short HEAD)" + run: echo "commit-hash=$(git rev-parse --short HEAD)" >> $GITHUB_OUTPUT - name: Get timestamp id: get-timestamp - run: echo "::set-output name=timestamp::$(date +'%Y-%m-%d-%H-%M')" + run: echo "timestamp=$(date +'%Y-%m-%d-%H-%M')" >> $GITHUB_OUTPUT

🧰 Tools

🪛 actionlint (1.7.11)

[error] 30-30: workflow command "set-output" was deprecated. use echo "{name}={value}" >> $GITHUB_OUTPUT instead: https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions

(deprecated-commands)

[error] 34-34: workflow command "set-output" was deprecated. use echo "{name}={value}" >> $GITHUB_OUTPUT instead: https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions

(deprecated-commands)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/build-rag-ingestion-develop.yaml around lines 28 - 34, The two workflow steps "Get commit hash" (id: get-commit-hash) and "Get timestamp" (id: get-timestamp) use the deprecated ::set-output syntax; update each step to write outputs to the $GITHUB_OUTPUT file instead (i.e., produce lines like "commit-hash=..." and "timestamp=..." appended to $GITHUB_OUTPUT) so the steps set outputs via the supported mechanism and keep the step names/ids unchanged.

coderabbitai · 2026-04-03T13:03:23Z

wavefront/server/docker/rag_ingestion.Dockerfile

+WORKDIR /app/background_jobs/rag_ingestion
+
+# Make startup script executable
+RUN chmod +x startup-rag-ingestion.sh
+
+# Set entrypoint to run startup script
+CMD ["./startup-rag-ingestion.sh"] 


⚠️ Potential issue | 🟠 Major

Run as non-root user for improved security.

The container runs as root, which is flagged by Trivy (DS-0002). Running containers as non-root limits the impact of a container compromise.

🛡️ Proposed fix to add a non-root user

# Download the tiktoken encoding file and NLTK data RUN mkdir -p /root/.cache/tiktoken RUN uv run python3 -c "import tiktoken; enc = tiktoken.encoding_for_model('gpt-4')" RUN uv run python3 -c "import nltk; nltk.download('punkt'); nltk.download('averaged_perceptron_tagger')" +# Create non-root user +RUN useradd --create-home --shell /bin/bash appuser \ + && cp -r /root/.cache /home/appuser/.cache \ + && chown -R appuser:appuser /app /home/appuser/.cache + WORKDIR /app/background_jobs/rag_ingestion # Make startup script executable RUN chmod +x startup-rag-ingestion.sh +USER appuser + # Set entrypoint to run startup script CMD ["./startup-rag-ingestion.sh"]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@wavefront/server/docker/rag_ingestion.Dockerfile` around lines 26 - 32, Create and switch to a non-root user in the Dockerfile: add a step that creates a system/group and a non-root user (e.g., "appuser"), chown the WORKDIR (/app/background_jobs/rag_ingestion) and the startup-rag-ingestion.sh to that user, keep the chmod +x step, and add a USER appuser instruction before the CMD so the container runs as the non-root user; reference the existing WORKDIR, startup-rag-ingestion.sh and CMD ["./startup-rag-ingestion.sh"] when making these changes.

added rag_ingestion action, script and dockerfile

17a1bdb

github-advanced-security AI found potential problems Apr 3, 2026

View reviewed changes

vizsatiz approved these changes Apr 3, 2026

View reviewed changes

coderabbitai bot reviewed Apr 3, 2026

View reviewed changes

vishnurk6247 approved these changes Apr 3, 2026

View reviewed changes

rootflo-hardik merged commit 732fc46 into develop Apr 4, 2026
9 checks passed

coderabbitai bot mentioned this pull request Apr 4, 2026

commented aws for rag_ingestion action #269

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added rag_ingestion action, script and dockerfile#267

added rag_ingestion action, script and dockerfile#267
rootflo-hardik merged 1 commit intodevelopfrom
fix_added_rag_ingestion_action

rootflo-hardik commented Apr 3, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 3, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Review ran into problems

Uh oh!

Check warning

Copilot Autofix

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 3, 2026

Uh oh!

coderabbitai bot Apr 3, 2026

Uh oh!

coderabbitai bot Apr 3, 2026

Uh oh!

coderabbitai bot Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

@@ -1,5 +1,8 @@
             name: (Develop) Build and Push RAG Ingestion to AWS, GCP and Azure
+            permissions:
+              contents: read
             on:
               workflow_dispatch:

Conversation

rootflo-hardik commented Apr 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Review ran into problems

Uh oh!

Check warning

Copilot Autofix

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

rootflo-hardik commented Apr 3, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 3, 2026 •

edited

Loading