Skip to content

gh-runner: keep only actions runner configuration in external volume#53

Draft
lukkrusz wants to merge 3 commits into
masterfrom
lukkrusz/lighter_gh_runner
Draft

gh-runner: keep only actions runner configuration in external volume#53
lukkrusz wants to merge 3 commits into
masterfrom
lukkrusz/lighter_gh_runner

Conversation

@lukkrusz

@lukkrusz lukkrusz commented Jun 9, 2026

Copy link
Copy Markdown

All boxes must be checked before marking the PR as ready for review.

Checklist

  • This is not a pre-release PR (releases only on the master branch)
  • CHANGELOG.md updated
  • All affected images built locally for all supported architectures
  • Changes were tested on all supported target architectures: (fill in how)

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the GitHub Actions runner Dockerfile to optimize the multi-stage build, trim run-stage dependencies, and introduce a persistent volume for runner configuration. However, several critical issues were identified: the configuration linking logic in entry.py can lead to permanent configuration loss on container restart, and PyOCD packs installed in the build stage are lost because they are not copied to the run stage. Additionally, it is recommended to combine the separate COPY instructions for udev rules to reduce image layers.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread gh-runner/entry.py
Comment on lines +79 to 96
def link_config():
os.makedirs(CONFIG_DIR, exist_ok=True)
for name in CONFIG_FILES:
link = os.path.join('/actions-runner', name)
target = os.path.join(CONFIG_DIR, name)
if os.path.islink(link) and os.readlink(link) == target:
continue
if os.path.lexists(link):
os.remove(link)
os.symlink(target, link)


API_URL = resolve_api_url()

link_config()

if not os.path.isfile('.runner'):
configure_runner()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Critical Bug: Configuration Loss on Container Restart

In the current implementation, link_config() is called at startup and immediately creates broken symlinks in /actions-runner pointing to non-existent files in /runner-config (since the volume is initially empty).

  1. First Run: link_config() creates broken symlinks. Then configure_runner() runs. If the runner configuration process or .NET runtime performs atomic writes (e.g., writing to a temporary file and renaming/moving it to .runner), the broken symlink is replaced by a regular file in /actions-runner, while /runner-config/.runner remains empty.
  2. Container Restart: On the next boot, link_config() runs again. Since /actions-runner/.runner is now a regular file (not a symlink), os.path.islink(link) is False. The code then executes if os.path.lexists(link): os.remove(link) and deletes the newly created configuration file! Since the file was never successfully written to /runner-config, the configuration is permanently lost.

Solution

Implement a robust two-way sync/linking mechanism:

  • If a regular file exists at the link path (e.g., newly created or updated during runtime), move it to the persistent volume.
  • If the file exists in the persistent volume, ensure the symlink points to it.
  • Call link_config() both at startup and immediately after configure_runner() to ensure newly created configurations are immediately persisted.
def link_config():
    os.makedirs(CONFIG_DIR, exist_ok=True)
    for name in CONFIG_FILES:
        link = os.path.join('/actions-runner', name)
        target = os.path.join(CONFIG_DIR, name)
        if os.path.exists(link) and not os.path.islink(link):
            if os.path.exists(target):
                os.remove(target)
            os.rename(link, target)
        if os.path.exists(target):
            if os.path.islink(link):
                if os.readlink(link) == target:
                    continue
                os.remove(link)
            elif os.path.lexists(link):
                os.remove(link)
            os.symlink(target, link)


API_URL = resolve_api_url()

link_config()

if not os.path.isfile('.runner'):
    configure_runner()
    link_config()

Comment thread gh-runner/Dockerfile
Comment on lines +46 to +48
# install all dependencies
RUN pip3 install --no-cache-dir pyocd requests && \
pyocd pack install mcxn947

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

High Severity: Missing PyOCD Packs in Run Stage

Moving pyocd pack install mcxn947 to the build stage means the pack is downloaded to the build stage's user home directory (typically /root/.cache/pyocd or /root/.local/share/cmsis-pack-manager). Since only /opt/venv is copied to the run stage, the downloaded CMSIS pack is lost, and PyOCD will not have the mcxn947 pack installed in the final image.

To fix this cleanly and robustly, define a system-wide CMSIS_PACK_ROOT environment variable in both stages and copy the downloaded packs to the run stage.

# install all dependencies
ENV CMSIS_PACK_ROOT=/opt/cmsis-packs
RUN pip3 install --no-cache-dir pyocd requests && \
    pyocd pack install mcxn947

Comment thread gh-runner/Dockerfile Outdated
Comment on lines 148 to 149
COPY --from=build /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:${PATH}"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Copy the downloaded CMSIS packs from the build stage and set the CMSIS_PACK_ROOT environment variable so PyOCD can locate them.

COPY --from=build /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:${PATH}"
ENV CMSIS_PACK_ROOT=/opt/cmsis-packs
COPY --from=build /opt/cmsis-packs /opt/cmsis-packs

Comment thread gh-runner/Dockerfile Outdated
Comment on lines +132 to +134
# get udev rules for hardware debuggers (JLink and OpenOCD)
COPY --from=build /etc/udev/rules.d/99-jlink.rules /etc/udev/rules.d/
COPY --from=build /etc/udev/rules.d/60-openocd.rules /etc/udev/rules.d/

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Medium Severity: Combine COPY Instructions

Instead of copying the udev rules individually, you can copy the entire /etc/udev/rules.d/ directory in a single COPY instruction. This is cleaner, more idiomatic, and reduces the number of layers in the final image.

# get udev rules for hardware debuggers (JLink and OpenOCD)
COPY --from=build /etc/udev/rules.d/ /etc/udev/rules.d/

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/gemini This copy targets seem specific, as per the comment, so leaving as is.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. If the directory /etc/udev/rules.d/ contains other files that should not be included in the final image, then keeping the COPY instructions explicit is indeed the correct approach to avoid unintended side effects. Thank you for the clarification.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could still use 2 source files and 1 target dir. We need to optimize heavily the docker commands, as each RUN/COPY/ADD creates a new layer in the underlying FS.

Why did you move jlink installation from the run stage to the build stage? Is it needed for building something?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I previously attempted a removal of wget from run stage, which I later reverted but kept this. I see now that it should be moved back to run stage as well. This Dockerfile is very much a draft still. Thanks for taking a look

@lukkrusz lukkrusz force-pushed the lukkrusz/lighter_gh_runner branch from 8720401 to 7aeecb2 Compare June 9, 2026 09:50

@nalajcie nalajcie left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is a draft, just pointing out some issues I've seen in the email notification

Comment thread gh-runner/Dockerfile Outdated
Comment on lines +132 to +134
# get udev rules for hardware debuggers (JLink and OpenOCD)
COPY --from=build /etc/udev/rules.d/99-jlink.rules /etc/udev/rules.d/
COPY --from=build /etc/udev/rules.d/60-openocd.rules /etc/udev/rules.d/

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could still use 2 source files and 1 target dir. We need to optimize heavily the docker commands, as each RUN/COPY/ADD creates a new layer in the underlying FS.

Why did you move jlink installation from the run stage to the build stage? Is it needed for building something?

Comment thread CHANGELOG.md Outdated
## NEXT VERSION

## 20260414 (latest)
## 20260609 (latest)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're not making a release, put the changes under "NEXT VERSION"

@lukkrusz lukkrusz force-pushed the lukkrusz/lighter_gh_runner branch from 7aeecb2 to 4ce0600 Compare June 9, 2026 13:08
Improvements made to optimize the Dockerfile and
reduce the number of layers to minimum.

Assisted-by: Claude Sonnet 4.6
YT: CI-346
@lukkrusz lukkrusz force-pushed the lukkrusz/lighter_gh_runner branch from 4ce0600 to 772d033 Compare June 9, 2026 13:25
@lukkrusz

lukkrusz commented Jun 9, 2026

Copy link
Copy Markdown
Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the gh-runner Dockerfile by copying build artifacts in a single layer, trimming run-stage dependencies, and fixing package names. It also updates entry.py to persist only the runner configuration inside an external volume and improves error handling. The review feedback suggests stripping trailing slashes from the repository URL to avoid duplicate slashes in the API endpoint, and safely masking the registration token by matching exact arguments in the command list instead of performing a substring replacement on the joined command string.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread gh-runner/entry.py Outdated
Comment thread gh-runner/entry.py
Comment on lines +80 to +81
command = " ".join(exe).replace(token, "<TOKEN>")
print(f"{command} failed!")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using .replace(token, "<TOKEN>") on the joined command string can accidentally replace substrings in other arguments (such as the runner name, labels, or paths) if they happen to contain the token string. A safer approach is to mask the token by matching exact arguments in the exe list before joining them.

Suggested change
command = " ".join(exe).replace(token, "<TOKEN>")
print(f"{command} failed!")
command = " ".join("<TOKEN>" if arg == token else arg for arg in exe)
print(f"{command} failed!")

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this would be possible, anything containing the token should be somewhat redacted

lukkrusz added 2 commits June 10, 2026 09:57
Create symlinks pointing to the external volume with config,
to ensure that, when updated, the config persists, while the
old binaries are removed.

Assisted-by: Claude Opus 4.8
YT: CI-346
YT: CI-346
@lukkrusz lukkrusz force-pushed the lukkrusz/lighter_gh_runner branch from 772d033 to 685a6c0 Compare June 10, 2026 07:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants