Skip to content

feat: add CUDA/gsplat environment check script#11

Open
cicorias wants to merge 2 commits intoAzure-Samples:mainfrom
cicorias:simple-gsplat-train
Open

feat: add CUDA/gsplat environment check script#11
cicorias wants to merge 2 commits intoAzure-Samples:mainfrom
cicorias:simple-gsplat-train

Conversation

@cicorias
Copy link
Copy Markdown
Member

Add scripts/gsplat_check — a lightweight Python tool (managed by uv) that verifies whether the current device can run the gsplat 3DGS training backend.

Checks performed:

  • CUDA GPU detection via nvidia-smi + PyTorch tensor smoke-test
  • gsplat library import and rasterization kernel validation (8 Gaussians)
  • External tool availability (nvidia-smi, python3, ffmpeg, colmap)

Reports a structured pass/fail verdict similar to the Rust preflight binary.

Usage: cd scripts/gsplat_check && uv run main.py

Also adds a reference to the new tool in the root README documentation section.

Copilot AI review requested due to automatic review settings March 20, 2026 15:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new lightweight Python (uv-managed) preflight script to validate CUDA + gsplat functionality on a machine, plus docs linking to the tool.

Changes:

  • Add scripts/gsplat_check with a main.py verifier and uv pyproject.toml
  • Add tool-specific README with usage/examples
  • Link the new environment check tool from the root README.md

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
scripts/gsplat_check/pyproject.toml Defines the uv-managed Python project and dependencies for the checker script
scripts/gsplat_check/main.py Implements CUDA detection, PyTorch smoke test, gsplat rasterization smoke test, and tool probing
scripts/gsplat_check/README.md Documents what is checked, prerequisites, and example outputs
scripts/gsplat_check/.python-version Pins the local Python version for the tool directory
README.md Adds a link to the new gsplat environment check tool

Comment on lines +237 to +257
_heading("External Tools")
for cmd, args in [
("nvidia-smi", ["--version"]),
("python3", ["--version"]),
("ffmpeg", ["-version"]),
("colmap", ["--version"]),
]:
ver = _cmd_version(cmd, args)
if ver:
_ok(cmd, ver)
else:
_fail(cmd, "not found")

# ── verdict ───────────────────────────────────────────────────────────
_heading("Environment Verdict")
if failures:
print()
print(" ❌ ENVIRONMENT CHECK FAILED")
for f in failures:
print(f" • {f}")
return 1
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The “External Tools” checks currently never affect failures, so the final exit code can be 0 even when ffmpeg/colmap/nvidia-smi are missing. If tool availability is intended to be part of the pass/fail verdict (per the PR description and the README’s “Exits 0 on success, 1 on failure”), append a failure reason when a required tool is not found; alternatively, explicitly label these checks as informational-only in output/docs.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

Comment thread scripts/gsplat_check/main.py Outdated
Comment thread scripts/gsplat_check/main.py
cicorias and others added 2 commits March 25, 2026 19:54
Add scripts/gsplat_check — a lightweight Python tool (managed by uv) that
verifies whether the current device can run the gsplat 3DGS training backend.

Checks performed:
- CUDA GPU detection via nvidia-smi + PyTorch tensor smoke-test
- gsplat library import and rasterization kernel validation (8 Gaussians)
- External tool availability (nvidia-smi, python3, ffmpeg, colmap)

Reports a structured pass/fail verdict similar to the Rust preflight binary.

Usage: cd scripts/gsplat_check && uv run main.py

Also adds a reference to the new tool in the root README documentation section.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.

if ver:
_ok(cmd, ver)
else:
_fail(cmd, "not found")
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The “External Tools” probe prints missing tools as ✗ but never adds them to failures, so the script can still exit 0 even when ffmpeg/colmap/nvidia-smi are absent. If these tools are required for a passing environment verdict (per the PR description), append a failure reason when ver is None; otherwise, explicitly label these checks as informational-only and ensure the verdict messaging/docs match.

Suggested change
_fail(cmd, "not found")
_fail(cmd, "not found")
failures.append(f"Required external tool '{cmd}' not found")

Copilot uses AI. Check for mistakes.
"skipped: CUDA unavailable or unusable "
"(torch.cuda.is_available() is False)"
)
return info
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says “If CUDA is not available/usable, skip kernel probing”, but the code only checks torch.cuda.is_available(). In cases where CUDA is “available” but actually unusable (e.g., unsupported SM arch, driver/runtime mismatch), this will still attempt rasterization() and produce low-level errors. Consider gating this on the earlier usability smoke-test result (e.g., pass gpu['usable'] into check_gsplat) or performing a small CUDA tensor op here before running the gsplat kernel probe.

Suggested change
return info
return info
# Basic CUDA usability smoke test: run a tiny tensor op on the GPU.
# This catches cases where CUDA is "available" but the runtime/driver
# is not actually usable before we attempt to load gsplat kernels.
try:
_ = torch.tensor([0.0], device="cuda") + 1.0
except Exception as exc:
info["kernel_error"] = (
"skipped: CUDA reported available but failed a basic tensor "
f"operation ({exc})"
)
return info

Copilot uses AI. Check for mistakes.
import subprocess
import shutil
import sys
import textwrap
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

textwrap is imported but never used in this script. Please remove the unused import to avoid lint warnings and keep the script minimal.

Suggested change
import textwrap

Copilot uses AI. Check for mistakes.
Comment on lines +13 to +15
| **External tools** | `nvidia-smi`, `python3`, `ffmpeg`, `colmap` |

Exits **0** on success, **1** on failure.
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This README states “Exits 0 on success, 1 on failure” and lists external tool availability as a check, but the current implementation only fails the verdict based on GPU/gsplat and does not treat missing external tools as failures. Either update the script so missing required tools affect the exit code, or clarify here that the external tool section is informational-only.

Suggested change
| **External tools** | `nvidia-smi`, `python3`, `ffmpeg`, `colmap` |
Exits **0** on success, **1** on failure.
| **External tools** | Reports availability of `nvidia-smi`, `python3`, `ffmpeg`, `colmap` (informational only; does **not** affect exit code) |
Exits **0** if the CUDA GPU and **gsplat** checks pass, **1** otherwise. External tool availability does **not** currently influence the exit status.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants