Skip to content

feat: fix rebuild#25

Merged
chraac merged 28 commits intomainfrom
dev-fix-rebuild
Feb 7, 2026
Merged

feat: fix rebuild#25
chraac merged 28 commits intomainfrom
dev-fix-rebuild

Conversation

@chraac
Copy link
Owner

@chraac chraac commented Dec 26, 2025

This pull request introduces significant improvements to the build system, documentation, and backend configurability for the llama.cpp QNN builder project. The main changes include enhanced documentation, expanded build options (including OpenCL/Adreno GPU and ggml-hexagon toggling), improved Docker and CI workflows, and a more robust clean rebuild process.

Build System and Backend Configuration:

  • Added new build options to enable or disable specific backends:
    • --disable-ggml-hexagon to disable the ggml-hexagon backend.
    • --enable-ocl to enable OpenCL support for Adreno GPU kernels.
    • Improved handling of build flags for quantized tensors, performance tracking, and backend selection in docker/docker_compose_compile.sh. [1] [2]
  • Updated Docker compose files and build scripts to pass new environment variables and support clean rebuilds via the SHOULD_REBUILD flag. [1] [2] [3] [4]
  • Improved output and logging in build scripts to reflect new options and their states.

Documentation and User Guidance:

  • Substantially expanded the README.md with detailed feature lists, backend descriptions, platform support, build options, quick start instructions, and project structure.
  • Updated the build instructions in docs/how-to-build.md to document new build flags and provide more comprehensive example usage.

Continuous Integration and Testing:

  • Modified GitHub Actions workflows to use the new build flags, particularly to disable ggml-hexagon where appropriate, and to optimize disk usage on CI runners. [1] [2] [3]
  • Added steps to free up disk space before builds and ensured the correct backend configurations are used in CI jobs.

Miscellaneous Improvements:

  • Updated submodule reference for llama.cpp to a newer commit.
  • Minor script improvements, such as exporting LD_LIBRARY_PATH for test execution and cleaning up command-line argument handling. [1] [2] [3]

These changes collectively make the project more flexible, easier to use, and better documented for both development and deployment across different hardware and platforms.

References:

@chraac chraac requested a review from Copilot December 26, 2025 06:31
@chraac chraac self-assigned this Dec 26, 2025
@chraac chraac added the enhancement New feature or request label Dec 26, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the build and test infrastructure to transition from llama-cli to llama-completion as the primary executable, and changes the default backend configuration from requiring explicit opt-in to GGML Hexagon to requiring explicit opt-out. Additionally, it introduces a rebuild flag and updates the llama.cpp subproject commit.

Key changes:

  • Replaces llama-cli references with llama-completion across test scripts
  • Inverts GGML Hexagon logic from opt-in (--use-ggml-hexagon) to opt-out (--disable-ggml-hexagon)
  • Adds support for rebuild functionality with SHOULD_REBUILD flag

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
scripts/run_device_model_test.sh Updates executable name and command arguments for device model tests
scripts/run_device_model_test.ps1 Updates executable name and log file naming in PowerShell test script
scripts/run_all_device_tests.sh Adds backend selection logic, new model test (Qwen), and updates log naming
scripts/push_and_run_test.sh Adds llama-completion to list of pushed executables
scripts/push_and_run_test.ps1 Adds llama-completion to list of pushed executables
scripts/looping.sh Updates command line flag from --hexagon-npu-only to --use-ggml-hexagon
llama.cpp Updates subproject commit reference
docker/docker_compose_compile.sh Inverts GGML Hexagon flag logic and adds rebuild support
docker/docker-compose-compile.yml Adds SHOULD_REBUILD environment variable
docker/docker-compose-compile-qnn.yml Adds SHOULD_REBUILD environment variable
docker/build_in_container.sh Refactors build process to support clean rebuilds and removes rsync operation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

t) test_only=1 ;;
r) revision="$OPTARG" ;;
s) subdirectory="$OPTARG" ;;
q) use_ggml_hexagon=0 ;;
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable assignment use_ggml_hexagon=0 when the -q flag is set is confusing given the help text states '-q Use hexagon-npu backend for tests and benchmarks'. Setting to 0 suggests disabling, but the intent appears to be selecting hexagon-npu. Consider renaming the variable to use_hexagon_npu or inverting the logic to match the flag's documented purpose.

Copilot uses AI. Check for mistakes.
Comment on lines +93 to 99
-t 512

echo "Running device model test and saving log to ${qwen_test_log_name}"
"${_script_path}/run_device_model_test.sh" \
-m "qwen3-1.7b-bf16-Q4_0.gguf" \
-l "${qwen_test_log_name}" \
-t 512
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -f flag was removed from this invocation but remains in the original version. If the -f flag controlled flash attention or another feature, verify this removal is intentional and won't cause unexpected behavior in the model test.

Suggested change
-t 512
echo "Running device model test and saving log to ${qwen_test_log_name}"
"${_script_path}/run_device_model_test.sh" \
-m "qwen3-1.7b-bf16-Q4_0.gguf" \
-l "${qwen_test_log_name}" \
-t 512
-t 512 \
-f
echo "Running device model test and saving log to ${qwen_test_log_name}"
"${_script_path}/run_device_model_test.sh" \
-m "qwen3-1.7b-bf16-Q4_0.gguf" \
-l "${qwen_test_log_name}" \
-t 512 \
-f

Copilot uses AI. Check for mistakes.
@chraac chraac merged commit 826a383 into main Feb 7, 2026
6 checks passed
@chraac chraac deleted the dev-fix-rebuild branch February 7, 2026 03:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants