Skip to content

Add prepared quantized matmul context APIs#81

Open
k8ika0s wants to merge 3 commits into
IBM:mainfrom
k8ika0s:qe/issue-12-prepared-quantized-matmul-context
Open

Add prepared quantized matmul context APIs#81
k8ika0s wants to merge 3 commits into
IBM:mainfrom
k8ika0s:qe/issue-12-prepared-quantized-matmul-context

Conversation

@k8ika0s
Copy link
Copy Markdown

@k8ika0s k8ika0s commented Apr 15, 2026

Fixes #51

Summary

  • Add prepared quantized matmul context APIs.

Why

Changes

  • Implemented on branch qe/issue-12-prepared-quantized-matmul-context.
  • Includes code and tests scoped to this issue.

Validation

  • s390x integrated battery pass recorded in artifacts/final-validation-20260212T173158Z.
  • Targeted regressions for this scope were validated during branch prep.

Notes

  • DCO signoffs are present on branch commits.

What: Add optional prepare/execute/release APIs for repeated quantized matmul with stable descriptors.

Why: Repeated-shape inference loops otherwise redo setup and scratch handling each call.

Expected impact: Lower per-call CPU overhead for repeated quantized matmul workloads; additive opt-in API.

Tests: add dedicated coverage for prepared lifecycle and invariants (tests/testDriver_quantized_matmul_prepared_context.c).
Signed-off-by: Kaitlyn Davis <k8ika0s@gmail.com>
Signed-off-by: Kaitlyn Davis <kaitlyn.davis@ibm.com>
The prepared quantized matmul context previously depended on the separate\nwork-area sizing helper API. Compute qc_tilde sizing locally instead so this\nPR remains self-contained (only requires scratch-buffer support) and does not\nintroduce a hidden dependency on work_area.c helper changes.

Signed-off-by: Kaitlyn Davis <k8ika0s@gmail.com>
Signed-off-by: Kaitlyn Davis <kaitlyn.davis@ibm.com>
What: Replace prepared-context scratch API usage with internal aligned buffer management so the branch can stand alone on main.

Why: Issue 12 previously depended on Issue 11 types/functions; this change removes that branch dependency while preserving behavior.

Expected impact: Prepared quantized matmul remains functionally equivalent and independently PR-able to main.

Tests: update prepared-context tests for internal buffer fields.

Signed-off-by: Kaitlyn Davis <k8ika0s@gmail.com>

Signed-off-by: Kaitlyn Davis <kaitlyn.davis@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Add Prepared Quantized Matmul Context APIs (Prepare/Execute/Release)

1 participant