Skip to content

Add quantized matmul work-area size helper API#79

Open
k8ika0s wants to merge 1 commit into
IBM:mainfrom
k8ika0s:qe/issue-10-quantized-matmul-work-area-size
Open

Add quantized matmul work-area size helper API#79
k8ika0s wants to merge 1 commit into
IBM:mainfrom
k8ika0s:qe/issue-10-quantized-matmul-work-area-size

Conversation

@k8ika0s
Copy link
Copy Markdown

@k8ika0s k8ika0s commented Apr 15, 2026

Fixes #49

Summary

  • Add quantized matmul work-area size helper API.

Why

Changes

  • Implemented on branch qe/issue-10-quantized-matmul-work-area-size.
  • Includes code and tests scoped to this issue.

Validation

  • s390x integrated battery pass recorded in artifacts/final-validation-20260212T173158Z.
  • Targeted regressions for this scope were validated during branch prep.

Notes

  • DCO signoffs are present on branch commits.

What: Add a helper API to compute quantized matmul work_area requirements.

Why: Avoid per-invocation allocator overhead by enabling deterministic caller preallocation in hot loops.

Expected impact: Lower tail latency and fewer malloc/free calls; additive opt-in API.

Tests: add sizing helper coverage (tests/testDriver_quantized_matmul_work_area_size_apis.c).
Signed-off-by: Kaitlyn Davis <k8ika0s@gmail.com>
Signed-off-by: Kaitlyn Davis <kaitlyn.davis@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Add Quantized Matmul Work-Area Size Helper API

1 participant