Skip to content

[Feature Request] Add Quantized Matmul Work-Area Size Helper API #49

@k8ika0s

Description

@k8ika0s

Proposed PR branch: qe/issue-10-quantized-matmul-work-area-size

Category: Feature Request (performance + usability)

Problem

  • Quantized matmul can allocate internal scratch when work_area == NULL.
  • Callers need a sizing helper for deterministic preallocation.

Justification / why this is needed

  • Quantized matmul is typically used in hot paths; allocator overhead here is
    especially costly.
  • Sizing helper allows frameworks to allocate once, reuse, and avoid spikes.

Proposed change

  • Add zdnn_get_quantized_matmul_work_area_size(...).
  • Add tests.
  • (Optional follow-up) Add README guidance for caller preallocation in quantized matmul hot paths.

Acceptance criteria

  • Helper computes required bytes (return 0 for invalid inputs).
  • Tests cover edge cases and representative shapes.

Test plan

  • make test (expects tests/testDriver_quantized_matmul_work_area_size_apis.c to pass).

References

  • Key files: zdnn/quantized_matmul_work_area.c, zdnn/zdnn.h, zdnn/zdnn.map,
    tests/testDriver_quantized_matmul_work_area_size_apis.c.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions