Skip to content

[Feature Request] Add LSTM/GRU Work-Area Size Helper APIs (Caller Preallocation) #47

@k8ika0s

Description

@k8ika0s

Proposed PR branch: qe/issue-08-work-area-size-apis

Category: Feature Request (performance + usability)

Problem

  • Callers need deterministic work-area sizing for LSTM/GRU to preallocate and
    avoid runtime allocator overhead in hot loops.

Justification / why this is needed

  • Repeated allocation/free in inference loops increases tail latency and
    wastes CPU time under load.
  • A sizing helper enables a clean, deterministic preallocation flow for
    frameworks (llama.cpp style loops, ORT EPs, etc.).
  • This is additive/opt-in: callers can keep current behavior.

Proposed change

  • Add zdnn_get_lstm_work_area_size(...) and zdnn_get_gru_work_area_size(...).
  • Add tests.
  • (Optional follow-up) Add README guidance/examples for recommended caller preallocation.

Acceptance criteria

  • Helper APIs return required bytes (return 0 on invalid inputs).
  • Tests cover representative shapes and invalid cases.

Test plan

  • make test (expects tests/testDriver_work_area_size_apis.c to pass).

References

  • Key files: zdnn/work_area.c, zdnn/zdnn.h, zdnn/zdnn.map,
    tests/testDriver_work_area_size_apis.c.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions