feat(inference): Implement Q3_K dequantization in aprender-serve

### Description
The `aprender-serve` inference engine currently lacks support for `Q3_K` (GGML type 11) dequantization. When attempting to load a `Q3_K` model (e.g., `qwen2.5-7b-instruct-q3_k_m.gguf`), inference crashes with:

`Inference failed: Operation 'get_tensor_f32' not supported: Unsupported quantization type: 11`

### Acceptance Criteria
- Implement `dequantize_q3_k` in `crates/aprender-serve/src/quantize/` (both standard and SIMD/parallel variants if applicable).
- Hook the implementation into the `get_tensor_f32` match arm in `crates/aprender-serve/src/gguf/metadata.rs`.
- Add coverage tests for `Q3_K` tensors.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(inference): Implement Q3_K dequantization in aprender-serve #1892

Description

Acceptance Criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat(inference): Implement Q3_K dequantization in aprender-serve #1892

Description

Description

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions