Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
fc7619e to
ff39c79
Compare
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1217 +/- ##
==========================================
- Coverage 76.91% 76.68% -0.24%
==========================================
Files 350 355 +5
Lines 40481 41414 +933
==========================================
+ Hits 31137 31758 +621
- Misses 9344 9656 +312
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
7b0bb08 to
9b41b8e
Compare
Signed-off-by: Kai Xu <kaix@nvidia.com>
Signed-off-by: Kai Xu <kaix@nvidia.com>
Signed-off-by: Kai Xu <kaix@nvidia.com>
Signed-off-by: Kai Xu <kaix@nvidia.com>
Signed-off-by: Kai Xu <kaix@nvidia.com>
9b41b8e to
4b44815
Compare
What does this PR do?
Type of change: ?
New feature. Integrates WaterSIC for KV-cache quantization.
WaterSIC is an information-theoretically near-optimal quantization algorithm (Lifar et al., 2026) that uses the waterfilling principle for per-column rate allocation combined with Successive Interference Cancellation (ZSIC) and Huffman entropy coding. This PR adds KV-cache quantization for HF models.
Usage
Testing
python examples/watersic_kv_cache/kv_cache_real_model_plots.pyCapture real post-RoPE Q, K tensors from Qwen3-8B (layers 1, 12, 24, 35) and plots rate vs KL divergence for all 5 method.
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅ / ❌ / N/AAdditional Information