Skip to content

Conversation

@jambayk
Copy link
Contributor

@jambayk jambayk commented Feb 6, 2026

Describe your changes

  • Add AutoClip pass which conducts an automatic search for optimal clipping values for linear layers that will be quantized later.
  • Based on BitDistiller paper.
  • Model quality metrics will be provided in a follow up paper which adds the self-distillation part of the paper.
  • Gptq pass and quant_utils refactored to share common code. These will also be useful later when adding AWQ pass which is similar to autoclip but does symmetric clipping and also optimizes scales.

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.

(Optional) Issue link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant