[Fix] Fixed a bug related to sparse logic. #4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
Marlin-Sparse is a nice repository. When I run the code about sparse and pack, I encountered some bugs. These bugs results in negative values being pruned, leading to extremely large outputs.
Modifications
1. Incorrect Shape of self.B
In
Layer_2_4, the shape ofself.Bshould be(self.k // 16 // 2, self.n * 16 // 8).2. Issue with Sparsity Logic
The quantized weight in
Layer_2_4has a +8 offset and is clamped within the range (0, 15). However, the functionmask_creatordoes not consider this offset. It prunes theNsmallest elements fromMelements without considering the +8 offset. As a result, it unintentionally prunes negative values in the weight matrix.3. Handling of Pruned Elements
After applying sparsity,
mask * w.Tis used to zero out the pruned elements. The functionsparse_semi_structured_from_dense_cutlassthen generates indices for these pruned elements. However, the range of quantized weights is (0, 15) after adding offset +8. So the original weights with 0 may be confused with the pruned value, causing the incorrect sparsity.