Fix parameter name for values tensor in run_scaled_dot_product_attention#35
Closed
NiccoloSacchi wants to merge 1 commit intostanford-cs336:mainfrom
Closed
Fix parameter name for values tensor in run_scaled_dot_product_attention#35NiccoloSacchi wants to merge 1 commit intostanford-cs336:mainfrom
NiccoloSacchi wants to merge 1 commit intostanford-cs336:mainfrom
Conversation
I believe this dimension should be the same in V and K: the V matrix must contain the value embedding for each key.
Member
|
Yep, good catch, not sure how that slipped through. Will fix this on our end and update this repo before I close the issue. |
NiccoloSacchi
commented
Sep 17, 2025
| K: Float[Tensor, " ... keys d_k"], | ||
| V: Float[Tensor, " ... values d_v"], | ||
| V: Float[Tensor, " ... keys d_v"], | ||
| mask: Bool[Tensor, " ... queries keys"] | None = None, |
Author
There was a problem hiding this comment.
Also, shouldn't mask be of size "queries keys" instead of "... queries keys"?
Member
There was a problem hiding this comment.
The masks can differ for each batch element, so ... is correct.
Author
There was a problem hiding this comment.
Ah, I considered that but thought not really necessary (thought either all batch samples attend to everything or all batch samples only attend to past tokens). Anyway, thanks for confirming! :)
Member
|
Fixed in 2026 version (releases on Monday) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I believe this dimension should be the same in V and K: shouldn't the V matrix contain the value embedding for each key?