Skip to content

hooks/pyramid_attention_broadcast: fix redundant recompute at iteration 0 and free stale cache when outside timestep range#13467

Open
GitGlimpse895 wants to merge 1 commit intohuggingface:mainfrom
GitGlimpse895:fix/pab-cache-logic
Open

hooks/pyramid_attention_broadcast: fix redundant recompute at iteration 0 and free stale cache when outside timestep range#13467
GitGlimpse895 wants to merge 1 commit intohuggingface:mainfrom
GitGlimpse895:fix/pab-cache-logic

Conversation

@GitGlimpse895
Copy link
Copy Markdown

@GitGlimpse895 GitGlimpse895 commented Apr 14, 2026

What does this PR do?

Fixes two bugs in PyramidAttentionBroadcastHook.new_forward:

  1. Redundant iteration == 0 conditionself.state.cache is None already
    covers the first-call case after every reset_state, making the extra guard
    dead code that creates a misleading impression of two independent invariants.

  2. Stale cache leaking GPU VRAM — when outside the active timestep range,
    the hook was still writing self.state.cache = output, holding a full
    hidden-state activation tensor on GPU until the next generation's
    reset_state call. For video transformers with dozens of PAB-hooked layers
    this accumulates hundreds of MBs of unreleased VRAM. The fix sets
    self.state.cache = None immediately when outside the range.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline?
  • Did you read our philosophy doc (important for complex PRs)?
  • Was this discussed/approved via a GitHub issue or the forum?
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Who can review?

@yiyixuxu @sayakpaul @DN6

…on 0 and free stale cache when outside timestep range
@github-actions github-actions bot added hooks size/S PR with diff < 50 LOC labels Apr 14, 2026
@sayakpaul sayakpaul requested a review from DN6 April 14, 2026 04:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hooks size/S PR with diff < 50 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant