Skip to content

AttributeError when trying to switch to CUDA. #3

@Erquint

Description

@Erquint

CUDA doesn't actually work due to a bug in upstream PyPI package, resulting in…

AttributeError: 'FileLikeQueueWriter' object has no attribute 'tell'
[wav @ 000002052f916200] invalid start code [0][0][0][0] in RIFF header
[in#0 @ 000002052f915f80] Error opening input: Invalid data found when processing input
Error opening input file pipe:0.
Error opening input files: Invalid data found when processing input

The cache is created on self.in_proj.weight.device (likely CPU at init), but forward uses q.device (CUDA at runtime). The mask is created on q.device, but k and v come from complete_kv which uses the cache's device. A complete context mismatch.

pocket_tts/modules/transformer.py

     def forward(self, query: torch.Tensor, model_state: dict | None):
         state = self.check_model_state(model_state)
 
         projected = self.in_proj(query)
         # Reshape from (b, t, p*h*d) to (b, t, p, h, d) where p=3, h=num_heads
         b, t, _ = projected.shape
         d = self.embed_dim // self.num_heads
         packed = projected.view(b, t, 3, self.num_heads, d)
         q, k, v = torch.unbind(packed, dim=2)
         q, k = self._apply_rope(q, k, state)
         k, v = self._complete_kv(k, v, state)
+        k = k.to(q.device)
+        v = v.to(q.device)

This fixes the bug, allowing to switch to CUDA. But since you pull upstream dynamically as a module — you'd have to hack it in somehow. Have fun.

The upstream GitHub repo is ahead of the PyPI package — they refactored it and I'm not checking if it happens to break things even worse.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions