Skip to content

Add UpliftAI TTS engine#63

Merged
willwade merged 2 commits intomainfrom
codex/add-new-voice-engines-and-streaming-support-8tggfy
Aug 24, 2025
Merged

Add UpliftAI TTS engine#63
willwade merged 2 commits intomainfrom
codex/add-new-voice-engines-and-streaming-support-8tggfy

Conversation

@willwade
Copy link
Owner

Summary

  • add UpliftAI engine with streaming HTTP support and preset Urdu, Sindhi, and Balochi voices
  • wire UpliftAI into test scaffolding and add example usage
  • strip WAV headers from UpliftAI responses and add regression tests

Testing

  • uv run pytest -q -m "not synthetic" (fails: Missing required environment variables: POLLY_REGION, POLLY_AWS_KEY_ID, POLLY_AWS_ACCESS_KEY, GOOGLE_SA_PATH, GOOGLE_SA_FILE_B64, MICROSOFT_TOKEN, MICROSOFT_REGION, WATSON_API_KEY, WATSON_REGION, WATSON_INSTANCE_ID, ELEVENLABS_API_KEY, WITAI_TOKEN, PLAYHT_API_KEY, PLAYHT_USER_ID)
  • uv run ruff check tests/test_tts_engines.py (fails: C901 check_credentials is too complex (24 > 10), PLR0911, PLR0912, PLR0915, PTH110, PTH100, PTH118, PTH109, E501, C901, PLR0911, PLR0912, LOG015, G004, C901, PLR0912, PLR0915, LOG015, G004)
  • uv run ruff check tts_wrapper/engines/upliftai/client.py

https://chatgpt.com/codex/tasks/task_e_68ab6808c4bc83278cd968e4f809eef5

@willwade willwade merged commit 758cb57 into main Aug 24, 2025
1 of 10 checks passed
Copy link

@llamapreview llamapreview bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto Pull Request Review from LlamaPReview

🎯 TL;DR & Recommendation

Recommendation: Request Changes

This PR adds the UpliftAI TTS engine but introduces critical risks with fragile WAV header handling that could break audio decoding and incomplete test validation that might miss audio corruption.

🌟 Strengths

  • Adds UpliftAI engine with streaming support and preset voices for Urdu, Sindhi, and Balochi languages.
  • Well-integrated into existing test scaffolding with example usage.

🚨 Critical Issues (P1)

  • tts_wrapper/engines/upliftai/client.py: Fragile WAV header stripping logic risks breaking audio decoding silently if the API changes header size.
  • tests/test_tts_engines.py: Test for WAV header stripping doesn't validate audio integrity, potentially allowing corrupted audio to pass undetected.

💡 Suggestions (P2)

  • tests/test_tts_engines.py: Adding a branch to check_credentials exacerbates its complexity, making it harder to maintain and scale for new engines.
  • tts_wrapper/engines/upliftai/client.py: Hardcoded voice metadata creates a maintenance burden and synchronization risk with the external API.
  • tts_wrapper/engines/upliftai/client.py: Credential check performs expensive synthesis, which could slow down frequent calls in test setups.

💡 LlamaPReview Community
Have feedback on this AI Code review tool? Join our GitHub Discussions to share your thoughts and help shape the future of LlamaPReview.

audio_bytes = self._strip_wav_header(audio_bytes)
return audio_bytes

def synth_to_bytestream(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 | Confidence: High

The WAV header stripping logic in the streaming method assumes a fixed 44-byte header. This is fragile because WAV headers can vary in size depending on format details. The related test (test_upliftai_streaming_strips_wav_header) only checks for the absence of "RIFF" but doesn't validate the actual audio format. If the API changes its header structure, this could break audio decoding or cause silent corruption. The non-streaming method uses a more robust RIFF signature check.

Suggested change
def synth_to_bytestream(
def synth_to_bytestream(...):
# ... same setup ...
with requests.post(...) as response:
response.raise_for_status()
# Find RIFF header dynamically
header_buffer = b""
for chunk in response.iter_content(chunk_size=4096):
if not chunk:
continue
if not header_skipped:
header_buffer += chunk
riff_pos = header_buffer.find(RIFF_HEADER)
if riff_pos == -1:
# Haven't found header yet, keep buffering
continue
# Calculate header end (RIFF + 4 bytes size + 4 bytes WAVE = 8 bytes minimum)
# Actual header size is at riff_pos+4 (little-endian uint32)
if len(header_buffer) < riff_pos + 8:
continue
header_size = int.from_bytes(header_buffer[riff_pos+4:riff_pos+8], 'little') + 8
if len(header_buffer) < riff_pos + header_size:
continue
# Yield data after header
data = header_buffer[riff_pos + header_size:]
header_skipped = True
if data:
yield data
continue
yield chunk

pytest.fail(f"SSML synthesis failed with error: {e}")


@pytest.mark.synthetic
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 | Confidence: High

This test validates the WAV header stripping but doesn't verify the audio content remains valid after stripping. The related context shows similar tests for other engines focus on basic audio properties (byte length, format), but don't validate audio integrity. This could allow broken audio to pass tests if the header stripping corrupts the data.

Code Suggestion:

# Add to test after header check
import wave
from io import BytesIO
# ... after first_chunk assertion ...
full_audio = b''.join(stream)
try:
    with BytesIO(full_audio) as bio:
        with wave.open(bio) as wav:
        assert wav.getnchannels() == 1  # Validate basic audio properties
        assert wav.getframerate() == 22050
except Exception as e:
    pytest.fail(f"Stripped audio is invalid: {e}")

f"ElevenLabs API key: {elevenlabs_api_key[:5]}...{elevenlabs_api_key[-5:] if elevenlabs_api_key else ''}"
)
client = ElevenLabsClient(credentials=elevenlabs_api_key)
elif service == "upliftai":
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 | Confidence: High

The check_credentials function is already flagged by Ruff as overly complex (C901). Adding another conditional branch for UpliftAI exacerbates this maintainability issue. The function now has 25+ branches, making it difficult to modify and test. This pattern also doesn't scale well for new engines. The related context shows this function is critical for test setup across all engines.

Code Suggestion:

_CREDENTIAL_CHECKERS = {
      "upliftai": lambda: UpliftAIClient(api_key=os.getenv("UPLIFTAI_KEY")),
      # ... other services ...
  }

  def check_credentials(service):
      if service not in _CREDENTIAL_CHECKERS:
          return False
      try:
          client = _CREDENTIAL_CHECKERS[service]()
          return client.check_credentials()
      except Exception:
          return False

with Path(output_file).open("wb") as f:
f.write(audio_bytes)

def _get_voices(self) -> list[dict[str, Any]]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 | Confidence: Medium

Hardcoding voice metadata creates a maintenance burden and synchronization risk with the external API. If UpliftAI adds/removes voices, this will require code changes. The related context doesn't show how other engines handle voice metadata, but this static approach is less flexible than dynamic discovery.

Code Suggestion:

def _get_voices(self) -> list[dict[str, Any]]:
        try:
            # Attempt to fetch voices dynamically if API adds endpoint
            response = requests.get(f"{self.BASE_URL}/voices", headers=self.headers)
            return response.json()
        except Exception:
            logger.warning("Falling back to hardcoded voice list")
            return [ ... ]  # Hardcoded fallback

},
]

def check_credentials(self) -> bool: # pragma: no cover - network call
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 | Confidence: Medium

The credential check performs a full synthesis request ("ping"), which is expensive and slow compared to a dedicated auth endpoint or lighter-weight request. The related context doesn't show how other engines implement credential checks, but this could become problematic if called frequently (e.g., in test setups).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant