-
Notifications
You must be signed in to change notification settings - Fork 273
Description
What does this message mean?
"[WARNING] chatterbox.models.s3gen.utils.mel: Audio values outside normalized range: min=-1.0005, max=1.0154"
Console output:
2026-03-21 22:48:26 [INFO] server: Received /tts request: mode='predefined', format='mp3'
2026-03-21 22:48:26 [INFO] server: Using predefined voice: de_Bruce_Willis_Manfred_Lehmann_04.wav
2026-03-21 22:48:26 [INFO] server: Splitting text into chunks of size ~240.
2026-03-21 22:48:26 [INFO] utils: A single segment (length 658) exceeds chunk_size 240. It will form its own chunk.
2026-03-21 22:48:26 [INFO] utils: A single segment (length 289) exceeds chunk_size 240. It will form its own chunk.
2026-03-21 22:48:26 [INFO] utils: Text chunking complete. Generated 6 chunk(s).
2026-03-21 22:48:26 [INFO] server: Synthesizing chunk 1/6...
2026-03-21 22:48:26 [INFO] engine: Applying user-provided seed for generation: 1775
2026-03-21 22:48:26 [INFO] engine: Global seed set to: 1775
2026-03-21 22:48:26 [WARNING] chatterbox.models.s3gen.utils.mel: Audio values outside normalized range: min=-1.0005, max=1.0154
Sampling: 93%|███████████████████████████████████████████████▍ | 929/1000 [00:36<00:02, 25.72it/s]2026-03-21 22:49:03 [INFO] chatterbox.models.t3.t3: ✅ EOS token detected! Stopping generation at step 931
Sampling: 93%|███████████████████████████████████████████████▍ | 930/1000 [00:36<00:02, 25.38it/s]
2026-03-21 22:49:08 [INFO] server: Synthesizing chunk 2/6...
2026-03-21 22:49:08 [INFO] engine: Applying user-provided seed for generation: 1775
2026-03-21 22:49:08 [INFO] engine: Global seed set to: 1775
2026-03-21 22:49:08 [WARNING] chatterbox.models.s3gen.utils.mel: Audio values outside normalized range: min=-1.0005, max=1.0154
Sampling: 34%|█████████████████▎ | 340/1000 [00:12<00:25, 26.03it/s]2026-03-21 22:49:21 [WARNING] chatterbox.models.t3.inference.alignment_stream_analyzer: 🚨 Detected 2x repetition of token 4218
2026-03-21 22:49:21 [WARNING] chatterbox.models.t3.inference.alignment_stream_analyzer: forcing EOS token, long_tail=tensor(False), alignment_repetition=tensor(False), token_repetition=True
2026-03-21 22:49:21 [INFO] chatterbox.models.t3.t3: ✅ EOS token detected! Stopping generation at step 343
Sampling: 34%|█████████████████▍ | 342/1000 [00:12<00:23, 27.81it/s]
2026-03-21 22:49:23 [INFO] server: Synthesizing chunk 3/6...
2026-03-21 22:49:23 [INFO] engine: Applying user-provided seed for generation: 1775
2026-03-21 22:49:23 [INFO] engine: Global seed set to: 1775
2026-03-21 22:49:23 [WARNING] chatterbox.models.s3gen.utils.mel: Audio values outside normalized range: min=-1.0005, max=1.0154
Sampling: 28%|██████████████▎ | 280/1000 [00:09<00:27, 25.84it/s]2026-03-21 22:49:33 [WARNING] chatterbox.models.t3.inference.alignment_stream_analyzer: forcing EOS token, long_tail=tensor(True), alignment_repetition=tensor(False), token_repetition=False
2026-03-21 22:49:33 [INFO] chatterbox.models.t3.t3: ✅ EOS token detected! Stopping generation at step 283
Sampling: 28%|██████████████▍ | 282/1000 [00:09<00:25, 28.46it/s]
2026-03-21 22:49:34 [INFO] server: Synthesizing chunk 4/6...
2026-03-21 22:49:34 [INFO] engine: Applying user-provided seed for generation: 1775
2026-03-21 22:49:34 [INFO] engine: Global seed set to: 1775
2026-03-21 22:49:34 [WARNING] chatterbox.models.s3gen.utils.mel: Audio values outside normalized range: min=-1.0005, max=1.0154
Sampling: 15%|███████▊ | 153/1000 [00:05<00:31, 26.54it/s]2026-03-21 22:49:40 [WARNING] chatterbox.models.t3.inference.alignment_stream_analyzer: forcing EOS token, long_tail=tensor(True), alignment_repetition=tensor(False), token_repetition=False
2026-03-21 22:49:40 [INFO] chatterbox.models.t3.t3: ✅ EOS token detected! Stopping generation at step 156
Sampling: 16%|███████▉ | 155/1000 [00:05<00:29, 28.47it/s]
2026-03-21 22:49:41 [INFO] server: Synthesizing chunk 5/6...
2026-03-21 22:49:41 [INFO] engine: Applying user-provided seed for generation: 1775
2026-03-21 22:49:41 [INFO] engine: Global seed set to: 1775
2026-03-21 22:49:41 [WARNING] chatterbox.models.s3gen.utils.mel: Audio values outside normalized range: min=-1.0005, max=1.0154
Sampling: 40%|████████████████████▏ | 395/1000 [00:15<00:27, 22.20it/s]2026-03-21 22:49:57 [WARNING] chatterbox.models.t3.inference.alignment_stream_analyzer: forcing EOS token, long_tail=tensor(True), alignment_repetition=tensor(False), token_repetition=False
2026-03-21 22:49:57 [INFO] chatterbox.models.t3.t3: ✅ EOS token detected! Stopping generation at step 397
Sampling: 40%|████████████████████▏ | 396/1000 [00:15<00:23, 25.76it/s]
2026-03-21 22:49:58 [INFO] server: Synthesizing chunk 6/6...
2026-03-21 22:49:58 [INFO] engine: Applying user-provided seed for generation: 1775
2026-03-21 22:49:58 [INFO] engine: Global seed set to: 1775
2026-03-21 22:49:58 [WARNING] chatterbox.models.s3gen.utils.mel: Audio values outside normalized range: min=-1.0005, max=1.0154
Sampling: 34%|█████████████████▌ | 344/1000 [00:17<00:28, 23.25it/s]2026-03-21 22:50:16 [WARNING] chatterbox.models.t3.inference.alignment_stream_analyzer: forcing EOS token, long_tail=tensor(True), alignment_repetition=tensor(False), token_repetition=False
2026-03-21 22:50:16 [INFO] chatterbox.models.t3.t3: ✅ EOS token detected! Stopping generation at step 345
Sampling: 34%|█████████████████▌ | 344/1000 [00:17<00:32, 20.02it/s]
2026-03-21 22:50:18 [INFO] server: Smart stitching applied: 6 chunks, 20ms crossfades, 200ms pauses
2026-03-21 22:50:18 [WARNING] server: Audio normalized to prevent clipping (peak was 0.990)
2026-03-21 22:50:18 [INFO] utils: Encoded 396333 bytes to 'mp3' at 24000Hz in 0.338 seconds.
2026-03-21 22:50:18 [INFO] server: Successfully generated audio: tts_output_20260321_225018.mp3, 396333 bytes, type audio/mp3.
2026-03-21 23:46:46 [INFO] server: Request received for /save_settings.
2026-03-21 23:46:46 [INFO] config: TTS processing device resolved to: cuda
2026-03-21 23:46:46 [INFO] config: Configuration successfully saved to config.yaml
2026-03-21 23:46:46 [INFO] config: Configuration updated, saved, and re-resolved successfully.
2026-03-21 23:46:50 [INFO] server: Request received for /save_settings.
2026-03-21 23:46:50 [INFO] config: TTS processing device resolved to: cuda
2026-03-21 23:46:50 [INFO] config: Configuration successfully saved to config.yaml
2026-03-21 23:46:50 [INFO] config: Configuration updated, saved, and re-resolved successfully.
2026-03-21 23:46:55 [INFO] server: Request received for /save_settings.
2026-03-21 23:46:55 [INFO] config: TTS processing device resolved to: cuda
2026-03-21 23:46:55 [INFO] config: Configuration successfully saved to config.yaml
2026-03-21 23:46:55 [INFO] config: Configuration updated, saved, and re-resolved successfully.
2026-03-21 23:47:01 [INFO] server: Received /tts request: mode='predefined', format='mp3'
2026-03-21 23:47:01 [INFO] server: Using predefined voice: de_Bruce_Willis_Manfred_Lehmann_04.wav
2026-03-21 23:47:01 [INFO] server: Splitting text into chunks of size ~240.
2026-03-21 23:47:01 [INFO] utils: Text chunking complete. Generated 2 chunk(s).
2026-03-21 23:47:01 [INFO] server: Synthesizing chunk 1/2...
2026-03-21 23:47:01 [INFO] engine: Applying user-provided seed for generation: 2025
2026-03-21 23:47:01 [INFO] engine: Global seed set to: 2025
2026-03-21 23:47:01 [WARNING] chatterbox.models.s3gen.utils.mel: Audio values outside normalized range: min=-1.0005, max=1.0154
Sampling: 31%|███████████████▊ | 311/1000 [00:13<00:26, 25.67it/s]2026-03-21 23:47:15 [WARNING] chatterbox.models.t3.inference.alignment_stream_analyzer: forcing EOS token, long_tail=tensor(True), alignment_repetition=tensor(False), token_repetition=False
2026-03-21 23:47:15 [INFO] chatterbox.models.t3.t3: ✅ EOS token detected! Stopping generation at step 314
Sampling: 31%|███████████████▉ | 313/1000 [00:13<00:29, 23.49it/s]
2026-03-21 23:47:16 [INFO] server: Synthesizing chunk 2/2...
2026-03-21 23:47:16 [INFO] engine: Applying user-provided seed for generation: 2025
2026-03-21 23:47:16 [INFO] engine: Global seed set to: 2025
2026-03-21 23:47:16 [WARNING] chatterbox.models.s3gen.utils.mel: Audio values outside normalized range: min=-1.0005, max=1.0154
Sampling: 36%|██████████████████▍ | 361/1000 [00:17<00:28, 22.36it/s]2026-03-21 23:47:34 [INFO] chatterbox.models.t3.t3: ✅ EOS token detected! Stopping generation at step 362
Sampling: 36%|██████████████████▍ | 361/1000 [00:17<00:30, 21.14it/s]
2026-03-21 23:47:36 [INFO] server: Smart stitching applied: 2 chunks, 20ms crossfades, 200ms pauses
2026-03-21 23:47:36 [WARNING] server: Audio normalized to prevent clipping (peak was 0.990)
2026-03-21 23:47:36 [INFO] utils: Encoded 109101 bytes to 'mp3' at 24000Hz in 0.764 seconds.
2026-03-21 23:47:36 [INFO] server: Successfully generated audio: tts_output_20260321_234736.mp3, 109101 bytes, type audio/mp3.