Skip to content

Latest commit

Β 

History

History
250 lines (199 loc) Β· 8.09 KB

File metadata and controls

250 lines (199 loc) Β· 8.09 KB

πŸŽ™οΈ Custom Voice Script Feature - Complete Guide

βœ… Feature Successfully Implemented!

You can now provide exact custom script text for video narration instead of relying on auto-generated content. This gives you complete control over what the voice says.


🎯 How It Works

Parameter: voice_script_preview

  • Purpose: Provide exact text for the voice narration
  • Priority: Overrides style-based auto-generation
  • Limit: 2000 characters maximum
  • Usage: Combined with voice_type for complete control

πŸš€ Usage Examples

1. API Request with Custom Script

curl -X POST "http://localhost:8000/generate-video-from-prompt" \
  -H "Content-Type: application/json" \
  -d '{
    "image_prompt": "A beautiful mountain landscape with flowing water",
    "voice_type": "female",
    "voice_script_preview": "Hello everyone! Welcome to this amazing visual journey. What you'\''re seeing here is absolutely incredible - a perfect demonstration of nature'\''s beauty. The way the light dances across the scene creates such a mesmerizing effect.",
    "merge_audio": true
  }'

2. Python Usage

from src.text_extractor import TextExtractor

extractor = TextExtractor()

custom_script = """
Welcome to our nature documentary! Today we're exploring one of the most
breathtaking landscapes on Earth. Notice how the water reflects the sky
perfectly, creating a mirror-like effect that has captivated visitors
for generations. This is truly a masterpiece of natural beauty.
"""

result = extractor.generate_video_from_prompt(
    image_prompt="A serene mountain lake with perfect reflections",
    voice_type="male",
    voice_script_preview=custom_script,
    merge_audio=True
)

3. Different Voice Types with Custom Scripts

# Professional Female Narrator
{
  "image_prompt": "A corporate office environment",
  "voice_type": "female_soft",
  "voice_script_preview": "In today's business landscape, efficiency and collaboration are key. This modern workspace represents the future of productive environments."
}

# Energetic Male Presenter
{
  "image_prompt": "An exciting sports scene",
  "voice_type": "male",
  "voice_script_preview": "Get ready for the action! This is where champions are made and legends are born. Feel the energy and excitement in every moment!"
}

# Educational Narrator
{
  "image_prompt": "A scientific laboratory",
  "voice_type": "male_deep",
  "voice_script_preview": "Welcome to the cutting edge of scientific research. Here, dedicated researchers work tirelessly to unlock the mysteries of our universe."
}

🎬 Complete Workflow Examples

Scenario 1: Product Demonstration

{
  "image_prompt": "A sleek modern smartphone on a clean desk",
  "video_prompt": "Smooth camera movement around the device with elegant lighting",
  "voice_type": "female_bright",
  "voice_script_preview": "Introducing the future of mobile technology. With its stunning design and powerful features, this device redefines what's possible in your pocket. Experience innovation at your fingertips.",
  "duration": 10,
  "merge_audio": true
}

Scenario 2: Travel Destination

{
  "image_prompt": "A tropical beach with crystal clear water and palm trees",
  "video_prompt": "Gentle waves washing ashore with palm trees swaying in the breeze",
  "voice_type": "male_warm",
  "voice_script_preview": "Escape to paradise where time stands still. Feel the warm sand between your toes and let the ocean breeze wash away your worries. This is where memories are made.",
  "duration": 10,
  "merge_audio": true
}

Scenario 3: Educational Content

{
  "image_prompt": "A detailed diagram of the solar system",
  "video_prompt": "Planets slowly orbiting with cosmic effects",
  "voice_type": "male_deep",
  "voice_script_preview": "Our solar system contains eight planets, each with unique characteristics. From Mercury's extreme temperatures to Neptune's powerful winds, each world tells a story of cosmic evolution spanning billions of years.",
  "duration": 10,
  "merge_audio": true
}

πŸ“‹ Voice Type Reference

Gender-Based Selection

  • male β†’ Deep, authoritative voice (onyx)
  • female β†’ Bright, energetic voice (nova)

Specific Voice Characteristics

  • male_deep β†’ Very authoritative (onyx)
  • male_warm β†’ Friendly, approachable (echo)
  • female_bright β†’ Energetic, engaging (nova)
  • female_soft β†’ Gentle, soothing (shimmer)

Direct Voice Names

  • alloy β†’ Balanced, neutral
  • echo β†’ Warm, friendly
  • fable β†’ Storytelling character
  • onyx β†’ Deep, authoritative
  • nova β†’ Bright, energetic
  • shimmer β†’ Soft, gentle

🎯 Best Practices

Script Writing Tips

  1. Match Duration: ~150-200 words per minute
  2. Natural Speech: Write as you would speak
  3. Clear Pronunciation: Avoid complex technical terms
  4. Emotional Tone: Match voice type to content mood
  5. Pacing: Include natural pauses with punctuation

Technical Considerations

  • Character Limit: 2000 characters maximum
  • Voice Selection: Choose voice that matches content tone
  • Merge Audio: Always use merge_audio: true for complete videos
  • Duration: Longer videos (10s) allow more detailed scripts

Content Examples by Style

Professional/Business

"In today's competitive market, innovation drives success. This solution represents years of research and development, designed to meet the evolving needs of modern businesses."

Educational/Documentary

"What we're observing here demonstrates fundamental principles of physics in action. Notice how each element interacts with the others, creating a perfect example of natural harmony."

Entertainment/Casual

"This is absolutely incredible! You're looking at something that will blow your mind. The way everything comes together is just pure magic - you have to see it to believe it!"

Marketing/Promotional

"Discover the difference that quality makes. With attention to every detail and commitment to excellence, this represents the pinnacle of craftsmanship and innovation."

πŸ”„ Migration from Style-Based Generation

Before: Style-Based

{
  "voice_type": "female",
  "style": "descriptive"
}

After: Custom Script

{
  "voice_type": "female",
  "voice_script_preview": "Your exact custom text here"
}

Backward Compatibility

  • Old style-based generation still works
  • Custom script takes priority when provided
  • Can mix and match in different videos

πŸŽ‰ Test Results - CONFIRMED WORKING

Successful Test Case

βœ… Status: success
🎬 Video URL: http://localhost:8000/merged-video/1759139866
πŸŽ™οΈ Audio URL: http://localhost:8000/audio/35c66c0f
🎡 Complete video with custom narration generated!

Features Verified

  • βœ… Custom script text used exactly as provided
  • βœ… Voice type mapping works (female β†’ nova voice)
  • βœ… Audio generation from custom script
  • βœ… Video+audio merging successful
  • βœ… Primary video URL points to merged version
  • βœ… Both separate audio and merged video accessible

πŸ’‘ Use Cases

Perfect For:

  • Product demos with specific messaging
  • Educational content with precise information
  • Marketing videos with brand-specific language
  • Tutorials with step-by-step instructions
  • Documentaries with researched narration
  • Presentations with scripted content

Advantages Over Auto-Generation:

  • Exact control over every word
  • Brand consistency in messaging
  • Technical accuracy for specialized content
  • Emotional timing matched to visuals
  • Length precision for specific durations
  • Professional quality with planned content

πŸš€ Ready to Use!

The custom voice script feature is fully implemented and tested. You now have complete control over your video narration while maintaining all the enhanced motion, moderation bypass, and audio merging capabilities.

Start creating videos with your exact custom scripts today! πŸŽ¬πŸŽ™οΈβœ¨