This document outlines the batch processing feature for TubeScript, which allows users to process multiple videos at once from YouTube playlists or channels.
The batch processing feature enables users to:
- Process all videos in a YouTube playlist
- Process recent videos from a YouTube channel
- Set limits on how many videos to process
- Control processing options like speaker diarization and enhanced exports
- Monitor batch job progress
- Download all transcripts in a single operation
- Enhanced
youtube.pymodule to extract playlist and channel metadata - Added support for parsing playlist IDs from various YouTube URL formats
- Implemented channel video listing with configurable limits
- New API endpoint:
/api/batch-process - Support for job queue management with rate limiting
- Background task processing with progress tracking
- Consolidated export of all completed transcripts
# Batch request model
class BatchProcessRequest(BaseModel):
url: HttpUrl # Playlist or channel URL
type: str = "playlist" # "playlist" or "channel"
limit: Optional[int] = None # Max videos to process (None = all)
options: Optional[dict] = None # Processing options- Tab-based interface with separate flows for single videos vs. batch processing
- Input fields for playlist/channel URL
- Options for limiting the number of videos
- Expanded progress tracking interface
- List view of all videos in the batch
- Individual progress indicators
- Batch-level overall progress
- Cancel/pause functionality
- Consolidated download options for all completed transcripts
- Format selection (TXT, SRT, VTT)
- Zip file packaging of multiple transcript files
-
Input
- User enters a YouTube playlist or channel URL
- User selects processing type (playlist/channel)
- User sets optional limit on number of videos
-
Validation & Preview
- System validates URL and checks video count
- Displays preview of videos to be processed
- User confirms batch processing start
-
Processing
- System queues all videos for processing
- Processes videos sequentially with progress tracking
- Handles failures gracefully without stopping entire batch
-
Results & Export
- Displays all completed transcripts with options to:
- View individual transcripts
- Rename speakers across all transcripts
- Download individual or all transcripts
- Apply enhanced export options to entire batch
- Displays all completed transcripts with options to:
- Rate limiting to respect YouTube API quotas
- Configurable concurrency for parallel processing (based on system capabilities)
- Disk space management for temporary audio files
- Retry mechanism for transient failures
- Skip strategy for persistently problematic videos
- Detailed error reporting per video
- Estimated disk space requirements displayed before processing
- Automatic cleanup of temporary files
- Cache management for repeated batch processing
- Scheduled Processing: Queue batch jobs for off-peak hours
- Custom Naming Templates: User-defined naming patterns for exported files
- Filters: Process only videos matching certain criteria (length, date, etc.)
- Resume Capability: Resume interrupted batch processing
- Export Templates: Save and reuse export settings across batches
- Backend YouTube module enhancement (2-3 days)
- Batch job management system (2-3 days)
- Frontend batch UI implementation (2-3 days)
- Testing and optimization (1-2 days)
// JavaScript example for batch processing
async function processBatch(playlistUrl, limit = 10) {
try {
const response = await api.post('/api/batch-process', {
url: playlistUrl,
type: 'playlist',
limit: limit
});
return response.data.batch_id;
} catch (error) {
console.error('Batch processing error:', error);
throw error;
}
}