feat: add chunk splitting, S3 storage, and session transcript copy#288
Open
animalnots wants to merge 2 commits intoOpenWhispr:mainfrom
Open
feat: add chunk splitting, S3 storage, and session transcript copy#288animalnots wants to merge 2 commits intoOpenWhispr:mainfrom
animalnots wants to merge 2 commits intoOpenWhispr:mainfrom
Conversation
## Features - Add size-based recording chunk splitting - Previously, recordings were kept entirely in memory (data loss risk for long recordings) - New: automatically split recordings at configurable threshold (default 24.5MB) - Save each chunk to disk immediately via new recordingStorage module - Transcribe chunks as they complete (incremental results during long recordings) - Added timeslice to MediaRecorder for real-time size tracking - Refactor R2-specific storage to generic S3-compatible implementation - Support any S3-compatible provider (AWS S3, MinIO, Backblaze B2, etc.) - Add custom endpoint URL and region configuration - Enhanced connection test: write → read via presigned URL → delete - Presigned URL passthrough for large files (>25MB) to avoid 413 errors - Add "Copy Full Session" for multi-part recordings - Track and combine all partial transcripts from a session - Show dismissible banner with timing info (duration, start/end times) - Paste combined text at session end (not just last part) - Copy button for full session transcript ## Breaking Changes - none ## Testing - Tested on Windows only - Cross-platform testing recommended before merge
- Added a new setting `passAgentNameToWhisper` (default false) to control whether the agent name is added to the custom dictionary. - Updated [syncAgentNameToDictionary] to respect the new setting and dynamically add or remove the agent name from the dictionary when the setting is toggled. - Added a UI toggle in the Settings page under Voice Agent configuration. - Added localization strings for the new setting across all supported languages (en, es, fr, de, it, ja, pt, ru, zh-CN). - Added debug logging in [audioManager.js] to track when the custom dictionary is appended to the transcription prompt for local Whisper, OpenWhispr Cloud, and Cloud API providers (like Groq/OpenAI).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR: Add Chunk Splitting, S3 Storage & Session Transcript Copy
Summary
This PR adds three features:
Tested on Windows only.
What Changed
1. Feature: Size-Based Recording Chunk Splitting
Problem: Previously, recordings were kept entirely in memory until the user stopped recording. For long recordings (e.g., 2+ hours), this created:
Solution: New chunk splitting feature that automatically splits recordings at a configurable size threshold (default: 24.5 MB):
recordingStorage.jsmoduletimeslice: 1000toMediaRecorder.start()calls soondataavailablefires every ~1s, enabling real-time cumulative size trackingNew files:
src/helpers/recordingStorage.jsModified:
src/helpers/audioManager.js(new_startChunkSplitTimer,_splitRecordingChunk,_saveRecordingBackupmethods)2. Feature: S3-Compatible Cloud Storage for Large Files
Why: For large audio files (>25 MB), direct upload to transcription providers can fail with HTTP 413 errors. S3-compatible storage provides temporary hosting with presigned URLs that services like Groq can fetch directly.
Changes:
src/helpers/s3Storage.js— New S3StorageManager with support for any S3-compatible provider (AWS S3, Cloudflare R2, MinIO, Backblaze B2, etc.). Supports custom endpoint URL, region,forcePathStyle, presigned URLs, and a full connection test (write → read via presigned URL → delete).main.js— Instantiates S3StorageManager and passes to IPC handlers.src/helpers/ipcHandlers.js— S3 IPC handlers for config, test connection, upload, presigned URLs, and cleanup.preload.js— S3 IPC channel exposures.src/types/electron.ts— Type definitions for S3 config includingendpointUrl,region,forcePathStyle, presigned URL on upload.src/helpers/audioManager.js— S3 upload/cleanup methods, presigned URL passthrough for Groq large-file transcription.src/components/SettingsPage.tsx— S3 settings UI with endpoint URL and region fields, plus a quick-start guide.src/locales/en/translation.json— S3 storage i18n strings.Reliability improvement: The connection test now verifies write + public read (via presigned URL) + delete, not just bucket access. This catches misconfigured CORS/permissions before the user starts recording.
Presigned URL passthrough: For files exceeding 25 MB, the app now passes a presigned S3 URL to Groq's API (
urlparameter) instead of uploading the blob directly, avoiding HTTP 413 errors.3. Feature: Copy Full Session Transcript
Why: When chunk splitting is enabled, a long recording (e.g. 2 hours) produces many partial transcriptions. Users had no way to get the combined text for the entire session.
How it works:
AudioManagertracks_sessionTranscripts[]— an ordered array of{partIndex, text}entries.session-transcript-ready) to the ControlPanel window.Files changed:
src/helpers/audioManager.js— Session tracking, combined text assembly, IPC broadcast with timing data.src/hooks/useAudioRecording.js— Usesresult.sessionTextfor paste at session end instead of just the final part's text.src/helpers/ipcHandlers.js— Newbroadcast-session-transcriptIPC handler.preload.js—broadcastSessionTranscriptandonSessionTranscriptReadychannels.src/types/electron.ts— Type definitions for session transcript IPC (includes timing fields).src/components/ControlPanel.tsx— Session transcript banner with Layers icon, timing info, Copy button, and dismiss (X) button.src/locales/en/translation.json— i18n strings for session banner (controlPanel.session.*).Files Modified
src/helpers/audioManager.jssrc/helpers/s3Storage.jssrc/helpers/recordingStorage.jssrc/helpers/ipcHandlers.jspreload.jssrc/types/electron.tsmain.jssrc/components/SettingsPage.tsxsrc/components/ControlPanel.tsxsrc/hooks/useAudioRecording.jssrc/locales/en/translation.jsonTesting Notes
Breaking Changes
None.