Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -96,3 +96,6 @@ iOSInjectionProject/

# Local CLI scratch transcripts / terminal captures
Tools/TranscriptedCLI/*.txt

# CodeGraph local index (MCP code-intelligence cache)
.codegraph/
9 changes: 6 additions & 3 deletions Sources/Meeting/CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,9 @@
- `LocalMeetingSummarizer.swift` — opt-in local meeting-summary runners (Gemma MLX and Apple Foundation Models), transcript chunking, provider metadata, runtime env sanitizing, and stale-transcript write protection; blocking model runs execute on a dedicated queue so they never occupy Swift-concurrency cooperative threads
- `MeetingMicBoostPromptPolicy.swift` — dependency-free gate for the in-meeting Boost Mic consent prompt and stale prompt actions
- `MeetingModelDownloader.swift` — loads the selected STT and diarization models together
- `MeetingPromptDetector.swift` — polls upcoming Calendar events, watches supported meeting apps, and asks the overlay to offer recording prompts with provider-aware remind/dismiss backoff
- `MeetingPromptHeuristics.swift` — shared scoring, prompt reasons, and provider-aware remind/dismiss backoff rules for calendar- and runtime-based prompt candidates
- `MeetingPromptDetector.swift` — polls upcoming Calendar events, watches supported meeting apps, ingests mic-activity from `MicActivityMonitor`, and asks the overlay to offer recording prompts with provider-aware remind/dismiss backoff
- `MeetingPromptHeuristics.swift` — shared scoring, prompt reasons, browser-family + mic-input provider mapping, and provider-aware remind/dismiss backoff rules for calendar-, runtime-, and mic-activity-based prompt candidates
- `MicActivityMonitor.swift` — Core Audio process-object watcher for ad-hoc call detection. Emits the set of non-self bundle IDs currently holding the mic input so the detector can prompt when a call *starts* (including a spontaneous Google Meet with no calendar invite). Metadata-only, no TCC permission; CoreAudio confined to one serial queue. See `docs/auto-call-detection-spec.md`
- `MeetingRecordingCleanup.swift` — removes scratch audio when a live meeting recording is explicitly discarded instead of saved
- `MeetingRecordingStartGate.swift` — permission preflight for meeting recording, including missing-permission reasons and user-facing error messages
- `MeetingSTTAdapter.swift` — adapts the app's shared `STTRouter` to `TranscriptedCore.SpeechToTextEngine`
Expand Down Expand Up @@ -62,7 +63,8 @@
- Imported meeting audio should be copied into app-controlled scratch space before transcription so later cleanup and metadata writes stay consistent with live captures.
- Retained-audio maintenance must only manage Transcripted meeting transcripts and app-owned retained audio filenames. A transcript is only storage-owned when its frontmatter has `capture_type: meeting` and a valid `capture_id` or `transcript_id`. Be very conservative with deletion: Markdown transcripts stay, unrelated files in capture folders stay, symlinked audio folders are ignored, and converted or pre-existing M4A files should be owner-only.
- Meeting recording cancellation must be explicit, visible, and confirmed because discard deletes the captured audio. Do not wire Escape to meeting cancellation.
- `MeetingPromptDetector` can prompt from upcoming calendar events or from recently active supported runtime apps. Zoom and Teams should rely on stronger calendar evidence because app-open/frontmost state is not enough to prove a call is active.
- `MeetingPromptDetector` can prompt from upcoming calendar events, from recently active supported runtime apps, or from a process actively holding the mic input (ad-hoc call detection). Zoom and Teams should rely on stronger calendar evidence because app-open/frontmost state is not enough to prove a call is active.
- Ad-hoc call detection (`MicActivityMonitor` → `MeetingPromptDetector.updateMicInputUsers`) must never prompt while Transcripted itself holds the mic: it is gated by `isOwnCaptureActive` (meeting recording or dictation) and the monitor drops our own bundle ID by prefix. Browser calls map to `.googleMeet` via family-prefix matching because the mic is held by helper/service processes (`com.google.Chrome.helper`, `com.apple.WebKit.GPU`), and reuse the `.runtimeApp` source so existing snooze/dismiss/backoff is unchanged. The feature is behind `AutoCallDetectionPreferences` (default on).
- Prompt dismissals are provider- and source-aware: runtime-only prompts can remind sooner, calendar-linked prompts can stay suppressed until the next relevant window, and Teams gets a longer minimum dismiss interval.
- Local mic diarization is opt-in and controlled by `Sources/Support/LocalSpeakerPreferences.swift`, so default meeting behavior still keeps the mic side as a single "You" speaker unless the user enables review for people in the room.
- Live meeting sidecar mode is opt-in and sidecar-only. It can write provisional live files under app support during recording, but the durable meeting Markdown still comes from the existing `TranscriptionTaskManager` save pipeline. Keep live ASR isolated from final transcription work; if another transcript is already processing, prefer deferring live ASR over contending with the final pipeline.
Expand Down Expand Up @@ -121,6 +123,7 @@ Relevant direct coverage:
- `Tests/MeetingFailureExplanationTests.swift`
- `Tests/MeetingFailureKindTests.swift`
- `Tests/MeetingPromptHeuristicsTests.swift`
- `Tests/MicActivityMonitorTests.swift`
- `Tests/MeetingRecordingStartGateTests.swift`
- `Tests/MeetingMicBoostPromptPolicyTests.swift`
- `Tests/MeetingRecordingCleanupTests.swift`
Expand Down
90 changes: 88 additions & 2 deletions Sources/Meeting/MeetingPromptDetector.swift
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,18 @@ final class MeetingPromptDetector {
var onPromptRequest: ((Candidate) -> Bool)?
var shouldSkipPromptEvaluation: (() -> Bool)?

/// Returns true while Transcripted itself holds the mic (meeting recording or
/// dictation). Gates the mic-activity path so we never prompt to record our
/// own capture — belt-and-suspenders with `MicActivityMonitor`'s own-bundle
/// filter. Wired in `TranscriptedApp`.
var isOwnCaptureActive: (() -> Bool)?
/// Returns false when the Settings toggle is off. This keeps late monitor
/// callbacks quiet after the user disables auto call detection.
var isMicInputPromptEnabled: (() -> Bool)?

private let calendarReader = MeetingPromptCalendarReader()
private let calendarAccessGranted: () -> Bool
private let refreshesCalendarEventSnapshots: Bool
// Cache of upcoming meeting-link events refreshed off-main by each poll cycle.
// Dismiss/markAccepted/title paths must stay synchronous (overlay callbacks and
// the recording-start title closure), so they read this cache instead of querying
Expand All @@ -39,6 +49,8 @@ final class MeetingPromptDetector {
private var pendingUntil: [String: Date] = [:]
private var recentNativeActivity: [MeetingPromptProvider: Date] = [:]
private var runtimeSuppressedUntil: [MeetingPromptProvider: Date] = [:]
// Bundle IDs currently holding the mic input, pushed by MicActivityMonitor.
private var micActiveBundleIDs: Set<String> = []

private let defaultSnoozeInterval: TimeInterval = 30 * 60
private let pendingCooldown: TimeInterval = 90
Expand All @@ -49,10 +61,12 @@ final class MeetingPromptDetector {

init(
calendarAccessGranted: @escaping () -> Bool = { TranscriptedPermissionAccess.calendarAccessGranted() },
calendarEventSnapshots: [MeetingPromptCalendarEventSnapshot] = []
calendarEventSnapshots: [MeetingPromptCalendarEventSnapshot] = [],
refreshesCalendarEventSnapshots: Bool = true
) {
self.calendarAccessGranted = calendarAccessGranted
self.calendarEventSnapshots = calendarEventSnapshots
self.refreshesCalendarEventSnapshots = refreshesCalendarEventSnapshots
}

func start() {
Expand Down Expand Up @@ -211,8 +225,10 @@ final class MeetingPromptDetector {
runningBundleIDs: runningBundleIDs,
frontmostBundleID: frontmostBundleID
))
candidates.append(contentsOf: micInputCandidates(now: now))

guard let match = candidates.sorted(by: sortCandidates).first else { return }
let sortedCandidates = candidates.sorted(by: sortCandidates)
guard let match = preferredCandidate(from: sortedCandidates) else { return }

guard snoozedUntil[match.candidate.id] == nil, pendingUntil[match.candidate.id] == nil else { return }

Expand All @@ -222,6 +238,7 @@ final class MeetingPromptDetector {
}

private func refreshCalendarEventSnapshots() async {
guard refreshesCalendarEventSnapshots else { return }
guard calendarAccessGranted() else {
calendarEventSnapshots = []
return
Expand Down Expand Up @@ -315,6 +332,75 @@ final class MeetingPromptDetector {
}
}

// MARK: - Mic-activity candidates (ad-hoc call detection)

/// Pushed by `MicActivityMonitor` with the set of non-self bundle IDs holding
/// the mic input. Stores it and re-evaluates; existing pending/dismiss
/// cooldowns survive inactive edges so mute/unmute cannot re-prompt early.
func updateMicInputUsers(_ bundleIDs: Set<String>) {
guard bundleIDs != micActiveBundleIDs else { return }
micActiveBundleIDs = bundleIDs
Task { @MainActor [weak self] in
await self?.evaluate()
}
}

private func micInputCandidates(now: Date) -> [ScoredCandidate] {
guard isMicInputPromptEnabled?() != false else { return [] }
guard !micActiveBundleIDs.isEmpty else { return [] }
// Never prompt to record a call while we already hold the mic ourselves.
guard isOwnCaptureActive?() != true else { return [] }

var seenProviders: Set<MeetingPromptProvider> = []
var candidates: [ScoredCandidate] = []
for bundleID in micActiveBundleIDs.sorted() {
guard let provider = MeetingPromptProvider.micInputProvider(forBundleID: bundleID) else { continue }
guard seenProviders.insert(provider).inserted else { continue }
if let suppressedUntil = runtimeSuppressedUntil[provider], suppressedUntil > now { continue }

// Browser calls map to .googleMeet generically (could be Meet/Zoom-web/
// Teams-web), so keep their title neutral instead of mislabeling them.
let isBrowserCall = provider == .googleMeet
let title = isBrowserCall
? "Call detected in your browser"
: "\(provider.displayName) call detected"
let presentation = MeetingPromptHeuristics.micInputPresentation(title: title)

candidates.append(
ScoredCandidate(
candidate: Candidate(
id: micCandidateID(for: provider),
title: presentation.title,
detail: presentation.detail,
provider: provider,
reason: .micInput,
source: .runtimeApp,
startDate: now,
endDate: now.addingTimeInterval(MeetingPromptHeuristics.runtimeReminderSnoozeInterval),
meetingURL: nil,
suggestedTranscriptTitle: nil
),
score: presentation.score
)
)
}
return candidates
}

private func micCandidateID(for provider: MeetingPromptProvider) -> String {
"mic:\(provider.rawValue)"
}

private func preferredCandidate(from sortedCandidates: [ScoredCandidate]) -> ScoredCandidate? {
guard let first = sortedCandidates.first else { return nil }
guard first.candidate.reason == .micInput else { return first }
guard first.candidate.provider != .googleMeet else { return first }
return sortedCandidates.first {
$0.candidate.source == .calendarEvent &&
$0.candidate.provider == first.candidate.provider
} ?? first
}

private func sortCandidates(_ lhs: ScoredCandidate, _ rhs: ScoredCandidate) -> Bool {
if lhs.score != rhs.score {
return lhs.score > rhs.score
Expand Down
82 changes: 82 additions & 0 deletions Sources/Meeting/MeetingPromptHeuristics.swift
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,68 @@ enum MeetingPromptProvider: String, CaseIterable, Hashable {
private static func hostMatches(_ host: String, domain: String) -> Bool {
host == domain || host.hasSuffix(".\(domain)")
}

// MARK: - Mic-activity attribution (ad-hoc call detection)

/// Bundle-ID prefixes for browser families that can host a web call.
///
/// Matching is by *family prefix*, not exact bundle ID, because browsers
/// spread audio across helper/service processes. This was confirmed against a
/// live Google Meet call: the only process holding the mic input was
/// `com.google.Chrome.helper` (Chrome's Audio Service), not the main
/// `com.google.Chrome`. Safari/WKWebView audio likewise runs in
/// `com.apple.WebKit.GPU`, which is not prefixed by `com.apple.Safari`.
static let browserBundleIDPrefixes: [String] = [
"com.google.Chrome", // Chrome, Chrome.canary, and *.helper children
"com.microsoft.edgemac", // Edge + helpers
"com.brave.Browser", // Brave + helpers
"company.thebrowser.Browser", // Arc + helpers
"org.mozilla.firefox", // Firefox + plugin-container children
"com.apple.Safari", // Safari main process
"com.apple.WebKit", // Safari / WKWebView service processes (GPU, Networking, WebContent)
]

/// Whether `bundleID` belongs to a known browser family (main app or any of
/// its helper/service processes). Prefix-aware: a prefix matches the exact id
/// or any dotted child (`com.google.Chrome` matches `com.google.Chrome.helper`).
static func isBrowserBundleID(_ bundleID: String) -> Bool {
browserBundleIDPrefixes.contains { bundleID.matchesBundleFamily($0) }
}

/// Maps a process bundle ID that is *currently holding the mic input* to a
/// meeting provider, or `nil` if it is not a recognized call source.
///
/// - Native conferencing apps (Zoom, Teams, Webex, FaceTime) map to their own
/// provider, matched by family prefix so a helper process still attributes
/// to the parent app.
/// - Any browser-family process maps to `.googleMeet` — the representative
/// "browser call" (Meet / Zoom-web / Teams-web). This is the branch that
/// closes the spontaneous-Google-Meet gap.
/// - Everything else (QuickTime, Voice Memos, Photo Booth, …) returns `nil`,
/// so it never produces a prompt.
///
/// Because `.googleMeet` carries no `activeBundleIdentifiers`, it is never a
/// native match — so a `.googleMeet` result unambiguously means "browser call".
static func micInputProvider(forBundleID bundleID: String) -> MeetingPromptProvider? {
if let native = allCases.first(where: { provider in
provider.activeBundleIdentifiers.contains { bundleID.matchesBundleFamily($0) }
}) {
return native
}
if isBrowserBundleID(bundleID) {
return .googleMeet
}
return nil
}
}

extension String {
/// True when the receiver is `family` exactly or a dotted child of it
/// (`com.google.Chrome.helper`.matchesBundleFamily(`com.google.Chrome`)).
/// Guards against false matches like `com.google.ChromeEvil`.
func matchesBundleFamily(_ family: String) -> Bool {
self == family || hasPrefix(family + ".")
}
}

enum MeetingPromptSource: Equatable {
Expand All @@ -97,6 +159,10 @@ enum MeetingPromptReason: String, Equatable {
case calendarNearby = "calendar_nearby"
case calendarPlusRuntimeMatch = "calendar_plus_runtime_match"
case runtimeOnly = "runtime_only"
// A process is actively holding the mic input (ad-hoc call detection).
// Distinct from `runtimeOnly` so analytics can tell the stronger signal
// apart, even though both reuse the `.runtimeApp` source for backoff.
case micInput = "mic_input"
}

enum MeetingPromptBackoffKind: String, Equatable {
Expand Down Expand Up @@ -217,6 +283,22 @@ enum MeetingPromptHeuristics {
score: 3
)
}

/// Score for a mic-in-use candidate. Deliberately above the frontmost-browser
/// runtime score (4): a process actively holding the mic is stronger evidence
/// of a live call than "a browser is frontmost".
static let micInputPromptScore = 5

/// Presentation for an ad-hoc call detected from mic activity. The caller
/// builds the user-facing `title` (provider-specific for native apps, generic
/// for browser calls so a Zoom-web/Teams-web call is not mislabeled "Meet").
static func micInputPresentation(title: String) -> RuntimeMeetingPromptPresentation {
RuntimeMeetingPromptPresentation(
title: title,
detail: "Start recording now or press Option-M anytime.",
score: micInputPromptScore
)
}
}

@available(macOS 14.0, *)
Expand Down
Loading
Loading