draft: add ARKit52 NPC voice lip sync lane by JOY · Pull Request #319 · DOS/Second-Spawn

JOY (JOY) · 2026-05-29T19:17:41Z

Summary

Adds ARKit52/provider lip-sync plumbing for scoped NPC voice sessions.
Adds local ARKit52 smoke/helper scripts, a sidecar launcher, and blendshape reporting utilities.
Adds a synthetic Unity editor ARKit52 driver smoke hook that feeds provider-style frames into PrototypeFacialAnimationDriver and checks actual SkinnedMeshRenderer blendshape weights.
Adds visual prefab catalog safety checks and documents the current missing paid-source prefab state.
Keeps the lane presentation-only: voice and lip-sync payloads do not mutate authoritative gameplay state.
Shows focused NPC text before voice buffering, and stops stale NPC voice presentation when the player sends the next line so chat stays responsive.
Extends the ARKit52 smoke helper to verify secondspawn_voice_audio_chunk_get, so chunked Gemini audio retrieval is tested instead of only voice-session metadata.
Adds non-mutating lip-sync provider readiness and Unity lip-sync contract checkers so agents can verify local provider/testability state without touching production services.
Adds Nakama runtime coverage for Gemini voice gender pool selection and stable actor voice assignment.
Merged current origin/dev into this draft branch after resolving the voice lane doc conflict.

Verification

dotnet build Unity\SecondSpawn.AI.csproj --nologo: passed, 0 warnings.
dotnet build Unity\SecondSpawn.UI.csproj --nologo: passed, 0 warnings.
dotnet build Unity\Assembly-CSharp-Editor.csproj --nologo: passed. Warnings are existing Photon/Fusion package warnings, not new SECOND SPAWN code.
npm.cmd run build: passed.
npm.cmd test: passed after build, including Gemini voice pool selection coverage.
git diff --check: passed.
python -m py_compile tools\lipsync\check_unity_lipsync_contract.py tools\lipsync\check_lipsync_provider_readiness.py tools\lipsync\run_arkit52_smoke.py tools\lipsync\wav2arkit_http_server.py: passed.
python tools\lipsync\check_unity_lipsync_contract.py: passed 13/13 checks for ARKit52 sidecar channel shape, Nakama chunk smoke assertions, Unity DTO shape, presenter lip-sync forwarding, facial driver blendshape mapping, synthetic editor driver smoke hook, focused/free-mode voice gating, and portrait ARKit status reporting.
python tools\lipsync\run_arkit52_smoke.py: passed, 99 frames, 52 channels, 72.11 ms wall in the isolated worktree.
python tools\lipsync\run_arkit52_smoke.py --include-nakama --actor-id npc-scrap-warden-0441 --text "Xin chao. Day la spike test ARKit nam muoi hai voi giong nhan vat nam.": passed, provider gemini_tts, transport nakama_audio_chunks, voice Algenib, male hint, 158 ARKit52 frames, 52 channels, 104 ms lip-sync latency, 6580.26 ms total RPC wall time, 3 audio chunks, 335360 base64 chars fetched.
Attempted Unity.exe -batchmode -nographics -quit -projectPath .claude/worktrees/arkit52-lipsync-root/Unity -executeMethod SecondSpawn.EditorTools.SecondSpawnFacialBlendshapeReportUtility.RunArkit52DriverSmoke: Unity did not reach the smoke method because Package Manager failed startup with The "path" argument must be of type string. Received undefined. This is recorded as a worktree batch runner blocker, not a failed driver smoke.
Earlier male/female voice smoke: npc-scrap-warden-0441 returned stable voice Algenib twice with male hint; npc-clinic-operator-0320 returned Sulafat with female hint. Both returned Gemini audio chunks plus ARKit52 frames.
python tools\lipsync\check_lipsync_provider_readiness.py --scratch-root D:\Projects\Second-Spawn\.tmp-import\lipsync-spike: passed. Only wav2arkit_cpu is testable now; SALSA/uLipSync/Convai/FaceSync are not installed in Unity, NeuroSync is blocked by broken torchaudio, and NVIDIA A2F SDK is blocked by CUDA 13.3 plus missing TENSORRT_ROOT_DIR.
python tools\unity\check_visual_prefab_catalog.py: passed and reported 50 entries, 29 missing generated prefabs, 50 missing source assets, 3 unresolved generated prefab source GUIDs.
Local review fallback: no provider keys in Unity, Unity only requests presentation tiers through Nakama/Fusion boundary, branch stays presentation-only.

Still Missing Before Ready

Unity MCP console and Play Mode smoke could not run. Coplay-style tools returned telemetry but no_unity_session; official Unity MCP tools were visible, and the Editor log showed 20 official MCP tools discovered, but Unity_ManageEditor and Unity_ReadConsole timed out after 120 seconds from this agent session.
Current root Editor is still on dev; this branch is not merged into dev, so the feature is not yet applied to the root game workspace.
Final visual proof still requires one focused Ida Faber NPC in Play Mode with audible Gemini voice, honest portrait status, and visible ARKit mapped mouth motion.

Related: #139, #288

…root # Conflicts: # docs/design/82-alpha-npc-voice-and-facial-animation-lane.md

JOY (JOY) force-pushed the codex/arkit52-lipsync-root branch 2 times, most recently from dd500da to 7777bd2 Compare May 29, 2026 19:42

JOY (JOY) added 15 commits May 30, 2026 02:50

feat: add ARKit52 voice lip sync path

9cc76e0

docs: record ARKit52 lip sync spike matrix

01ec70d

chore: add ARKit52 blendshape coverage report

b1faeaf

fix: stabilize Gemini NPC voice selection

c1dd53d

fix: fail fast on delayed NPC voice

4de9abd

docs: add ARKit52 lip sync rerun evidence

9e522df

chore: add ARKit52 smoke helper

6f21a74

fix: report ARKit facial mapping honestly

38193ab

chore: add visual prefab catalog check

0171da2

fix: keep prototype random visuals spawn-safe

4f2f6c1

fix: show NPC text before voice buffering

9b8787e

docs: record text-first voice spike evidence

d5fd65d

chore: add ARKit52 sidecar launcher

d418a81

docs: record repeated voice route evidence

b53db51

fix: stop stale NPC voice presentation on reply

bb2ab0d

JOY (JOY) force-pushed the codex/arkit52-lipsync-root branch from 3047fd0 to bb2ab0d Compare May 29, 2026 19:51

JOY (JOY) added 3 commits May 30, 2026 03:08

chore: verify chunked NPC voice audio smoke

887d147

chore: add lip sync provider readiness check

29433d3

test: cover Gemini voice pool selection

821adc6

This was referenced May 29, 2026

docs: sync alpha order open PR list #339

Merged

Track alpha demo implementation umbrella #260

Open

JOY (JOY) added 3 commits May 30, 2026 03:52

test: add Unity lip sync contract verifier

3aab5d8

Merge remote-tracking branch 'origin/dev' into codex/arkit52-lipsync-…

64977ca

…root # Conflicts: # docs/design/82-alpha-npc-voice-and-facial-animation-lane.md

test: add ARKit52 driver smoke hook

490a03b

JOY (JOY) merged commit e8cdc56 into dev May 30, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

draft: add ARKit52 NPC voice lip sync lane#319

draft: add ARKit52 NPC voice lip sync lane#319
JOY (JOY) merged 21 commits into
devfrom
codex/arkit52-lipsync-root

JOY (JOY) commented May 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JOY (JOY) commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Still Missing Before Ready

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JOY (JOY) commented May 29, 2026 •

edited

Loading