Skip to content

Using speech config when Runner.run_live gives error 1007 invalid frame payload data #2934

@ParohyGr

Description

@ParohyGr

Project setup

  • Running a websocket in google cloud run
  • Agent session is created on demand from client

Session creaetion

  • When client creates audio connection, i start a Runner with RunConfig using modality AUDIO
if not params.is_audio:
    modality = [types.Modality.TEXT]
    streaming_mode = StreamingMode.NONE
    speech_config = None
else:
    modality = [types.Modality.AUDIO]
    streaming_mode = StreamingMode.BIDI
    voice_config = types.PrebuiltVoiceConfig(
        voice_name="Kore"
    )
    speech_config = types.SpeechConfig(
        voice_config=voice_config,
        language_code="en-US"
    )

run_config = RunConfig(
    streaming_mode=streaming_mode,
    response_modalities=modality,
    session_resumption=types.SessionResumptionConfig(),
    speech_config=speech_config
)

live_request_queue = LiveRequestQueue()

live_events = runner.run_live(
    session=session,
    live_request_queue=live_request_queue,
    run_config=run_config
)

Observing events and place of crash

async def agent_to_client(websocket: ServerConnection, live_events: AsyncGenerator[Event, None]):
    """Listens to events from the agent and forward them to the client."""
    try:
        async for event in live_events:
            logging.info(f"AGENT EVENT: {event}")
            try:
                if event.turn_complete or event.interrupted:
                    await websocket.send(
                        system_message({"turn_complete": event.turn_complete, "interrupted": event.interrupted}))
                    continue

                part: Optional[types.Part] = event.content and event.content.parts and event.content.parts[0]
                if not part:
                    continue

                if part.inline_data and part.inline_data.mime_type.startswith("audio/pcm"):
                    audio_data = part.inline_data.data
                    if audio_data:
                        await websocket.send(AUDIO_HEADER + audio_data)
                        logging.info(f"[AGENT->CLIENT]: audio/pcm: {len(audio_data)} bytes.")
                elif part.text and event.partial:
                    await websocket.send(agent_message("text/plain", part.text))
            except Exception as e:
                await websocket.send(exception_json(e))
                logging.error(f"Error processing single agent event: {e}")
    except Exception as e:
        await websocket.send(exception_json(e))
        logging.error(f"Agent-to-client loop failed: {e}")
    finally:
        logging.info("Exiting agent-to-client loop.")

Error in "Agent-to-client" log
When I run this code:

  • It successfully creates a session
  • after 1-2s connection is closed with the following error:
google_adk.google.adk.flows.llm_flows.base_llm_flow:Connection closed: received 1007 (invalid frame payload data) Request contains an invalid argument.; then sent 1007 (invalid frame payload data) Request contains an invalid argument.

When I run AUDIO without speech config it works as expected.

Metadata

Metadata

Assignees

Labels

live[Component] This issue is related to live, voice and video chat

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions