Skip to content

Event streaming in run_async triggers repeated Pydantic serialization #4233

@ohs30359-nobuhara

Description

@ohs30359-nobuhara

** Please make sure you read the contribution guide and file the issues in the right place. **
Contribution guide.

Describe the bug
When running an agent via async for event in runner.run_async(...),
pydantic.BaseModel.model_dump_json() is invoked an unexpectedly large number of times within a single user request.

This results in a noticeable performance impact (CPU time and latency), especially when large objects such as LlmRequest, GenerateContentConfig, and Schema are repeatedly serialized even though they appear to be immutable within the same run.

I am not certain whether this behavior is expected by design or if it is an unintended side effect of the event-driven architecture, so I am reporting this as a potential performance issue / design concern rather than a definitive bug.

To Reproduce
Below is a minimal reproduction that triggers the behavior.

from google.adk.runners import Runner
from src.agents.agent import sample_agent

runner = Runner(agent=sample_agent)

async def main():
    async for event in runner.run_async(
        input="hello",
    ):
        pass  # simply consuming events

# run with asyncio.run(main())

To measure the issue, I monkey-patched BaseModel.model_dump_json to record:

  • call count
  • latency
  • serialized size

Observed behavior
For a single user request, model_dump_json() is called thousands of times.
Example aggregated metrics (per single request):

google.adk.events.event.Event                        n=    177
google.adk.models.llm_request.LlmRequest             n=   1886
google.adk.models.llm_response.LlmResponse           n=   3772
google.adk.sessions.session.Session                  n=     69
google.genai.types.Content                           n=     61
google.genai.types.FunctionDeclaration               n=   7048
google.genai.types.Schema                            n=  49380
google.genai.types.GenerateContentConfig             n=   1886

Large immutable objects (e.g. schema and config) appear to be repeatedly serialized during event emission.

Expected behavior
One of the following (or clarification if current behavior is intentional):

  • Fewer repeated model_dump_json() calls for immutable objects within a single run
  • Caching or memoization of serialized representations
  • Documentation clarifying that this level of serialization is expected when using run_async with event streaming

Screenshots
N/A

Desktop (please complete the following information):

  • OS: Linux
  • Python version(python -V): 3.13
  • ADK version(pip show google-adk): 1.21.0

Model Information:

  • Are you using LiteLLM: Yes
  • Which model is being used(e.g. gemini-2.5-pro): gpt-4.1-mini

Additional context
I suspect this may be related to the event-driven architecture where the same state objects are serialized for each emitted event.

I would appreciate guidance on:

  • whether this behavior is expected
  • whether there are recommended patterns to avoid excessive serialization
  • or whether this could be optimized within ADK itself

If this is primarily caused by incorrect usage on my side, I would be happy to adjust my implementation accordingly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    live[Component] This issue is related to live, voice and video chat

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions