Python: Emit AG-UI events for MCP tool calls, results, and text reasoning by moonbox3 · Pull Request #4760 · microsoft/agent-framework

moonbox3 · 2026-03-18T10:32:29Z

Summary

Fixes #4213 — Supersedes #4233

_emit_content() in the AG-UI layer only handled text, function_call, function_result, function_approval_request, usage, and oauth_consent_request content types. Foundry MCP content types (mcp_server_tool_call, mcp_server_tool_result) and text_reasoning fell through unhandled, producing no SSE events for AG-UI consumers.

Changes

Added three new handler functions and wired them into _emit_content():

Content Type	Handler	AG-UI Events
`mcp_server_tool_call`	`_emit_mcp_tool_call`	`TOOL_CALL_START` + `TOOL_CALL_ARGS`
`mcp_server_tool_result`	`_emit_mcp_tool_result`	`TOOL_CALL_END` + `TOOL_CALL_RESULT`
`text_reasoning`	`_emit_text_reasoning`	`REASONING_START` → `REASONING_MESSAGE_START` → `REASONING_MESSAGE_CONTENT` → `REASONING_MESSAGE_END` → `REASONING_ENCRYPTED_VALUE` (when protected_data present) → `REASONING_END`

Improvements over #4233

This PR incorporates the work from #4233 by @LEDazzio01 and addresses all reviewer feedback:

Uses protocol-defined reasoning events instead of CustomEvent fallback (addresses @robinsk's feedback) — CopilotKit and other AG-UI consumers can render reasoning natively
Emits ReasoningEncryptedValueEvent for protected_data, keeping encrypted reasoning separate from display text (addresses @robinsk's ReasoningEncryptedValue concern)
Correct event constructor arguments — ReasoningStartEvent/ReasoningEndEvent receive required message_id, ReasoningMessageStartEvent uses role="assistant" per protocol spec
Tests match implementation — all test assertions verify the actual reasoning event types, not stale CustomEvent assertions
_emit_mcp_tool_result mirrors _emit_tool_result cleanup — resets tool_call_id/tool_call_name, closes open text messages, resets accumulated_text (addresses Copilot bot review)
Logs warning on missing call_id in _emit_mcp_tool_result (addresses @moonbox3's feedback)
No Fixes #4213 in docstrings (addresses @moonbox3's feedback)
No unused flow param on _emit_text_reasoning (addresses @moonbox3's feedback)

…ning Fixes microsoft#4213 — `_emit_content()` in the AG-UI layer only handled `text`, `function_call`, `function_result`, `function_approval_request`, `usage`, and `oauth_consent_request` content types. Foundry MCP content types (`mcp_server_tool_call`, `mcp_server_tool_result`) and `text_reasoning` fell through unhandled, producing no SSE events for AG-UI consumers. Added three new handler functions wired into `_emit_content()`: - `_emit_mcp_tool_call`: emits TOOL_CALL_START + TOOL_CALL_ARGS and tracks in FlowState for MESSAGES_SNAPSHOT inclusion - `_emit_mcp_tool_result`: emits TOOL_CALL_END + TOOL_CALL_RESULT with full FlowState cleanup mirroring `_emit_tool_result` - `_emit_text_reasoning`: emits the protocol-defined reasoning event sequence (ReasoningStart → MessageStart → MessageContent → MessageEnd → ReasoningEnd) with ReasoningEncryptedValueEvent for protected_data

markwallace-microsoft · 2026-03-18T10:34:50Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
packages/ag-ui/agent_framework_ag_ui
_run_common.py	244	10	95%	309–310, 316–321, 531–532
TOTAL	27142	3218	88%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
5289	20 💤	0 ❌	0 🔥	1m 23s ⏱️

Copilot

Pull request overview

Adds missing AG-UI SSE event emission for Foundry/Responses streaming content items that previously fell through _emit_content() unhandled, ensuring MCP tool execution and text_reasoning are visible to AG-UI consumers.

Changes:

Add _emit_mcp_tool_call() and _emit_mcp_tool_result() to map MCP tool call/result content into standard TOOL_CALL_* AG-UI events and snapshot-tracked FlowState entries.
Add _emit_text_reasoning() to emit protocol-defined reasoning events, including encrypted reasoning payload emission when protected_data is present.
Add/expand unit tests to cover MCP tool call/result emission and _emit_content() routing for the new content types.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
python/packages/ag-ui/agent_framework_ag_ui/_run_common.py	Adds new emitters for MCP tool call/result and `text_reasoning`, and wires them into `_emit_content()` dispatch.
python/packages/ag-ui/tests/ag_ui/test_run.py	Adds test coverage for MCP tool call/result events, reasoning events, and `_emit_content()` routing for the new content types.

You can also share your feedback on Copilot code review. Take the survey.

Exercises the full POST → SSE bytes → parse → validate pipeline for mcp_server_tool_call, mcp_server_tool_result, text_reasoning, and ReasoningEncryptedValueEvent content through FastAPI TestClient.

moonbox3

Automated Code Review

Reviewers: 4 | Confidence: 92%

✓ Correctness

The diff adds three new content-type handlers (_emit_mcp_tool_call, _emit_mcp_tool_result, _emit_text_reasoning) and corresponding tests. The implementation correctly mirrors existing patterns for tool call/result and reasoning event emission. All ag_ui.core event types used (ReasoningStartEvent, etc.) exist in the installed ag-ui-protocol package with matching field signatures. FlowState tracking for MCP tool calls follows the same structure as regular tool calls. The Pydantic min_length=1 constraint on ReasoningMessageContentEvent.delta is respected by guarding with if text:. Tests are comprehensive and cover edge cases (missing IDs, empty content, protected_data only). No correctness issues found.

✓ Security Reliability

The diff adds three new event emitter functions (_emit_mcp_tool_call, _emit_mcp_tool_result, _emit_text_reasoning) and corresponding routing in _emit_content, along with comprehensive tests. The new code faithfully mirrors existing patterns: argument serialization via make_json_safe, FlowState cleanup in _emit_mcp_tool_result matching _emit_tool_result, proper fallback ID generation, and early-return guards for missing call_id. No injection risks, resource leaks, unsafe deserialization, or secrets were found. The protected_data field is an opaque passthrough by design. All new code paths are well-tested including edge cases.

✓ Test Coverage

Test coverage for the new MCP tool call, MCP tool result, and text reasoning features is thorough. Unit tests cover the main happy paths, edge cases (missing call_id, missing server_name, missing arguments, generated IDs), flow state tracking, and FlowState cleanup symmetry with the existing _emit_tool_result. SSE round-trip integration tests verify events survive encoding/parsing. Two minor gaps exist: (1) no test for _emit_mcp_tool_result when output is None (the implementation defaults to empty string, but this path is not exercised), and (2) no test for the fallback tool_name="mcp_tool" when tool_name is None in _emit_mcp_tool_call. Neither gap is blocking since the implementation handles both cases and they are unlikely failure paths via normal factory usage.

✗ Design Approach

The MCP tool call/result and text reasoning handlers are well-structured and the test coverage is thorough. One functional omission stands out: _emit_mcp_tool_result does not accept or forward the predictive_handler parameter, so MCP tool results will never trigger the apply_pending_updates() call or emit a StateSnapshotEvent, unlike the equivalent _emit_tool_result path. This silently breaks state continuity for any agent that uses predictive state updates. Additionally, _emit_mcp_tool_call does not populate flow.tool_call_id/flow.tool_call_name, yet _emit_mcp_tool_result unconditionally clears them — if a standard function_call is being tracked in those fields when an MCP result arrives, the FlowState is corrupted, though this edge case may not occur in practice given how tools are sequenced.

Flagged Issues

_emit_mcp_tool_result silently drops the predictive_handler: it has no parameter for it and _emit_content does not pass it (unlike _emit_tool_result and _emit_approval_request). Any agent using predictive state will never receive a StateSnapshotEvent after an MCP tool result, causing stale client-side state. Fix: add predictive_handler: PredictiveStateHandler | None = None to _emit_mcp_tool_result, mirror the apply_pending_updates() + StateSnapshotEvent block from _emit_tool_result, and forward the handler in _emit_content.

Suggestions

_emit_mcp_tool_call never sets flow.tool_call_id or flow.tool_call_name, but _emit_mcp_tool_result unconditionally resets both to None. If a function_call previously wrote those fields and an MCP result arrives first, they are cleared prematurely. For consistency, _emit_mcp_tool_call should set flow.tool_call_id = tool_call_id and flow.tool_call_name = display_name, mirroring _emit_tool_call.
Add a test for _emit_mcp_tool_result when output is None to verify the empty-string default: Content.from_mcp_server_tool_result(call_id="x") and assert result_event.content == "".
Add a test for the fallback tool_name when tool_name is None (Content(type="mcp_server_tool_call", call_id="x")) to verify it falls back to "mcp_tool".
In TestEmitMcpToolCall.test_tracks_in_flow_state, also assert the stored arguments value (flow.tool_calls_by_id["mcp_call_2"]["function"]["arguments"]) to verify string-typed arguments are stored verbatim.

Automated review by moonbox3's agents

python/packages/ag-ui/agent_framework_ag_ui/_run_common.py

python/packages/ag-ui/tests/ag_ui/test_run.py

…ft#4213) - Add predictive_handler parameter to _emit_mcp_tool_result and mirror the apply_pending_updates + StateSnapshotEvent block from _emit_tool_result - Forward predictive_handler from _emit_content to _emit_mcp_tool_result - Add assertion for stored arguments in MCP tool call test - Add test for predictive handler state snapshot after MCP tool result Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

TaoChenOSU

Automated Code Review

Reviewers: 4 | Confidence: 90%

✓ Correctness

The diff adds three new content-type handlers (_emit_mcp_tool_call, _emit_mcp_tool_result, _emit_text_reasoning) and comprehensive tests. The implementations correctly mirror existing patterns (e.g., _emit_tool_call / _emit_tool_result), properly handle edge cases (missing call_id, missing arguments, empty text), and respect the ag-ui-protocol event type contracts. The ReasoningMessageContentEvent's min_length=1 constraint on delta is correctly guarded by the if text: check. FlowState cleanup in _emit_mcp_tool_result faithfully mirrors _emit_tool_result. Tests cover all branches including routing through _emit_content. No correctness issues found.

✓ Security Reliability

The diff adds three new event emitters (_emit_mcp_tool_call, _emit_mcp_tool_result, _emit_text_reasoning) and their routing in _emit_content, plus comprehensive tests. The new code closely mirrors existing patterns (e.g., _emit_tool_call / _emit_tool_result) for argument serialization via make_json_safe, call_id validation, FlowState cleanup, and predictive handler integration. No injection risks, unsafe deserialization, secrets, or resource leaks were found. All externally-sourced values are passed as typed fields to Pydantic event models rather than interpolated into commands or queries. Input validation is present (missing call_id returns early in _emit_mcp_tool_result). The protected_data pass-through is by design—the field carries opaque encrypted data and the encryption responsibility lies with producers.

✓ Test Coverage

Test coverage for the three new emit functions (_emit_mcp_tool_call, _emit_mcp_tool_result, _emit_text_reasoning) is thorough. Unit tests cover happy paths, edge cases (missing call_id, missing server_name, missing arguments, empty text, generated IDs), flow state mutations, JSON serialization of non-string values, predictive handler integration, and flow cleanup parity with _emit_tool_result. Routing through _emit_content is verified for all three new content types. Full HTTP SSE round-trip tests cover MCP tool calls, reasoning events, and encrypted reasoning. Two minor edge cases lack explicit tests: the tool_name fallback to 'mcp_tool' when None, and None output in _emit_mcp_tool_result defaulting to empty string. These are defensive fallbacks unlikely to occur through normal factory methods, so they are non-blocking suggestions only.

✗ Design Approach

The implementation is functionally correct and well-tested, but has two design concerns worth noting. The most significant is that _emit_mcp_tool_result is a near-verbatim copy of the existing _emit_tool_result, differing only in content.output vs content.result. The code even comments that it "Mirrors the FlowState cleanup performed by _emit_tool_result", acknowledging the duplication. This creates a maintenance trap: future bug-fixes in _emit_tool_result's flow-state cleanup (like the #3568 fix already present) will not automatically propagate. The second concern is a leaky abstraction: embedding server_name into the AG-UI tool_call_name string (e.g. "brave/search") bleeds MCP infrastructure details into the AG-UI protocol layer, forcing downstream consumers to parse a /-delimited string to recover the server origin.

Flagged Issues

_emit_mcp_tool_result duplicates ~40 lines of _emit_tool_result's logic, differing only in reading content.output instead of content.result. Extract a shared private helper (e.g. _emit_tool_result_common(call_id, raw_result, flow, predictive_handler)) and call it from both functions, or normalize the Content field so both result types use the same attribute. As-is, any future fix to _emit_tool_result's flow-state cleanup (like the existing #3568 fix) must be manually propagated to _emit_mcp_tool_result.

Suggestions

Embedding server_name into tool_call_name via f"{server_name}/{tool_name}" is a leaky abstraction: MCP-specific routing context bleeds into the AG-UI protocol layer, and '/' is not a reserved separator in tool names so this can be ambiguous (e.g. server_name containing '/'). Consider using just tool_name for tool_call_name and carrying server_name as structured data in the ToolCallArgsEvent payload or a companion CustomEvent. If the current encoding is kept, validate or sanitize server_name.
_emit_mcp_tool_call does not include the duplicate-argument-replay guard that _emit_tool_call has (the #4194 fix). The docstring notes MCP calls arrive as complete items, but if a provider ever replays them, arguments could double in MESSAGES_SNAPSHOT. A brief comment noting this intentional omission would help future maintainers.
Add unit tests for two untested defensive fallbacks: (1) _emit_mcp_tool_call when tool_name is None, verifying the fallback to 'mcp_tool' display name; (2) _emit_mcp_tool_result when output is None, verifying the result defaults to an empty string.
Consider testing that _emit_mcp_tool_call correctly passes flow.message_id as parent_message_id on the ToolCallStartEvent, to verify the parent-child relationship is preserved.

Automated review by TaoChenOSU's agents

python/packages/ag-ui/agent_framework_ag_ui/_run_common.py

python/packages/ag-ui/tests/ag_ui/test_run.py

- Extract _emit_tool_result_common shared helper to eliminate duplication between _emit_tool_result and _emit_mcp_tool_result - Remove server_name prefix from tool_call_name in _emit_mcp_tool_call; display_name now equals tool_name directly - Add test for tool_name fallback to 'mcp_tool' when tool_name is None - Add test for output=None fallback to empty string in _emit_mcp_tool_result Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 18, 2026 10:32

markwallace-microsoft added the python label Mar 18, 2026

Copilot started reviewing on behalf of moonbox3 March 18, 2026 10:33 View session

Copilot AI reviewed Mar 18, 2026

View reviewed changes

Add HTTP round-trip tests for MCP tool and reasoning SSE events

0f2cf4e

Exercises the full POST → SSE bytes → parse → validate pipeline for mcp_server_tool_call, mcp_server_tool_result, text_reasoning, and ReasoningEncryptedValueEvent content through FastAPI TestClient.

moonbox3 self-assigned this Mar 18, 2026

moonbox3 added the ag-ui label Mar 18, 2026

moonbox3 mentioned this pull request Mar 18, 2026

Python: Emit AG-UI events for MCP tool calls, results, and text reasoning #4233

Closed

moonbox3 commented Mar 18, 2026

View reviewed changes

python/packages/ag-ui/agent_framework_ag_ui/_run_common.py Outdated Show resolved Hide resolved

python/packages/ag-ui/agent_framework_ag_ui/_run_common.py Show resolved Hide resolved

python/packages/ag-ui/tests/ag_ui/test_run.py Show resolved Hide resolved

Copilot and others added 2 commits March 18, 2026 10:50

Apply pre-commit auto-fixes

ed735b3

eavanvalkenburg approved these changes Mar 18, 2026

View reviewed changes

TaoChenOSU reviewed Mar 18, 2026

View reviewed changes

Copilot and others added 2 commits March 18, 2026 22:22

Address review feedback for microsoft#4213: review comment fixes

45b8e9e

moonbox3 requested a review from TaoChenOSU March 18, 2026 23:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Emit AG-UI events for MCP tool calls, results, and text reasoning#4760

Python: Emit AG-UI events for MCP tool calls, results, and text reasoning#4760
moonbox3 wants to merge 6 commits intomicrosoft:mainfrom
moonbox3:4213-fix

moonbox3 commented Mar 18, 2026 •

edited

Loading

Uh oh!

markwallace-microsoft commented Mar 18, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

moonbox3 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TaoChenOSU left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

moonbox3 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Improvements over #4233

Uh oh!

markwallace-microsoft commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Unit Test Overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

moonbox3 left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✗ Design Approach

Flagged Issues

Suggestions

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TaoChenOSU left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✗ Design Approach

Flagged Issues

Suggestions

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

moonbox3 commented Mar 18, 2026 •

edited

Loading

markwallace-microsoft commented Mar 18, 2026 •

edited

Loading