.NET: Fix AG-UI multimodal user message handling#4761
.NET: Fix AG-UI multimodal user message handling#4761thoemmi wants to merge 3 commits intomicrosoft:mainfrom
Conversation
Fixes microsoft#3729 Add typed AG-UI user input content models and converters so user messages can round-trip either plain text or multimodal content arrays. Update AG-UI chat message conversion, ASP.NET Core request binding, and AGUIChatClient serialization to preserve text, binary data, URLs, and hosted file references. Add unit and integration coverage for serializer behavior, endpoint binding, and end-to-end multimodal requests.
There was a problem hiding this comment.
Pull request overview
Fixes AG-UI (.NET) handling of spec-compliant multimodal user messages where content can be either a string or an input-content array, unblocking ASP.NET Core JSON binding and preserving multimodal inputs through AG-UI ↔ ChatMessage conversions.
Changes:
- Add typed AG-UI input-content models (
AGUIInputContent+text/binaryvariants) andSystem.Text.Jsonconverters to supportuser.contentas either string or array. - Update AG-UI message conversion helpers to map multimodal user inputs to/from
ChatMessage.Contents(text, base64 binary, URL, hosted file id). - Add unit + integration regression tests covering the reported payload shape and round-tripping behavior.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIUserMessageJsonConverter.cs | New converter to deserialize/serialize AGUIUserMessage.content as string or array. |
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIUserMessage.cs | Adds converter attribute and InputContents backing model for multimodal content. |
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIInputContent.cs | New polymorphic base type for multimodal user input parts. |
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIInputContentJsonConverter.cs | New discriminator-based converter for AGUIInputContent (`type: text |
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUITextInputContent.cs | New model for type: "text" input items. |
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIBinaryInputContent.cs | New model for type: "binary" input items (id/url/data + metadata). |
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIJsonSerializerContext.cs | Registers new input-content types for source-generated serialization. |
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIChatMessageExtensions.cs | Updates mapping to support multimodal user messages ↔ ChatMessage.Contents. |
| dotnet/tests/Microsoft.Agents.AI.AGUI.UnitTests/AGUIJsonSerializerContextTests.cs | Adds serialization round-trip coverage for AGUIUserMessage content arrays. |
| dotnet/tests/Microsoft.Agents.AI.AGUI.UnitTests/AGUIChatMessageExtensionsTests.cs | Adds mapping tests for text+binary, URL, and hosted-file user content. |
| dotnet/tests/Microsoft.Agents.AI.AGUI.UnitTests/AGUIChatClientTests.cs | Adds client serialization test ensuring multimodal user messages are emitted as arrays. |
| dotnet/tests/Microsoft.Agents.AI.Hosting.AGUI.AspNetCore.UnitTests/AGUIEndpointRouteBuilderExtensionsTests.cs | Adds ASP.NET Core handler test validating binding + conversion of content arrays. |
| dotnet/tests/Microsoft.Agents.AI.Hosting.AGUI.AspNetCore.IntegrationTests/ForwardedPropertiesTests.cs | Adds end-to-end /agent integration test for multimodal user payload acceptance. |
dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIUserMessageJsonConverter.cs
Show resolved
Hide resolved
dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIUserMessageJsonConverter.cs
Outdated
Show resolved
Hide resolved
dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIChatMessageExtensions.cs
Outdated
Show resolved
Hide resolved
dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIChatMessageExtensions.cs
Outdated
Show resolved
Hide resolved
Respond to review comments on PR microsoft#4761 by enforcing role="user" for AGUIUserMessage serialization and deserialization, and by preserving user message name metadata across AG-UI and ChatMessage conversions. Add regression tests for invalid roles, forced user-role serialization, and Name/AuthorName round-tripping for both text-only and multimodal user messages.
There was a problem hiding this comment.
Pull request overview
Fixes .NET AG-UI handling of user.content so user messages can round-trip as either a plain string or a multimodal input-content array (per AG-UI spec), preventing ASP.NET Core binding/mapping failures for multimodal payloads (fixes #3729).
Changes:
- Added typed multimodal user input content models plus JSON converters to support
contentas string-or-array forAGUIUserMessage. - Updated AG-UI ↔
ChatMessageconversion andAGUIChatClientserialization to preserve text, binary data, URLs, and hosted file references for user messages. - Added regression tests across shared unit tests, ASP.NET Core unit tests, and an integration test posting a multimodal payload through
/agent.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIUserMessageJsonConverter.cs | Implements custom string-or-array parsing/writing for AGUIUserMessage.content. |
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIUserMessage.cs | Attaches converter and adds InputContents to model multimodal user content. |
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIInputContent.cs | Introduces polymorphic base type for multimodal user input items. |
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIInputContentJsonConverter.cs | Adds discriminator-based polymorphic (de)serialization for input content items. |
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUITextInputContent.cs | Adds text input content type. |
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIBinaryInputContent.cs | Adds binary input content type (data/url/id + mimeType/filename). |
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIJsonSerializerContext.cs | Extends source-gen context with new models for serialization. |
| dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIChatMessageExtensions.cs | Maps multimodal user contents between AG-UI messages and ChatMessage contents. |
| dotnet/tests/Microsoft.Agents.AI.AGUI.UnitTests/AGUIJsonSerializerContextTests.cs | Adds round-trip serialization test for input-content array user messages. |
| dotnet/tests/Microsoft.Agents.AI.AGUI.UnitTests/AGUIChatMessageExtensionsTests.cs | Adds mapping tests for text/data/url/hosted-file user content and name mapping. |
| dotnet/tests/Microsoft.Agents.AI.AGUI.UnitTests/AGUIChatClientTests.cs | Adds AGUIChatClient serialization coverage for multimodal user messages; minor Encoding cleanup. |
| dotnet/tests/Microsoft.Agents.AI.Hosting.AGUI.AspNetCore.UnitTests/AGUIEndpointRouteBuilderExtensionsTests.cs | Adds ASP.NET Core binding/mapping test for multimodal user message payloads. |
| dotnet/tests/Microsoft.Agents.AI.Hosting.AGUI.AspNetCore.IntegrationTests/ForwardedPropertiesTests.cs | Adds /agent integration test verifying multimodal user messages reach the agent. |
| string? discriminator = discriminatorElement.GetString(); | ||
|
|
||
| AGUIInputContent? result = discriminator switch | ||
| { | ||
| "text" => jsonElement.Deserialize(options.GetTypeInfo(typeof(AGUITextInputContent))) as AGUITextInputContent, | ||
| "binary" => DeserializeBinaryInputContent(jsonElement, options), | ||
| _ => throw new JsonException($"Unknown AGUIInputContent type discriminator: '{discriminator}'") | ||
| }; |
There was a problem hiding this comment.
Leaving this one open for now. I agree case-insensitive discriminator matching would improve tolerance for non-canonical client payloads, but I treated it as a compatibility enhancement rather than a correctness fix for #3729. The current implementation emits canonical lowercase type values and accepts the spec shape we are targeting in this PR, so I kept the scope to the reported multimodal user-message bug and the review comments that addressed concrete data loss / invalid output cases.
dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIChatMessageExtensions.cs
Outdated
Show resolved
Hide resolved
| if (string.IsNullOrEmpty(binaryContent.Id) && | ||
| string.IsNullOrEmpty(binaryContent.Url) && | ||
| string.IsNullOrEmpty(binaryContent.Data)) | ||
| { | ||
| throw new JsonException("Binary input content must provide at least one of 'id', 'url', or 'data'."); | ||
| } |
There was a problem hiding this comment.
Leaving this one open for now as well. I agree that validating exactly one of id / url / data would make the contract stricter, but I did not want to change accepted payload semantics in this PR without explicitly taking on that behavior change. The current implementation is deterministic because downstream mapping already applies explicit precedence (data > url > id), so I kept this fix scoped to correctness issues in the existing bug report rather than tightening validation rules.
...sts/Microsoft.Agents.AI.Hosting.AGUI.AspNetCore.IntegrationTests/ForwardedPropertiesTests.cs
Show resolved
Hide resolved
dotnet/src/Microsoft.Agents.AI.AGUI/Shared/AGUIUserMessageJsonConverter.cs
Outdated
Show resolved
Hide resolved
Respond to the remaining review comments on PR microsoft#4761 by throwing for unsupported user AI content, omitting absent mime types from outbound binary input content, and making the integration test fake agent capture received messages in both RunCoreAsync and RunCoreStreamingAsync. Also standardize the AGUIUserMessageJsonConverter missing-content JsonException punctuation and add regression coverage for unsupported user content and hosted files without media types.
Motivation and Context
AG-UI supports user messages whose
contentis either a plain string or a multimodal input-content array. The .NET AG-UI implementation currently assumesuser.contentis always a string, which causes spec-compliant multimodal requests to fail during ASP.NET Core binding and AG-UI request mapping.This PR fixes #3729.
Description
This change adds typed AG-UI user input content models and converters so
AGUIUserMessagecan round-trip either plain text or multimodal content arrays.It updates AG-UI chat message conversion and
AGUIChatClientserialization so text, binary data, URLs, and hosted file references are preserved end-to-end for user messages.It also adds regression coverage in shared unit tests, ASP.NET Core unit tests, and an integration test that posts the reported multimodal payload shape through the
/agentendpoint.Contribution Checklist
Verification
dotnet format src/Microsoft.Agents.AI.AGUI/Microsoft.Agents.AI.AGUI.csprojdotnet format tests/Microsoft.Agents.AI.AGUI.UnitTests/Microsoft.Agents.AI.AGUI.UnitTests.csprojdotnet format tests/Microsoft.Agents.AI.Hosting.AGUI.AspNetCore.UnitTests/Microsoft.Agents.AI.Hosting.AGUI.AspNetCore.UnitTests.csprojdotnet format tests/Microsoft.Agents.AI.Hosting.AGUI.AspNetCore.IntegrationTests/Microsoft.Agents.AI.Hosting.AGUI.AspNetCore.IntegrationTests.csprojdotnet vstest tests\Microsoft.Agents.AI.AGUI.UnitTests\bin\Debug\net10.0\Microsoft.Agents.AI.AGUI.UnitTests.dlldotnet vstest tests\Microsoft.Agents.AI.Hosting.AGUI.AspNetCore.UnitTests\bin\Debug\net10.0\Microsoft.Agents.AI.Hosting.AGUI.AspNetCore.UnitTests.dlldotnet vstest tests\Microsoft.Agents.AI.Hosting.AGUI.AspNetCore.IntegrationTests\bin\Debug\net10.0\Microsoft.Agents.AI.Hosting.AGUI.AspNetCore.IntegrationTests.dllNote: these AG-UI test projects currently require
dotnet vstestin this repo becausedotnet testhits the current MTP/VSTest mismatch fromglobal.json.