Conversation
gocloud.dev's SQS driver batches Send() calls internally via SendMessageBatch (up to 10 messages) but does not enforce SQS's 1 MB batch size limit. When log entries carry large payloads the batch can exceed 1 MB, causing SQS to reject the entire batch. This cascades into a permanent retry loop because the log entry never gets persisted and the retry scheduler cannot find a prior attempt. Set MaxBatchByteSize to 1 MB so gocloud splits oversized batches before they reach SQS. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
971064b to
ebe2e42
Compare
| ) | ||
|
|
||
| // sqsMaxBatchBytes is the maximum total payload size for an SQS SendMessageBatch | ||
| // request. SQS rejects batches exceeding 256 KB per message and 1 MB per batch. |
There was a problem hiding this comment.
nit: Based on the docs,
The minimum message size is 1 byte (1 character). The maximum is 1,048,576 bytes (1 MiB).
1 single message can also be of size 1MB now. So both single message or a batch of 10 messages should be max 1MB
There was a problem hiding this comment.
Hi @samadalishah, thanks for checking in and reviewing the PR!
I think 1MB is a natural limitation of SQS, so if there's a single message exceeding 1MB then that message will not be able to be sent to SQS. I'm not sure if there's anything we can do in this regards here.
There was a problem hiding this comment.
Yes, all good here! 👍
I think this should be considered for issue #663
As we are adding Event, Attempt, Destination in the LogEntry, the attempt can sometimes be huge (based on my error example, the webhook destination in case of failure responded with 13K lines of HTML 😅). So either the Attempt's ResponseData should be truncated or the message should be cleaned from deliverymq-retry but again related to the issue not this PR.
There was a problem hiding this comment.
noted, and definitely something we can consider supporting better. You brought up a good point that Attempt's ResponseData is something we need to consider in terms of size limit. cc @alexbouchardd
Summary
gocloud.dev's SQS topic driver batches
Send()calls viaSendMessageBatch(up to 10 messages per batch) but does not enforce SQS's 1 MB total batch size limit — it relies on SQS to reject oversized batches at the API level.When log entries carry large payloads (e.g., large request/response bodies), the combined batch can exceed 1 MB, causing SQS to reject the entire batch with
BatchRequestTooLong. This failure cascades: the log entry never gets persisted, and the retry scheduler can't find a prior attempt in the logstore, triggering an infinite retry loop (see #663).This PR sets
MaxBatchByteSize: 1_048_576on the gocloudTopicOptionsso the batcher splits oversized batches before they reach SQS. If a single message exceeds 1 MB, gocloud returnsErrMessageTooLargeinstead of a cryptic SQS batch error.Changes
TopicOptionswithBatcherOptions.MaxBatchByteSizeset to 1 MB when opening SQS topics inqueue_awssqs.goTest plan
🤖 Generated with Claude Code