Summary
When using type: "anthropic" with a BYOK provider config, the SDK sends max_tokens: 8192 (DEFAULT_MAX_OUTPUT_TOKENS) regardless of the model being used. The SDK's built-in model catalog (which correctly lists max_output_tokens: 32000 for claude-opus-4.5, 65536 for claude-sonnet-4.6, etc.) is not consulted for BYOK providers.
This causes 87% of agentic tool-use sessions to terminate prematurely because Claude's responses are truncated before the tool_use content block is emitted.
Note: This may be a regression or incomplete fix from #955 and #931, which were closed as completed on April 6 — but the underlying issue persists on SDK 0.2.0. The specific gap is that getEffectiveMaxTokens() does not look up the model catalog for BYOK providers.
Reproduction
const session = await client.createSession({
model: "claude-opus-4.5",
provider: {
type: "anthropic",
baseUrl: "https://my-proxy/v1",
apiKey: "my-key",
},
tools: [myTool],
});
await session.sendAndWait({ prompt: "Use the tool to create a file with substantial content" });
// Session ends after 1 turn — tool_use block truncated at 8192 output tokens
Expected behavior
The SDK should resolve max_output_tokens from its built-in model catalog when the model name matches a known entry (e.g., claude-opus-4.5 → 32000). The DEFAULT_MAX_OUTPUT_TOKENS fallback should only apply when the model is genuinely unknown.
The catalog already exists in the SDK:
// From app.js
["claude-opus-4.5", { max_prompt_tokens: 168000, max_context_window_tokens: 200000, max_output_tokens: 32000 }]
["claude-opus-4.6", { max_prompt_tokens: 168000, max_context_window_tokens: 200000, max_output_tokens: 32000 }]
["claude-sonnet-4.6", { max_prompt_tokens: 168000, max_context_window_tokens: 200000, max_output_tokens: 65536 }]
Actual behavior
getEffectiveMaxTokens() returns this.options?.maxOutputTokens ?? DEFAULT_MAX_OUTPUT_TOKENS (8192). For BYOK providers, this.options.maxOutputTokens is never populated from the model catalog.
Impact
When Claude generates a response exceeding 8192 tokens (common for tool-calling agents that write substantial content):
- Response is truncated —
stop_reason becomes "max_tokens" instead of "tool_use"
- The
tool_use content block is incomplete or missing entirely
finish_reason maps to "length" → agent loop stops instead of continuing
- Session emits
session.idle prematurely — the agent never executes the tool
This is a silent failure — the session completes without error, but the agent did not accomplish its task.
Evidence
In a 54-trial evaluation run with claude-opus-4.5 via a BYOK proxy:
- 28 responses hit exactly 8192 output tokens with 0 tool calls (truncated)
- 19 responses were small enough to fit (tool calls for metadata lookups, not content creation)
- 7 responses happened to be concise enough to fit the create tool call within 8192 tokens
Suggested fix
In getEffectiveMaxTokens(), look up the model name in the built-in catalog before falling back to DEFAULT_MAX_OUTPUT_TOKENS:
getEffectiveMaxTokens() {
const catalogMax = this.getModelCatalogMaxOutput(this.model); // new lookup
const e = this.options?.maxOutputTokens ?? catalogMax ?? DEFAULT_MAX_OUTPUT_TOKENS;
return this.isThinkingEnabled() ? Math.max(e, MIN_THINKING_BUDGET + 1) : e;
}
Workaround
Proxy servers can enforce a max_tokens floor by inspecting and overriding the value in the request body before forwarding to the upstream provider.
Environment
@github/copilot-sdk: 0.2.0
- Models affected: Any Claude model via BYOK
type: "anthropic" where the model's actual max output exceeds 8192
Summary
When using
type: "anthropic"with a BYOK provider config, the SDK sendsmax_tokens: 8192(DEFAULT_MAX_OUTPUT_TOKENS) regardless of the model being used. The SDK's built-in model catalog (which correctly listsmax_output_tokens: 32000forclaude-opus-4.5,65536forclaude-sonnet-4.6, etc.) is not consulted for BYOK providers.This causes 87% of agentic tool-use sessions to terminate prematurely because Claude's responses are truncated before the
tool_usecontent block is emitted.Note: This may be a regression or incomplete fix from #955 and #931, which were closed as completed on April 6 — but the underlying issue persists on SDK 0.2.0. The specific gap is that
getEffectiveMaxTokens()does not look up the model catalog for BYOK providers.Reproduction
Expected behavior
The SDK should resolve
max_output_tokensfrom its built-in model catalog when the model name matches a known entry (e.g.,claude-opus-4.5→ 32000). TheDEFAULT_MAX_OUTPUT_TOKENSfallback should only apply when the model is genuinely unknown.The catalog already exists in the SDK:
Actual behavior
getEffectiveMaxTokens()returnsthis.options?.maxOutputTokens ?? DEFAULT_MAX_OUTPUT_TOKENS(8192). For BYOK providers,this.options.maxOutputTokensis never populated from the model catalog.Impact
When Claude generates a response exceeding 8192 tokens (common for tool-calling agents that write substantial content):
stop_reasonbecomes"max_tokens"instead of"tool_use"tool_usecontent block is incomplete or missing entirelyfinish_reasonmaps to"length"→ agent loop stops instead of continuingsession.idleprematurely — the agent never executes the toolThis is a silent failure — the session completes without error, but the agent did not accomplish its task.
Evidence
In a 54-trial evaluation run with
claude-opus-4.5via a BYOK proxy:Suggested fix
In
getEffectiveMaxTokens(), look up the model name in the built-in catalog before falling back toDEFAULT_MAX_OUTPUT_TOKENS:Workaround
Proxy servers can enforce a
max_tokensfloor by inspecting and overriding the value in the request body before forwarding to the upstream provider.Environment
@github/copilot-sdk: 0.2.0type: "anthropic"where the model's actual max output exceeds 8192