Expand E2E coverage to including using BYOK

If we had a suitable OpenAI/Anthropic token to use in CI, we could set up an E2E run that goes through BYOK.

I think this would work by setting a flag/param (e.g., an env var) that means:

 * In E2E tests, all sessions get a `provider` injected that configures them to use BYOK with the token
 * That provider sets the endpoint to be our record/replay proxy
 * Our proxy has a separate snapshot store for BYOK+OpenAI/Anthropic, but otherwise uses the same logic as the CAPI proxy to resolve calls from the snapshot or to pass them onto the underlying OpenAI/Anthropic enpoint with the token and capture the result
   * Note this means the proxy would have to be expanded to work in terms of Anthropic-formatted data as well as OpenAI. Likely we need some base implementation and then per-provider specializations.

### Why not share snapshots with the CAPI variant?

We could but then that doesn't prove we've ever successfully completed any of these requests against a real Anthropic/OpenAI endpoint. For example maybe Anthropic won't accept certain tool calls, but OpenAI would - in that case sharing a snapshot would mean we don't detect this.

Obviously the fact that we're replaying from snapshots means we only observe the underlying provider's response whenever we're first generating the snapshots or refreshing them. But that's enough to prove the provider did accept our requests at least once, and that's a lot more than never having seen them accept the request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expand E2E coverage to including using BYOK #1084

Why not share snapshots with the CAPI variant?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Expand E2E coverage to including using BYOK #1084

Description

Why not share snapshots with the CAPI variant?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions