Skip to content

Replace R2 llms-full.txt with middlecache proxy (request-time)#29301

Merged
mvvmm merged 4 commits intoproductionfrom
llms-full-middlecache-request-time
Mar 25, 2026
Merged

Replace R2 llms-full.txt with middlecache proxy (request-time)#29301
mvvmm merged 4 commits intoproductionfrom
llms-full-middlecache-request-time

Conversation

@mvvmm
Copy link
Copy Markdown
Contributor

@mvvmm mvvmm commented Mar 24, 2026

Summary

  • Replaces the VENDORED_MARKDOWN R2 bucket binding with a MIDDLECACHE R2 binding pointing to the middlecache bucket (same account)
  • The worker reads llms-full.txt files directly from the middlecache R2 bucket via internal binding (no HTTP overhead)
  • R2 keys follow the middlecache path convention: v1/cloudflare-docs-llms-full/{path}

Approach

Request-time R2 read: When a request for /llms-full.txt or /{product}/llms-full.txt arrives, the worker maps the pathname to an R2 key (e.g. /workers/llms-full.txtv1/cloudflare-docs-llms-full/workers/llms-full.txt) and reads directly from the middlecache R2 bucket. This is a local IPC call with no DNS/TLS/HTTP overhead.

This is one of two approaches being tested — compare with #29302 which fetches per-product files at build time and serves them as static assets.

Changes

  • worker/index.ts — Replace R2 .get() on VENDORED_MARKDOWN with .get() on MIDDLECACHE using the v1/cloudflare-docs-llms-full/ key prefix, with 404 handling
  • wrangler.toml — Replace VENDORED_MARKDOWNvendored-markdown binding with MIDDLECACHEmiddlecache binding
  • worker/worker-configuration.d.ts — Replace VENDORED_MARKDOWN: R2Bucket with MIDDLECACHE: R2Bucket in Env interface

Replaces the VENDORED_MARKDOWN R2 bucket binding with a request-time
proxy to middlecache. The worker now fetches llms-full.txt files from
https://middlecache.ced.cloudflare.com/v1/cloudflare-docs-llms-full/
and streams the response to the client.

This removes the dependency on the vendored-markdown R2 bucket.
@mvvmm mvvmm requested review from a team and kodster28 as code owners March 24, 2026 21:58
@github-actions
Copy link
Copy Markdown
Contributor

This pull request requires reviews from CODEOWNERS as it changes files that match the following patterns:

Pattern Owners
*.ts @cloudflare/content-engineering, @kodster28
* @cloudflare/pcx-technical-writing

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 24, 2026

mvvmm added 3 commits March 24, 2026 18:00
Replace the HTTP proxy to middlecache with a direct R2 binding to the
middlecache bucket (same account). This eliminates the HTTP overhead
(DNS, TLS, connection) and reads directly from R2 via internal IPC,
which is faster especially for the ~40 MB root llms-full.txt.
R2 bindings only work for Workers deployed in the same account. External
contributors running npm run dev locally would get empty R2 simulations
and 404s for all llms-full.txt requests. Revert to HTTP fetch from the
public middlecache URL which works for everyone.
The request-time branch is a production-only approach — all llms-full.txt
requests go through the Worker, so using the R2 binding (direct IPC, no
HTTP overhead) is the right choice here. Local dev for this branch would
require the same account access regardless.
@mvvmm mvvmm marked this pull request as ready for review March 25, 2026 17:27
@mvvmm mvvmm merged commit 335bcf3 into production Mar 25, 2026
15 checks passed
@mvvmm mvvmm deleted the llms-full-middlecache-request-time branch March 25, 2026 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants