Skip to content

docs: Update Flash local testing docs with request format and docstring feature#564

Open
promptless[bot] wants to merge 2 commits intomainfrom
promptless/flash-local-testing-updates
Open

docs: Update Flash local testing docs with request format and docstring feature#564
promptless[bot] wants to merge 2 commits intomainfrom
promptless/flash-local-testing-updates

Conversation

@promptless
Copy link
Contributor

@promptless promptless bot commented Mar 23, 2026

Open this suggestion in Promptless to view citations and reasoning process

Updates Flash local development documentation to reflect recent changes from PR #215: corrects queue-based endpoint request format to use {"input": {...}} wrapper, and documents the new docstring surfacing feature that shows function docstrings in the startup table and Swagger UI.

Trigger Events

runpod/flash PR #215: feat: fully deployed environment

Promptless Research (11 files, 1 GitHub PR)

  • Reviewed runpod/flash#215
  • Viewed .long_term_context/product_knowledge/product_overview.md
  • Viewed .long_term_context/doc_workflow/client_instructions.md
  • Viewed flash/apps/apps-and-environments.mdx
  • Viewed flash/apps/deploy-apps.mdx
  • Viewed flash/cli/deploy.mdx
  • Viewed flash/apps/local-testing.mdx
  • Viewed flash/cli/run.mdx
  • Viewed flash/create-endpoints.mdx
  • Viewed flash/apps/requests.mdx
  • Viewed .long_term_context/style/client_style_guide.md
  • Viewed flash/quickstart.mdx

Agent Response

I've created a documentation suggestion based on PR #215 "feat: fully deployed environment". Here's a summary of the changes:

Documentation Updates

Files updated:

  • flash/apps/local-testing.mdx
  • flash/cli/run.mdx

Changes made:

  1. Request body format for queue-based endpoints: Updated curl and Python examples to use the {"input": {...}} wrapper format that matches the deployed endpoint behavior. Added notes explaining the difference between queue-based (requires wrapper) and load-balanced (direct JSON) endpoints.

  2. Docstring surfacing feature: Documented the new developer experience feature where Flash extracts the first line of function docstrings and displays them:

    • In the startup table's "Description" column when running flash run
    • In the Swagger UI as the endpoint summary at /docs

    Added code examples showing how to add docstrings to @Endpoint functions to make APIs self-documenting.

Note: The PR is currently open (not merged), so these documentation updates are ready to be reviewed alongside the code changes. The docs reflect the behavior that will be in place once the PR is merged.

Suggestion URL: https://app.gopromptless.ai/suggestions/30f90650-32ba-404c-afeb-0840fc92abb5


Tip: Use labels in the Promptless dashboard to categorize suggestions by release or team 🏷️

curl -X POST http://localhost:8888/gpu_worker/runsync \
-H "Content-Type: application/json" \
-d '{"message": "Hello from Flash"}'
-d '{"input": {"message": "Hello from Flash"}}'
Copy link
Contributor Author

@promptless promptless bot Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation: PR #215 changed queue-based endpoint request format to use {"input": {...}} wrapper. See changes in src/runpod_flash/cli/commands/run.py where body.input is extracted from the request body.
View source


Open [http://localhost:8888/docs](http://localhost:8888/docs) in your browser to access the interactive Swagger UI. You can test all endpoints directly from the browser.

Flash extracts the first line of each function's docstring and displays it as the endpoint description in the API explorer. Add docstrings to your `@Endpoint` functions to make your API self-documenting:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation: PR #215 adds docstring extraction via RemoteFunctionMetadata.docstring in scanner.py and displays it in run.py's startup table and Swagger UI summary using _escape_summary() helper.
View source

Auto-provision all Serverless endpoints on startup instead of lazily on first call. Eliminates cold-start delays during development.
</ResponseField>

## Endpoint descriptions from docstrings
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation: PR #215 introduces function_docstrings in WorkerInfo dataclass and displays docstrings in the startup table's "Description" column and as Swagger UI summary via the summary= parameter.
View source

curl -X POST http://localhost:8888/gpu_worker/runsync \
-H "Content-Type: application/json" \
-d '{"message": "Hello from GPU!"}'
-d '{"input": {"message": "Hello from GPU!"}}'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation: PR #215 introduces make_wrapped_model() helper that wraps request body in an input envelope for queue-based endpoints, changing the request format from direct JSON to {"input": {...}}.
View source

@muhsinking muhsinking requested a review from deanq March 23, 2026 14:39
@mintlify
Copy link

mintlify bot commented Mar 23, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
runpod-docs 🟢 Ready View Preview Mar 23, 2026, 2:43 PM

@promptless
Copy link
Contributor Author

promptless bot commented Mar 23, 2026

I've added new changes to address this GitHub PR in commit 6673384

```

<Note>
Queue-based endpoints require the `{"input": {...}}` wrapper format to match the deployed endpoint behavior. Load-balanced endpoints accept direct JSON payloads.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation: Request format examples updated to match the corrections in PR #222. The QA report verified: QB routes use make_wrapped_model requiring the {"input": <inner>} envelope, and functions with input_data: dict parameter need {"input_data": {...}} as the inner payload. LB routes use make_input_model directly without the input wrapper.
View source

curl -X POST http://localhost:8888/gpu_worker/runsync \
-H "Content-Type: application/json" \
-d '{"message": "Hello from Flash"}'
-d '{"input": {"message": "Hello from Flash"}}'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation: Queue-based endpoint request format corrected based on PR #222 fix to skeleton_template/README.md. The format {"input": {"input_data": {...}}} matches the skeleton template's gpu_hello(input_data: dict) function signature.
View source

# Call a load-balanced endpoint (lb_worker.py)
curl -X POST http://localhost:8888/lb_worker/process \
-H "Content-Type: application/json" \
-d '{"data": "test"}'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation: Load-balanced endpoint request format corrected based on PR #222 fix to skeleton_template/README.md. LB routes don't use the {"input": ...} wrapper—the payload maps directly to function parameters.
View source

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants