Small Rust service that simulates token streaming.
- Accepts a prompt with
POST /runs - Starts streaming tokens with
GET /runs/:id/stream(SSE) - Sends tokens progressively instead of one big response
- Stops safely on timeout, disconnect, or slow consumers
cargo runOpen http://localhost:8080 to use the simple test UI.
POST /runs
Content-Type: application/json
{ "prompt": "Explain async streaming" }Response:
{ "id": "<uuid>" }GET /runs/<uuid>/stream
Accept: text/event-streamSSE event types:
tokendoneerror
- This is a demo generator, not a real LLM backend.
- Runs are in-memory and removed when finished.