Skip to content

manuelnongba/streaminferencegateway

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Streaming Inference Gateway

Small Rust service that simulates token streaming.

What it does

  • Accepts a prompt with POST /runs
  • Starts streaming tokens with GET /runs/:id/stream (SSE)
  • Sends tokens progressively instead of one big response
  • Stops safely on timeout, disconnect, or slow consumers

Run it

cargo run

Open http://localhost:8080 to use the simple test UI.

API

Create run

POST /runs
Content-Type: application/json

{ "prompt": "Explain async streaming" }

Response:

{ "id": "<uuid>" }

Stream run

GET /runs/<uuid>/stream
Accept: text/event-stream

SSE event types:

  • token
  • done
  • error

Notes

  • This is a demo generator, not a real LLM backend.
  • Runs are in-memory and removed when finished.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors