Support server#44
Conversation
Signed-off-by: kerthcet <kerthcet@gmail.com>
Signed-off-by: kerthcet <kerthcet@gmail.com>
There was a problem hiding this comment.
Pull request overview
Adds an Axum-based inference server to PUMA with OpenAI-compatible endpoints (chat completions, legacy completions, and model listing), exposed via a new puma serve CLI subcommand, plus integration tests and documentation to support Issue #43.
Changes:
- Introduces HTTP API router + handlers for
/v1/chat/completions,/v1/completions,/v1/models,/v1/models/:model, and/health. - Adds a new
serveCLI command that starts the server using aMockEnginebackend. - Adds API integration tests, a manual test script, and docs updates (README + logging guidelines).
Reviewed changes
Copilot reviewed 24 out of 25 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/cli_test.rs | Adds module-level doc comment for CLI integration tests. |
| tests/api_test.rs | New Axum-router integration tests for health/models/completions endpoints. |
| src/storage/storage_trait.rs | Requires ModelStorage to be Send + Sync for server state sharing. |
| src/main.rs | Registers new api/backend modules; tweaks default logging filters; minor ordering change. |
| src/lib.rs | Exposes internal modules publicly to support integration tests. |
| src/cli/serve.rs | Implements puma serve to start the Axum server. |
| src/cli/mod.rs | Exports new serve module. |
| src/cli/commands.rs | Adds SERVE subcommand + args and dispatches to server startup. |
| src/backend/mod.rs | Adds backend module structure and re-exports engine trait. |
| src/backend/engine.rs | Defines InferenceEngine trait and GenerateResponse. |
| src/backend/mock.rs | Implements MockEngine generate + streaming stubs. |
| src/api/types/request.rs | Adds OpenAI-ish request DTOs for chat/completions. |
| src/api/types/response.rs | Adds OpenAI-ish response/error DTOs. |
| src/api/types/mod.rs | Exposes request/response types. |
| src/api/mod.rs | Wires API submodules and re-exports create_router. |
| src/api/routes.rs | Defines router, shared state, CORS, and health endpoint. |
| src/api/chat.rs | Implements chat completions (SSE + non-streaming). |
| src/api/completions.rs | Implements legacy text completions. |
| src/api/models.rs | Implements list/get model endpoints. |
| hack/scripts/test_api.sh | Adds a manual curl/jq script for endpoint testing. |
| hack/README.md | Documents the hack/ directory and scripts. |
| README.md | Documents the new API server and serve command. |
| LOGGING.md | Adds logging conventions for CLI output vs internal logs. |
| Cargo.toml | Adds web/server dependencies and test dependencies. |
| Cargo.lock | Updates lockfile for new dependencies. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 24 out of 25 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 24 out of 25 changed files in this pull request and generated 15 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| #[tokio::test] | ||
| async fn test_chat_completion_non_streaming() { | ||
| let (app, _temp_dir) = create_test_app(); | ||
| let request_body = json!({ | ||
| "model": "test-model", |
There was a problem hiding this comment.
There’s no test coverage for the streaming path of /v1/chat/completions ("stream": true). Add a test that asserts text/event-stream output and verifies the stream ends with [DONE].
| fn create_test_app() -> (axum::Router, TempDir) { | ||
| let engine = Arc::new(MockEngine::new()); | ||
| let temp_dir = TempDir::new().unwrap(); | ||
| let registry = Arc::new(ModelRegistry::new(Some(temp_dir.path().to_path_buf()))); | ||
|
|
There was a problem hiding this comment.
These tests type the app as axum::Router (default state ()), but the router produced by create_router carries typed state (AppState<E>). After fixing create_router’s return type, update the test helpers to use Router<AppState<MockEngine>> (or make make_json_request generic over the router state) so the tests compile.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 24 out of 25 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: kerthcet <kerthcet@gmail.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 24 out of 25 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: kerthcet <kerthcet@gmail.com>
|
/lgtm |
What this PR does / why we need it
Which issue(s) this PR fixes
Fixes #43
Special notes for your reviewer
Does this PR introduce a user-facing change?