Conversation
…astructure/model-service into docs/model-service
ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Free Run ID: 📒 Files selected for processing (14)
📝 WalkthroughWalkthroughThis pull request introduces a comprehensive documentation system for a Ray Serve-based model service on Kubernetes, including architecture guides, deployment workflows, and getting-started resources. The CI/CD pipeline is extended to process documentation builds via MkDocs. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
📝 Coding Plan
Note 🎁 Summarized by CodeRabbit FreeThe PR author is not assigned a seat. To perform a comprehensive line-by-line review, please assign a seat to the pull request author through the subscription management page by visiting https://app.coderabbit.ai/login. Comment Tip CodeRabbit can use TruffleHog to scan for secrets in your code with verification capabilities.Add a TruffleHog config file (e.g. trufflehog-config.yml, trufflehog.yml) to your project to customize detectors and scanning behavior. The tool runs only when a config file is present. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the Model Service's discoverability and usability by introducing a comprehensive, structured documentation portal. The new documentation, built with MkDocs, covers essential aspects from quick deployment to detailed architectural insights and troubleshooting, making it easier for users to understand, deploy, and manage machine learning models. Accompanying these changes are updates to the main README, CI/CD pipelines for documentation validation, and proper dependency management for the documentation tools. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request introduces a comprehensive documentation site for the Model Service, covering its architecture, deployment guides, model integration, configuration, and troubleshooting, while also integrating the documentation build process into the CI pipeline. Review comments suggest improving clarity in the README.md regarding kubectl namespace placeholders, completing a placeholder URL in the configuration reference, addressing a potential LaTeX rendering issue in the configuration-reference.md by suggesting a pymdownx.arithmatex extension, resolving a missing JSON curl example in deployment-guide.md, and adding a final newline character to mkdocs.yml for consistency.
|
|
||
| ## Collaborate with your team | ||
| ```bash | ||
| kubectl apply -f ray-service.yaml -n [namespace] |
There was a problem hiding this comment.
The [namespace] placeholder is used in several kubectl commands in this README (e.g., lines 37, 38, and 44). For clarity, especially for new users, it would be beneficial to add a note explaining that this needs to be replaced with their target Kubernetes namespace, and provide an example like rationai-notebooks-ns. This is handled well in the docs/get-started/quick-start.md file.
| import_path: models.binary_classifier:app | ||
| route_prefix: /prostate-classifier | ||
| runtime_env: | ||
| working_dir: https://.../model-service-master.zip |
There was a problem hiding this comment.
The working_dir URL contains a placeholder .... For better usability and consistency with other documentation files (like adding-models.md), it would be better to provide the full, valid URL.
| working_dir: https://.../model-service-master.zip | |
| working_dir: https://gitlab.ics.muni.cz/rationai/infrastructure/model-service/-/archive/master/model-service-master.zip |
| **What it is:** The desired average number of **ongoing (in-flight)** requests per replica. This is the **primary scaling driver**. | ||
|
|
||
| **Formula:** | ||
| $$ \text{Desired Replicas} = \left\lceil \frac{\text{Total Ongoing Requests}}{\text{target}\_{\text{ongoing}}\_\text{requests}} \right\rceil $$ |
There was a problem hiding this comment.
| svc/rayservice-my-model-serve-svc 8000:8000 | ||
| ``` | ||
|
|
||
| The example model in this repository (`models/binary_classifier.py`) uses FastAPI ingress and expects a **compressed binary request body** (LZ4), not JSON. The JSON `curl` example below is valid for JSON-based models but does not apply to `BinaryClassifier`. |
| alternate_style: true | ||
| - tables | ||
| - toc: | ||
| permalink: true No newline at end of file |
There was a problem hiding this comment.
Pull request overview
This PR adds an MkDocs documentation site for the model-service and updates repository metadata/CI to build the docs alongside the existing Ray Serve + KubeRay RayService setup.
Changes:
- Adds a new MkDocs site (
mkdocs.yml) and a full set of docs pages underdocs/(quick start, guides, architecture). - Updates
README.mdto match the current Ray Serve + KubeRay deployment model and reference model payload format. - Adds a
docsdependency group topyproject.tomland updates.gitlab-ci.ymlto include the docs build template.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| pyproject.toml | Adds a docs dependency group for MkDocs tooling. |
| mkdocs.yml | Introduces MkDocs Material configuration, nav, and Markdown extensions. |
| docs/index.md | Adds the documentation landing page content and navigation pointers. |
| docs/get-started/quick-start.md | Adds a Kubernetes quick start for deploying the RayService. |
| docs/guides/deployment-guide.md | Adds a detailed production deployment guide and ops considerations. |
| docs/guides/configuration-reference.md | Adds a reference for RayService / Serve knobs (autoscaling, backpressure, etc.). |
| docs/guides/adding-models.md | Adds guidance and examples for implementing and integrating models. |
| docs/guides/troubleshooting.md | Adds common failure modes and triage steps for RayService/Serve on K8s. |
| docs/architecture/overview.md | Adds a high-level architecture overview of the stack and scaling model. |
| docs/architecture/request-lifecycle.md | Documents end-to-end request flow and queueing points. |
| docs/architecture/queues-and-backpressure.md | Explains queueing controls (max_queued_requests, max_ongoing_requests). |
| docs/architecture/batching.md | Explains Ray Serve batching behavior and tuning considerations. |
| README.md | Replaces the GitLab template README with repo-specific usage and payload details. |
| .gitlab-ci.yml | Includes the MkDocs CI template and adds a deploy stage for docs build. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Process data and return prediction | ||
| result = self.predict(data) | ||
| return {"prediction": result} | ||
|
|
||
| def predict(self, data: dict): | ||
| # Replace with your own inference logic | ||
| return data |
| def reconfigure(self, config: Config): | ||
| self.threshold = config["threshold"] | ||
| self.batch_size = config["batch_size"] | ||
| print(f"Reconfigured: threshold={self.threshold}") |
| @serve.batch(max_batch_size=32, batch_wait_timeout_s=0.1) | ||
| async def predict_batch(self, inputs: list[np.ndarray]): | ||
| batch = np.stack(inputs) | ||
| outputs = self.model(batch) | ||
| return outputs.tolist() |
| - `name`: logical app name (used in Ray dashboard/logs). | ||
| - `import_path`: Python entrypoint (`module.path:variable`). | ||
| - `route_prefix`: HTTP path under the Serve gateway. | ||
| - `runtime_env`: dynamic environment setup (see [Managing Dependencies](../guides/adding-models.md#6-managing-dependencies)). |
| **Formula:** | ||
| $$ \text{Desired Replicas} = \left\lceil \frac{\text{Total Ongoing Requests}}{\text{target}\_{\text{ongoing}}\_\text{requests}} \right\rceil $$ | ||
|
|
||
| **Note:** "Total Ongoing Requests" refers to the **concurrency** (number of requests currently being processed or waiting in the queue), _not_ the Requests Per Second (RPS). | ||
|
|
| svc/rayservice-my-model-serve-svc 8000:8000 | ||
| ``` | ||
|
|
||
| The example model in this repository (`models/binary_classifier.py`) uses FastAPI ingress and expects a **compressed binary request body** (LZ4), not JSON. The JSON `curl` example below is valid for JSON-based models but does not apply to `BinaryClassifier`. |
| "http://localhost:8000/prostate-classifier-1/", | ||
| data=payload, | ||
| headers={"Content-Type": "application/octet-stream"}, | ||
| timeout=60, |
| # Process data and return prediction | ||
| result = self.predict(data) | ||
| return {"prediction": result} | ||
|
|
||
| def predict(self, data: dict): | ||
| # Replace with your own inference logic | ||
| return data |
This PR adds and polishes MkDocs documentation for the model-service. Changes:
docs/): quick start, architecture overview, deployment guide, configuration reference, adding models, troubleshooting.RayServicesetuppyproject.toml([dependency-groups].docs: mkdocs, mkdocs-material, pymdown-extensions)..gitlab-ci.yml→docs:buildrunsmkdocs build --strict.mkdocs.ymlas needed for the docs site.Summary by CodeRabbit
Release Notes
Documentation
Chores