fix: /v2/health/ready returns 200 when Python backend stub is dead (#8604) by itsnothuy · Pull Request #473 · triton-inference-server/core

itsnothuy · 2026-02-24T06:34:19Z

Checklist

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

Related PRs:

fix: Enable detection of unresponsive or crashed Python backend stub process python_backend#423
Fix #8604: Implement stub restart for unhealthy Python backends python_backend#431 — Community attempt to fix the same issue from the Python backend side by restarting the stub process.

Where should the reviewer start?

src/server.cc, function InferenceServer::IsReady() — the only changed file. Look for the new block after the existing ModelReadyState::READY / "unloaded" check.

Test plan:

Start Triton with --strict-readiness and a Python backend model
Verify /v2/health/ready returns 200
Kill the Python backend stub process (kill -9 <stub_pid>)
Verify /v2/health/ready now returns 503
Existing CI tests pass (pre-commit hooks: clang-format, codespell, copyright)

Caveats:

Only affects strict_readiness_ mode (the default). Non-strict mode is unchanged.
Adds per-model ModelIsReady() calls during health checks, but TRITONBACKEND_ModelInstanceReady is designed to be lightweight (Python backend checks StubActive() via waitpid(WNOHANG), a non-blocking syscall ~1–2µs). Health probes typically run every 5–30s.
Backends that don't implement TRITONBACKEND_ModelInstanceReady are unaffected — the function pointer is nullptr and IsReady() returns Status::Success by default.

Background

edit (need maintainers' clarification):

The per-model endpoint /v2/models/{model}/ready already correctly returns 503 when
a stub dies — it calls through ModelIsReady() → Model::IsReady() →
TritonModelInstance::IsReady() → TRITONBACKEND_ModelInstanceReady() → StubActive().

The server-level endpoint /v2/health/ready does not, because
InferenceServer::IsReady() only checks ModelStates() — a lifecycle enum set at
load time that is never updated when a backend fails at runtime.

Call chain comparison:

# /v2/health/ready (BUG — before this fix)
HTTP GET /v2/health/ready
  → TRITONSERVER_ServerIsReady()
    → InferenceServer::IsReady()        ← checks ModelStates() only, never calls ModelIsReady()

# /v2/models/{name}/ready (already correct)
HTTP GET /v2/models/{name}/ready
  → TRITONSERVER_ServerModelIsReady()
    → InferenceServer::ModelIsReady()
      → Model::IsReady()
        → TritonModelInstance::IsReady()
          → TRITONBACKEND_ModelInstanceReady()  ← checks stub health via waitpid()

The fix adds the missing ModelIsReady() call inside IsReady() so the server-level
health endpoint also checks runtime backend health.

Endpoint	Before	After
`/v2/models/{model}/ready`	503 (correct)	503 (unchanged)
`/v2/health/ready`	200 (bug)	503 (fixed)

Under strict_readiness_=true, IsReady() only checked lifecycle state (ModelStates()) but not runtime backend readiness. This meant that /v2/health/ready returned 200 even when a model's backend had crashed (e.g., Python backend stub process died), because the lifecycle still showed READY. Add a runtime readiness check via ModelIsReady() for models in READY lifecycle state. ModelIsReady() calls model->IsReady() which invokes TRITONBACKEND_ModelInstanceReady — in the Python backend this calls StubActive() -> waitpid(WNOHANG), a non-blocking microsecond-level check. This ensures /v2/health/ready correctly reflects backend health, enabling orchestrators (K8s, etc.) to detect and restart unhealthy pods. Fixes: triton-inference-server/server#8604

Copilot

Pull request overview

This pull request enhances the strict readiness check for Triton Inference Server by adding runtime backend health verification. Under strict_readiness_=true, the IsReady() function previously only checked the lifecycle state (ModelStates()) but not runtime backend readiness. This meant /v2/health/ready would return 200 even when a model's backend had crashed (e.g., Python backend stub process died) because the lifecycle still showed READY.

Changes:

Added runtime backend readiness check via ModelIsReady() for models in READY lifecycle state within the strict readiness evaluation
Enhanced logging to report when models fail runtime readiness checks despite being in READY lifecycle state

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/server.cc

whoisj · 2026-02-24T23:29:18Z

LGTM, @yinggeh please take a look at this as well. Thanks.

yinggeh · 2026-02-24T23:33:19Z

I thought we already fixed this bug in triton-inference-server/python_backend#423. @pskiran1 Can you clarify?

yinggeh · 2026-02-24T23:36:58Z

Please fill out the template pull_request_template_external_contrib.md

Co-authored-by: J Wyman <jeremy.wyman@outlook.com>

itsnothuy · 2026-02-25T03:33:15Z

Please fill out the template pull_request_template_external_contrib.md

OK, I updated my PR description based on pull_request_template_external_contrib.md template.

whoisj · 2026-02-25T19:34:18Z

@itsnothuy, I've not seen your signed Contributor License Agreement yet. Have you submitted it or are you covered as part of your employer or university?

itsnothuy · 2026-02-26T02:42:11Z

@itsnothuy, I've not seen your signed Contributor License Agreement yet. Have you submitted it or are you covered as part of your employer or university?

Mb, I forgot. Just signed it and submit them via email.

whoisj · 2026-03-03T20:40:16Z

Mb, I forgot. Just signed it and submit them via email.

You were approved today. Just waiting on @pskiran1 to review the change per @yinggeh's request.

itsnothuy · 2026-03-04T02:12:45Z

Mb, I forgot. Just signed it and submit them via email.

You were approved today. Just waiting on @pskiran1 to review the change per @yinggeh's request.

Noted. Please keep me updated.

src/server.cc

pskiran1 · 2026-03-04T10:34:08Z

I thought we already fixed this bug in triton-inference-server/python_backend#423. @pskiran1 Can you clarify?

@yinggeh, this change targets the server-level health endpoint to report accurate status on /v2/health/ready, whereas my earlier changes only applied to /v2/models/{model}/ready.

yinggeh

LGTM. Thanks for the contribution

whoisj · 2026-03-17T18:56:27Z

Merged. Thank you very much for your contribution. 🎉

itsnothuy · 2026-03-18T12:49:24Z

Merged. Thank you very much for your contribution. 🎉

Gladly!!

Copilot AI review requested due to automatic review settings February 24, 2026 06:34

Copilot started reviewing on behalf of itsnothuy February 24, 2026 06:34 View session

Copilot AI reviewed Feb 24, 2026

View reviewed changes

itsnothuy changed the title ~~Check runtime backend readiness in IsReady() for strict mode~~ fix: /v2/health/ready returns 200 when Python backend stub is dead (#8604) Feb 24, 2026

itsnothuy mentioned this pull request Feb 24, 2026

Python Backend fails to restart on unhealthy state but health API remains 200 triton-inference-server/server#8604

Closed

whoisj reviewed Feb 24, 2026

View reviewed changes

src/server.cc Outdated Show resolved Hide resolved

whoisj requested a review from yinggeh February 24, 2026 23:29

whoisj added the PR: fix A bug fix label Feb 24, 2026

Update src/server.cc

1fd6685

Co-authored-by: J Wyman <jeremy.wyman@outlook.com>

whoisj approved these changes Feb 26, 2026

View reviewed changes

yinggeh requested a review from pskiran1 March 2, 2026 09:41

Merge branch 'main' into fix-8604-server-readiness

03f1b58

pskiran1 reviewed Mar 4, 2026

View reviewed changes

src/server.cc Show resolved Hide resolved

itsnothuy added 2 commits March 16, 2026 06:32

Merge branch 'main' into fix-8604-server-readiness

1cca45e

Merge branch 'main' into fix-8604-server-readiness

119ef5b

yinggeh approved these changes Mar 17, 2026

View reviewed changes

whoisj merged commit 53fc26e into triton-inference-server:main Mar 17, 2026
1 check passed

pskiran1 mentioned this pull request Mar 19, 2026

test: Fix L0_backend_python model_readiness subtest triton-inference-server/server#8709

Open

20 tasks

Conversation

itsnothuy commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Commit Type:

Related PRs:

Where should the reviewer start?

Test plan:

Caveats:

Background

edit (need maintainers' clarification):

Related

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

whoisj commented Feb 24, 2026

Uh oh!

yinggeh commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yinggeh commented Feb 24, 2026

Uh oh!

itsnothuy commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

whoisj commented Feb 25, 2026

Uh oh!

itsnothuy commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

whoisj commented Mar 3, 2026

Uh oh!

itsnothuy commented Mar 4, 2026

Uh oh!

Uh oh!

pskiran1 commented Mar 4, 2026

Uh oh!

yinggeh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

whoisj commented Mar 17, 2026

Uh oh!

itsnothuy commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants

itsnothuy commented Feb 24, 2026 •

edited

Loading

yinggeh commented Feb 24, 2026 •

edited

Loading

itsnothuy commented Feb 25, 2026 •

edited

Loading

itsnothuy commented Feb 26, 2026 •

edited

Loading

itsnothuy commented Mar 18, 2026 •

edited

Loading