fix(trainer): LocalJob.logs removes print side-effect and enables real-time streaming#436
fix(trainer): LocalJob.logs removes print side-effect and enables real-time streaming#436G26karthik wants to merge 1 commit intokubeflow:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
This PR fixes LocalJob.logs(follow=True) in the localprocess backend so it behaves like a true streaming iterator and no longer prints directly to stdout, aligning TrainerClient.get_job_logs(..., follow=True) with the documented “iterator of log lines” behavior.
Changes:
- Remove the hidden
print(..., flush=True)side-effect fromLocalJob.logs(follow=True). - Change
LocalJob.logs()to return anIterator[str]and yield log lines incrementally whenfollow=True. - Add unit tests covering
follow=Falseline yielding and asserting no stdout side-effect forfollow=True.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
kubeflow/trainer/backends/localprocess/job.py |
Refactors LocalJob.logs() to yield log lines (and stream when follow=True) without printing. |
kubeflow/trainer/backends/localprocess/backend_test.py |
Adds tests to validate non-follow behavior and ensure streaming doesn’t print to stdout. |
aa59baf to
1a19c55
Compare
…l-time streaming When follow=True, logs() was calling print() directly to stdout, causing duplicate output for any caller that also printed the lines. Additionally, it blocked until the job finished before returning, making the follow=True path effectively non-streaming. Fix: convert logs() to an Iterator[str] generator. The follow=True path yields lines from stream_logs() chunks in real-time without any side-effect. The follow=False path yields from self._stdout.splitlines() as before. Tests: added two new pytest cases to verify follow=False yields correct lines and follow=True produces no stdout side-effect. Fixes kubeflow#435 Signed-off-by: G26Karthik <karthik26092005@gmail.com>
|
Thanks for the review @copilot-pull-request-reviewer — great catch on the snapshot timing issue. Applied the suggestion: Also added a dedicated test |
|
@andreyvelich |
What this PR does
Fixes #435
LocalJob.logs(follow=True)had two bugs:print()side-effect — it calledprint(chunk, end="", flush=True)internally, so any caller that iterated the result would see duplicate output (once printed directly to stdout, once from the returned list).stream_logs()), then returned all lines as a single batch. Despite being namedfollow=True, no lines were yielded until the process completed.Changes
kubeflow/trainer/backends/localprocess/job.pyfrom collections.abc import Iteratorimportlogs()return type fromlist[str]toIterator[str]follow=False:yield from self._stdout.splitlines()(snapshot behavior preserved)follow=True:yield from chunk.splitlines()for each chunk fromstream_logs()— real-time, no side-effectThe existing
backend.pycaller (yield from _step.job.logs(follow=follow)) is fully compatible with a generator, so no changes needed there.kubeflow/trainer/backends/localprocess/backend_test.pyLocalJob,LocalBackendJobs,LocalBackendSteptest_get_job_logs_follow_false_yields_lines: verifies follow=False yields correct individual linestest_get_job_logs_follow_true_no_print_side_effect: usescapsys+patch.objectonstream_logsto assert no stdout side-effect and correct line streamingTesting
Signed-off-by: G26Karthik karthik26092005@gmail.com