Skip to content

Introduce job_submissions_limit for /api/runs/list #2883

Merged
r4victor merged 6 commits intomasterfrom
pr_job_submissions_limit
Jul 10, 2025
Merged

Introduce job_submissions_limit for /api/runs/list #2883
r4victor merged 6 commits intomasterfrom
pr_job_submissions_limit

Conversation

@r4victor
Copy link
Copy Markdown
Collaborator

@r4victor r4victor commented Jul 8, 2025

This PR introduces job_submissions_limit parameter for /api/runs/list endpoint that allows returning less job_submission per job, reducing response sizes significantly and speeding up the API / server. The CLI/UI are updated to request job_submissions_limit: 1 when listing runs – no need for all the submission there.

The only downside so far: status_message calculation requires all the job submissions to be returned, so it may not return "retrying" if fewer job submission requested. This will be fixed in a separate PR since status_message calculation requires a revision.

Update: also adds include_jobs parameter to include no jobs at all (no job specs). job_submissions_limit: 0 still returns job specs but without jobs submissions.

Speedups should stack nicely with #2880.

"job_submissions_limit": null:

(venvt) ➜  stuff wrk http://localhost:8000/api/runs/list -H 'Authorization: Bearer-' -s bench_list.lua
Running 10s test @ http://localhost:8000/api/runs/list
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   655.67ms  185.66ms   1.21s    73.65%
    Req/Sec     8.41      4.51    29.00     83.96%
  148 requests in 10.10s, 19.30MB read
Requests/sec:     14.66
Transfer/sec:      1.91MB

"job_submissions_limit": 1:

(venvt) ➜  stuff wrk http://localhost:8000/api/runs/list -H 'Authorization: Bearer -' -s bench_list.lua
Running 10s test @ http://localhost:8000/api/runs/list
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   570.31ms  146.76ms   1.00s    70.41%
    Req/Sec    10.12      6.94    30.00     67.33%
  169 requests in 10.10s, 17.35MB read
Requests/sec:     16.74
Transfer/sec:      1.72MB

"include_jobs": false:

(venvt) ➜  stuff wrk http://localhost:8000/api/runs/list -H 'Authorization: Bearer -' -s bench_list.lua
Running 10s test @ http://localhost:8000/api/runs/list
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   333.30ms   83.78ms 662.51ms   75.50%
    Req/Sec    15.30      7.13    39.00     85.88%
  298 requests in 10.09s, 8.90MB read
Requests/sec:     29.54
Transfer/sec:      0.88MB

Speedups will be more significant if there are lots of submission per job (retrying), in my benchmarks there are at most 2-3 submissions per job.

@r4victor r4victor requested review from jvstme and olgenn July 8, 2025 11:19
Comment on lines +676 to +677
if job_submissions_limit == 0:
job_submissions = []
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will result in no jobs returned at all, since the jobs list is populated from the latest submission. I assume the expected behavior would be to return jobs, just without job submissions.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to have some way to return no jobs at all since they may be unnecessary for list runs and expensive at the same time. Fixed job_submissions_limit: 0 and added another parameter include_jobs.

Comment on lines 660 to +661
include_job_submissions: bool = True,
job_submissions_limit: Optional[int] = None,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) If you decide to address my previous comment, include_job_submissions=False and job_submissions_limit=0 will have the same effect. It looks like include_job_submissions is not used anywhere, so it can be dropped.

@r4victor r4victor force-pushed the pr_job_submissions_limit branch from 531c022 to 36f874b Compare July 10, 2025 06:21
@r4victor r4victor merged commit ab5dfbf into master Jul 10, 2025
24 checks passed
@r4victor r4victor deleted the pr_job_submissions_limit branch July 10, 2025 06:34
@jvstme
Copy link
Copy Markdown
Collaborator

jvstme commented Jul 10, 2025

@r4victor, dstack ps doesn't work with 0.19.18 servers now.

> dstack ps
Server validation error: 
{'detail': [{'loc': ['body', 'include_jobs'],
             'msg': 'extra fields not permitted',
             'type': 'value_error.extra'},
            {'loc': ['body', 'job_submissions_limit'],
             'msg': 'extra fields not permitted',
             'type': 'value_error.extra'}]}

@r4victor
Copy link
Copy Markdown
Collaborator Author

Thanks, fixed: #2894

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants