Conversation
| package_sync: true | ||
| wait_for_workers: true | ||
| scheduler_vm_types: [m6i.large] | ||
| _n_worker_specs_per_host: 2 |
There was a problem hiding this comment.
IIUC a fair assessment would require us to double the machine sizes and reduce cluster sizes by half. Otherwise the same-host workers have much less memory to work with and are much more likely to run into spilling and OOM
There was a problem hiding this comment.
I kicked off an A/B test with this https://github.com/coiled/benchmarks/actions/runs/5667683049
There was a problem hiding this comment.
Results of this A/B test are interesting but likely require a bit further analysis
We can see a couple of tests that regress significantly while others are getting much worse.
My best read without digging deeper is that most/almost all of our tests require a bit of spilling and that all test cases that are spilling behave really poorly, likely because disk is just busy?
|
@fjetter you want to try run with 1/2 the number of |

I'd like to see how this impacts perf of various benchmarks.