Skip to content

improve scheduler actor#33

Draft
TheHeroBrine422 wants to merge 6 commits intomainfrom
caleb/fix-scheduler-actor-cache
Draft

improve scheduler actor#33
TheHeroBrine422 wants to merge 6 commits intomainfrom
caleb/fix-scheduler-actor-cache

Conversation

@TheHeroBrine422
Copy link
Copy Markdown
Contributor

@TheHeroBrine422 TheHeroBrine422 commented May 1, 2026

intended changes for this PR

  • improve actor cache
    • currently the actor cache is setup in a way that doesn't really support the scheduler updating it when messages happen. this makes the cache very difficult to use because it can't be relied on for up to date information which means we can barely use it as a cache. a specific issue i hit with this was trying to implement affinity/placement groups. if we rely on the cache data for calculating affinity, we could break the affinity rules due to race conditions.
    • i am replacing the generic version I made with one that is just integrated into the scheduler directly, because otherwise I don't really feel comfortable doing direct modifications to the cache in other segments of the scheduler
    • on top of this, the vm cache arguably needs slightly different code because we basically need the ability to query by either actor id or vmid, and so it needed some sort of modification to support this either way.
    • todo: have messages sent to scheduler directly update the cache
  • Placement groups #28
  • improve scheduling algorithm
    • i realized that currently the scheduling algorithm doesnt consider the resource requirements of the new vm. so for example if an agent had 15/16 vcpus scheduled, and then a new VM was to be scheduled that needed another 16 vcpus, it could theoretically schedule it to that agent. which would be bad

Comment thread odorobo/src/actors/scheduler_actor.rs Fixed
Comment thread odorobo/src/actors/scheduler_actor.rs Fixed
vmid,
CachedVMActor {
actor_ref: actor_ref.clone(),
metadata: metadata
Comment thread odorobo/src/actors/scheduler_actor.rs Fixed
this was done for two reasons:
1) the vm and agent actor caches have slightly different requirements,
specifically around keys (actorid vs vmid)
2) the scheduler needs direct control of the cache so the cache can be
updated in response to messages it forwards. being possibly up to 1
second out of date in this data could cause problems. In theory the
scheduler could update the generic cache, but I don't think using a
generic cache if we need to modify it directly anyway really makes
sense.

Signed-off-by: Caleb Jones <caleb@calebgj.io>
@TheHeroBrine422 TheHeroBrine422 force-pushed the caleb/fix-scheduler-actor-cache branch from e7305bf to ac0ee47 Compare May 1, 2026 17:04
Signed-off-by: Caleb Jones <caleb@calebgj.io>
Signed-off-by: Caleb Jones <caleb@calebgj.io>
Signed-off-by: Caleb Jones <caleb@calebgj.io>
Signed-off-by: Caleb Jones <caleb@calebgj.io>
Comment thread odorobo/src/actors/scheduler_actor.rs Fixed
Signed-off-by: Caleb Jones <caleb@calebgj.io>

match &rule.affinity_type {
AffinityType::VirtualMachine(zone) => todo!(),
AffinityType::Agent => lhs_values.push(agent.metadata.), // todo: this should be object metadata, but I just realized that field isnt included in the data we have in the cache

match &rule.affinity_type {
AffinityType::VirtualMachine(zone) => todo!(),
AffinityType::Agent => lhs_values.push(agent.metadata.), // todo: this should be object metadata, but I just realized that field isnt included in the data we have in the cache

match &rule.affinity_type {
AffinityType::VirtualMachine(zone) => todo!(),
AffinityType::Agent => lhs_values.push(agent.metadata.), // todo: this should be object metadata, but I just realized that field isnt included in the data we have in the cache
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants