Skip to content

feat: add model-selector RequestProcessor plugin#97

Draft
noyitz wants to merge 2 commits into
llm-d:mainfrom
noyitz:feature/model-selector-plugin
Draft

feat: add model-selector RequestProcessor plugin#97
noyitz wants to merge 2 commits into
llm-d:mainfrom
noyitz:feature/model-selector-plugin

Conversation

@noyitz
Copy link
Copy Markdown
Contributor

@noyitz noyitz commented May 15, 2026

What does this PR do?

Adds glue code to run the ModelSelector pipeline (Filter → Score → Pick) as a RequestProcessor plugin in the IPP pipeline. This allows model selection to be wired alongside other plugins like body-field-to-header in the request processing chain.

How it works:

  1. Plugin is configured with a static list of candidate model names
  2. On each request, ModelSelector.Select() runs the configured profile (Filter → Score → Pick)
  3. Selected model name is written to CycleState (selected-model key) and the request body model field

What's included:

  • pkg/plugins/modelselector/plugin.go — RequestProcessor wrapper with factory function
  • pkg/plugins/modelselector/plugin_test.go — 7 tests
  • Registration in cmd/runner/runner.go

Extension points:

  • Candidate source: Static list from config in this version. Future versions can source candidates from Datastore, CRD, or other dynamic sources.
  • Filters and scorers: Not configured in this version but the profile supports them — add via WithFilters() / WithScorers() when implementations are ready.
  • Picker: Uses a default max-score picker inline. Will switch to shared picker implementations once PR feat: implement max-score and weighted-random pickers #74 merges.

Why is this change needed?

This is the glue code needed to run the model-selector as a request plugin with one profile, as discussed with @shmuelk.

How was this tested?

  • Unit tests added/updated
  • Integration/e2e tests added/updated
  • Manual testing performed

7 tests covering:

  • Selected model written to body and CycleState
  • Selection from configured candidates only
  • Empty candidates rejected at startup
  • Factory config parsing
  • Invalid JSON rejection

Checklist

  • Commits are signed off (git commit -s) per DCO
  • Code follows project contributing guidelines
  • Tests pass locally (make test)
  • Linters pass (make lint)
  • Documentation updated (if applicable)

Related Issues

Related to the model selector work: #65, #73
Related to profiles design: #15, PR #92

Adds glue code to run the ModelSelector pipeline (Filter → Score →
Pick) as a RequestProcessor plugin in the IPP pipeline.

The plugin:
- Implements framework.RequestProcessor
- Takes candidate model names from plugin config (static list in
  this version; future versions may source from Datastore, CRD,
  or other dynamic sources)
- Runs ModelSelector.Select() on each request
- Writes selected model name to CycleState and request body
- Uses a default max-score picker (replace with shared picker
  implementations once PR llm-d#74 merges)
- Filters and scorers can be added via WithFilters() / WithScorers()
  once implementations are available

Registered as "model-selector" plugin type in runner.go.

Signed-off-by: Noy Itzikowitz <nitzikow@redhat.com>
@github-actions github-actions Bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 15, 2026
@nirrozenbaum
Copy link
Copy Markdown
Collaborator

need to connect model selector as request plugin differently.
let's sync on this to make sure we're on the same page.

@nirrozenbaum
Copy link
Copy Markdown
Collaborator

need to take datastore from handle (PR #84).

ModelSelector plugin now reads candidate models from the Datastore
on each request instead of a static config list.

- Constructor takes datastore.Datastore instead of []string
- loadCandidateModels() reads all models from Datastore per request
- Factory function is a placeholder until Handle exposes Datastore
  (PR llm-d#84) — TODO added for wiring via handle.Datastore()
- Tests use datastore.NewStore() with pre-populated models

Signed-off-by: Noy Itzikowitz <nitzikow@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants