test: add E2E tests for payload processor with Kind cluster CI#31
test: add E2E tests for payload processor with Kind cluster CI#31asaadbalum wants to merge 1 commit into
Conversation
|
Unsigned commits detected! Please sign your commits. For instructions on how to set up GPG/SSH signing and verify your commits, please see GitHub Documentation. |
9da84d3 to
1b2b395
Compare
aradhalevy
left a comment
There was a problem hiding this comment.
Looks good, some minor comments (and we will need the new added check to run and pass first)
| name: llama-adapters | ||
| namespace: $E2E_NS | ||
| labels: | ||
| inference.llm-d.io/ipp-managed: "true" |
There was a problem hiding this comment.
I believe it supposed to be llm-d.ai instead of llm-d.io as per #28.
After this fix the test pass for me locally
| name: deepseek-adapters | ||
| namespace: $E2E_NS | ||
| labels: | ||
| inference.llm-d.io/ipp-managed: "true" |
| | Streaming routing | SSE chunks returned | | ||
| | Metrics | `bbr_info`, `bbr_success_total` | | ||
|
|
||
| ## Troubleshooting |
There was a problem hiding this comment.
When I had a couple of other clusters set up in kind, Envoy tried to route requests to them. Please add a suggestion / troubleshooting to use kind delete clusters --all first to clean your kind environment first.
| containers: | ||
| - name: payload-processor | ||
| image: $E2E_IMAGE | ||
| imagePullPolicy: Never |
There was a problem hiding this comment.
This should be IfNotPresent if we want to test on a different cluster other than kind. But that requires pushing an image to ghcr.io and might require some more changes, and can be dealt with in another issue / PR if you prefer to keep this PR for kind only
There was a problem hiding this comment.
Keeping it for now, will address it in a follow-up pr
|
|
||
| - name: Run E2E tests | ||
| run: | | ||
| E2E_IMAGE=ghcr.io/llm-d/llm-d-inference-payload-processor:e2e \ |
There was a problem hiding this comment.
You don't use the Makefile / script here, I think it would be better to use them to have a single source of truth.
1b2b395 to
9de9955
Compare
|
Your PR is large. Please consider breaking it into multiple PRs. The |
aradhalevy
left a comment
There was a problem hiding this comment.
LGTM.
I think this is fine even tough it is a large PR as all the code is needed and relevant to this minimal e2e test suite.
|
|
||
| var ( | ||
| testConfig *testutils.TestConfig | ||
| ppImage string |
There was a problem hiding this comment.
nit: it would be better to align the name on ipp rather than pp (that was the agreed acronym).
| | Base model routing | Pool routing via header | | ||
| | LoRA adapter routing | ConfigMap adapter lookup | | ||
| | Streaming routing | SSE chunks returned | | ||
| | Metrics | `bbr_info`, `bbr_success_total` | |
There was a problem hiding this comment.
as a follow up, we should update all metrics to be named ipp instead of bbr.
not a blocker
| | Streaming routing | SSE chunks returned | | ||
| | Metrics | `bbr_info`, `bbr_success_total` | | ||
|
|
||
| ## Troubleshooting |
There was a problem hiding this comment.
this should probably go to a separate troubleshot guide.
quickstart guide should be quick, and simple :)
in other words, the simplest explanation of the green path.
| @@ -0,0 +1,164 @@ | |||
| # Llama model server simulator | |||
| apiVersion: apps/v1 | |||
There was a problem hiding this comment.
can you explain the separation between e2e-deployment and deepseek-model-server?
I see deepseek has deployment + svc.
here I see deployment + svc for a llama plus adapter of deepseek + llama + many other CRs.
not sure I understand the separation.
| - '!**/*.md' | ||
| - '!LICENSE' | ||
| - '!OWNERS' | ||
|
|
There was a problem hiding this comment.
can you move this logic to the file "ci-pr-checks.yaml" (and on the way to clean from it the lint python and build at the end)?
9de9955 to
3c561d1
Compare
|
Your PR is large. Please consider breaking it into multiple PRs. The |
|
@shmuelk can you please review this PR when you have time? |
|
@nirrozenbaum I took a very quick look at this PR. I don't like it's structure. This E2E test looks a lot more like the old IGW E2E test and not like the scheduler's E2E test. @roytman restructured the End to End test and the development environment on Kind to use the same K8S YAML and config YAML files where possible. Following that idea here will make it easier to put together a development environment on Kind. |
@asaadbalum can you please take a look on @shmuelk's feedback and work towards setting the e2e to work like they do in llm-d scheduler? (or the new name llm-d router). |
Adds end-to-end tests that deploy a complete stack on a Kind cluster:
Envoy proxy (v1.33, FULL_DUPLEX_STREAMED ext_proc), Payload Processor,
Llama and DeepSeek model-server simulators, and adapter ConfigMaps.
Kubernetes manifests live under deploy/ following the llm-d-router
pattern: shared components (deploy/components/) and environment-specific
infrastructure (deploy/environments/dev/e2e-infra/). Test code references
these manifests via relative paths with ${VAR} substitution.
Tests cover base-model routing, LoRA adapter resolution, streaming
requests, and ipp_* metrics exposure.
Signed-off-by: Asaad Balum <asaad.balum@gmail.com>
3c561d1 to
2a0668c
Compare
|
Your PR is large. Please consider breaking it into multiple PRs. The |
|
cc for another pair of eyes: @noyitz |
Summary
Add end-to-end tests that deploy a complete Envoy + Payload Processor + model-server-simulator stack on a Kind cluster and validate core functionality through the actual ext_proc gRPC pipeline.
modelfield extraction from/v1/chat/completionsand/v1/completionsbodies routes Llama and DeepSeek requests to the correct pools viaX-Gateway-Base-Model-Nameheader."stream": true) return SSEtext/event-streamchunks through the full Envoy → Payload Processor → model-server pipeline.ipp_infoandipp_success_totalPrometheus metrics are populated after traffic flows.ci-pr-checks.yaml, skipping docs-only changes. Removed unusedpython-lintandcontainer-buildjobs.Manifest structure
Kubernetes manifests live under
deploy/following the llm-d-router pattern: shared components (deploy/components/) and environment-specific infrastructure (deploy/environments/dev/e2e-infra/). Each component directory includes akustomization.yaml. Test code references these manifests via relative paths with${VAR}substitution, enabling reuse for both E2E tests and local Kind development.The E2E Envoy configuration mirrors production:
request_body_mode: FULL_DUPLEX_STREAMEDallow_mode_override: true,failure_mode_allow: falserequest_trailer_mode: SEND,response_trailer_mode: SKIPNew files
deploy/components/ipp/deployment.yamldeploy/components/ipp/service.yamldeploy/components/ipp/rbac.yamldeploy/components/ipp/kustomization.yamldeploy/components/model-server/llama/deployment.yamldeploy/components/model-server/llama/kustomization.yamldeploy/components/model-server/deepseek/deployment.yamldeploy/components/model-server/deepseek/kustomization.yamldeploy/environments/dev/e2e-infra/envoy.yamldeploy/environments/dev/e2e-infra/client.yamltest/e2e/e2e_suite_test.gotest/e2e/e2e_test.gotest/e2e/README.mdtest/e2e/TROUBLESHOOTING.mdhack/test-e2e.shModified files
.github/workflows/ci-pr-checks.yamle2ejob, removedpython-lintandcontainer-buildMakefiletest-e2e,image-build-local,image-kindtargetsTest plan
7 Passed | 0 Failed)go test ./...)go vet+ build tags)kubectl exec curlfor each scenariomake image-kind && make test-e2e)Closes #14