Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 0 additions & 59 deletions inference-platforms/vllm/Dockerfile

This file was deleted.

13 changes: 8 additions & 5 deletions inference-platforms/vllm/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# vLLM

This shows how to use the [vLLM OpenTelemetry POC][otel-poc] to export
OpenTelemetry traces from vLLM requests to its OpenAI compatible endpoint.
This shows how to export OpenTelemetry traces from [vLLM][vllm] requests to
its OpenAI compatible endpoint.

## Prerequisites

Expand All @@ -28,13 +28,16 @@ Once vLLM is running, use [uv][uv] to make an OpenAI request via
uv run --exact -q --env-file env.local ../chat.py
```

Or, for the OpenAI Responses API
```bash
uv run --exact -q --env-file env.local ../chat.py --use-responses-api
```

## Notes

* This does not yet support metrics, and there is no GitHub issue on it.
* This does not yet support logs, and there is no GitHub issue on it.
* Until [this][openai-responses] resolves, don't use `--use-responses-api`.

---
[otel-poc]: https://github.com/vllm-project/vllm/blob/main/examples/online_serving/opentelemetry/README.md
[vllm]: https://docs.vllm.ai/en/latest/features/opentelemetry.html
[uv]: https://docs.astral.sh/uv/getting-started/installation/
[openai-responses]: https://github.com/vllm-project/vllm/issues/14721
14 changes: 12 additions & 2 deletions inference-platforms/vllm/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,18 @@
services:
vllm:
container_name: vllm
build:
context: .
image: vllm/vllm-openai-cpu:v0.17.0
entrypoint: []
# Serve args from the prior Dockerfile CMD:
# https://github.com/elastic/observability-examples/blob/139feb0f/inference-platforms/vllm/Dockerfile#L59
command:
- sh
- -c
- >
vllm serve $$CHAT_MODEL
--max-model-len=8192
--enforce-eager
--otlp-traces-endpoint=$$OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
env_file:
- env.local
ports:
Expand Down
Loading