Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions examples/10_Agentic_Inference/qwen_agentic_benchmark.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
name: "qwen-agentic-benchmark"
version: "1.0"
type: "online"

model_params:
name: "Qwen/Qwen3.6-35B-A3B"
temperature: 1.0
top_k: 20
top_p: 0.95
repetition_penalty: 1.0
presence_penalty: 1.5
max_new_tokens: 8192
chat_template_kwargs:
preserve_thinking: true

datasets:
- name: agentic_combined
type: performance
path: /path/to/agentic_combined.jsonl
accuracy_config:
eval_method: agentic_inference_inline # required benchmark default.
agentic_inference:
enable_salt: true # do not change.
inject_tool_delay: true # do not change.
Comment on lines +22 to +24

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To ensure compliance with the benchmark invariants and to make the configuration complete, please explicitly specify num_trajectories_to_issue and stop_issuing_on_first_user_complete under agentic_inference.

    agentic_inference:
      enable_salt: true # do not change.
      inject_tool_delay: true # do not change.
      num_trajectories_to_issue: 990 # Should be integer multiple of dataset trajectory count.
      stop_issuing_on_first_user_complete: false # required benchmark default.


Comment on lines +22 to +25
settings:
runtime:
min_duration_ms: 0
max_duration_ms: 36000000

load_pattern:
type: agentic_inference
target_concurrency: 8 # Submission-specific concurrency.
Comment on lines +31 to +33

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The settings.client configuration is missing. For official agentic benchmark runs, the client settings warmup_connections: 0 and max_idle_time: 0.5 are required invariants to ensure consistent and comparable performance results.

  load_pattern:
    type: agentic_inference
    target_concurrency: 8 # Submission-specific concurrency.

  client:
    warmup_connections: 0
    max_idle_time: 0.5


Comment on lines +31 to +34
endpoint_config:
endpoints:
- "http://localhost:30000"
api_type: openai

report_dir: logs/qwen_agentic
Loading