Skip to content

Add reasoning_effort parameter support for API calls#32

Open
spillai wants to merge 1 commit intomainfrom
claude/add-reasoning-config-Qj6T8
Open

Add reasoning_effort parameter support for API calls#32
spillai wants to merge 1 commit intomainfrom
claude/add-reasoning-config-Qj6T8

Conversation

@spillai
Copy link
Copy Markdown
Contributor

@spillai spillai commented Mar 15, 2026

Summary

This PR adds support for the reasoning_effort parameter in API calls, allowing users to control the reasoning effort level when making completion requests. This is useful for models that support extended reasoning capabilities.

Key Changes

  • Added reasoning_effort field to InputInfo dataclass to store the parameter
  • Updated call_completion_async() and call_completion() functions to accept and pass reasoning_effort parameter to the API via reasoning.effort in kwargs
  • Modified _call_with_retry_async() to propagate reasoning_effort through retry logic
  • Updated run_benchmark() to accept and pass reasoning_effort to completion calls
  • Added --reasoning-effort CLI argument with choices: none, low, medium, high
  • Updated print_config() to display the reasoning effort setting when configured
  • Threaded reasoning_effort parameter through the benchmark execution pipeline
  • Bumped version from 0.5.4 to 0.5.5

Implementation Details

  • The reasoning_effort parameter is optional (defaults to None) and only included in API requests when explicitly set
  • The parameter is passed to the API as {"reasoning": {"effort": reasoning_effort}} in the request kwargs
  • CLI argument is properly integrated with argument parsing and passed through to benchmark execution
  • Configuration display includes reasoning effort when set, maintaining clean output when not in use

https://claude.ai/code/session_01Qy4Dkd9fibk9CFga2vV74R

Adds support for passing reasoning={effort: <level>} to the OpenAI-compatible
chat completions API, allowing benchmarks to compare thinking vs non-thinking
modes on models that support it (e.g. o-series, DeepSeek-R1).

https://claude.ai/code/session_01Qy4Dkd9fibk9CFga2vV74R
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the benchmarking tool by introducing a reasoning_effort parameter, which provides granular control over the reasoning capabilities of models during API completion requests. This addition allows for more nuanced testing and evaluation of models that support varying levels of reasoning, integrating seamlessly into both the asynchronous and synchronous API call flows, as well as the command-line interface for user convenience.

Highlights

  • API Parameter Support: Added support for a new reasoning_effort parameter in API calls, allowing users to specify the reasoning level for completion requests.
  • CLI Integration: Introduced a new --reasoning-effort command-line argument with predefined choices (none, low, medium, high) to control this parameter via the CLI.
  • Core Logic Updates: Modified call_completion_async, call_completion, _call_with_retry_async, and run_benchmark functions to accept and propagate the reasoning_effort parameter throughout the benchmark execution pipeline.
  • Configuration Display: Updated the print_config function to display the configured reasoning_effort setting when it is explicitly set.
  • Version Bump: Incremented the project version from 0.5.4 to 0.5.5 to reflect these new features.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • vlmbench/cli.py
    • Added reasoning_effort field to the InputInfo dataclass.
    • Modified call_completion_async to accept and pass reasoning_effort to the API kwargs.
    • Modified call_completion to accept and pass reasoning_effort to the API kwargs.
    • Updated _call_with_retry_async to propagate the reasoning_effort parameter.
    • Modified run_benchmark to accept reasoning_effort and pass it to completion calls.
    • Updated print_config to include reasoning_effort in the displayed configuration.
    • Added --reasoning-effort argument to the CLI parser with specific choices.
    • Ensured reasoning_effort is passed from CLI arguments to the benchmark execution logic.
  • vlmbench/version.py
    • Updated the __version__ string from '0.5.4' to '0.5.5'.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively adds support for the reasoning_effort parameter across the CLI, benchmarking logic, and API calls. The changes are well-threaded through the different functions. I have two suggestions to improve maintainability and the clarity of the configuration output. One is to address code duplication between call_completion and call_completion_async, and the other is to make the configuration key for reasoning effort more descriptive.

Comment thread vlmbench/cli.py
Comment on lines +1941 to +1942
if reasoning_effort is not None:
kwargs["reasoning"] = {"effort": reasoning_effort}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This logic for adding the reasoning parameter is also present in call_completion_async. This adds to the existing code duplication between call_completion and call_completion_async. These two functions are nearly identical and could be refactored to improve maintainability. For example, the kwargs dictionary construction could be extracted into a shared helper function.

Comment thread vlmbench/cli.py
# Config
tbl.add_row("max_tokens", str(max_tokens))
if reasoning_effort is not None:
tbl.add_row("reasoning", reasoning_effort)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For clarity and consistency with the parameter name (reasoning_effort) and the CLI argument (--reasoning-effort), it would be better to use 'reasoning_effort' as the key in the configuration output table instead of just 'reasoning'.

Suggested change
tbl.add_row("reasoning", reasoning_effort)
tbl.add_row("reasoning_effort", reasoning_effort)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants