Add reasoning_effort parameter support for API calls#32
Conversation
Adds support for passing reasoning={effort: <level>} to the OpenAI-compatible
chat completions API, allowing benchmarks to compare thinking vs non-thinking
modes on models that support it (e.g. o-series, DeepSeek-R1).
https://claude.ai/code/session_01Qy4Dkd9fibk9CFga2vV74R
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the benchmarking tool by introducing a Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request effectively adds support for the reasoning_effort parameter across the CLI, benchmarking logic, and API calls. The changes are well-threaded through the different functions. I have two suggestions to improve maintainability and the clarity of the configuration output. One is to address code duplication between call_completion and call_completion_async, and the other is to make the configuration key for reasoning effort more descriptive.
| if reasoning_effort is not None: | ||
| kwargs["reasoning"] = {"effort": reasoning_effort} |
There was a problem hiding this comment.
This logic for adding the reasoning parameter is also present in call_completion_async. This adds to the existing code duplication between call_completion and call_completion_async. These two functions are nearly identical and could be refactored to improve maintainability. For example, the kwargs dictionary construction could be extracted into a shared helper function.
| # Config | ||
| tbl.add_row("max_tokens", str(max_tokens)) | ||
| if reasoning_effort is not None: | ||
| tbl.add_row("reasoning", reasoning_effort) |
There was a problem hiding this comment.
For clarity and consistency with the parameter name (reasoning_effort) and the CLI argument (--reasoning-effort), it would be better to use 'reasoning_effort' as the key in the configuration output table instead of just 'reasoning'.
| tbl.add_row("reasoning", reasoning_effort) | |
| tbl.add_row("reasoning_effort", reasoning_effort) |
Summary
This PR adds support for the
reasoning_effortparameter in API calls, allowing users to control the reasoning effort level when making completion requests. This is useful for models that support extended reasoning capabilities.Key Changes
reasoning_effortfield toInputInfodataclass to store the parametercall_completion_async()andcall_completion()functions to accept and passreasoning_effortparameter to the API viareasoning.effortin kwargs_call_with_retry_async()to propagatereasoning_effortthrough retry logicrun_benchmark()to accept and passreasoning_effortto completion calls--reasoning-effortCLI argument with choices:none,low,medium,highprint_config()to display the reasoning effort setting when configuredreasoning_effortparameter through the benchmark execution pipelineImplementation Details
reasoning_effortparameter is optional (defaults toNone) and only included in API requests when explicitly set{"reasoning": {"effort": reasoning_effort}}in the request kwargshttps://claude.ai/code/session_01Qy4Dkd9fibk9CFga2vV74R