Skip to content

Refresh Responses API support and add latest OpenAI models#38

Open
nibzard wants to merge 1 commit intosimonw:mainfrom
nibzard:feat/responses-refresh
Open

Refresh Responses API support and add latest OpenAI models#38
nibzard wants to merge 1 commit intosimonw:mainfrom
nibzard:feat/responses-refresh

Conversation

@nibzard
Copy link
Copy Markdown

@nibzard nibzard commented Mar 17, 2026

Summary

This consolidates the stale OpenAI backlog into one patch:

  • adds the missing current model IDs, including gpt-5.4, gpt-5.4-pro, gpt-5.3-codex, the gpt-5.2* and gpt-5.1* families, missing snapshots like o3-pro-2025-06-10, and the missing gpt-4 / gpt-4-turbo Responses models
  • switches current system prompts to the instructions parameter
  • passes reasoning options via the nested reasoning object, including reasoning_summary plus newer effort values like minimal and xhigh
  • uses previous_response_id when a stored prior response is available, with fallback to full history when chaining is unsafe
  • adds OpenAI-hosted web_search tool options and documents them in the README
  • updates tests to cover model registration, request shaping, web search configuration, and response chaining, and isolates the test plugin environment so unrelated installed LLM plugins do not interfere

Issues

This is intended to resolve:

It also covers the request-shaping side of #9 by exposing hosted web_search, but it does not attempt to solve any remaining upstream annotation/citation display gaps in llm core.

Testing

  • pytest -q

@nibzard
Copy link
Copy Markdown
Author

nibzard commented Mar 17, 2026

@simonw you could use prompt like this one (also vibed from the session log) to recrate this:

You are working in the `simonw/llm-openai-plugin` repository. Your task is to produce one consolidated, production-quality patch that refreshes the
  plugin’s OpenAI Responses API support and closes the stale backlog around missing modern OpenAI model support.

  Context:
  - This plugin implements OpenAI models for LLM using the Responses API.
  - As of March 17, 2026, the repo backlog is fragmented across multiple open issues and overlapping PRs.
  - The work should be consolidated into one coherent change, not split into separate partial fixes.
  - Before changing code, inspect the current repo, open issues, and open PRs, and verify current OpenAI model availability against official OpenAI
  documentation and/or the live Models API.
  - Do not guess on “latest” model names. Verify them.

  Primary goal:
  Bring the plugin up to date with the current OpenAI Responses API model catalog and request shape, while preserving existing behavior and adding
  focused tests.

  Issues/PR themes to consolidate:
  - Add support for the latest OpenAI models, including GPT-5.4.
  - Implement broader missing model coverage, including older Responses-compatible GPT models that were absent.
  - Switch from system prompt messages to the Responses API `instructions` parameter.
  - Support reasoning summaries and ensure reasoning options are actually passed via the nested `reasoning` object.
  - Use `previous_response_id` for conversation chaining when safe.
  - Add support for OpenAI-hosted platform tools, especially `web_search`.
  - Update README/documentation to match the new capabilities.
  - Add or update tests so the patch is reviewable and reliable.

  Implementation requirements:

  1. Refresh model registration
  - Update the model registry in `llm_openai.py` to include current Responses-compatible OpenAI models verified as of March 17, 2026.
  - Include, at minimum:
    - `gpt-5.4`
    - `gpt-5.4-2026-03-05`
    - `gpt-5.4-pro`
    - `gpt-5.4-pro-2026-03-05`
    - `gpt-5.3-chat-latest`
    - `gpt-5.3-codex`
    - `gpt-5.2`
    - `gpt-5.2-2025-12-11`
    - `gpt-5.2-chat-latest`
    - `gpt-5.2-codex`
    - `gpt-5.2-pro`
    - `gpt-5.2-pro-2025-12-11`
    - `gpt-5.1`
    - `gpt-5.1-2025-11-13`
    - `gpt-5.1-chat-latest`
    - `gpt-5.1-codex`
    - `gpt-5.1-codex-mini`
    - `gpt-5.1-codex-max`
    - `gpt-5-chat-latest`
    - `o3-pro-2025-06-10`
  - Also restore/add missing older Responses-compatible models where appropriate, including:
    - `gpt-4`
    - `gpt-4-turbo`
    - `gpt-4-turbo-2024-04-09`
    - missing snapshot variants such as `gpt-4o-*`, `gpt-4o-mini-*`, `o1-*`, `o3-mini-*` if they are confirmed available and appropriate.
  - Preserve capability flags correctly:
    - `vision`
    - `reasoning`
    - `streaming`
    - `schemas`
  - Be careful with pro/reasoning models that should not advertise unsupported schema behavior.

  2. Responses API prompt handling
  - Stop sending system prompts as `"role": "system"` input messages for current prompts.
  - Use the Responses API `instructions` parameter instead.
  - Preserve conversation behavior correctly.
  - If a conversation is reconstructed without `previous_response_id`, earlier conversation history may still need legacy system messages for historical
  entries; current prompt handling should use `instructions`.

  3. Reasoning options
  - Ensure reasoning options are passed via a nested `reasoning` object, not top-level loose fields.
  - Support:
    - `reasoning_effort`
    - `reasoning_summary`
  - Supported effort values should include current values such as:
    - `minimal`
    - `low`
    - `medium`
    - `high`
    - `xhigh`
  - Supported summary values should include:
    - `auto`
    - `concise`
    - `detailed`

  4. previous_response_id chaining
  - Implement safe use of `previous_response_id` in conversations.
  - If the previous response exists and was stored, and the current prompt is eligible, use `previous_response_id` and send only the current prompt
  payload.
  - If chaining is unsafe, especially when `store=False`, fall back to sending full conversation history.
  - Preserve behavior when system instructions change across turns.

  5. Platform tools / web search
  - Add options for OpenAI-hosted tools in the plugin, centered on `web_search`.
  - Support:
    - enabling hosted tools, e.g. `openai_tools`
    - domain filtering for web search
    - live vs cached search control
    - search context size
    - optional inclusion of web search sources in the response JSON
  - Merge hosted platform tools with user-defined LLM tools cleanly.
  - Validate obvious constraints such as the maximum number of allowed domains if applicable.

  6. Documentation
  - Update `README.md` so it no longer describes the plugin as effectively only useful for `o1-pro`.
  - Show that the plugin is now useful for current Responses API access, latest models, reasoning controls, conversation chaining, and hosted tools.
  - Regenerate any model list block in the README if it is auto-generated.
  - Add a short web search usage example and document the relevant options.

  7. Tests
  - Add/update focused tests in `tests/test_openai.py` for:
    - new model registration
    - pro model schema capability behavior
    - `instructions` request shaping
    - reasoning request shaping
    - web search request shaping
    - validation of web search domain limits
    - `previous_response_id` request shaping and fallback behavior
  - Keep or preserve existing recorded API tests.
  - If the test environment loads unrelated installed LLM plugins and that breaks VCR cassettes, isolate plugin loading in test setup so only this plugin
  participates during test runs.
  - Make sure the full suite passes.

  Files likely to change:
  - `llm_openai.py`
  - `README.md`
  - `tests/test_openai.py`
  - `tests/conftest.py`

  Process requirements:
  - Inspect the current code before editing.
  - Review the existing open issues and PRs and fold in the compatible parts rather than duplicating stale or conflicting approaches.
  - Verify current OpenAI model names using official OpenAI sources before changing the registry.
  - Prefer a single coherent patch over multiple partial changes.
  - Do not add unrelated features such as a new image-generation CLI command or a “retrieve stored responses later” CLI unless strictly necessary; those
  are separate design tasks.
  - Keep the patch pragmatic and reviewable.

  Validation:
  - Run the test suite locally.
  - Expected final validation command:
    - `pytest -q`
  - The patch is complete only when tests pass and the README reflects the new behavior.

  Deliverable:
  - Produce one final integrated patch suitable for a PR titled roughly:
    - `Refresh Responses API support and add latest OpenAI models`

  In the final summary, clearly state:
  - which backlog items were effectively covered,
  - which exact model families were added,
  - that tests pass,
  - and which larger backlog items were intentionally left out because they require separate CLI/product design.

@toruvieI
Copy link
Copy Markdown

support this, I offer help to test it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants