Skip to content

Add logprobs support#127

Open
jschweiz wants to merge 6 commits intomainfrom
add-logprobs-support-2
Open

Add logprobs support#127
jschweiz wants to merge 6 commits intomainfrom
add-logprobs-support-2

Conversation

@jschweiz
Copy link
Copy Markdown
Member

Adding support for LogProbs in Wayflow: #126

@jschweiz jschweiz requested a review from a team March 27, 2026 15:58
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Mar 27, 2026
@jschweiz jschweiz requested a review from sonleoracle March 27, 2026 15:58
@dhilloulinoracle
Copy link
Copy Markdown
Contributor

Internal regression failed: Build ID #763

@dhilloulinoracle
Copy link
Copy Markdown
Contributor

Internal regression succeeded 🍏: Build ID #764

@jschweiz jschweiz changed the title Add logprobs support 2 Add logprobs support Mar 27, 2026
if llm_config == GROK_OCI_API_KEY_CONFIG:
pytest.skip("OCI grok returns empty logprobs")
if llm_config == COHERE_OCI_API_KEY_CONFIG:
pytest.skip("Gtp-OSS does not support returning logprobs")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the string seems wrong

pytest.skip("OCI grok returns empty logprobs")
if llm_config == COHERE_OCI_API_KEY_CONFIG:
pytest.skip("Gtp-OSS does not support returning logprobs")
if llm_config == COHERE_OCI_API_KEY_CONFIG:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the if seems the same as previous one?

def test_hosted_llm_can_return_logprobs_if_supported(llm_config):

if llm_config == OLLAMA_MODEL_CONFIG:
pytest.skip("Ollama hosted models sometimes does not return logprobs")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of skipping, we might want to check that we get an expected exception or expected default output

token: str
"""The literal text of the generated token."""
logprob: float
"""The log probability assigned to the generated token."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the range of values? should be documented and normalized across providers

generation_config:
Optional generation arguments for the LLM generation in this step. See ``LlmGenerationConfig`` for available parameters.
top_logprobs:
If not None, the step will return the top logprobs for each token.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for each of the top top_logprobs token?

Python script/notebook for this guide.

Generation parameters, such as temperature, top-p, and the maximum number of output tokens, are important for achieving the desired performance with Large Language Models (LLMs).
Generation parameters, such as temperature, top-p, the maximum number of output tokens, and token log probabilities, are important for achieving the desired performance with Large Language Models (LLMs).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Generation parameters, such as temperature, top-p, the maximum number of output tokens, and token log probabilities, are important for achieving the desired performance with Large Language Models (LLMs).
Generation parameters, such as temperature, top-p, the maximum number of output tokens, and per-token log-probabilities, are important for achieving the desired performance with Large Language Models (LLMs).


The :ref:`LLM generation config <llmgenerationconfig>` is the set of parameters that control the output of a :ref:`Large Language Model (LLM) <llmmodel>` in WayFlow.
These parameters include the maximum number of tokens to generate (``max_tokens``), the sampling ``temperature``, and the probability threshold for nucleus sampling (``top_p``).
These parameters include the maximum number of tokens to generate (``max_tokens``), the sampling ``temperature``, the probability threshold for nucleus sampling (``top_p``), and optional token log probabilities (``top_logprobs``).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
These parameters include the maximum number of tokens to generate (``max_tokens``), the sampling ``temperature``, the probability threshold for nucleus sampling (``top_p``), and optional token log probabilities (``top_logprobs``).
These parameters include the maximum number of tokens to generate (``max_tokens``), the sampling ``temperature``, the probability threshold for nucleus sampling (``top_p``), and optional per-token log-probabilities (``top_logprobs``).

continue

new_logprob = TextTokenLogProb(
token=max_log_prob_token, logprob=max_log_prob, top_logprobs=top_log_probs
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TextTokenLogProb is supposed to represent the actual emitted token and its logprob. Here we’re deriving it from top_logprobs by picking the max-probability candidate, which is only the argmax token, not necessarily the returned token. Under sampling, those can differ. OCI already gives us the canonical emitted token/logprob in choice_dict.logprobs.tokens and choice_dict.logprobs.token_logprobs; top_logprobs should only populate the alternate candidates.

if "summary" not in generation_config.extra_args["reasoning"]:
generation_config.extra_args["reasoning"]["summary"] = "auto"

kwargs.update(generation_config.extra_args)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if generation_config.extra_args contains "include", this would overwrite the kwargs["include"].append("message.output_text.logprobs") you set above. Please check if the same problem with overriding exists for the chat completions processor.

if "usage" in json_object and json_object["usage"] is not None:
raw_usage = json_object["usage"]
token_usage = self._extract_usage(raw_usage)
yield StreamChunkType.TEXT_CHUNK, Message(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not attach logprobs here too? Same for the responses processor. Like delta["logprobs"]?

)
logger.warning(warning_message)
raise ValueError(warning_message)
outputs[self.LOGPROBS] = text_chunk.logprobs
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this overwrite a usesr-defined output property also called "logprobs"?


Add token logprobs support with the `top_logprobs` generation config parameter and support returning
logprobs in the `PromptExecutionStep`.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a

for more information please read the guide on :ref:`how to ... <ref_to_section on request token log probabilities>`

Copy link
Copy Markdown
Member

@sonleoracle sonleoracle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase, and please ensure this works for GeminiModel or raise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OCA Verified All contributors have signed the Oracle Contributor Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants