Conversation
|
Internal regression failed: Build ID #763 |
|
Internal regression succeeded 🍏: Build ID #764 |
| if llm_config == GROK_OCI_API_KEY_CONFIG: | ||
| pytest.skip("OCI grok returns empty logprobs") | ||
| if llm_config == COHERE_OCI_API_KEY_CONFIG: | ||
| pytest.skip("Gtp-OSS does not support returning logprobs") |
There was a problem hiding this comment.
the string seems wrong
| pytest.skip("OCI grok returns empty logprobs") | ||
| if llm_config == COHERE_OCI_API_KEY_CONFIG: | ||
| pytest.skip("Gtp-OSS does not support returning logprobs") | ||
| if llm_config == COHERE_OCI_API_KEY_CONFIG: |
There was a problem hiding this comment.
the if seems the same as previous one?
| def test_hosted_llm_can_return_logprobs_if_supported(llm_config): | ||
|
|
||
| if llm_config == OLLAMA_MODEL_CONFIG: | ||
| pytest.skip("Ollama hosted models sometimes does not return logprobs") |
There was a problem hiding this comment.
instead of skipping, we might want to check that we get an expected exception or expected default output
| token: str | ||
| """The literal text of the generated token.""" | ||
| logprob: float | ||
| """The log probability assigned to the generated token.""" |
There was a problem hiding this comment.
what's the range of values? should be documented and normalized across providers
| generation_config: | ||
| Optional generation arguments for the LLM generation in this step. See ``LlmGenerationConfig`` for available parameters. | ||
| top_logprobs: | ||
| If not None, the step will return the top logprobs for each token. |
There was a problem hiding this comment.
for each of the top top_logprobs token?
| Python script/notebook for this guide. | ||
|
|
||
| Generation parameters, such as temperature, top-p, and the maximum number of output tokens, are important for achieving the desired performance with Large Language Models (LLMs). | ||
| Generation parameters, such as temperature, top-p, the maximum number of output tokens, and token log probabilities, are important for achieving the desired performance with Large Language Models (LLMs). |
There was a problem hiding this comment.
| Generation parameters, such as temperature, top-p, the maximum number of output tokens, and token log probabilities, are important for achieving the desired performance with Large Language Models (LLMs). | |
| Generation parameters, such as temperature, top-p, the maximum number of output tokens, and per-token log-probabilities, are important for achieving the desired performance with Large Language Models (LLMs). |
|
|
||
| The :ref:`LLM generation config <llmgenerationconfig>` is the set of parameters that control the output of a :ref:`Large Language Model (LLM) <llmmodel>` in WayFlow. | ||
| These parameters include the maximum number of tokens to generate (``max_tokens``), the sampling ``temperature``, and the probability threshold for nucleus sampling (``top_p``). | ||
| These parameters include the maximum number of tokens to generate (``max_tokens``), the sampling ``temperature``, the probability threshold for nucleus sampling (``top_p``), and optional token log probabilities (``top_logprobs``). |
There was a problem hiding this comment.
| These parameters include the maximum number of tokens to generate (``max_tokens``), the sampling ``temperature``, the probability threshold for nucleus sampling (``top_p``), and optional token log probabilities (``top_logprobs``). | |
| These parameters include the maximum number of tokens to generate (``max_tokens``), the sampling ``temperature``, the probability threshold for nucleus sampling (``top_p``), and optional per-token log-probabilities (``top_logprobs``). |
| continue | ||
|
|
||
| new_logprob = TextTokenLogProb( | ||
| token=max_log_prob_token, logprob=max_log_prob, top_logprobs=top_log_probs |
There was a problem hiding this comment.
TextTokenLogProb is supposed to represent the actual emitted token and its logprob. Here we’re deriving it from top_logprobs by picking the max-probability candidate, which is only the argmax token, not necessarily the returned token. Under sampling, those can differ. OCI already gives us the canonical emitted token/logprob in choice_dict.logprobs.tokens and choice_dict.logprobs.token_logprobs; top_logprobs should only populate the alternate candidates.
| if "summary" not in generation_config.extra_args["reasoning"]: | ||
| generation_config.extra_args["reasoning"]["summary"] = "auto" | ||
|
|
||
| kwargs.update(generation_config.extra_args) |
There was a problem hiding this comment.
if generation_config.extra_args contains "include", this would overwrite the kwargs["include"].append("message.output_text.logprobs") you set above. Please check if the same problem with overriding exists for the chat completions processor.
| if "usage" in json_object and json_object["usage"] is not None: | ||
| raw_usage = json_object["usage"] | ||
| token_usage = self._extract_usage(raw_usage) | ||
| yield StreamChunkType.TEXT_CHUNK, Message( |
There was a problem hiding this comment.
Should we not attach logprobs here too? Same for the responses processor. Like delta["logprobs"]?
| ) | ||
| logger.warning(warning_message) | ||
| raise ValueError(warning_message) | ||
| outputs[self.LOGPROBS] = text_chunk.logprobs |
There was a problem hiding this comment.
Would this overwrite a usesr-defined output property also called "logprobs"?
|
|
||
| Add token logprobs support with the `top_logprobs` generation config parameter and support returning | ||
| logprobs in the `PromptExecutionStep`. | ||
|
|
There was a problem hiding this comment.
Please add a
for more information please read the guide on :ref:`how to ... <ref_to_section on request token log probabilities>`
sonleoracle
left a comment
There was a problem hiding this comment.
Please rebase, and please ensure this works for GeminiModel or raise
Adding support for LogProbs in Wayflow: #126