Skip to content

Check for stop_reason: max_tokens when using structured data #315

@chendaniely

Description

@chendaniely

When the structured output has a lot of text to pull out, it will fail with a JSONDecodeError.

I ended up (with the help of claude) finding the issue was because the chat_structured() has a max token count and changed my code to add a kwargs={"max_tokens": 16000} argument.

From:

dat = chat.chat_structured(pdf, data_model=CommentResults)

To:

dat = chat.chat_structured(pdf, data_model=CommentResults, kwargs={"max_tokens": 16000})

The JSON response should return with a "stop_reason: max_tokens" value, so this could be surfaced to the user to make things a bit friendlier to point the user to adjusting the model's max_tokens value.

Full traceback of JSONDecodeError error

JSONDecodeError: unexpected end of data: line 1 column 20679 (char 20678)
Cell In[53], line 1
----> 1 dat = chat.chat_structured(pdf, data_model=CommentResults)

Hide Traceback

Fix

Explain
File [~/.venv/lib/python3.14/site-packages/chatlas/_chat.py:1443](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html#), in Chat.chat_structured(self, data_model, echo, stream, kwargs, *args)
   1407 def chat_structured(
   1408     self,
   1409     *args: Content | str,
   (...)
   1413     kwargs: Optional[SubmitInputArgsT] = None,
   1414 ) -> BaseModelT:
   1415     """
   1416     Extract structured data.
   1417 
   (...)
   1441         An instance of the provided `data_model` containing the extracted data.
   1442     """
-> 1443     dat = self._submit_and_extract_data(
   1444         *args,
   1445         data_model=data_model,
   1446         echo=echo,
   1447         stream=stream,
   1448         kwargs=kwargs,
   1449     )
   1450     return data_model.model_validate(dat)

File [~/.venv/lib/python3.14/site-packages/chatlas/_chat.py:1497](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html#), in Chat._submit_and_extract_data(self, data_model, echo, stream, kwargs, *args)
   1485 response = ChatResponse(
   1486     self._submit_turns(
   1487         user_turn(*args, prior_turns=self.get_turns()),
   (...)
   1493     )
   1494 )
   1496 with display:
-> 1497     for _ in response:
   1498         pass
   1500 turn = self.get_last_turn()

File [~/.venv/lib/python3.14/site-packages/chatlas/_chat.py:3216](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html#), in ChatResponse.__next__(self)
   3215 def __next__(self) -> str:
-> 3216     chunk = next(self._generator)
   3217     self.content += chunk  # Keep track of accumulated content
   3218     return chunk

File [~/.venv/lib/python3.14/site-packages/chatlas/_chat.py:2801](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html#), in Chat._submit_turns(self, user_turn, echo, stream, data_model, kwargs, content_mode, controller)
   2792 else:
   2793     response = self.provider.chat_perform(
   2794         stream=False,
   2795         turns=[*self._turns, user_turn],
   (...)
   2798         kwargs=all_kwargs,
   2799     )
-> 2801     turn = self.provider.value_turn(
   2802         response, has_data_model=data_model is not None
   2803     )
   2804     if turn.text:
   2805         emit(turn.text)

File [~/.venv/lib/python3.14/site-packages/chatlas/_provider_anthropic.py:567](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html#), in AnthropicProvider.value_turn(self, completion, has_data_model)
    566 def value_turn(self, completion, has_data_model):
--> 567     return self._as_turn(completion, has_data_model)

File [~/.venv/lib/python3.14/site-packages/chatlas/_provider_anthropic.py:811](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html#), in AnthropicProvider._as_turn(self, completion, has_data_model)
    809 if content.type == "text":
    810     if uses_new_output_format:
--> 811         contents.append(ContentJson(value=orjson.loads(content.text)))
    812     else:
    813         contents.append(ContentText(text=content.text))

FWIW, if you sent the max_tokens to a very high value (e.g., 32_000), you end up with a different error

ValueError: Streaming is required for operations that may take longer than 10 minutes. See [https://github.com/anthropics/anthropic-sdk-python#long-requests](vscode-file://vscode-app/Applications/Positron.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html#) for more details

Metadata

Metadata

Assignees

No one assigned

    Labels

    Priority: MediumValid bug or well-defined request with moderate impact or a workaround.ai-triage:acceptedA human reviewed and accepted the AI triage result.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions