[fix] http_options leaking to model init & NPU tensor serialization failure over HTTP by kevssim · Pull Request #109 · modelscope/twinkle

kevssim · 2026-03-06T07:37:42Z

PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

Fix two bugs in the HTTP serving path:

1. http_options leaking into model __init__

http_options was unconditionally injected into args for all app types in ServerLauncher._deploy_app(), causing it to be forwarded to
Qwen3ForCausalLM.__init__() and triggering a TypeError. Fixed by guarding the injection with import_path == 'server'.

2. NPU tensor serialization failure on CPU-only client

The /forward_backward HTTP endpoint returned raw NPU tensors, which Ray could not deserialize on the client side (no NPU available). Fixed by adding
forward_backward_http, which converts all tensors/arrays to plain Python scalars/lists via _to_cpu_safe_output() before returning.

gemini-code-assist · 2026-03-06T07:37:58Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses two critical bugs in the HTTP serving path. It resolves an issue where http_options were inadvertently passed to model initialization, causing TypeError exceptions. Additionally, it fixes a problem with NPU tensor serialization, ensuring that outputs from the /forward_backward HTTP endpoint are correctly converted to CPU-compatible Python objects, preventing deserialization failures on client machines without NPU hardware.

Highlights

http_options injection fix: Prevented http_options from being incorrectly passed to model __init__ methods by guarding its injection to only 'server' app types, resolving TypeError issues.
NPU tensor serialization for HTTP: Implemented a mechanism to convert NPU tensors and NumPy arrays to CPU-safe Python scalars/lists before returning them from the /forward_backward HTTP endpoint, addressing deserialization failures on CPU-only clients.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

cookbook/client/twinkle/self_congnition.py
- Updated loss logging to extract 'loss' specifically from the output dictionary.
src/twinkle/model/transformers/transformers.py
- Added a static method _to_cpu_safe_output to convert nested data structures containing PyTorch tensors or NumPy arrays into CPU-safe Python types.
- Introduced forward_backward_http as a new remote function that wraps forward_backward and applies _to_cpu_safe_output to its results.
src/twinkle/server/launcher.py
- Modified the logic for injecting http_options into application arguments, ensuring it only occurs for 'server' type applications.
src/twinkle/server/twinkle/model.py
- Updated the /forward_backward HTTP endpoint to call the new forward_backward_http method, ensuring outputs are CPU-safe.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces two important bug fixes: preventing http_options from leaking into model initializers and resolving a tensor serialization issue over HTTP by introducing a new forward_backward_http method. While these bug fixes are well-implemented, a critical security vulnerability was identified in the /upload_to_hub endpoint, allowing for arbitrary directory exfiltration due to a lack of path validation when the twinkle:// prefix is not used. This requires immediate attention by strictly validating and restricting all user-supplied paths to authorized directories. Additionally, there is a minor suggestion to improve code conciseness in the new _to_cpu_safe_output helper function.

src/twinkle/model/transformers/transformers.py

Yunnglin · 2026-03-06T08:09:25Z

src/twinkle/model/transformers/transformers.py

+    @remote_function(dispatch='slice_dp', collect='mean')
+    def forward_backward_http(self, *, inputs: Union[InputFeature, List[InputFeature], Trajectory, List[Trajectory]],
+                              **kwargs):
+        """HTTP-safe forward/backward that materializes outputs before they leave the worker."""
+        outputs = self.forward_backward(inputs=inputs, **kwargs)
+        return self._to_cpu_safe_output(outputs)
+


The forward_backward_http broke the consistency of the interface and cannot be directly modified in the transformers.py file. For special operations required by the server, you need to inherit this class and override the method. You can refer to the writing style of tinker:

twinkle/src/twinkle/server/tinker/common/transformers_model.py

Lines 91 to 99 in 3d855d5

super().backward(adapter_name=adapter_name, **kwargs)

# shape (batch_size, seq_len, vocab_size)

logits = outputs['logits'].detach()

logps = outputs.get('logps', None)

if logps is not None:

logps = logps.detach().cpu()

results = self._get_forward_output(inputs, logits, logps)

return [results, loss]

fix

deb4d3d

gemini-code-assist bot reviewed Mar 6, 2026

View reviewed changes

src/twinkle/model/transformers/transformers.py Outdated Show resolved Hide resolved

fix

c0d747e

Yunnglin reviewed Mar 6, 2026

View reviewed changes

Yunnglin requested changes Mar 6, 2026

View reviewed changes

kevssim marked this pull request as draft March 6, 2026 08:31

fix

1c2407d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix] http_options leaking to model init & NPU tensor serialization failure over HTTP#109

[fix] http_options leaking to model init & NPU tensor serialization failure over HTTP#109
kevssim wants to merge 3 commits intomodelscope:mainfrom
kevssim:main

kevssim commented Mar 6, 2026

Uh oh!

gemini-code-assist bot commented Mar 6, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Yunnglin Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	super().backward(adapter_name=adapter_name, **kwargs)

	# shape (batch_size, seq_len, vocab_size)
	logits = outputs['logits'].detach()
	logps = outputs.get('logps', None)
	if logps is not None:
	logps = logps.detach().cpu()
	results = self._get_forward_output(inputs, logits, logps)
	return [results, loss]

Conversation

kevssim commented Mar 6, 2026

PR type

PR information

Uh oh!

gemini-code-assist bot commented Mar 6, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Yunnglin Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants