Skip to content

Qwen3-VL-2B export#68

Open
carinapeng wants to merge 9 commits into
apple:mainfrom
carinapeng:carina/qwen3vl-export
Open

Qwen3-VL-2B export#68
carinapeng wants to merge 9 commits into
apple:mainfrom
carinapeng:carina/qwen3vl-export

Conversation

@carinapeng

Copy link
Copy Markdown
Contributor

CoreAI export for Qwen3-VL-2B

Add Qwen3-VL vision-language model export support.

- gpu/qwen3_vl.py: Qwen3VLForCausalLM (text decoder, input_ids) and
  Qwen3VLForCausalLMEmbeddings (inputs_embeds variant for VLM fusion)
- primitives/macos/cache_scatter.py: slice_scatter-based explicit KV cache
  (avoids stateful mutable_slice_update Metal kernel issues)
- export_qwen3vl_explicit_kv.py: text decoder export (inputs_embeds, explicit KV)
- export_vision_encoder_224.py: vision encoder export (448x448 -> 196 visual tokens)
- registry.py: register qwen3_vl model entry
@carinapeng carinapeng requested a review from stikves June 29, 2026 17:35
Comment thread python/src/coreai_models/primitives/macos/cache_scatter.py
Comment thread python/export_qwen3vl.py
Usage:
cd <repo-root>
uv run python python/export_qwen3vl.py [--max-ctx 4096] [--num-layers N]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we use the existing pattern?

uv run coreai.vlm.export qwen3-vl

or similar

And also update

  • Model registry
  • A new model specific subfolder (or models/vlm, or models/qwen)
  • README?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have coreai.vlm.* namespace yet, at least in pyproject, this is a bigger refactor, I think we can have a follow up for it since the scope of this is a first model support for VLM infra?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be a simple refactor, and will help with discoverability.

Comment thread python/src/coreai_models/models/gpu/__init__.py Outdated

@stikves stikves left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, but we might want to move the model to a better home

models/qwen3-vl or models/vlm?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants