Conversation
zpitroda
commented
Apr 29, 2025
- Updated models from Qwen 2.5 to Qwen 3 equivalents
- Updated transformers and torch python packages
- Updated from qwen 2.5 to qwen 3 models - Updated transformers and torch python packages
Updated llama.cpp for Qwen3 support
README.md
Outdated
| For model deployment, we utilized [llama.cpp](https://github.com/ggml-org/llama.cpp), which provides efficient inference capabilities. | ||
|
|
||
| Our base models primarily come from the [Qwen2.5](https://huggingface.co/Qwen) series. | ||
| Our base models primarily come from the [Qwen3](https://huggingface.co/Qwen) series. |
There was a problem hiding this comment.
I am not sure what is Second Me's model update policy. As a community user I definitely want to use Qwen3 considering its SOTA capabilities (but I haven't tested yet so maybe there will be some in & outs).
Huge thanks to your work!
On top of that, It would be nice to add Qwen 3 on top of existing Qwen 2.5 model support, like adding a new support model instead of replace the existing Qwen 2.5 model.
There was a problem hiding this comment.
I am not sure what is Second Me's model update policy. As a community user I definitely want to use Qwen3 considering its SOTA capabilities (but I haven't tested yet so maybe there will be some in & outs).
Huge thanks to your work!
On top of that, It would be nice to add Qwen 3 on top of existing Qwen 2.5 model support, like adding a new support model instead of replace the existing Qwen 2.5 model.
I was wondering that as well, I'm testing right now to ensure qwen 3 doesn't break anything but yeah I don't know if the Second Me team currents actually wants to update . I can also keep the 2.5 models and add the option for 3 as well if that's preferable?
Updated convert_hf_to_gguf script and gguf-py package to support qwen3 models
Disabled thinking mode and updated backend dockerfile to work with new llama.cpp
|
Everything is working except during inference it outputs the |
Added Qwen 2.5 models back along with 3
|
Hi, I've test it a bit, it fails when downloading models... |
Sorry about that! The base_dir variable was accidentally indented into the "if" block above it but should be working now |
|
Hi, it works now. so once the conflict is resolved, it will be added to develop branch :) |
|
@kevin-mindverse sounds good! I think I have a temporary solution to the extra block being outputted until the llama.pp pr is merged, I'll try to have that and the conflict fixes pushed tomorrow |
|
Ok I actually quickly pushed what I think should do it, shouldn't break anything but haven't double check it's fixed |
|
@yingapple I'm away from my computer this week and unable to test, but it should hopefull now only add no_think flags when using a qwen3 model |