Qwen 3 1.7B Offline tool calling Android#165
Open
vkkhare wants to merge 11 commits intoNimbleEdge:mainfrom
Open
Qwen 3 1.7B Offline tool calling Android#165vkkhare wants to merge 11 commits intoNimbleEdge:mainfrom
vkkhare wants to merge 11 commits intoNimbleEdge:mainfrom
Conversation
Signed-off-by: Varun Khare <varun.khare@nimbledgehq.ai>
add tokenizer-cpp add jinja template for qwen and dict support for tokenizer:from_json Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>
Signed-off-by: Varun Khare <varunkhare1234@gmail.com>
jpuneet
reviewed
Jul 25, 2025
| f"{library_stubs_dir}/src_gen", | ||
| coreruntime_dir, | ||
| ], | ||
| ["cp", "-r", f"{library_stubs_dir}/src_template", f"{library_stubs_dir}/src_gen"], |
Contributor
There was a problem hiding this comment.
- Is this accidental change?
cp -Ris the portable form, compared tocp -r.
jpuneet
reviewed
Jul 25, 2025
Comment on lines
16
to
28
| * Compares two data types and returns the one with higher precedence | ||
| * for automatic type promotion in operations. The precedence order is: | ||
| * BOOLEAN (0) < INT32 (3) < INT64 (4) < FLOAT (5) < DOUBLE (6) | ||
| * BOOLEAN (0) < INT32 (3) < INT64 (4) < FLOAT16 (4.5) < FLOAT (5) < DOUBLE (6) | ||
| * | ||
| * @param dataType1 First data type to compare | ||
| * @param dataType2 Second data type to compare | ||
| * @return The data type with higher precedence | ||
| */ | ||
| inline int get_max_dataType(int dataType1, int dataType2) { | ||
| std::map<int, int> _typeScore = { | ||
| {DATATYPE::BOOLEAN, 0}, {DATATYPE::INT32, 3}, {DATATYPE::INT64, 4}, | ||
| {DATATYPE::FLOAT, 5}, {DATATYPE::DOUBLE, 6}, | ||
| {DATATYPE::BOOLEAN, 0}, {DATATYPE::INT32, 3}, {DATATYPE::INT64, 4}, | ||
| {DATATYPE::FLOAT16, 45}, {DATATYPE::FLOAT, 5}, {DATATYPE::DOUBLE, 6}, | ||
| }; |
Contributor
There was a problem hiding this comment.
This doesn't look correct.
Contributor
Author
There was a problem hiding this comment.
yup will update this
# This is the 1st commit message: add support for dictionary indexing in onnx executor Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai> # This is the commit message NimbleEdge#2: add dictionary input support to model.run() for kv_cache Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai> # This is the commit message NimbleEdge#3: add fp16 support in delitepy Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai> # This is the commit message NimbleEdge#4: Qwen with tool calling functional in delitePy Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai> # This is the commit message NimbleEdge#5: Implemented enumerate and next in DelitePy (NimbleEdge#162) * Implemented enumerate and next in DelitePy Signed-off-by: Atul Jain <atul.jain@nimbleedgehq.ai> * Cosmetics Signed-off-by: Puneet Jindal <puneet.jindal@nimbleedgehq.ai> --------- Signed-off-by: Puneet Jindal <puneet.jindal@nimbleedgehq.ai> Co-authored-by: Atul Jain <atul.jain@nimbleedgehq.ai> Co-authored-by: Puneet Jindal <puneet.jindal@nimbleedgehq.ai>
Signed-off-by: Puneet Jindal <puneet.jindal@nimbleedgehq.ai>
Signed-off-by: Puneet Jindal <puneet.jindal@nimbleedgehq.ai>
Signed-off-by: Varun Khare <varun.khare@nimbledgehq.ai> modular qwen demo structure Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai> wip handle attention cache Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai> resume from last postion for multi-step run Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>
Signed-off-by: Varun Khare <varun.khare@nimbledgehq.ai>
Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>
Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>
Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Here's a comprehensive PR description for all the changes:
Description
This PR adds comprehensive support for Qwen 3 1.7B model with tool calling capabilities, uses native ONNX runtime isntead of onnxruntime_genai, and adds the export script for model enhancements.
Key Features Added
Cpp bindings
Delitepy Bindings
FP16 Support
Enhanced binary operations now support FP16 data type through uint16_t:
Kotlin Interface
Reverse stream of generation from python and subscription in kotlin flows.
Qwen Demo Setup
The Qwen demo uses a zip-based modules in delitePy
Tool Calling Features
<tool_call>XML tagsChecklist: