feats(transformers):add bitnet model#1416
feats(transformers):add bitnet model#1416iugoood wants to merge 1 commit intomindspore-lab:masterfrom
Conversation
Summary of ChangesHello @iugoood, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces the BitNet model into the mindone/transformers framework. It provides the full implementation of the BitNet architecture, including its core components like RMSNorm, MLP, Attention, and Rotary Embedding, adapted for MindSpore. The integration also extends to the auto-configuration and auto-modeling systems, making the BitNet model readily available for use within the library. Additionally, a robust test suite has been added to validate the MindSpore implementation against its PyTorch counterpart. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds the BitNet model. The implementation looks solid, but there are a few areas for improvement. The model currently depends on an un-vendored BitNetConfig from the transformers library, which should be included for self-containment. The tests are incomplete, lacking coverage for BitNetForCausalLM and support for MindSpore's Graph mode, which is crucial for performance. Additionally, there are some minor documentation issues and a wildcard import that should be addressed for better code quality and maintainability.
| from ...processing_utils import Unpack | ||
| from ...utils import TransformersKwargs, can_return_tuple | ||
| from ...utils.generic import check_model_inputs | ||
| from transformers.models.bitnet.configuration_bitnet import BitNetConfig |
There was a problem hiding this comment.
The model configuration BitNetConfig is imported from the transformers library, which creates an external dependency. To ensure this model is self-contained within the mindone library, please vendor the configuration_bitnet.py file into this pull request, similar to how other models are structured in this repository.
| from tests.transformers_tests.models.modeling_common import floats_numpy, ids_numpy | ||
|
|
||
| DTYPE_AND_THRESHOLDS = {"fp32": 5e-4, "fp16": 5e-3, "bf16": 5e-2} | ||
| MODES = [1] # not support graph mode yet |
| { | ||
| "last_hidden_state": 0, | ||
| }, | ||
| ], |
There was a problem hiding this comment.
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| from .modeling_bitnet import * No newline at end of file |
There was a problem hiding this comment.
2e61bfd to
612cc3f
Compare
612cc3f to
660032e
Compare
Add
1 add bitnet model
2 add UT
ps: Quantitative models cannot be validated.
Usage
Performance
Experiments are tested on Ascend Atlas 800T A2 machines with mindspore 2.6.0.