Skip to content

AntNLP/DVAGen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DVAGen: Dynamic Vocabulary Augmented Generation

DVAGen is a fully open-source, unified framework designed for training, evaluation, and visualization of dynamic vocabulary-augmented language models.

DVAGen Framework

Updates

  • [2025/07/18] DVAGen v0.1.0 released!

Quick Start

Setup

Download the repository and install DVAGen:

git clone https://github.com/AntNLP/DVAGen.git
cd DVAGen
pip install -e .

Note that the faiss-gpu package is required if you intend to retrieve supporting documents using the GPU with FAISSRetriever. faiss-gpu contains both GPU and CPU indices and may be incompatible with the CPU-only version (faiss-cpu). For further information, please refer to the FAISS documentation.

Inference and Training with DVAGen

Use the following command to launch a CLI or WebUI tool for chatting.

dvagen chat --config_path examples/chat.yaml

To evaluate a model on various tasks, use the following command:

dvagen eval --config_path examples/eval.yaml

By default, we use deepspeed to launch the training script. To train a model, use the following command:

dvagen train [deepspeed_args] --config_path examples/train.yaml

An example for training command.

dvagen train --num_gpus 1 \
             --num_nodes 1 \
             --master_addr "localhost" \
             --master_port 9901 \
             --config_path examples/train.yaml

Details of the configuration files are available in the examples/README.md file.

About

[EMNLP 25 Demo] Dynamic Vocabulaty Augmented Generation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages