Skip to content

tvosch/VRAM-estimator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

VRAM/GPU Memory estimator for LLMs

This repo estimates the required VRAM or commonly used GPU memory to train a large language model (LLM). Supports:

  • ZeRO stages.
  • Providing HuggingFace hub repository id (example: meta-llama/Meta-Llama-3-8B)

TODO:

  • Add LoRA/QLoRA for finetuning purposes
  • Make UI with gradio/streamlit and add sliders
  • Support for reversing the problem: given GPUs how large of a model can I train

Getting Started

In case you only use with your own estimations and numbers and no HuggingFace config is given, the prerequisities and installation can be skipped.

Prerequisites

  • transformers (only necessary for automatic HuggingFace hub model parsing)

Installation

python -m venv venv
source venv/bin/activate
pip install transformers

Usage

python vram_estimator_old.py --micro_batch_size 1 --num_gpus 1 --repo_id TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3

When a repo_id is given, the argument parser values are overwritten

Notes

Some models have been confirmed experimentally on their VRAM usage. See list below:

Repo id/Model name Micro batch size Number of GPUs ZeRO stage Gradient checkpointing Estimated VRAM (per GPU) Actual VRAM (per GPU)
TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3 1 1 0 False 34.3 GB 33.5GB

About

VRAM/GPU memory estimator for LLM training

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages