Repack a local Hugging Face model directory into smaller safetensors shards without loading the model through transformers.
This repository is intended for cases where you want to:
- keep the original tensor names and weights intact
- split large
safetensorsfiles into smaller shards - preserve auxiliary files like
config.json, tokenizer files, and the model card - upload the repacked model to the Hugging Face Hub
The repacker works directly on the safetensors shard files and regenerates model.safetensors.index.json. This avoids dropping weights that can be lost when round-tripping through from_pretrained(...).save_pretrained(...).
- Python
>=3.13,<3.14 uv
Install the project environment:
uv syncInstall the development tools as well, including huggingface_hub for the Hub CLI:
uv sync --devBasic usage:
uv run python repack_hf_model.py my-model my-model-repacked --max-shard-size 1900MBYou can also run the installed script entrypoint:
uv run repack-hf-model my-model my-model-repacked --max-shard-size 1900MBThis will:
- read the original
model.safetensors.index.json - rewrite the tensors into new smaller shard files
- write a new
model.safetensors.index.json - copy non-weight files into the output directory
Check the authenticated user:
uv run python -m huggingface_hub.cli.hf auth whoamiLog in:
uv run python -m huggingface_hub.cli.hf auth loginUpload a large model folder:
uv run python -m huggingface_hub.cli.hf upload-large-folder \
my-org/my-model-repacked \
my-model-repacked \
--type model \
--num-workers 4 \
--no-barsShow CLI help:
uv run python -m huggingface_hub.cli.hf --help- The repository currently targets local model directories that already use
safetensorsand includemodel.safetensors.index.json. - Large uploads create resumable state under
.cache/huggingface/inside the uploaded folder whileupload-large-folderis running. - If
uv syncwarns about hardlinks and falls back to copies, that usually means the cache and project directory are on different filesystems.