Skip to content

feat: add UnifiedLoader with config-driven loader dispatch#70

Open
ABNER-1 wants to merge 1 commit intofoundation-model-stack:mainfrom
ABNER-1:unified_loader
Open

feat: add UnifiedLoader with config-driven loader dispatch#70
ABNER-1 wants to merge 1 commit intofoundation-model-stack:mainfrom
ABNER-1:unified_loader

Conversation

@ABNER-1
Copy link
Copy Markdown
Contributor

@ABNER-1 ABNER-1 commented Apr 27, 2026

Summary

Implement #65

Add a config-driven UnifiedLoader that automatically dispatches to the appropriate loader class (SafeTensorsFileLoader or ThreeFSLoader) based on a YAML/JSON configuration file.

Configuration

  • docs/configuration.md (new): Comprehensive configuration guide with 5 usage examples
  • README.md: Add Configuration section linking to docs

Usage

from fastsafetensors import UnifiedLoader

loader = UnifiedLoader(pg, hf_weights_files, device="cuda:0")
for key, tensor in loader.iterate_weights():
    process(key, tensor)
loader.close()

Configuration via fastsafetensors.yaml or FASTSAFETENSORS_CONFIG env var.

Default Config (GDS)

When no config file is present, UnifiedLoader uses the following defaults:

loader: "base"
framework: "pytorch"
debug_log: false
set_numa: true
disable_cache: true

parallel:
  use_pipeline: false

The default call chain:

UnifiedLoader(pg, files, device)
  ├─ load_config() → no config file → LoaderConfig()
  ├─ _resolve_loader_class("base") → SafeTensorsFileLoader
  ├─ get_extension_config("base") → {}
  │   └─ process_extension_config({}) → {nogds: False}
  ├─ SafeTensorsFileLoader(pg, device, nogds=False, ...)
  ├─ create_parallel_kwargs() → {queue_size: -1}  (use_pipeline=False)
  └─ PipelineParallel(pg, loader, files, queue_size=-1)
       └─ serial: copy_files → broadcast → copy_files → ...

Note: The base loader defaults to copier_type: "gds" (GPU Direct Storage), matching the behavior of calling SafeTensorsFileLoader directly with nogds=False.

3FS Config Example

To use the 3FS high-performance I/O loader, create a fastsafetensors.yaml:

loader: "3fs"
framework: "pytorch"

"3fs":
  mount_point: "/mnt/3fs"

parallel:
  use_pipeline: true
  max_concurrent_producers: 1
  queue_size: 0
  use_tqdm_on_load: true

The 3FS call chain:

UnifiedLoader(pg, files, device)
  ├─ load_config() → fastsafetensors.yaml → LoaderConfig(loader="3fs")
  ├─ _resolve_loader_class("3fs") → ThreeFSLoader
  ├─ get_extension_config("3fs") → {mount_point: "/mnt/3fs"}
  │   └─ process_extension_config({mount_point: "/mnt/3fs"}) → {mount_point: "/mnt/3fs"}
  ├─ ThreeFSLoader(pg, device, mount_point="/mnt/3fs", ...)
  ├─ create_parallel_kwargs() → {max_concurrent_producers: 1, queue_size: 0, ...}
  └─ PipelineParallel(pg, loader, files, max_concurrent_producers=1, queue_size=0)
       └─ pipeline: producer threads copy files concurrently

Note: If mount_point is omitted, ThreeFSLoader.process_extension_config will auto-infer it from file paths via fastsafetensor_3fs_reader.extract_mount_point().

@ABNER-1 ABNER-1 force-pushed the unified_loader branch 5 times, most recently from 88f2d15 to 0a75403 Compare April 27, 2026 04:07
Signed-off-by: yuanyuxing.yyx <yuanyuxing.yyx@alibaba-inc.com>
Copy link
Copy Markdown
Collaborator

@takeshi-yoshimura takeshi-yoshimura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please resolve my comments. thanks!

Comment thread pyproject.toml
]
dependencies = [
"typer>=0.9.0",
"pyyaml>=6.0",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to avoid new dependencies. Can we only use JSON?

return cls


class UnifiedLoader:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if UnifiedLoader is correct naming from user's perspective. AutoLoader?

Comment thread docs/configuration.md

The base loader extension defaults to `copier_type: "gds"` (GPU Direct Storage).

## Default Call Chain
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is too detailed for users. Maybe you can just delete this section.

Comment thread docs/configuration.md
loader.close()
```

No config file. Uses `loader="base"`, `nogds`, serial mode.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gds?

Comment thread examples/run_unified.py
print("=== Way 1: Default config ===")
loader = UnifiedLoader(pg, args.files, device=args.device)
for key, tensor in loader.iterate_weights():
print(f" {key}: shape={tensor.shape}")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use logger instead of print in this file.

Comment thread examples/run_unified.py
# --- Way 2: Config file in working directory ---
# Place a fastsafetensors.yaml in the working directory:
#
# loader: "threefs"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3fs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants