Problem
--matformer-load-strategy auto only checks a local -tier{N} path. If the checkpoint is remote (HF), it downloads the universal repo and does not attempt a tiered repo. Small-tier nodes can OOM unless operators manually provide tiered checkpoints.
Refs:
shared/client/src/state/init.rs (local-only tier path detection)
Expected
Auto strategy should prefer tiered checkpoints when available (local or remote), or offer an explicit override to point to tiered repos.
Possible Approach
- Add
--matformer-tier-repo or template (e.g., {repo}-tier{tier}) and try that before universal.
- For auto, attempt remote
repo_id-tier{tier} if local tier dir is missing.
- Optional fallback: download universal and slice locally (with clear warning), if enabled.
Acceptance Criteria
- Tiered checkpoints are automatically preferred when available.
- Clear behavior when tiered repos are missing (fallback or error).
- Documentation update for recommended tiered repo layout.
Problem
--matformer-load-strategy autoonly checks a local-tier{N}path. If the checkpoint is remote (HF), it downloads the universal repo and does not attempt a tiered repo. Small-tier nodes can OOM unless operators manually provide tiered checkpoints.Refs:
shared/client/src/state/init.rs(local-only tier path detection)Expected
Auto strategy should prefer tiered checkpoints when available (local or remote), or offer an explicit override to point to tiered repos.
Possible Approach
--matformer-tier-repoor template (e.g.,{repo}-tier{tier}) and try that before universal.repo_id-tier{tier}if local tier dir is missing.Acceptance Criteria