Skip to content

feat(packing): add opt-in packing_strategy (binpack | sequential) for order-preserving packing#9598

Open
yuchenwang3 wants to merge 1 commit into
modelscope:mainfrom
yuchenwang3:feat/packing-sequential-strategy
Open

feat(packing): add opt-in packing_strategy (binpack | sequential) for order-preserving packing#9598
yuchenwang3 wants to merge 1 commit into
modelscope:mainfrom
yuchenwang3:feat/packing-sequential-strategy

Conversation

@yuchenwang3

Copy link
Copy Markdown
Contributor

What

The packing path uses binpacking.to_constant_volume (best-fit-decreasing), which reorders samples (by length) and changes which samples land in each pack. For recipes that rely on a sequential sampler — where the sample order and pack boundaries are themselves part of the recipe (e.g. domain weighting encoded in the sample order, as in StepFun Step-3.5) — this reordering breaks faithful reproduction.

This adds an opt-in packing_strategy:

  • binpack (default, unchanged): best-fit-decreasing bin packing.
  • sequential: order-preserving greedy packing (next-fit) — keep a single open pack, flush it when the next sample doesn't fit, and carry the trailing unfinished pack to the next batch. Sample order and pack boundaries follow the input order (use packing_num_proc=1 for a single global ordering).

Change

  • calculate_matched_group(..., strategy='binpack'): add the sequential (next-fit) branch; the default binpack path is unchanged.
  • PackingDataset / IterablePackingDataset: accept packing_strategy and pass it through to calculate_matched_group.
  • BaseArguments.packing_strategy: Literal['binpack', 'sequential'] = 'binpack', threaded through sft.py.
  • zh/en command-line docs.

Default behavior is unchanged (binpack).

Testing

Verified syntactically (py_compile) and traced the call path (args → sft.pyPackingDataset/IterablePackingDatasetcalculate_matched_group). I could not run the full training suite locally; relying on CI. The sequential algorithm (order-preserving next-fit with carry-over) was used in real Qwen3.5-35B SFT to reproduce a sequential-sampler recipe.

Note

Named sequential (not first_fit) because the algorithm keeps a single open pack and flushes on overflow — i.e. next-fit — and because the user-facing property is order preservation. Happy to rename if maintainers prefer.

The packing path uses binpacking.to_constant_volume (best-fit-decreasing),
which reorders samples and changes pack composition. Recipes relying on a
sequential sampler (sample order / pack boundaries are part of the recipe, e.g.
domain weighting encoded in sample order) cannot be reproduced faithfully.

Add packing_strategy:
- 'binpack' (default, unchanged): best-fit-decreasing bin packing.
- 'sequential': order-preserving greedy packing (next-fit) - single open pack,
  flush on overflow, carry trailing pack to next batch; sample order and pack
  boundaries follow input order (use packing_num_proc=1 for global order).

Threaded through PackingDataset / IterablePackingDataset and BaseArguments;
default behavior unchanged. Adds zh/en docs.

Signed-off-by: yuchenwang3 <eang333cms@gmail.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new packing_strategy parameter, allowing users to choose between the default 'binpack' (best-fit-decreasing) and a new 'sequential' (order-preserving greedy next-fit) packing algorithm. This parameter is documented, added to the base arguments, and integrated into the dataset packing process. The review feedback suggests validating the strategy parameter in calculate_matched_group to raise a ValueError for unsupported strategies instead of silently falling back to 'binpack'.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread swift/dataset/packing.py
Comment on lines +16 to +19
def calculate_matched_group(sequences, packing_length: int, is_finished: bool = True, strategy: str = 'binpack'):
if len(sequences) == 0:
return [], []
# https://arxiv.org/pdf/2404.10830
if strategy == 'sequential':

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The strategy parameter is not validated. If an invalid strategy name is passed, the function will silently fall back to the default 'binpack' strategy. It is safer to explicitly validate the strategy and raise a ValueError if an unsupported strategy is provided.

Suggested change
def calculate_matched_group(sequences, packing_length: int, is_finished: bool = True, strategy: str = 'binpack'):
if len(sequences) == 0:
return [], []
# https://arxiv.org/pdf/2404.10830
if strategy == 'sequential':
def calculate_matched_group(sequences, packing_length: int, is_finished: bool = True, strategy: str = 'binpack'):
if strategy not in ('binpack', 'sequential'):
raise ValueError(f"Unknown packing strategy: {strategy}. Supported strategies are 'binpack' and 'sequential'.")
if len(sequences) == 0:
return [], []
if strategy == 'sequential':

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant