Skip to content

Fix implicit padding in split_between_processes when apply_padding=False and num_samples < num_processes#4052

Open
3manifold wants to merge 1 commit into
huggingface:mainfrom
3manifold:fix-implicit-padding
Open

Fix implicit padding in split_between_processes when apply_padding=False and num_samples < num_processes#4052
3manifold wants to merge 1 commit into
huggingface:mainfrom
3manifold:fix-implicit-padding

Conversation

@3manifold
Copy link
Copy Markdown
Contributor

@3manifold 3manifold commented May 27, 2026

What does this PR do?

Function split_between_processes applies implicit padding even if apply_padding=False. This results to redundant calculations during distributed inference since the same inference (i.e. prompt) is called twice etc. Apart from wasting resources, this can cause conflicts among competing processes, for instance, disk I/O.

The same bug is duplicated across two code branches of the same function. Both stem from the identical incorrect assumption — "a process always receives at least one element" — and both produce identical wrong behaviour (silent padding when apply_padding=False). The only difference is syntax:

  • Bug 1 (list/Tensor): guards with result = inputs[-1:]
  • Bug 2 (Dataset): guards with start_index = len(inputs) - 1

How to reproduce

Example with 1 sample and 2 GPUs

Process start_index end_index Normal slice
0 0 1 inputs[0:1][sample_0]
1 1 1 start_index >= len(inputs) triggers! → inputs[-1:] = [sample_0]

Process 1's start_index (1) equals len(inputs) (1), triggering the guard. Instead of returning an empty slice, it silently returns the last element — regardless of apply_padding.

def _split_values(inputs, start_index, end_index):
    if isinstance(inputs, (list, tuple, torch.Tensor)):
        if start_index >= len(inputs):      
            result = inputs[-1:]              # ← always returns last element
        else:
            result = inputs[start_index:end_index]
        if apply_padding:
            ...  

resolves #4053

Who can review?

@BenjaminBossan @SunMarc

Comment thread src/accelerate/state.py
result = inputs[-1:]
else:
result = inputs[start_index:end_index]
result = inputs[start_index:end_index]
Copy link
Copy Markdown
Contributor Author

@3manifold 3manifold May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python slicing returns [] naturally for out-of-bounds, which is the correct behavior when apply_padding=False.

Comment thread src/accelerate/state.py
result = pad_across_processes(tensorized_result, pad_index=inputs[-1])
else:
result += [result[-1]] * (num_samples_per_process + (1 if num_extras > 0 else 0) - len(result))
result += [inputs[-1]] * (num_samples_per_process + (1 if num_extras > 0 else 0) - len(result))
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use inputs[-1] instead of result[-1]: result may be empty when this process received no samples.

Comment thread src/accelerate/state.py
if end_index > len(inputs):
end_index = len(inputs)
result_idcs = list(range(start_index, end_index))
clamped_start = min(start_index, len(inputs))
Copy link
Copy Markdown
Contributor Author

@3manifold 3manifold May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clamp both indices to valid range without artificially inflating start_index to len-1 when apply_padding is False.

Comment thread src/accelerate/state.py
result_idcs = list(range(clamped_start, clamped_end))
if apply_padding:
result_idcs += [end_index - 1] * (
result_idcs += [len(inputs) - 1] * (
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pad using the last real element (len(inputs)-1), not end_index-1 which can be wrong when result_idcs is empty.

@3manifold 3manifold force-pushed the fix-implicit-padding branch from 3ce7d52 to e81785a Compare May 27, 2026 13:16
@3manifold 3manifold marked this pull request as ready for review May 27, 2026 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Implicit padding when splitting input between processes while padding flag is disabled

1 participant