Skip to content

Bug +fix: padding_side not set for tokenizers with existing pad_token, causing empty responses #70

@accemlcc

Description

@accemlcc

Bug: padding_side not set for tokenizers with existing pad_token, causing empty responses

Problem

When a tokenizer already has a pad_token defined (e.g., allenai/Olmo-3-7B-Instruct), padding_side is not set to "left", causing empty responses during batched generation.

Decoder-only models require left-padding for generation. With right-padding (the default), models see [PROMPT] [PAD] [PAD] and stop generating immediately, returning empty strings.

Impact

  • Empty responses are counted as "non-refusals" in is_refusal() (no markers found → success)
  • Results in falsely low refusal rates (e.g., 5/100 instead of 90/100)
  • Only affects batch_size > 1 (no padding needed for single prompts)
  • batch_size=1 shows correct results, larger batches show near-zero refusals

Reproduction

Test with any model that has its own pad_token (doesn't use EOS fallback):

# batch_size=1: Correct refusal count (e.g., 10/10)
# batch_size=128: Near-zero refusals, all empty responses counted as "success"

Tested with allenai/Olmo-3-7B-Instruct and squ11z1/Hypnos-i1-8B.

Current Code (model.py:50-52)

if self.tokenizer.pad_token is None:
    self.tokenizer.pad_token = self.tokenizer.eos_token
    self.tokenizer.padding_side = "left"  # ← Only set when pad_token is None!

Suggested Fix

if self.tokenizer.pad_token is None:
    self.tokenizer.pad_token = self.tokenizer.eos_token

# CRITICAL: Always use left padding for decoder-only models during generation.
# Right padding causes empty outputs because the model sees PAD tokens after
# the prompt and thinks the sequence is complete.
self.tokenizer.padding_side = "left"

Optional: Handle empty responses in evaluation

Additionally, is_refusal() in evaluator.py could treat empty/very short responses as failures rather than successes:

def is_refusal(self, response: str) -> bool:
    # Empty response = generation failure, not successful uncensoring
    if len(response.strip()) < 5:
        return True
    # ... rest of marker checking

This provides defense-in-depth against other potential generation failures.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions