Skip to content

Bug report in otb project #77

@hkr04

Description

@hkr04

In UnderThinkingBench, there exists some items without "source_dataset", but "source" instead with values "aime" and "hmmt", which seems to be in the UnderThinkingBench-Math subset.

UnderThinkingBench entry:

RAM/projects/otb/eval.py

Lines 39 to 40 in 264c047

elif subset == "underthinking-bench" or "underthinking" in subset:
acc = eval_underthink(row)

Error encountered here:

def eval_underthink(row, find_last_box: bool = False) -> float:
puzzle = json.loads(row["metadata"])["source_dataset"]

Possible solution in eval.py:

import json

# ...

elif subset == "underthinking-bench" or "underthinking" in subset:
    metadata = json.loads(row["metadata"])
    if "source" in metadata and metadata["source"] in ["aime", "hmmt"]:
        acc = eval_math(row, tokenizer, model_name)
    else:
        acc = eval_underthink(row)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions