Skip to content

Added RoboReward model#3354

Open
int-smart wants to merge 29 commits intohuggingface:mainfrom
int-smart:feat/add-robo-reward
Open

Added RoboReward model#3354
int-smart wants to merge 29 commits intohuggingface:mainfrom
int-smart:feat/add-robo-reward

Conversation

@int-smart
Copy link
Copy Markdown

Title

Added Roboreward mentioned in Paper: arXiv:2601.00675

Type / Scope

  • Type: Feature

Summary / Motivation

  • One-paragraph description of what changes and why.
  • Why this change is needed and any trade-offs or design notes.

Related issues

What changed

  • Short, concrete bullets of the modifications (files/behaviour).
  • Short note if this introduces breaking changes and migration steps.

How was this tested (or how to run locally)

  • Tests added: test_robo_reward.py

Example:

  • Ran the relevant tests:

    pytest tests/rewards/test_robo_reward.py -v

Checklist (required before merge)

  • Linting/formatting run (pre-commit run -a)
  • All tests pass locally (pytest)
  • Documentation updated
  • [] CI is green

Reviewer notes

  • Anything the reviewer should focus on (performance, edge-cases, specific files) or general notes.
  • Anyone in the community is free to review the PR.

@github-actions github-actions bot added documentation Improvements or fixes to the project’s docs policies Items related to robot policies tests Problems with test coverage, failures, or improvements to testing configuration Problems with configuration files or settings processor Issue related to processor examples Issues related to the examples labels Apr 11, 2026
@xianglunkai
Copy link
Copy Markdown

very great!
could you give some examples for VLA or RL demos?

@s1lent4gnt s1lent4gnt self-assigned this Apr 12, 2026
@int-smart
Copy link
Copy Markdown
Author

@xianglunkai Can you explain more on the VLA demos.

@philipmit
Copy link
Copy Markdown
Collaborator

Hi @int-smart, thanks for your work on this! the following tests are failing. I think the first failure is caused by the __post_init__ function in configuration_robo_reward.py (where input_features is updated to include the provided image_key). The other two failures appear to come from issues in the _make_mock_vlm_and_processor helper used in the test code.

FAILED tests/rewards/test_robo_reward.py::test_robo_reward_config_validate_features_missing_key - Failed: DID NOT RAISE <class 'ValueError'>
FAILED tests/rewards/test_robo_reward.py::test_compute_reward_shape_single_frame - StopIteration
FAILED tests/rewards/test_robo_reward.py::test_compute_reward_score_mapping - StopIteration

…ard calculation. Corrected the exhaustion of iterator.
@philipmit
Copy link
Copy Markdown
Collaborator

@int-smart great all tests are passing now! new issue: I get the below error when trying your quick start code in robo_reward.mdx. The error doesn't happen when testing a batch with just one image or one video.

batch = {
    "observation.images.top": torch.rand(2, 3, 480, 640),   # (B, C, H, W)
    "observation.language_instruction": [
        "pick up the red cube",
        "place the block on the tray",
    ],
}
rewards = model.compute_reward(batch)  # tensor([0.75, 0.50])

Error

ValueError: expected sequence of length 489 at dim 1 (got 490)
...
ValueError: Unable to create tensor, you should probably activate padding with 'padding=True' to have batched tensors with the same length.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

configuration Problems with configuration files or settings documentation Improvements or fixes to the project’s docs examples Issues related to the examples policies Items related to robot policies processor Issue related to processor tests Problems with test coverage, failures, or improvements to testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reward Models: call for contributions

5 participants