Skip to content

DexHand Lab pro: event-hardened tactile dexterity benchmark#501

Open
ducthuykh1009 wants to merge 14 commits into
Faraday-Future-AI:mainfrom
ducthuykh1009:codex/dexhand-event-final-polish
Open

DexHand Lab pro: event-hardened tactile dexterity benchmark#501
ducthuykh1009 wants to merge 14 commits into
Faraday-Future-AI:mainfrom
ducthuykh1009:codex/dexhand-event-final-polish

Conversation

@ducthuykh1009

Copy link
Copy Markdown

Registration UUID: 2555924c-74a4-4788-be61-1f1e65bf3f44

Project: DexHand Lab pro
Submission folder: submissions/dexhand_lab

Summary
This PR submits the event-hardened DexHand Lab dexterous manipulation benchmark. The project is a hand-only MuJoCo submission centered on a human-like five-finger robotic hand, blind tactile active perception, tactile pose estimation, adaptive regrasp, precision assembly, 224-degree cap rotation, 9x load hold, tactile combination lock manipulation, no-crush vial uncap/sample delivery, microsuture threading, stress evaluation, and a judge-readable evidence pack.

Latest update in this PR

  • Removed stale score/PR backup files from the submitted folder so judge evidence does not point to old PRs or old leaderboard state.
  • Replaced hardcoded PR target metadata with current-branch upstream PR metadata in outputs/submission_readiness_report.json.
  • Added validator event-hygiene gates for summary/readiness agreement, stale PR target detection, deprecated backup detection, SRT timing inside the generated demo duration, and embedded stress-eval summary evidence.
  • Kept the generated demo in the event-required 1-3 minute window: duration_s = 85.25.
  • Preserved all existing DexHand evidence: cap rotation, blind tactile mode, tactile pose assembly, combination lock, vial task, microsuture task, contact-causality audit, stress eval, hardware replay audit, and 45-gate task suite.

How to run

python submissions/dexhand_lab/run_demo.py
python submissions/dexhand_lab/run_demo.py --episodes 3 --seed 42 --no-video --difficulty medium
python submissions/dexhand_lab/run_stress_eval.py --seeds 32
python submissions/dexhand_lab/validate_submission.py

Validation run completed locally

  • python -m py_compile submissions/dexhand_lab/run_demo.py submissions/dexhand_lab/validate_submission.py submissions/dexhand_lab/quality_gate.py submissions/dexhand_lab/run_stress_eval.py submissions/dexhand_lab/arena_task_suite.py
  • python -m unittest discover -s submissions/dexhand_lab/tests -p "test_*.py"
  • python submissions/dexhand_lab/run_demo.py
  • python submissions/dexhand_lab/run_stress_eval.py --seeds 32
  • python submissions/dexhand_lab/arena_task_suite.py
  • python submissions/dexhand_lab/microsuture_benchmark.py
  • python submissions/dexhand_lab/run_demo.py --episodes 3 --seed 42 --no-video --difficulty medium
  • python submissions/dexhand_lab/validate_submission.py

Current evidence highlights

  • validation_passed: true
  • submission_readiness_pass: true
  • rules_alignment_pass: true
  • rubric_readiness_pass: true
  • task gates: 45/45
  • object_snap_events: 0
  • contact_causality_pass: true
  • cap_rotation_success: true
  • cap_rotation_achieved_deg: 224.0
  • load_hold_success: true
  • load_hold_x: 9.0
  • feedback_success_rate: 1.0
  • baseline_success_rate: 0.59375
  • microsuture_threading_success: true
  • demo_video_duration_rule_pass: true

Files to inspect first

  • submissions/dexhand_lab/media/demo.mp4
  • submissions/dexhand_lab/media/keyframes.png
  • submissions/dexhand_lab/outputs/summary.json
  • submissions/dexhand_lab/outputs/validator_report.json
  • submissions/dexhand_lab/outputs/submission_readiness_report.json
  • submissions/dexhand_lab/outputs/event_rules_report.json
  • submissions/dexhand_lab/JUDGE_BRIEF.md
  • submissions/dexhand_lab/EVIDENCE_INDEX.md
  • submissions/dexhand_lab/dataset/task_suite_report.json
  • submissions/dexhand_lab/dataset/contact_causality_report.json

Honest scope
DexHand Lab uses simulation-native MuJoCo/contact evidence and deterministic contact-aware controllers. It does not claim learned RL, real camera perception, perfect contact physics, or physical hardware execution. Hardware-related files are replay/audit artifacts for transfer analysis, not a real hardware trial.

@ducthuykh1009

Copy link
Copy Markdown
Author

Update pushed on codex/dexhand-event-final-polish.

Latest cleanup:

  • Refreshed PR_DESCRIPTION.md so it matches the current DexHand evidence: 85.25s demo, 45/45 gates, validator/readiness/rules pass, and current task list.
  • Removed stale PR-number literals from the validator scan path while keeping stale-target protection active.
  • Confirmed no stale PR links, old score text, encoding artifacts, or copied-source references remain in the submission scan.

Validation rerun:

  • python -m py_compile submissions/dexhand_lab/validate_submission.py submissions/dexhand_lab/run_demo.py
  • python submissions/dexhand_lab/validate_submission.py

Current final evidence remains: validator pass, submission readiness pass, rules alignment pass, 45/45 gates, object_snap_events=0, contact_causality_pass=true, cap_rotation_success=true, load_hold_success=true, microsuture_threading_success=true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant