Skip to content

DexHand Lab Pro: Tactile Microsuture Dexterity Benchmark#477

Open
ducthuykh1009 wants to merge 12 commits into
Faraday-Future-AI:mainfrom
ducthuykh1009:codex/dexhand-microsuture-95
Open

DexHand Lab Pro: Tactile Microsuture Dexterity Benchmark#477
ducthuykh1009 wants to merge 12 commits into
Faraday-Future-AI:mainfrom
ducthuykh1009:codex/dexhand-microsuture-95

Conversation

@ducthuykh1009

Copy link
Copy Markdown

Registration UUID: 2555924c-74a4-4788-be61-1f1e65bf3f44

Project name: DexHand Lab Pro
Submission folder: submissions/dexhand_lab

Summary:
This update keeps DexHand Lab as a hand-only MuJoCo dexterous manipulation benchmark and adds a visible high-dexterity task: tactile microsuture threading. The five-finger hand performs an index probe, thumb-index-middle tripod pinch with ring stabilizer, two controlled needle passes through a tissue-pad target, thread-loop pull, and tension-limited no-tear verification. The task is shown in the generated demo video and logged in machine-readable evidence files.

What changed:

  • Added microsuture needle, visible thread loop, tissue pad, entry/exit eyelet markers, and close-up suture camera to the MJCF scene.
  • Added MICROSUTURE_THREADING grasp primitive with finger roles, closure sequence, stability conditions, and recovery strategy.
  • Added microsuture object classification and affordance metadata.
  • Added SUTURE_* phases to the main demo, HUD, narration, contact timeline, trajectory, summary, final report, and judge replay index.
  • Added microsuture_benchmark.py plus dataset/microsuture_threading_report.json, dataset/microsuture_threading_trace.csv, and outputs/microsuture_scorecard.json.
  • Expanded the deterministic verification suite from 39 gates to 45 gates.
  • Tightened contact-causality/no-snap audit so generated evidence reports zero pre-verification motion events.
  • Updated README, JUDGE_BRIEF, rubric_scorecard.json, submission_manifest.json, validator, and tests.

Validation performed:

  • python submissions/dexhand_lab/run_demo.py
  • python submissions/dexhand_lab/run_demo.py --episodes 3 --seed 42 --no-video --difficulty medium
  • python submissions/dexhand_lab/run_stress_eval.py --seeds 32
  • python submissions/dexhand_lab/arena_task_suite.py
  • python submissions/dexhand_lab/microsuture_benchmark.py
  • python -m unittest discover -s submissions/dexhand_lab/tests -p "test_*.py"
  • python submissions/dexhand_lab/validate_submission.py

Final generated metrics:

  • Overall task success: true
  • Task gates: 45/45
  • Demo duration: 85.25 s
  • Object snap events: 0
  • Contact-causality pass: true
  • Verified motion frame rate: 1.0
  • Average active fingers for dexterous grasps: 4.18272
  • Average multi-side contact score for dexterous grasps: 0.92919
  • Cap rotation: 224/224 deg
  • Load hold: 9.0x
  • Blind tactile classifier accuracy: 1.0
  • Precision assembly success: true
  • Combination lock success: true
  • Microsuture threading success: true
  • Microsuture passes: 2/2
  • Microsuture entry/exit error: 0.0024/0.0028 m
  • Microsuture tension: 0.42/0.65 N
  • Local readiness estimate: 100/100

Honest scope:
The project uses MuJoCo simulation-native state/contact information, fingertip touch sensors, and deterministic/contact-aware heuristic control. It does not claim learned RL, real camera perception, physical tactile hardware, perfect contact physics, physical surgery readiness, or real-world deployment.

@ducthuykh1009

Copy link
Copy Markdown
Author

Update pushed: aligned DexHand Lab evidence with event rules and fixed the readiness mismatch.

Changes in this update:

  • Fixed summary/report ordering so outputs/summary.json, submission_readiness_report.json, and validator_report.json agree.
  • Aligned the generated narration SRT and keyframe panel to the actual 85.25s demo video inside the required 1-3 minute window.
  • Updated keyframe selection to use real trajectory phases, making microsuture, vial, lock, assembly, tactile, and hardware-audit evidence visible in the correct segments.
  • Restored nested stress_eval_summary and dataset_stress_eval_path in summary.json while preserving top-level stress metrics.
  • Updated README, Judge Brief, rubric scorecard, and quality-gate presentation text to match the current generated demo.

Re-tested locally:

  • python submissions/dexhand_lab/run_demo.py
  • python submissions/dexhand_lab/run_demo.py --episodes 3 --seed 42 --no-video --difficulty medium
  • python submissions/dexhand_lab/run_stress_eval.py --seeds 32
  • python submissions/dexhand_lab/arena_task_suite.py
  • python submissions/dexhand_lab/microsuture_benchmark.py
  • python -m unittest discover -s submissions/dexhand_lab/tests -p "test_*.py"
  • python submissions/dexhand_lab/validate_submission.py

Current evidence highlights:

  • validator_passed=true
  • submission_readiness_pass=true
  • rules_alignment_pass=true
  • task gates: 45/45
  • object_snap_events=0
  • contact_causality_pass=true
  • cap_rotation_success=true
  • load_hold_success=true
  • microsuture_threading_success=true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant