Building a rover platform capable of real-time visual perception and remote control. Workstreams include mechanical/actuator integration, camera and sensor selection and calibration, ROS node development for perception and control, implementation of OpenCV algorithms for image processing (filtering, feature detection, segmentation), and MediaPipe-based hand/gesture recognition for teleoperation. The system will provide a continuous live video feed (e.g., via RTSP/WebRTC or ROS image transport) and map recognized gestures to control commands (drive, stop, arm/gripper actuation, camera pan/tilt). Emphasis is placed on robust perception in outdoor/harsh lighting conditions, low-latency streaming, and safe control mechanisms. Key features: Full-scale rover platform design (chassis, drive train, power management) OpenCV &MediaPipe pipelines Live video streaming from rover to operator. Object grasping and manipulation