-
Notifications
You must be signed in to change notification settings - Fork 0
Occupancy grid image #822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Occupancy grid image #822
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Greptile OverviewGreptile SummaryThis PR introduces Key Changes:
Issues Found:
Confidence Score: 3/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant User
participant InterpretMapSkill
participant OccupancyGrid
participant OccupancyGridImage
participant QwenVlModel
participant NavigationSkill
User->>InterpretMapSkill: get_goal_position(description)
InterpretMapSkill->>InterpretMapSkill: Retrieve latest costmap
InterpretMapSkill->>OccupancyGridImage: from_occupancygrid(costmap, robot_pose)
OccupancyGridImage->>OccupancyGrid: Convert grid to RGB image
OccupancyGridImage->>OccupancyGridImage: _overlay_robot_pose()
OccupancyGridImage->>OccupancyGridImage: Flip vertically & resize
OccupancyGridImage-->>InterpretMapSkill: OccupancyGridImage with Image
InterpretMapSkill->>QwenVlModel: query(image, prompt)
QwenVlModel-->>InterpretMapSkill: JSON response with pixel coordinates
InterpretMapSkill->>InterpretMapSkill: extract_coordinates()
InterpretMapSkill->>OccupancyGridImage: is_free_space(x, y)
OccupancyGridImage->>OccupancyGridImage: pixel_to_grid(x, y)
OccupancyGridImage->>OccupancyGrid: Check grid[grid_y, grid_x]
alt Point not in free space
InterpretMapSkill->>OccupancyGridImage: get_closest_free_point(x, y)
OccupancyGridImage-->>InterpretMapSkill: Closest free pixel coordinates
end
InterpretMapSkill->>OccupancyGridImage: pixel_to_world(x, y)
OccupancyGridImage->>OccupancyGridImage: pixel_to_grid(x, y)
OccupancyGridImage->>OccupancyGrid: grid_to_world(grid_point)
OccupancyGrid-->>OccupancyGridImage: World coordinates (Vector3)
OccupancyGridImage-->>InterpretMapSkill: goal_pose (Vector3)
InterpretMapSkill-->>User: goal_pose
User->>NavigationSkill: navigate_with_position(x, y, z)
NavigationSkill->>NavigationSkill: Create PoseStamped goal
NavigationSkill->>NavigationSkill: _navigate_to(goal_pose)
NavigationSkill-->>User: Success message
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
13 files reviewed, 2 comments
leshy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just these small things, otherwise looks good
0bdf5a6 to
6fbe4eb
Compare
482ed81 to
7165549
Compare
| max(10, int(min_dimension * 0.035)) | ||
|
|
||
| max(1, int(min_dimension * 0.005)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These don't do anything. I guess they were unused variables which were removed by the linter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah these were for the arrow thats removed
| cost value at the specified pixel | ||
| """ | ||
|
|
||
| size = size or self.size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size is defined as a tuple and has a default value, so there's no need for the or.
| Attributes: | ||
| image_path (str): Path to the map image file. | ||
| robot_pose (dict): Robot's pose in the map with keys 'position' (list of 3 floats - X Y Z) and 'orientation' (Quaternion). | ||
| occupancy_grid (OccupancyGrid): Generated occupancy grid from the image. | ||
| image (Image | None): Generated OccupancyGridImage from the occupancy grid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please convert this to Python types since mypy doesn't look at docstrings.
| width_scale = self.occupancy_grid.info.width / width | ||
| height_scale = self.occupancy_grid.info.height / height | ||
| return width_scale, height_scale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've noticed you use 1024x1024 images by default. If the width scale and height scale are not the same that produces images which are squashed in a random direction, no? Don't models get confused by such images? Or are you telling the model which way the image is squished?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't default to 1024 x 1024 now. The aspect ratio is maintained and max is set to 1024, so both these scales will have the same value, will fix it.
ab9769c to
3e5509b
Compare
3e5509b to
90d5a91
Compare
|
Too many files changed for review. |
1 similar comment
|
Too many files changed for review. |
90d5a91 to
dc49a93
Compare
|
Too many files changed for review. |
1 similar comment
|
Too many files changed for review. |
dc49a93 to
7c2c2f8
Compare
|
Too many files changed for review. |
1 similar comment
|
Too many files changed for review. |


This PR is one of many for encoding maps for agents. #804
Adds
Evals and Results
A dataset of floorplans (only 2 right now, with variations) are used to generate grids and evaluated on point placement and map comprehension. This dataset can be populated by adding a new floorplans with expected answers for queries.
dimos/agents2/skills/interpret_map/eval/test_map_interpretability.yamlhas queries with varying difficulty to evaluate spatial reasoning. For now, the minimum pass rate for point placement and map comprehension is set to 0.25 and 0.7 respectively.Run
For evals run
pytest -s dimos/agents2/skills/interpret_map/eval/test_map_eval.py.Examples of successful point placement results.
Go to the conference table in the office
Second room to the robot’s left along the corridor
a point immediately behind the robot
second room to the robot’s left along the corridor
Debug
debug_goal_placement_<map_id>_<query>.png. The goal placed is marked with+Adding new maps and queries for eval
map_id,image_path,robot_pose.positionin pixels andorientationundermap_comprehension_testsorpoint_placement_testsindimos/agents2/skills/interpret_map/eval/test_map_interpretability.yaml.dimos/agents2/skills/interpret_map/eval/annotate.py <image.png>to create bounding boxes and question pairs. These are saved into aquestions.yaml, copy them to main testing yaml mentioned above.Running with agents
The interpret_map_skill is to be able to pull the map / place a goal based on the query and return the world coordinates to navigate.
Run
dimos --replay run unitree-go2-agentic --extra-module interpret_map_skillIn the cli, queries like "get a goal right in front of the robot", "get a goal to the northeast side of the map" can be asked.
Few observations