-
Notifications
You must be signed in to change notification settings - Fork 505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HITL] Gui-controlled spot and local multi-user support #1831
[HITL] Gui-controlled spot and local multi-user support #1831
Conversation
72bd002
to
9bcc8da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments.
lin_speed: 10.0 | ||
# angular speed | ||
ang_speed: 10.0 | ||
# list of gui-controlled agents. Each list item should be a dict following the commented-out reference format below. Beware some apps only support one (or zero!) gui-controlled agent. For each Habitat env agent *without* a corresponding entry here, it will be policy-controlled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing lists of dicts this way is not great. E.g. there's no way to specify default lin_speed
and ang_speed
for each list item. We should probably switch to structured configs in the future (in which the habitat_hitl
config is defined in code, not this yaml).
@@ -52,6 +53,8 @@ def hitl_main(app_config, create_app_state_lambda=None): | |||
"Either run from habitat-lab directory or symlink data/ folder to your HITL app working directory" | |||
) | |||
|
|||
app_config = patch_config(app_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An unrelated fix. It appears that Habitat-lab always wants you to call patch_config
on a Habitat-lab config.
@@ -129,14 +132,6 @@ def hitl_headed_main(hitl_config, app_config, create_app_state_lambda): | |||
create_app_state_lambda, | |||
) | |||
|
|||
# sanity check if there are no agents with camera sensors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These checks are moved to hitl_driver.py
.
), | ||
) | ||
elif articulated_agent_type == "SpotRobot": | ||
agent_k = f"agent_{agent_index}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we see a fair amount of tricky Habitat-lab logic to get (1) the number of actions, and (2) which indices in the actions array correspond to base linear and angular vel.
|
||
habitat_hitl: | ||
gui_controlled_agents: | ||
- agent_index: 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the ordering of agents matters here. Here, user 0 will control agent 1 (the humanoid) and user 1 will control agent 0 (Spot).
key = key_remap[key] | ||
return key | ||
|
||
def _get_user_key_down(self, user_index, key): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comments in the PR description about this interface for encapsulating user-specific keys.
if self._held_obj_id is None: | ||
self._pick_helper.viz_objects() | ||
|
||
if ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For local multi-user testing, use the T
key to toggle the camera between users. Toggling the camera while holding an object is super-confusing so I disabled this.
@@ -98,7 +99,12 @@ def update_config( | |||
if measurement_name in task_config.measurements: | |||
task_config.measurements.pop(measurement_name) | |||
|
|||
sim_sensor_names = ["head_depth", "head_rgb"] | |||
# todo: decide whether to fix up config here versus validate config | |||
sim_sensor_names = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the long run, I think it's hacky to "fix up" yaml configs at runtime. It creates confusion, because the config used at runtime doesn't match what the user authored in the yaml. It would be cleaner to instead just assert on the requirements here (thus forcing the user to fix up their config yaml).
articulated_agent = self._env._sim.agents_mgr[agent_index].articulated_agent # type: ignore[attr-defined] | ||
|
||
# sloppy: derive turn scale. This is the change in yaw (in radians) corresponding to a base ang vel action of 1.0. See also Habitat-lab BaseVelAction. | ||
turn_scale = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discuss this more in the PR description.
assert ( | ||
throw_vel is None or do_drop is None | ||
), "You can not set throw_velocity and drop_position at the same time" | ||
self._hint_walk_dir = walk_dir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is sloppy. I'm not actually using _hint_walk_dir
anywhere (yet). Likewise for _hint_distance_multiplier
and _hint_target_dir
.
drop_pos = None | ||
grasp_object_id = None | ||
throw_vel = None | ||
reach_pos = None | ||
|
||
self._has_grasp_preview = False | ||
|
||
# todo: implement grasping properly for each user. _held_obj_id, _has_grasp_preview, etc. must be tracked per user. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
f"Selected agent for GUI control is of type {sim_config.agents[gui_agent_key].articulated_agent_type}, " | ||
"but only KinematicHumanoid is supported at the moment." | ||
"but only KinematicHumanoid and SpotRobot are supported at the moment." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, are there some APIs that limit us to SpotRobot
here, or would any robot work without changing the HITL tool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with other articulated agent types.
if len(config.habitat_hitl.gui_controlled_agents) == len( | ||
config.habitat.simulator.agents | ||
): | ||
assert self.get_sim().renderer is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
def get_gui_agent_controllers(self) -> List[Controller]: | ||
""" | ||
Return list of controllers indexed by user index. Beware the difference between user index and agent index. For example, user 0 may control agent 1. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beware the difference between user index and agent index.
I would have missed this detail. I think that the agent index
concept should be entirely abstracted away at the HITL level - this will be misleading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agent index concept should be entirely abstracted away at the HITL level
I doubt this will be feasible. We shouldn't think of the controller as completely encapsulating the agent. It is really just encapsulating how we give actions to the agent. E.g. if you want to know something particular about the agent state (e.g. arm position), you would probably still use the Habitat-lab agent API including knowing the particular agent index.
num_actions, | ||
base_vel_action_idx, | ||
num_base_vel_actions, | ||
turn_scale, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Types would be great here, as this is a dependency injection point.
# else: | ||
# arm_ctrlr = None | ||
# arm_action = None | ||
# grasp = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a small header comment to explain why this is commented?
turn_angle = self.angle_from_sim_obj_foward_dir_to_target_yaw( | ||
self._articulated_agent.sim_obj, | ||
self._cam_yaw + float(mn.Rad(mn.Deg(180))), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
180 degrees due to our camera convention
I may be confronted to a math error due to this. Could you provide some context on why the 180 degree rotation brings us to the desired result here?
Could it be that the asset was imported with a geo::CoordinateFrame
that doesn't match the sensor convention, which brings developers to flip the camera in "userspace" code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment in code. This is unrelated to sensor conventions and strictly due to camera_helper.py implementation details.
if len(action_names) == 0: | ||
raise ValueError("No active actions for human controller.") | ||
|
||
return {"action": action_names, "action_args": action_args} | ||
assert len(base_action) == self._num_base_vel_actions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we rename base_action
and type it? The current name doesn't convey that it can sometimes be a collection structure. This also makes it difficult to understand why this assert would break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't rename or type this, but I simplified the code. It is always a numpy array. The assert would break if _num_base_vel_actions (passed in to GuiRobotController.__init__
) doesn't match the base_velocity action space we find here in act()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I tested the changeset locally. It works as described.
self._cam_yaw = cam_yaw | ||
self._hint_target_dir = target_dir | ||
|
||
def angle_from_sim_obj_foward_dir_to_target_yaw(self, sim_obj, target_yaw): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: foward
Motivation and Context
python examples/hitl/rearrange_v2/rearrange_v2.py +experiment=gui_controlled_humanoid_and_spot
. Spot's keyboard controls includeF/V
for forward/back instead ofW/S
; similar alternates for grasping (N
) and opening (X
). The camera controls are shared across all users (e.g.A/D
to turn).habitat_hitl.gui_controlled_agents
anduser_index
.Rearrange_v2 multi-user application logic is very crude so far:
F
andV
keys inGuiRobotController
. Similar forGuiHumanoidController
. (This hard-coding behavior is not new to this PR.) Later, GuiControllers shouldn't have direct access togui_input
; they should just use things like_hint_walk_dir
.AppStateRearrangeV2._get_user_key_down
is currently how I encapsulate user-specific keys for grasping and opening. For the case of a local second user, I internally remap keys to other keys in a hard-coded way. In the long run, local multi-user is probably only for testing purposes, so this hackiness is fine. Later,_get_user_key_down
could encapsulate a second internalgui_input
from a remote user. Alternately, instead of the_get_user_key_down
interface, we may want to offer multiplegui_input
instances, one per user. (In the case of local multi-user, the second one could be built from the "real" one using similar key-remapping hackiness.)For GUI-controlled Spot, a few points:
turn_scale
is computed incontroller_helper.py
. Instead of the controller steering to match camera yaw, we can imagine a different paradigm: turn inputs are forwarded to the controller, the controller is free to steer the agent however it likes, and then the camera yaw would follow the agent yaw.How Has This Been Tested
Local testing on Macbook:
Types of changes
Checklist