Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HITL] Gui-controlled spot and local multi-user support #1831

Merged
merged 6 commits into from
Mar 7, 2024

Conversation

eundersander
Copy link
Contributor

@eundersander eundersander commented Mar 5, 2024

Motivation and Context

  • Add support for gui-controlled spot, e.g. python examples/hitl/rearrange_v2/rearrange_v2.py +experiment=gui_controlled_humanoid_and_spot. Spot's keyboard controls include F/V for forward/back instead of W/S; similar alternates for grasping (N) and opening (X). The camera controls are shared across all users (e.g. A/D to turn).
  • Add support for multiple local users. See references to habitat_hitl.gui_controlled_agents and user_index.

Rearrange_v2 multi-user application logic is very crude so far:

  • A lot of the hacks below stem from the fact that I have two local users but only one local display. This stuff will change when we have two remote users.
  • PickHelper doesn't do proper object-highlighting for two users. Later, we'll need a separate DebugLineRenderer per remote user (and probably a corresponding separate PickHelper per remote user).
  • Help text isn't customized per user. Later, we'll need separate help text per remote user.
  • Beware agent movement is tied to camera yaw. Later, we'll need a separate camera per remote user.
  • Spot forward/back movement is hard-coded to F and V keys in GuiRobotController. Similar for GuiHumanoidController. (This hard-coding behavior is not new to this PR.) Later, GuiControllers shouldn't have direct access to gui_input; they should just use things like _hint_walk_dir.
  • AppStateRearrangeV2._get_user_key_down is currently how I encapsulate user-specific keys for grasping and opening. For the case of a local second user, I internally remap keys to other keys in a hard-coded way. In the long run, local multi-user is probably only for testing purposes, so this hackiness is fine. Later, _get_user_key_down could encapsulate a second internal gui_input from a remote user. Alternately, instead of the _get_user_key_down interface, we may want to offer multiple gui_input instances, one per user. (In the case of local multi-user, the second one could be built from the "real" one using similar key-remapping hackiness.)

For GUI-controlled Spot, a few points:

  1. I opt to use the Habitat-lab action space already configured for the Spot via the Habitat-lab config, in particular, the BaseVelocity skill. Lots of other skills are present in its action space and not used. We can imagine a different paradigm where we directly manipulate the articulated agent via Habitat-sim API.
  2. In terms of turning/steering the Spot, I stick to the same convention we use for the Humanoid: "turning" is actually just turning the camera, and then we pass cam_yaw into the controller. The controller then makes a best effort to steer the agent to match the cam yaw. For Spot, doing this accurately unfortunately requires tight coupling between the GuiRobotController class and the underlying BaseVelocity action. In particular, see how turn_scale is computed in controller_helper.py. Instead of the controller steering to match camera yaw, we can imagine a different paradigm: turn inputs are forwarded to the controller, the controller is free to steer the agent however it likes, and then the camera yaw would follow the agent yaw.

How Has This Been Tested

Local testing on Macbook:

python examples/hitl/rearrange_v2/rearrange_v2.py 
python examples/hitl/rearrange_v2/rearrange_v2.py +experiment=gui_controlled_spot_only
python examples/hitl/rearrange_v2/rearrange_v2.py +experiment=gui_controlled_humanoid_and_spot
python examples/hitl/rearrange/rearrange.py
python examples/hitl/basic_viewer/basic_viewer.py
python examples/hitl/pick_throw_vr/pick_throw_vr.py
python examples/hitl/minimal/minimal.py

Types of changes

  • [HITL]

Checklist

  • My code follows the code style of this project.
  • I have updated the documentation if required.
  • I have read the CONTRIBUTING document.
  • I have completed my CLA (see CONTRIBUTING)
  • I have added tests to cover my changes if required.

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Mar 5, 2024
@eundersander eundersander force-pushed the eundersander/gui_controlled_spot_and_multi_user branch from 72bd002 to 9bcc8da Compare March 5, 2024 02:57
Copy link
Contributor Author

@eundersander eundersander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments.

lin_speed: 10.0
# angular speed
ang_speed: 10.0
# list of gui-controlled agents. Each list item should be a dict following the commented-out reference format below. Beware some apps only support one (or zero!) gui-controlled agent. For each Habitat env agent *without* a corresponding entry here, it will be policy-controlled.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing lists of dicts this way is not great. E.g. there's no way to specify default lin_speed and ang_speed for each list item. We should probably switch to structured configs in the future (in which the habitat_hitl config is defined in code, not this yaml).

@@ -52,6 +53,8 @@ def hitl_main(app_config, create_app_state_lambda=None):
"Either run from habitat-lab directory or symlink data/ folder to your HITL app working directory"
)

app_config = patch_config(app_config)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An unrelated fix. It appears that Habitat-lab always wants you to call patch_config on a Habitat-lab config.

@@ -129,14 +132,6 @@ def hitl_headed_main(hitl_config, app_config, create_app_state_lambda):
create_app_state_lambda,
)

# sanity check if there are no agents with camera sensors
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These checks are moved to hitl_driver.py.

),
)
elif articulated_agent_type == "SpotRobot":
agent_k = f"agent_{agent_index}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we see a fair amount of tricky Habitat-lab logic to get (1) the number of actions, and (2) which indices in the actions array correspond to base linear and angular vel.


habitat_hitl:
gui_controlled_agents:
- agent_index: 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the ordering of agents matters here. Here, user 0 will control agent 1 (the humanoid) and user 1 will control agent 0 (Spot).

key = key_remap[key]
return key

def _get_user_key_down(self, user_index, key):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comments in the PR description about this interface for encapsulating user-specific keys.

if self._held_obj_id is None:
self._pick_helper.viz_objects()

if (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For local multi-user testing, use the T key to toggle the camera between users. Toggling the camera while holding an object is super-confusing so I disabled this.

@@ -98,7 +99,12 @@ def update_config(
if measurement_name in task_config.measurements:
task_config.measurements.pop(measurement_name)

sim_sensor_names = ["head_depth", "head_rgb"]
# todo: decide whether to fix up config here versus validate config
sim_sensor_names = [
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the long run, I think it's hacky to "fix up" yaml configs at runtime. It creates confusion, because the config used at runtime doesn't match what the user authored in the yaml. It would be cleaner to instead just assert on the requirements here (thus forcing the user to fix up their config yaml).

articulated_agent = self._env._sim.agents_mgr[agent_index].articulated_agent # type: ignore[attr-defined]

# sloppy: derive turn scale. This is the change in yaw (in radians) corresponding to a base ang vel action of 1.0. See also Habitat-lab BaseVelAction.
turn_scale = (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discuss this more in the PR description.

assert (
throw_vel is None or do_drop is None
), "You can not set throw_velocity and drop_position at the same time"
self._hint_walk_dir = walk_dir
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is sloppy. I'm not actually using _hint_walk_dir anywhere (yet). Likewise for _hint_distance_multiplier and _hint_target_dir.

@eundersander eundersander requested a review from 0mdc March 5, 2024 03:07
drop_pos = None
grasp_object_id = None
throw_vel = None
reach_pos = None

self._has_grasp_preview = False

# todo: implement grasping properly for each user. _held_obj_id, _has_grasp_preview, etc. must be tracked per user.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

f"Selected agent for GUI control is of type {sim_config.agents[gui_agent_key].articulated_agent_type}, "
"but only KinematicHumanoid is supported at the moment."
"but only KinematicHumanoid and SpotRobot are supported at the moment."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, are there some APIs that limit us to SpotRobot here, or would any robot work without changing the HITL tool?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with other articulated agent types.

if len(config.habitat_hitl.gui_controlled_agents) == len(
config.habitat.simulator.agents
):
assert self.get_sim().renderer is None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

def get_gui_agent_controllers(self) -> List[Controller]:
"""
Return list of controllers indexed by user index. Beware the difference between user index and agent index. For example, user 0 may control agent 1.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beware the difference between user index and agent index.

I would have missed this detail. I think that the agent index concept should be entirely abstracted away at the HITL level - this will be misleading.

Copy link
Contributor Author

@eundersander eundersander Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agent index concept should be entirely abstracted away at the HITL level

I doubt this will be feasible. We shouldn't think of the controller as completely encapsulating the agent. It is really just encapsulating how we give actions to the agent. E.g. if you want to know something particular about the agent state (e.g. arm position), you would probably still use the Habitat-lab agent API including knowing the particular agent index.

num_actions,
base_vel_action_idx,
num_base_vel_actions,
turn_scale,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Types would be great here, as this is a dependency injection point.

# else:
# arm_ctrlr = None
# arm_action = None
# grasp = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a small header comment to explain why this is commented?

turn_angle = self.angle_from_sim_obj_foward_dir_to_target_yaw(
self._articulated_agent.sim_obj,
self._cam_yaw + float(mn.Rad(mn.Deg(180))),
)
Copy link
Contributor

@0mdc 0mdc Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

180 degrees due to our camera convention

I may be confronted to a math error due to this. Could you provide some context on why the 180 degree rotation brings us to the desired result here?

Could it be that the asset was imported with a geo::CoordinateFrame that doesn't match the sensor convention, which brings developers to flip the camera in "userspace" code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment in code. This is unrelated to sensor conventions and strictly due to camera_helper.py implementation details.

if len(action_names) == 0:
raise ValueError("No active actions for human controller.")

return {"action": action_names, "action_args": action_args}
assert len(base_action) == self._num_base_vel_actions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename base_action and type it? The current name doesn't convey that it can sometimes be a collection structure. This also makes it difficult to understand why this assert would break.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't rename or type this, but I simplified the code. It is always a numpy array. The assert would break if _num_base_vel_actions (passed in to GuiRobotController.__init__) doesn't match the base_velocity action space we find here in act().

Copy link
Contributor

@0mdc 0mdc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I tested the changeset locally. It works as described.

self._cam_yaw = cam_yaw
self._hint_target_dir = target_dir

def angle_from_sim_obj_foward_dir_to_target_yaw(self, sim_obj, target_yaw):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: foward

@eundersander eundersander merged commit f5567b7 into main Mar 7, 2024
3 of 4 checks passed
@eundersander eundersander deleted the eundersander/gui_controlled_spot_and_multi_user branch March 7, 2024 01:57
dannymcy pushed a commit to dannymcy/habitat-lab that referenced this pull request Jul 8, 2024
HHYHRHY pushed a commit to SgtVincent/habitat-lab that referenced this pull request Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants