Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New info API for vectorized environments #2657 #2773

Merged
merged 49 commits into from
May 24, 2022

Conversation

gianlucadecola
Copy link
Contributor

This is a draft referring to issue #2657 in order to gain some feedback before moving on.
For the new info API I'm working to make the code more modular and backward-compatible with the old info in order to make switching to a different info format in the future eventually easier.

Default behaviour is still the same so
envs = gym.vector.make("CartPole-v1", asynchronous=False, num_envs=3)
will return info in the format:

[{'terminal_observation': array([ 0.1307018,  1.7346063, -0.2267962, -2.7928443], dtype=float32)},
 {},
 {}]

If we want to switch to another format
envs = gym.vector.make("CartPole-v1", asynchronous=False, num_envs=3, info_format="brax")
we get

{'terminal_observation': [None,
                          array([-0.09317806, -1.5342456 ,  0.22191852,  2.548524  ], dtype=float32),
                          None]}

If we then want to format info for example like proposed by @pseudo-rnd-thoughts

{
   "key": [
         (data, environment number) 
         ...
   ], 
   ...
}

All we need to do is registering a new VecEnvInfoStrategy

The same approach applies to wrappers like RecordEpisodeStatistics which get the info_strategy by the wrapped environments and process the info accordingly.

Current draft works on SyncVectorEnv only. If we agree on this approach I will continue implementing it with Async and other wrappers

@gianlucadecola gianlucadecola marked this pull request as ready for review April 25, 2022 22:51
@gianlucadecola gianlucadecola changed the title [Draft] New info API for vectorized environments #2657 New info API for vectorized environments #2657 Apr 25, 2022
@gianlucadecola
Copy link
Contributor Author

I think all changes are in place now; the new info API works for both async and sync vector envs; the changes are backward compatible with the previous gym version and, in the future, if we want to add different info formatting all we need to do is adding a new ProcessingStrategy

@pseudo-rnd-thoughts
Copy link
Contributor

Thanks for the PR, it looks good to me. I will get someone else to review as well.
Would you be able to add documentation for this PR please, https://github.com/Farama-Foundation/gym-docs

@gianlucadecola
Copy link
Contributor Author

gianlucadecola commented Apr 26, 2022

Add some fixes and solved a merge conflict

Thanks for the PR, it looks good to me. I will get someone else to review as well. Would you be able to add documentation for this PR please, https://github.com/Farama-Foundation/gym-docs

Yup, will update the docs 👍

@pseudo-rnd-thoughts
Copy link
Contributor

@gianlucadecola Sorry, I did a bit more complete review and found a couple more things that could be updated.

@gianlucadecola
Copy link
Contributor Author

@gianlucadecola Sorry, I did a bit more complete review and found a couple more things that could be updated.

No problem 👍 I'll take a look

@gianlucadecola gianlucadecola marked this pull request as draft May 7, 2022 18:08
@gianlucadecola
Copy link
Contributor Author

So I'm converting this as a draft since it will require a different approach based on: google/brax#194
The new default behavior will be the Brax format so this PR will introduce a breaking change. Instead of having different strategies a wrapper will be added to convert back the info to the "classic" format.

@gianlucadecola
Copy link
Contributor Author

Did the renaming and incorporate the info_processor into VecEnv

gym/vector/sync_vector_env.py Outdated Show resolved Hide resolved
gym/vector/vector_env.py Outdated Show resolved Hide resolved
gym/vector/vector_env.py Outdated Show resolved Hide resolved
gym/vector/vector_env.py Outdated Show resolved Hide resolved
gym/wrappers/record_episode_statistics.py Outdated Show resolved Hide resolved
gym/wrappers/record_episode_statistics.py Outdated Show resolved Hide resolved
gym/wrappers/vec_info_to_classic.py Outdated Show resolved Hide resolved
gym/wrappers/vec_info_to_classic.py Outdated Show resolved Hide resolved
gym/wrappers/record_episode_statistics.py Outdated Show resolved Hide resolved
gym/wrappers/record_episode_statistics.py Outdated Show resolved Hide resolved
gym/wrappers/record_episode_statistics.py Outdated Show resolved Hide resolved
gym/wrappers/vec_info_to_classic.py Outdated Show resolved Hide resolved
gym/wrappers/vec_info_to_classic.py Outdated Show resolved Hide resolved
Copy link
Contributor

@RedTachyon RedTachyon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a bunch of comments, it's on the right track - most comments are stylistic and nitpicks, with I think one or two about functionality.

@@ -211,6 +217,42 @@ def seed(self, seed=None):
"Please use `env.reset(seed=seed) instead in VectorEnvs."
)

def _add_info(self, infos: dict, info: dict, env_num: int) -> dict:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this and the next method are basically static, correct? The only thing is the number of envs which can be passed as an argument. I think it's better to extract them to separate functions.

gym/vector/vector_env.py Show resolved Hide resolved
gym/wrappers/record_episode_statistics.py Outdated Show resolved Hide resolved
gym/wrappers/record_episode_statistics.py Outdated Show resolved Hide resolved
gym/wrappers/record_episode_statistics.py Outdated Show resolved Hide resolved
gym/wrappers/vector_list_info.py Outdated Show resolved Hide resolved
tests/vector/test_vector_env_info.py Outdated Show resolved Hide resolved
tests/vector/test_vector_env_info.py Show resolved Hide resolved
for _ in range(ENV_STEPS):
action = env.action_space.sample()
_, _, dones, infos = env.step(action)
if any(dones):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is my understanding correct that the way this test is done, there's a reasonable chance that there will never be two environments terminating at the same time? This leaves us with a pretty important untested functionality. Please design a test for various cases, i.e. 1, 2, 3 (all) envs terminating at once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Will add other test cases 👍

tests/wrappers/test_vector_list_info.py Outdated Show resolved Hide resolved
@gianlucadecola
Copy link
Contributor Author

Hey, thanks a lot for the review, added a bunch of comments, will work on that soon 👍

@gianlucadecola
Copy link
Contributor Author

gianlucadecola commented May 23, 2022

@RedTachyon I've done some changes, when you have time could you please take a look? Particularly at:
#2773 (comment)
#2773 (comment)
#2773 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants