Skip to content

Latest commit

 

History

History
20 lines (13 loc) · 1.02 KB

File metadata and controls

20 lines (13 loc) · 1.02 KB

Evolutionary Reinforcement Learning for OpenAI Gym

Implementation of Augmented-Random-Search for OpenAI Gym environments in Python. Performance is defined as the sample efficiency of the algorithm i.e how good is the average reward after using x episodes of interaction in the environment for traning. The paper can be found here: Simple random search provides a competitive approach to reinforcement learning

Augmented-Random-Search (ARS)

ARS is an Evolutionary Strategy where the policy is linear with weights wp

Given an observation st an action at is chosen by:

Continuous case:
at = wp * st

Discrete case:
at = softmax(wp * st)

The weights are mutated by adding i.i.d. normal distributed noise to every weight.

wnew = wp + α * N(0, 1)

Then the policy weights wp are updated in the direction of the best performing mutated weights.