Work on reinforcement learing in startcarft2 using pysc2.
On MoveToBeacon with policy gradient and simple model with two Dense.
On CollectMineralShards with proximal policy optimization and convolution and selected_units.
- Add autofit observation lenght
- Add hyperparameter for ppo
- Add hyperparameter for trpo
- Add description result