The great bug hunt of 2020
Designs
- Show closed items
Relates to
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Daniel Lukats assigned to @dl337788
assigned to @dl337788
- Daniel Lukats changed milestone to %Official dates
changed milestone to %Official dates
- Daniel Lukats added Bug label
added Bug label
- Author Owner
Final performance on Breakout with 1 run is 104.97; reward graph shows peaks of up to 400 episode reward. However, the algorithm is very unstable. Final performance in PPO paper is 274.8, which should be achievable if learning becomes more stable.
TODO:
- check loss calculation using open source implementations available on GitHub
- check hyperparameters, most importantly batch size, rollout length and number of updates
Edited by Daniel Lukats - Daniel Lukats marked this issue as related to #71 (closed)
marked this issue as related to #71 (closed)
- Daniel Lukats marked this issue as related to #73 (closed)
marked this issue as related to #73 (closed)
- Daniel Lukats added In Progress label
added In Progress label
- Daniel Lukats marked this issue as related to #78 (closed)
marked this issue as related to #78 (closed)
- Daniel Lukats marked this issue as related to #79 (closed)
marked this issue as related to #79 (closed)
- Daniel Lukats marked this issue as related to #80 (closed)
marked this issue as related to #80 (closed)
- Daniel Lukats marked this issue as related to #49 (closed)
marked this issue as related to #49 (closed)
- Daniel Lukats marked this issue as related to #82 (closed)
marked this issue as related to #82 (closed)
- Author Owner
- Daniel Lukats marked this issue as related to #98 (closed)
marked this issue as related to #98 (closed)
- Author Owner
Closed with a1fa87e6
- Daniel Lukats closed
closed
- Daniel Lukats removed In Progress label
removed In Progress label