Common Architecture
Designs
- Show closed items
Relates to
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Daniel Lukats assigned to @dl337788
assigned to @dl337788
- Daniel Lukats added Feature label
added Feature label
- Daniel Lukats marked this issue as related to #11 (closed)
marked this issue as related to #11 (closed)
- Author Owner
About half of the architecture outlined in "Playing Atari with Deep Reinforcement Learning" (Mnih et al., 2013) matches the architecture in https://github.com/openai/large-scale-curiosity/blob/master/utils.py#L133
- Author Owner
More implementations including sources in: https://github.com/openai/baselines/blob/master/baselines/ppo1/cnn_policy.py#L22
- Daniel Lukats added In Progress label
added In Progress label
- Daniel Lukats marked this issue as related to #7 (closed)
marked this issue as related to #7 (closed)
- Author Owner
Logit distribution in https://github.com/openai/baselines/blob/master/baselines/common/distributions.py. Why no softmax?
Collapse replies - Author Owner
The so called "Gumbel-max trick" is used to sample actions (in some implementations)
- Author Owner
Might be the same as using the Categorical distribution from pytorch as shown in the Reinforcement Learning example
- Author Owner
A larger CNN stack is described in "Human-level control through deep reinforcement learning" (Mnih et al., 2015)
- Daniel Lukats added In Review label and removed In Progress label
added In Review label and removed In Progress label
- Daniel Lukats marked this issue as related to #20 (closed)
marked this issue as related to #20 (closed)
- Daniel Lukats mentioned in issue #20 (closed)
mentioned in issue #20 (closed)
- Author Owner
Tests in 3dbac4a4
- Daniel Lukats closed
closed
- Daniel Lukats reopened
reopened
- Daniel Lukats closed
closed
- Daniel Lukats removed In Review label
removed In Review label
- Daniel Lukats changed milestone to %Official dates
changed milestone to %Official dates
- Daniel Lukats marked this issue as related to #54 (closed)
marked this issue as related to #54 (closed)