PPO Batch Size
Designs
- Show closed items
Link issues together to show that they're related or that one is blocking others.
Learn more.
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Daniel Lukats added Bug label
added Bug label
- Daniel Lukats changed milestone to %Official dates
changed milestone to %Official dates
- Daniel Lukats assigned to @dl337788
assigned to @dl337788
- Daniel Lukats added In Progress label
added In Progress label
- Author Owner
Horizon T Number of parallel rollouts N Minibatch size M <= T * N train for K epochs
use torch samplers
Edited by Daniel Lukats Collapse replies - Author Owner
We simply use a mini batch size of 32 per worker/environment, so 32 * N. With 4 epochs and T = 128 this will lead to 100% of the samples being used in mini batches. Some papers suggest using 3 epochs, so we will miss 25% of samples in training.
Edited by Daniel Lukats
- Daniel Lukats closed via commit 21090f54
closed via commit 21090f54
- Daniel Lukats removed In Progress label
removed In Progress label
Please register or sign in to reply