Official dates

Display by

Burndown chart

Burnup chart

Unstarted Issues (open and unassigned)

Ongoing Issues (open and assigned)

Auto Devops Test Pipeline
#12 Management

Completed Issues (closed)

103

Fix return calculation
#104 Bug
Effect of scaling input to [0, 1]
#103 Point of Interest
Fix post processing
#102 Bug
Test Kostrikov's implementation
#101 Point of Interest
Effect of setting model in training mode
#100 Bug
Effect of Kostrikov's third convolutional layer with 32 instead of 64 channels
#99 Point of Interest
Parameters are not using max gradient norm
#98 Bug
Change batch size from percentages to absolute numbers
#97 Feature
Logging cosy screened scripts output
#96 Feature Management
Tensorboard does not like large event files
#95 Bug
Effect of orthogonal initialization
#94 Point of Interest
Effect of epsilon annealing on policy loss
#93 Point of Interest
Bash script for mass-starting on cosy machines
#92 Management
Tensorboard file naming scheme does not allow proper evaluation
#91 Bug
Policy performance evaluation
#90 Feature
Game Selection
#89 Management Thesis
Update docker setup for tensorboard logging
#88 Management
Readme Update
#87 Management
Setup COSY-Lab
#86 Management
Verify annealing of ɛ and α
#85 Bug
Effect of entropy bonus
#84 Point of Interest
Tensorboard integration
#83 Feature
Use baselines environment wrappers
#82 Bug
Use grad norms to evaluate stability
#81 Bug
Use smaller value loss coeffecient (c_1 = 0.5)
#80 Bug
Use smaller clip range (ɛ = 0.1)
#79 Bug
Use KL divergence to evaluate stability
#78 Bug
ReLU vs tanh activation
#77 Point of Interest
Returns with gamma and lambda vs returns without lambda
#76 Point of Interest
Return calculation doesn't bootstrap from last value
#75 Bug
Minimum vs maximum in value function loss calculation
#74 Point of Interest
Double check loss calculation
#73 Bug
The great bug hunt of 2020
#72 Bug
Batch size once again
#71 Bug
Parallelization with shared memory
#70 Point of Interest
Curiosity
#69 Feature
Logging mean_stats to console with no terminated episodes
#68 Bug
Fake done from EpisodicLifeEnv triggers attempt at logging episode data
#67 Bug
Mocking and deleting in Logger test_save and test_save_not_mocked do not work
#66 Bug
Effect of evaluation over the last 100 episodes vs last 100 time steps with terminating episodes
#65 Point of Interest
Effect of reward clipping vs reward binning
#64 Point of Interest
Font choice
#63 Point of Interest Thesis
Environment parallelization with MPI or subprocessing
#62 Feature
Flatten multiple values per time step for batch forward pass
#61 Bug
Reset on done
#60 Bug
Goal Review #2
#59 Management
Masking terminal states
#58 Feature
Effect of advantage normalization
#57 Point of Interest
Flatten rollout time steps for batch determination
#56 Point of Interest
Rollout generation with horizon time steps
#55 Bug
Multiple epochs without retain_graph=True
#54 Bug
Advantages should be normalized
#53 Bug
Faulty Probability Ratio Calculation
#52 Bug
PPO Batch Size
#51 Bug
EpisodicLifeEnv not resetting properly on loss of final life
#50 Bug
Logging Losses across multiple episodes
#49 Bug
CUDA 10.1 on Tesla VM
#48 Management
Docker Image
#47 Management
Effect of reward scaling
#46 Point of Interest
Write Thesis
#45 Thesis
Thesis Structure
#44 Thesis
Reward Scaling Breaks CUDA
#43 Bug
Global Gradient Clipping
#42 Feature
Observation Normalization and Clipping
#41 Feature
Orthogonal Initialization
#40 Feature
Adam Annealing
#39 Feature
Reward Scaling
#38 Feature
Exploding Value Function
#37 Bug
PPO Optimizations
#36 Feature
PPO and Rollout integration tests
#35 Feature
Goal review #3
#34 Management
PPO scaling epsilon
#33 Feature
Mismatch in number of states and number of actions in Rollout
#32 Bug
Performance Review
#31 Point of Interest
Rollout/Storage Class
#30 Feature
Return calculation is backwards
#29 Bug
Refactor Policy Tests
#28 Bug
Evaluation
#27 Feature
Logging
#26 Feature
Entropy Bonus
#25 Feature
Postprocessing Implementation
#24 Feature
Include feedback from meeting 1 in README
#23 Management
Value Function Loss
#22 Feature
Agent Parallelization
#21 Feature
Negative action head output breaks categorical initialization
#20 Bug
Experiment Setup
#19 Feature
PPO CLIP
#18 Feature
PPO KLPEN
#17 Feature
Inverse Dynamics Features
#16 Feature
Variational Autoencoder
#15 Feature
Random Features
#14 Feature
Gym State Channel Order
#13 Bug
REINFORCE Atari Test
#11 Feature
Background Lectures
#10 Management
Goal Specification
#9 Management
Common Architecture
#8 Feature
Shared Value Function + Policy Parameters
#7 Feature
Value Function Implementation
#6 Feature
Feature Extraction
#5 Feature
Grayscale Conversion
#4 Feature
Preprocessing Implementation
#3 Feature
GAE Implementation
#2 Feature
PPO Implementation
#1 Feature