Prediction Error and Actor-Critic Hypotheses in the Brain.- Reviewing on-policy / off-policy critic learning in the context of Temporal Differences and Residual Learning.- Reward Function Design in Reinforcement Learning.- Exploration Methods In Sparse Reward Environments.- A Survey on Constraining Policy Updates Using the KL Divergence.- Fisher Information Approximations in Policy Gradient Methods.- Benchmarking the Natural gradient...