Comparing Direct and Indirect Temporal-Difference Methods for Estimating the Variance of the Return. C Sherstan, DR Ashley, B Bennett, K Young, A White, M White, RS Sutton UAI, 63-72, 2018 | 19 | 2018 |
Directly estimating the variance of the {\lambda}-return using temporal-difference methods C Sherstan, B Bennett, K Young, DR Ashley, A White, M White, RS Sutton arXiv preprint arXiv:1801.08287, 2018 | 15 | 2018 |
Predicting Periodicity with Temporal Difference Learning K De Asis, B Bennett, RS Sutton arXiv preprint arXiv:1809.07435, 2018 | 2 | 2018 |
Incrementally Learning Functions of the Return B Bennett, W Chung, M Zaheer, V Liu arXiv preprint arXiv:1907.04651, 2019 | 1 | 2019 |
Back to Square One: Superhuman Performance in Chutes and Ladders Through Deep Neural Networks and Tree Search D Ashley, A Kanervisto, B Bennett arXiv preprint arXiv:2104.00698, 2021 | | 2021 |
Estimating Variance of Returns using Temporal Difference Methods B Bennett | | 2021 |
Nexting and State Discovery in Robot Microworlds J Modayil, A White, AR Mahmood, B Bennett, DCP Prauchner, RS Sutton RLDM 2013, 73, 2013 | | 2013 |