Hybrid reward architecture for reinforcement learning H Van Seijen, M Fatemi, J Romoff, R Laroche, T Barnes, J Tsang Advances in Neural Information Processing Systems 30, 2017 | 226 | 2017 |
A theoretical and empirical analysis of Expected Sarsa H Van Seijen, H Van Hasselt, S Whiteson, M Wiering 2009 ieee symposium on adaptive dynamic programming and reinforcement …, 2009 | 216 | 2009 |
Reducing network agnostophobia AR Dhamija, M Günther, T Boult Advances in Neural Information Processing Systems 31, 2018 | 209 | 2018 |
True online TD (lambda) H Seijen, R Sutton International Conference on Machine Learning, 692-700, 2014 | 113 | 2014 |
True online temporal-difference learning H Van Seijen, AR Mahmood, PM Pilarski, MC Machado, RS Sutton The Journal of Machine Learning Research 17 (1), 5057-5096, 2016 | 102 | 2016 |
A Deeper Look at Planning as Learning from Replay H van Seijen, RS Sutton International Conference on Machine Learning, 2015 | 62 | 2015 |
Systematic generalisation with group invariant predictions F Ahmed, Y Bengio, H Van Seijen, A Courville International Conference on Learning Representations, 2021 | 55 | 2021 |
Planning by prioritized sweeping with small backups H Van Seijen, R Sutton International Conference on Machine Learning, 361-369, 2013 | 51* | 2013 |
Exploiting Best-Match Equations for Efficient Reinforcement Learning. H van Seijen, S Whiteson, H van Hasselt, M Wiering Journal of Machine Learning Research 12 (6), 2011 | 26 | 2011 |
Using a logarithmic mapping to enable lower discount factors in reinforcement learning H Van Seijen, M Fatemi, A Tavakoli Advances in Neural Information Processing Systems 32, 2019 | 25 | 2019 |
Multi-advisor reinforcement learning R Laroche, M Fatemi, J Romoff, H van Seijen arXiv preprint arXiv:1704.00756, 2017 | 20 | 2017 |
On value function representation of long horizon problems L Lehnert, R Laroche, H van Seijen Proceedings of the AAAI Conference on Artificial Intelligence 32 (1), 2018 | 18 | 2018 |
Effective multi-step temporal-difference learning for non-linear function approximation H van Seijen arXiv preprint arXiv:1608.05151, 2016 | 18 | 2016 |
Efficient abstraction selection in reinforcement learning H van Seijen, S Whiteson, L Kester Computational Intelligence 30 (4), 657-699, 2014 | 16 | 2014 |
Learning invariances for policy generalization R Tachet, P Bachman, H van Seijen arXiv preprint arXiv:1809.02591, 2018 | 13 | 2018 |
Separation of concerns in reinforcement learning H van Seijen, M Fatemi, J Romoff, R Laroche arXiv preprint arXiv:1612.05159, 2016 | 12 | 2016 |
Modular lifelong reinforcement learning via neural composition JA Mendez, H van Seijen, E Eaton arXiv preprint arXiv:2207.00429, 2022 | 10 | 2022 |
Dead-ends and secure exploration in reinforcement learning M Fatemi, S Sharma, H Van Seijen, SE Kahou International Conference on Machine Learning, 1873-1881, 2019 | 10 | 2019 |
Switching between representations in reinforcement learning H van Seijen, S Whiteson, L Kester Interactive Collaborative Information Systems, 65-84, 2010 | 10 | 2010 |
Forward actor-critic for nonlinear function approximation in reinforcement learning V Veeriah, H van Seijen, RS Sutton Proceedings of the 16th Conference on Autonomous Agents and MultiAgent …, 2017 | 9 | 2017 |