When waiting is not an option: Learning options with a deliberation cost J Harb, PL Bacon, M Klissarov, D Precup AAAI 2018, 2017 | 67 | 2017 |
Learning options end-to-end for continuous action tasks M Klissarov, PL Bacon, J Harb, D Precup arXiv preprint arXiv:1712.00004, 2017 | 20 | 2017 |
Options of Interest: Temporal Abstraction with Interest Functions K Khetarpal, M Klissarov, M Chevalier-Boisvert, PL Bacon, D Precup AAAI 2020, 2020 | 7 | 2020 |
Variational state encoding as intrinsic motivation in reinforcement learning M Klissarov, R Islam, K Khetarpal, D Precup Task-Agnostic Reinforcement Learning Workshop at Proceedings of the …, 2019 | 1 | 2019 |
Diffusion-Based Approximate Value Functions M Klissarov, D Precup | 1 | 2018 |
Reward Propagation Using Graph Convolutional Networks M Klissarov, D Precup NeurIPS 2020, 2020 | | 2020 |