A. Rupam Mahmood
TitleCited byYear
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
RS Sutton, AR Mahmood, M White
Journal of Machine Learning Research 17, 2016
742016
Weighted importance sampling for off-policy learning with linear function approximation
AR Mahmood, H van Hasselt, RS Sutton
Advances in Neural Information Processing Systems 27, 2014
512014
True Online Temporal-Difference Learning
H van Seijen, AR Mahmood, PM Pilarski, MC Machado, RS Sutton
Journal of Machine Learning Research 17, 2016
452016
Tuning-free step-size adaptation
AR Mahmood, RS Sutton, T Degris, PM Pilarski
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International …, 2012
342012
A new Q (λ) with interim forward view and Monte Carlo equivalence
RS Sutton, AR Mahmood, D Precup, M CA, H van Hasselt, U CA
282014
Off-policy TD (λ) with a true online equivalence
H van Hasselt, AR Mahmood, RS Sutton
Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence …, 2014
252014
Multi-step Off-policy Learning Without Importance Sampling Ratios
AR Mahmood, H Yu, RS Sutton
arXiv preprint arXiv:1702.03006, 2017
212017
Off-policy learning based on weighted importance sampling with linear computational complexity
AR Mahmood, RS Sutton
Proceedings of the 31st Conference on Uncertainty in Artificial Intelligence …, 2015
212015
Emphatic temporal-difference learning
AR Mahmood, H Yu, M White, RS Sutton
arXiv preprint arXiv:1507.01569, 2015
192015
Representation Search through Generate and Test
AR Mahmood, RS Sutton
Workshops at the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013
142013
Benchmarking Reinforcement Learning Algorithms on Real-World Robots
AR Mahmood, D Korenkevych, G Vasan, W Ma, J Bergstra
arXiv preprint arXiv:1809.07731, 2018
92018
Setting up a Reinforcement Learning Task with a Real-World Robot
AR Mahmood, D Korenkevych, BJ Komer, J Bergstra
arXiv preprint arXiv:1803.07067, 2018
92018
Incremental Off-policy Reinforcement Learning Algorithms
A Mahmood
University of Alberta, 2017
62017
Structure Learning of Causal Bayesian Networks: A Survey
A Mahmood
Department of Computing Science, University of Alberta, Edmonton, Canada …, 2011
62011
Automatic step-size adaptation in incremental supervised learning
A Mahmood
University of Alberta, 2010
62010
On Generalized Bellman Equations and Temporal-Difference Learning
H Yu, AR Mahmood, RS Sutton
Canadian Conference on Artificial Intelligence, 3-14, 2017
52017
Autoregressive Policies for Continuous Control Deep Reinforcement Learning
D Korenkevych, AR Mahmood, G Vasan, J Bergstra
arXiv preprint arXiv:1903.11524, 2019
12019
An Empirical Evaluation of True Online TD (λ)
H van Seijen, AR Mahmood, PM Pilarski, RS Sutton
arXiv preprint arXiv:1507.00353, 2015
12015
The system can't perform the operation now. Try again later.
Articles 1–18