A. Rupam Mahmood
Title
Cited by
Cited by
Year
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
RS Sutton, AR Mahmood, M White
Journal of Machine Learning Research 17, 2016
1062016
Weighted importance sampling for off-policy learning with linear function approximation
AR Mahmood, H van Hasselt, RS Sutton
Advances in Neural Information Processing Systems 27, 2014
692014
True Online Temporal-Difference Learning
H van Seijen, AR Mahmood, PM Pilarski, MC Machado, RS Sutton
Journal of Machine Learning Research 17, 2016
592016
Tuning-free step-size adaptation
AR Mahmood, RS Sutton, T Degris, PM Pilarski
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International …, 2012
402012
Off-policy TD (λ) with a true online equivalence
H van Hasselt, AR Mahmood, RS Sutton
Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence …, 2014
332014
A new Q (λ) with interim forward view and Monte Carlo equivalence
RS Sutton, AR Mahmood, D Precup, M CA, H van Hasselt, U CA
322014
Benchmarking Reinforcement Learning Algorithms on Real-World Robots
AR Mahmood, D Korenkevych, G Vasan, W Ma, J Bergstra
arXiv preprint arXiv:1809.07731, 2018
282018
Multi-step Off-policy Learning Without Importance Sampling Ratios
AR Mahmood, H Yu, RS Sutton
arXiv preprint arXiv:1702.03006, 2017
242017
Off-policy learning based on weighted importance sampling with linear computational complexity
AR Mahmood, RS Sutton
Proceedings of the 31st Conference on Uncertainty in Artificial Intelligence …, 2015
212015
Emphatic temporal-difference learning
AR Mahmood, H Yu, M White, RS Sutton
arXiv preprint arXiv:1507.01569, 2015
202015
Representation Search through Generate and Test
AR Mahmood, RS Sutton
Workshops at the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013
172013
Setting up a Reinforcement Learning Task with a Real-World Robot
AR Mahmood, D Korenkevych, BJ Komer, J Bergstra
arXiv preprint arXiv:1803.07067, 2018
102018
On Generalized Bellman Equations and Temporal-Difference Learning
H Yu, AR Mahmood, RS Sutton
Canadian Conference on Artificial Intelligence, 3-14, 2017
102017
Incremental Off-policy Reinforcement Learning Algorithms
A Mahmood
University of Alberta, 2017
72017
Structure Learning of Causal Bayesian Networks: A Survey
A Mahmood
Department of Computing Science, University of Alberta, Edmonton, Canada …, 2011
72011
Automatic step-size adaptation in incremental supervised learning
A Mahmood
University of Alberta, 2010
72010
Autoregressive Policies for Continuous Control Deep Reinforcement Learning
D Korenkevych, AR Mahmood, G Vasan, J Bergstra
arXiv preprint arXiv:1903.11524, 2019
42019
An Empirical Evaluation of True Online TD (λ)
H van Seijen, AR Mahmood, PM Pilarski, RS Sutton
arXiv preprint arXiv:1507.00353, 2015
12015
Real-time real-world reinforcement learning systems and methods
AR Mahmood, BJ Komer, D Korenkevych
US Patent App. 16/560,761, 2020
2020
The system can't perform the operation now. Try again later.
Articles 1–19