Philip Thomas
Title
Cited by
Cited by
Year
Value function approximation in reinforcement learning using the Fourier basis
G Konidaris, S Osentoski, P Thomas
Proceedings of the AAAI Conference on Artificial Intelligence 25 (1), 2011
2932011
Data-efficient off-policy policy evaluation for reinforcement learning
P Thomas, E Brunskill
International Conference on Machine Learning, 2139-2148, 2016
2352016
High-confidence off-policy evaluation
P Thomas, G Theocharous, M Ghavamzadeh
Proceedings of the AAAI Conference on Artificial Intelligence 29 (1), 2015
1572015
High confidence policy improvement
P Thomas, G Theocharous, M Ghavamzadeh
International Conference on Machine Learning, 2380-2388, 2015
1152015
Increasing the action gap: New operators for reinforcement learning
MG Bellemare, G Ostrovski, A Guez, P Thomas, R Munos
Proceedings of the AAAI Conference on Artificial Intelligence 30 (1), 2016
952016
Ad recommendation systems for life-time value optimization
G Theocharous, PS Thomas, M Ghavamzadeh
Proceedings of the 24th International Conference on World Wide Web, 1305-1310, 2015
922015
Bias in natural actor-critic algorithms
P Thomas
International conference on machine learning, 441-448, 2014
902014
Safe reinforcement learning
PS Thomas
University of Massachusetts Libraries, 2015
522015
Learning action representations for reinforcement learning
Y Chandak, G Theocharous, J Kostas, S Jordan, P Thomas
International Conference on Machine Learning, 941-950, 2019
492019
Proximal reinforcement learning: A new theory of sequential decision making in primal-dual spaces
S Mahadevan, B Liu, P Thomas, W Dabney, S Giguere, N Jacek, I Gemp, ...
arXiv preprint arXiv:1405.6757, 2014
372014
Preventing undesirable behavior of intelligent machines
P Thomas, B Castro da Silva, A Barto, S Giguere, Y Brun, E Brunskill
Science 366 (6468), 999-1004, 2019
362019
Application of the actor-critic architecture to functional electrical stimulation control of a human arm
P Thomas, M Branicky, A van den Bogert, K Jagodnik
Proceedings of the... Innovative Applications of Artificial Intelligence …, 2009
362009
Using options and covariance testing for long horizon off-policy policy evaluation
ZD Guo, PS Thomas, E Brunskill
arXiv preprint arXiv:1703.03453, 2017
282017
Training an actor-critic reinforcement learning controller for arm movement using human-generated rewards
KM Jagodnik, PS Thomas, AJ van den Bogert, MS Branicky, RF Kirsch
IEEE Transactions on Neural Systems and Rehabilitation Engineering 25 (10 …, 2017
252017
Conjugate Markov Decision Processes
P Thomas, A Barto
International Conference on Machine Learning, 137-144, 2011
252011
Projected Natural Actor-Critic.
PS Thomas, W Dabney, S Giguere, S Mahadevan
NIPS, 2337-2345, 2013
242013
Td_gamma: Re-evaluating complex backups in temporal difference learning
G Konidaris, S Niekum, PS Thomas
Advances in Neural Information Processing Systems 24, 2402-2410, 2011
242011
Importance Sampling for Fair Policy Selection.
S Doroudi, PS Thomas, E Brunskill
Grantee Submission, 2017
232017
Some recent applications of reinforcement learning
AG Barto, PS Thomas, RS Sutton
Proceedings of the 18th Yale Workshop on Adaptive and Learning Systems, 2017
222017
Predictive Off-Policy Policy Evaluation for Nonstationary Decision Problems, with Applications to Digital Marketing.
PS Thomas, G Theocharous, M Ghavamzadeh, I Durugkar, E Brunskill
AAAI, 4740-4745, 2017
202017
The system can't perform the operation now. Try again later.
Articles 1–20