Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
A Kumar, J Fu, G Tucker, S Levine
NeuRIPS 2019, arXiv:1906.00949, 2019
Offline reinforcement learning: Tutorial, review, and perspectives on open problems
S Levine, A Kumar, G Tucker, J Fu
arXiv preprint arXiv:2005.01643, 2020
Graph Normalizing Flows
J Liu, A Kumar, J Ba, J Kiros, K Swersky
NeurIPS 2019, arxiv:1905.13177, 2019
Trainable calibration measures for neural networks from kernel mean embeddings
A Kumar, S Sarawagi, U Jain
International Conference on Machine Learning, 2805-2814, 2018
Diagnosing Bottlenecks in Deep Q-learning Algorithms
J Fu, A Kumar, M Soh, S Levine
International Conference on Machine Learning (ICML) 2019, https://arxiv.org …, 0
D4rl: Datasets for deep data-driven reinforcement learning
J Fu, A Kumar, O Nachum, G Tucker, S Levine
arXiv preprint arXiv:2004.07219, 2020
Advantage-weighted regression: Simple and scalable off-policy reinforcement learning
XB Peng, A Kumar, G Zhang, S Levine
arXiv preprint arXiv:1910.00177, 2019
Calibration of Encoder Decoder Models for Neural Machine Translation
A Kumar, S Sarawagi
https://arxiv.org/abs/1903.00802, 2019
Conservative q-learning for offline reinforcement learning
A Kumar, A Zhou, G Tucker, S Levine
arXiv preprint arXiv:2006.04779, 2020
Discor: Corrective feedback in reinforcement learning via distribution correction
A Kumar, A Gupta, S Levine
arXiv preprint arXiv:2003.07305, 2020
Model inversion networks for model-based optimization
A Kumar, S Levine
arXiv preprint arXiv:1912.13464, 2019
Reward-conditioned policies
A Kumar, XB Peng, S Levine
arXiv preprint arXiv:1912.13465, 2019
COG: Connecting New Skills to Past Experience with Offline Reinforcement Learning
A Singh, A Yu, J Yang, J Zhang, A Kumar, S Levine
arXiv preprint arXiv:2010.14500, 2020
Conservative Safety Critics for Exploration
H Bharadhwaj, A Kumar, N Rhinehart, S Levine, F Shkurti, A Garg
arXiv preprint arXiv:2010.14497, 2020
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning
A Ajay, A Kumar, P Agrawal, S Levine, O Nachum
arXiv preprint arXiv:2010.13611, 2020
The reach-avoid problem for constant-rate multi-mode systems
SN Krishna, A Kumar, F Somenzi, B Touri, A Trivedi
International Symposium on Automated Technology for Verification and …, 2017
Challenges and Tool Implementation of Hybrid Rapidly-Exploring Random Trees
S Bak, S Bogomolov, TA Henzinger, A Kumar
International Workshop on Numerical Software Verification, 83-89, 2017
One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL
S Kumar, A Kumar, S Levine, C Finn
arXiv preprint arXiv:2010.14484, 2020
Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning
A Kumar, R Agarwal, D Ghosh, S Levine
arXiv preprint arXiv:2010.14498, 2020
