First Order Constrained Optimization in Policy Space Y Zhang, Q Vuong, K Ross Advances in Neural Information Processing Systems 33, 2020 | 121 | 2020 |
On-Policy Deep Reinforcement Learning for the Average-Reward Criterion Y Zhang, KW Ross International Conference on Machine Learning, 12535-12545, 2021 | 32 | 2021 |
Supervised policy update for deep reinforcement learning Q Vuong, Y Zhang, KW Ross International Conference for Learning Representations (ICLR), 2019 | 26 | 2019 |
Efficient entropy for policy gradient with multidimensional action space Y Zhang, QH Vuong, K Song, XY Gong, KW Ross International Conference for Learning Representations (ICLR) Workshop, 2018 | 11 | 2018 |
Aggressive q-learning with ensembles: Achieving both high sample efficiency and high asymptotic performance Y Wu, X Chen, C Wang, Y Zhang, KW Ross arXiv preprint arXiv:2111.09159, 2021 | 7 | 2021 |
CW-ERM: Improving Autonomous Driving Planning with Closed-loop Weighted Empirical Risk Minimization E Kumar, Y Zhang, S Pini, S Stent, A Ferreira, S Zagoruyko, CS Perone arXiv preprint arXiv:2210.02174, 2022 | 1 | 2022 |
An adaptive sequential Monte Carlo approach to neural network training Y Zhang, B Hu 2015 IEEE International Conference on Industrial Technology (ICIT), 1619-1623, 2015 | 1 | 2015 |
On Policy Deep Reinforcement Learning—The Discounted and Average Reward Criteria Y Zhang New York University, 2022 | | 2022 |