On the global convergence rates of softmax policy gradient methods J Mei, C Xiao, C Szepesvari, D Schuurmans International Conference on Machine Learning, 6820-6829, 2020 | 298 | 2020 |
Locality preserving hashing K Zhao, H Lu, J Mei Proceedings of the AAAI Conference on Artificial Intelligence 28 (1), 2014 | 68 | 2014 |
Leveraging non-uniformity in first-order non-convex optimization J Mei, Y Gao, B Dai, C Szepesvari, D Schuurmans International Conference on Machine Learning, 7555-7564, 2021 | 60 | 2021 |
Escaping the Gravitational Pull of Softmax J Mei, C Xiao, B Dai, L Li, C Szepesvári, D Schuurmans Advances in Neural Information Processing Systems 33, 2020 | 57 | 2020 |
Maximum entropy monte-carlo planning C Xiao, R Huang, J Mei, D Schuurmans, M Müller Advances in Neural Information Processing Systems, 9520-9528, 2019 | 41 | 2019 |
On the optimality of batch policy optimization algorithms C Xiao, Y Wu, J Mei, B Dai, T Lattimore, L Li, C Szepesvari, ... International Conference on Machine Learning, 11362-11371, 2021 | 32 | 2021 |
On principled entropy exploration in policy optimization J Mei, C Xiao, R Huang, D Schuurmans, M Müller Proceedings of the 28th International Joint Conference on Artificial …, 2019 | 29 | 2019 |
Memory-Augmented Monte Carlo Tree Search C Xiao, J Mei, M Müller AAAI, 1455-1462, 2018 | 26 | 2018 |
On the global convergence rates of decentralized softmax gradient play in markov potential games R Zhang, J Mei, B Dai, D Schuurmans, N Li Advances in Neural Information Processing Systems 35, 1923-1935, 2022 | 23 | 2022 |
Understanding the effect of stochasticity in policy optimization J Mei, B Dai, C Xiao, C Szepesvari, D Schuurmans Advances in Neural Information Processing Systems 34, 19339-19351, 2021 | 23 | 2021 |
The Role of Baselines in Policy Gradient Optimization J Mei, W Chung, V Thomas, B Dai, C Szepesvari, D Schuurmans Advances in Neural Information Processing Systems 35, 17818-17830, 2022 | 17 | 2022 |
Understanding and mitigating the limitations of prioritized experience replay Y Pan, J Mei, A Farahmand, M White, H Yao, M Rohani, J Luo Uncertainty in Artificial Intelligence, 1561-1571, 2022 | 17 | 2022 |
Identifying and Tracking Sentiments and Topics from Social Media Texts during Natural Disasters M Yang, J Mei, H Ji, W Zhao, Z Zhao, X Chen Proceedings of the 2017 Conference on Empirical Methods in Natural Language …, 2017 | 17 | 2017 |
Frequency-based Search-control in Dyna Y Pan, J Mei, A Farahmand arXiv preprint arXiv:2002.05822, 2020 | 16 | 2020 |
Understanding and Leveraging Overparameterization in Recursive Value Estimation C Xiao, B Dai, J Mei, OA Ramirez, R Gummadi, C Harris, D Schuurmans International Conference on Learning Representations, 2021 | 15 | 2021 |
Discovering author interest evolution in topic modeling M Yang, J Mei, F Xu, W Tu, Z Lu Proceedings of the 39th International ACM SIGIR conference on Research and …, 2016 | 15 | 2016 |
On unconstrained quasi-submodular function optimization J Mei, K Zhao, BL Lu Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015 | 8 | 2015 |
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal T Kozuno, W Yang, N Vieillard, T Kitamura, Y Tang, J Mei, P Ménard, ... arXiv preprint arXiv:2205.14211, 2022 | 7 | 2022 |
On the Effect of Log-Barrier Regularization in Decentralized Softmax Gradient Play in Multiagent Systems R Zhang, J Mei, B Dai, D Schuurmans, N Li arXiv preprint arXiv:2202.00872, 2022 | 7 | 2022 |
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF S Cen, J Mei, K Goshvadi, H Dai, T Yang, S Yang, D Schuurmans, Y Chi, ... arXiv preprint arXiv:2405.19320, 2024 | 5 | 2024 |