Follow
QinBo Bai
Title
Cited by
Cited by
Year
Deep learning-based channel estimation algorithm over time selective fading channels
Q Bai, J Wang, Y Zhang, J Song
IEEE Transactions on Cognitive Communications and Networking 6 (1), 125-134, 2019
1492019
Achieving zero constraint violation for constrained reinforcement learning via primal-dual approach
Q Bai, AS Bedi, M Agarwal, A Koppel, V Aggarwal
Proceedings of the AAAI Conference on Artificial Intelligence 36 (4), 3682-3689, 2022
712022
Reinforcement learning for constrained markov decision processes
A Gattami, Q Bai, V Aggarwal
International Conference on Artificial Intelligence and Statistics, 2656-2664, 2021
312021
Achieving zero constraint violation for constrained reinforcement learning via conservative natural policy gradient primal-dual algorithm
Q Bai, AS Bedi, V Aggarwal
Proceedings of the AAAI Conference on Artificial Intelligence 37 (6), 6737-6744, 2023
202023
Provably efficient model-free algorithm for MDPs with peak constraints
Q Bai, V Aggarwal, A Gattami
arXiv preprint arXiv:2003.05555, 2020
18*2020
Regret guarantees for model-based reinforcement learning with long-term average constraints
M Agarwal, Q Bai, V Aggarwal
Uncertainty in Artificial Intelligence, 22-31, 2022
172022
A reinforcement learning framework for vehicular network routing under peak and average constraints
N Geng, Q Bai, C Liu, T Lan, V Aggarwal, Y Yang, M Xu
IEEE Transactions on Vehicular Technology 72 (5), 6753-6764, 2023
142023
Concave utility reinforcement learning with zero-constraint violations
M Agarwal, Q Bai, V Aggarwal
arXiv preprint arXiv:2109.05439, 2021
142021
Reinforcement learning for multi-objective and constrained Markov decision processes
A Gattami, Q Bai, V Agarwal
arXiv preprint arXiv:1901.08978, 2019
142019
Regret analysis of policy gradient algorithm for infinite horizon average reward markov decision processes
Q Bai, WU Mondal, V Aggarwal
Proceedings of the AAAI Conference on Artificial Intelligence 38 (10), 10980 …, 2024
122024
Joint optimization of multi-objective reinforcement learning with policy gradient based algorithm
Q Bai, M Agarwal, V Aggarwal
arXiv preprint arXiv:2105.14125, 2021
102021
Escaping saddle points for zeroth-order non-convex optimization using estimated gradient descent
Q Bai, M Agarwal, V Aggarwal
2020 54th Annual Conference on Information Sciences and Systems (CISS), 1-6, 2020
82020
Achieving zero constraint violation for concave utility constrained reinforcement learning via primal-dual approach
Q Bai, AS Bedi, M Agarwal, A Koppel, V Aggarwal
Journal of Artificial Intelligence Research 78, 975-1016, 2023
72023
Markov decision processes with long-term average constraints
M Agarwal, Q Bai, V Aggarwal
arXiv preprint arXiv:2106.06680, 2021
72021
Provably sample-efficient model-free algorithm for mdps with peak constraints
Q Bai, V Aggarwal, A Gattami
Journal of Machine Learning Research 24 (60), 1-25, 2023
52023
Joint optimization of concave scalarized multi-objective reinforcement learning with policy gradient based algorithm
Q Bai, M Agarwal, V Aggarwal
Journal of Artificial Intelligence Research 74, 1565-1597, 2022
42022
Achieving zero constraint violation for constrained reinforcement learning via conservative natural policy gradient primal-dual algorithm
Q Bai, AS Bedi, V Aggarwal
arXiv preprint arXiv:2206.05850, 2022
32022
Model-free algorithm and regret analysis for MDPs with long-term constraints
Q Bai, V Aggarwal, A Gattami
arXiv preprint arXiv:2006.05961, 2020
12020
Constrained Reinforcement Learning with Average Reward Objective: Model-Based and Model-Free Algorithms
V Aggarwal, WU Mondal, Q Bai
arXiv preprint arXiv:2406.11481, 2024
2024
Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm
Q Bai, WU Mondal, V Aggarwal
arXiv preprint arXiv:2402.02042, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–20