A General Approach to Multi-Armed Bandits Under Risk Criteria A Cassel, S Mannor, A Zeevi Proceedings of the 31st Conference On Learning Theory 75, 1295--1306, 2018 | 86 | 2018 |
Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently A Cassel, A Cohen, T Koren Proceedings of the 37th International Conference on Machine Learning 119 …, 2020 | 73 | 2020 |
Bandit linear control A Cassel, T Koren Advances in Neural Information Processing Systems 33, 8872-8882, 2020 | 15 | 2020 |
Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with $\sqrt $ T Regret AB Cassel, T Koren International Conference on Machine Learning, 1304-1313, 2021 | 11 | 2021 |
Efficient online linear control with stochastic convex costs and unknown dynamics AB Cassel, A Cohen, T Koren Conference on Learning Theory, 3589-3604, 2022 | 5 | 2022 |
Rate-optimal online convex optimization in adaptive linear control AB Cassel, A Peled-Cohen, T Koren Advances in Neural Information Processing Systems 35, 7410-7422, 2022 | 4 | 2022 |
A General Framework for Bandit Problems Beyond Cumulative Objectives A Cassel, S Mannor, A Zeevi arXiv preprint arXiv:1806.01380, 2018 | 3 | 2018 |
A General Framework for Bandit Problems Beyond Cumulative Objectives A Cassel, S Mannor, A Zeevi Mathematics of Operations Research 48 (4), 2196-2232, 2023 | 2 | 2023 |
Eluder-based Regret for Stochastic Contextual MDPs O Levy, A Cassel, A Cohen, Y Mansour arXiv preprint arXiv:2211.14932, 2022 | 2 | 2022 |
The Pendulum Arrangement: Maximizing the Escape Time of Heterogeneous Random Walks A Cassel, S Mannor, G Tennenholtz arXiv preprint arXiv:2007.13232, 2020 | 1 | 2020 |
Efficient rate optimal regret for adversarial contextual MDPs using online function approximation O Levy, A Cohen, A Cassel, Y Mansour International Conference on Machine Learning, 19287-19314, 2023 | | 2023 |
Counterfactual Optimism: Rate Optimal Regret for Stochastic Contextual MDPs. O Levy, AB Cassel, A Cohen, Y Mansour CoRR, 2022 | | 2022 |