Gemini: A family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 2192 | 2023 |
Gemma: Open Models Based on Gemini Research and Technology G Team, T Mesnard, C Hardin, R Dadashi, S Bhupatiraju, S Pathak, ... arXiv preprint arXiv:2403.08295, 2024 | 724 | 2024 |
Acme: A research framework for distributed reinforcement learning MW Hoffman, B Shahriari, J Aslanides, G Barth-Maron, N Momchev, ... arXiv preprint arXiv:2006.00979, 2020 | 266 | 2020 |
Gemma 2: Improving Open Language Models at a Practical Size G Team, M Riviere, S Pathak, PG Sessa, C Hardin, S Bhupatiraju, ... arXiv preprint arXiv:2408.00118, 2024 | 193 | 2024 |
Primal Wasserstein Imitation Learning R Dadashi, L Hussenot, M Geist, O Pietquin International Conference on Learning Representations (ICLR), 2021 | 147 | 2021 |
A Geometric Perspective on Optimal Representations for Reinforcement Learning M Bellemare, W Dabney, R Dadashi, A Ali Taiga, PS Castro, N Le Roux, ... Neural Information Processing Systems (NeurIPS), 2019 | 108 | 2019 |
Statistics and Samples in Distributional Reinforcement Learning M Rowland, R Dadashi, S Kumar, R Munos, MG Bellemare, W Dabney International Conference on Machine Learning (ICML), 2019 | 101 | 2019 |
What Matters for Adversarial Imitation Learning? M Orsini, A Raichuk, L Hussenot, D Vincent, R Dadashi, S Girgin, M Geist, ... Neural Information Processing Systems (NeurIPS), 2021 | 79 | 2021 |
The Value-Improvement Path: Towards Better Representations for Reinforcement Learning W Dabney, A Barreto, M Rowland, R Dadashi, J Quan, MG Bellemare, ... AAAI Conference on Artificial Intelligence, 2021 | 74 | 2021 |
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback P Roit, J Ferret, L Shani, R Aharoni, G Cideron, R Dadashi, M Geist, ... Annual Meeting of the Association for Computational Linguistics (ACL), 2023 | 70 | 2023 |
Offline Reinforcement Learning as Anti-Exploration S Rezaeifar*, R Dadashi*, N Vieillard, L Hussenot, O Bachem, O Pietquin, ... AAAI Conference on Artificial Intelligence, 2022 | 57 | 2022 |
WARM: On the Benefits of Weight Averaged Reward Models A Ramé, N Vieillard, L Hussenot, R Dadashi, G Cideron, O Bachem, ... arXiv preprint arXiv:2401.12187, 2024 | 47 | 2024 |
The Value Function Polytope in Reinforcement Learning R Dadashi, AA Taïga, NL Roux, D Schuurmans, MG Bellemare International Conference on Machine Learning (ICML), 2019 | 47 | 2019 |
Offline Reinforcement Learning with Pseudometric Learning R Dadashi, S Rezaeifar, N Vieillard, L Hussenot, O Pietquin, M Geist International Conference on Machine Learning (ICML), 2021 | 41 | 2021 |
Continuous Control with Action Quantization from Demonstrations R Dadashi*, L Hussenot*, D Vincent, S Girgin, A Raichuk, M Geist, ... International Conference on Machine Learning (ICML), 2022 | 34 | 2022 |
Hyperparameter Selection for Imitation Learning L Hussenot, M Andrychowicz, D Vincent, R Dadashi, A Raichuk, ... International Conference on Machine Learning (ICML), 2021 | 22 | 2021 |
BOND: Aligning LLMs with Best-of-N Distillation PG Sessa, R Dadashi, L Hussenot, J Ferret, N Vieillard, A Ramé, ... arXiv preprint arXiv:2407.14622, 2024 | 14 | 2024 |
Show me the Way: Intrinsic Motivation from Demonstrations L Hussenot, R Dadashi, M Geist, O Pietquin International Conference on Autonomous Agents and Multiagent Systems (AAMAS …, 2020 | 12 | 2020 |
Learning Energy Networks with Generalized Fenchel-Young Losses M Blondel, F Llinares-López, R Dadashi, L Hussenot, M Geist Neural Information Processing Systems (NeurIPS), 2022 | 7 | 2022 |
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models A Botev, S De, SL Smith, A Fernando, GC Muraru, R Haroun, L Berrada, ... arXiv preprint arXiv:2404.07839, 2024 | 6 | 2024 |