Adam Gleave

Cited by

	All	Since 2019
Citations	3688	3590
h-index	15	14
i10-index	15	15

1300

650

325

975

2017201820192020202120222023202426 55 142 340 602 867 1265 368

Public access

View all

2 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Antonin RaffinDLRVerified email at dlr.de
Ashley WD HillResearch EngineerVerified email at ensta-paristech.fr
Anssi KanervistoResearcher, FAIR MetaVerified email at meta.com
Stuart RussellProfessor of Computer Science, University of California, BerkeleyVerified email at cs.berkeley.edu
Sergey LevineUC Berkeley, Physical IntelligenceVerified email at eecs.berkeley.edu
Ionel GogGoogleVerified email at google.com
Steven HandCambridgeVerified email at cl.cam.ac.uk
Malte SchwarzkopfBrown UniversityVerified email at cs.brown.edu
Robert N. M. WatsonProfessor, Department of Computer Science and Technology, University of CambridgeVerified email at cl.cam.ac.uk
Dylan Hadfield-MenellMassachusetts Institute of TechnologyVerified email at csail.mit.edu
Rohin ShahResearch Scientist, Google DeepMindVerified email at deepmind.com
Sören MindermannUniversity of Oxford, OATMLVerified email at cs.ox.ac.uk

Adam Gleave

CEO at FAR AI

Verified email at far.ai - Homepage

Machine Learning Deep RL


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Stable-baselines3: Reliable reinforcement learning implementations A Raffin, A Hill, A Gleave, A Kanervisto, M Ernestus, N Dormann Journal of Machine Learning Research 22 (268), 1-8, 2021	1720	2021
Stable baselines A Hill, A Raffin, M Ernestus, A Gleave, A Kanervisto, R Traore, P Dhariwal, ...	879	2018
Adversarial policies: Attacking deep reinforcement learning A Gleave, M Dennis, C Wild, N Kant, S Levine, S Russell International Conference on Learning Representations, 2020	390	2020
Firmament: Fast, centralized cluster scheduling at scale I Gog, M Schwarzkopf, A Gleave, RNM Watson, S Hand 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2016	276	2016
Inverse reinforcement learning for video games A Tucker, A Gleave, S Russell Deep Reinforcement Learning Workshop at NeurIPS, 2018	54	2018
Quantifying differences in reward functions A Gleave, M Dennis, S Legg, S Russell, J Leike International Conference on Learning Representations, 2021	52	2021
imitation: Clean imitation learning implementations A Gleave, M Taufeeque, J Rocamonde, E Jenner, SH Wang, S Toyer, ... arXiv preprint arXiv:2211.11972, 2022	51*	2022
Multi-task maximum entropy inverse reinforcement learning A Gleave, O Habryka GoalsRL Workshop at ICML, 2018	44	2018
Adversarial Policies Beat Superhuman Go AIs TT Wang, A Gleave, T Tseng, N Belrose, J Miller, MD Dennis, Y Duan, ... arXiv preprint arXiv:2211.00241, 2022	38*	2022
Active inverse reward design S Mindermann, R Shah, A Gleave, D Hadfield-Menell GoalsRL Workshop at ICML, 2018	28	2018
Understanding learned reward functions EJ Michaud, A Gleave, S Russell Deep Reinforcement Learning Workshop at NeurIPS, 2020	26	2020
Invariance in policy optimisation and partial identifiability in reward learning JMV Skalse, M Farrugia-Roberts, S Russell, A Abate, A Gleave International Conference on Machine Learning, 32033-32058, 2023	24	2023
Uncertainty estimation for language reward models A Gleave, G Irving arXiv preprint arXiv:2203.07472, 2022	21	2022
A primer on maximum causal entropy inverse reinforcement learning A Gleave, S Toyer arXiv preprint arXiv:2203.11409, 2022	18	2022
Making compression algorithms for Unicode text A Gleave, C Steinruecken Data Compression Conference, 2017	16	2017
On the fragility of learned reward functions L McKinney, Y Duan, D Krueger, A Gleave arXiv preprint arXiv:2301.03652, 2023	9	2023
Exploiting novel gpt-4 apis K Pelrine, M Taufeeque, M Zając, E McLean, A Gleave arXiv preprint arXiv:2312.14302, 2023	8	2023
Preprocessing reward functions for interpretability E Jenner, A Gleave arXiv preprint arXiv:2203.13553, 2022	8	2022
DERAIL: Diagnostic Environments for Reward And Imitation Learning P Freire, A Gleave, S Toyer, S Russell Deep Reinforcement Learning Workshop at NeurIPS, 2020	8	2020
Reducing exploitability with population based training P Czempin, A Gleave arXiv preprint arXiv:2208.05083, 2022	5	2022

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors