Yuntao Bai

Cited by

	All	Since 2019
Citations	4126	4009
h-index	18	18
i10-index	19	19

2200

1100

550

1650

20162017201820192020202120222023202416 36 50 80 110 118 277 2131 1276

Public access

View all

5 articles

0 articles

available

not available

Based on funding mandates

Yuntao Bai

Anthropic, PBC

Verified email at anthropic.com - Homepage

Machine Learning Physics


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022	737	2022
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback Y Bai, A Jones, K Ndousse, A Askell, A Chen, N DasSarma, D Drain, ... arXiv preprint arXiv:2204.05862, 2022	681	2022
Constitutional AI: Harmlessness from AI Feedback Y Bai, S Kadavath, S Kundu, A Askell, J Kernion, A Jones, A Chen, ... arXiv preprint arXiv:2212.08073, 2022	583	2022
Scattering forms and the positive geometry of kinematics, color and the worldsheet N Arkani-Hamed, Y Bai, S He, G Yan Journal of High Energy Physics 2018 (5), 1-78, 2018	278	2018
Language models (mostly) know what they know S Kadavath, T Conerly, A Askell, T Henighan, D Drain, E Perez, ... arXiv preprint arXiv:2207.05221, 2022	223	2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned D Ganguli, L Lovitt, J Kernion, A Askell, Y Bai, S Kadavath, B Mann, ... arXiv preprint arXiv:2209.07858, 2022	214	2022
A General Language Assistant as a Laboratory for Alignment A Askell, Y Bai, A Chen, D Drain, D Ganguli, T Henighan, A Jones, ... arXiv preprint arXiv:2112.00861, 2021	214	2021
Positive geometries and canonical forms N Arkani-Hamed, Y Bai, T Lam Journal of High Energy Physics 2017 (11), 39, 2017	210	2017
In-context learning and induction heads C Olsson, N Elhage, N Nanda, N Joseph, N DasSarma, T Henighan, ... arXiv preprint arXiv:2209.11895, 2022	189	2022
Predictability and surprise in large generative models D Ganguli, D Hernandez, L Lovitt, N DasSarma, T Henighan, A Jones, ... arXiv preprint arXiv:2202.07785, 2022	172	2022
A mathematical framework for transformer circuits N Elhage, N Nanda, C Olsson, T Henighan, N Joseph, B Mann, A Askell, ... Transformer Circuits Thread, 2021	149	2021
Discovering Language Model Behaviors with Model-Written Evaluations E Perez, S Ringer, K Lukošiūtė, K Nguyen, E Chen, S Heiner, C Pettit, ... arXiv preprint arXiv:2212.09251, 2022	125	2022
The capacity for moral self-correction in large language models D Ganguli, A Askell, N Schiefer, T Liao, K Lukošiūtė, A Chen, A Goldie, ... arXiv preprint arXiv:2302.07459, 2023	92	2023
The amplituhedron from momentum twistor diagrams Y Bai, S He Journal of High Energy Physics 2015 (2), 65, 2015	69	2015
Gravitational-wave physics with Cosmic Explorer: limits to low-frequency sensitivity ED Hall, K Kuns, JR Smith, Y Bai, C Wipf, S Biscans, RX Adhikari, K Arai, ... Physical Review D 103 (12), 122004, 2021	62	2021
The amplituhedron and the one-loop Grassmannian measure Y Bai, S He, T Lam Journal of High Energy Physics 2016 (1), 112, 2016	49	2016
Measuring Progress on Scalable Oversight for Large Language Models SR Bowman, J Hyun, E Perez, E Chen, C Pettit, S Heiner, K Lukosuite, ... arXiv preprint arXiv:2211.03540, 2022	41	2022
A mathematical framework for transformer circuits. Transformer Circuits Thread, 2021 N Elhage, N Nanda, C Olsson, T Henighan, N Joseph, B Mann, A Askell, ...	24
Phase-sensitive optomechanical amplifier for quantum noise reduction in laser interferometers Y Bai, G Venugopalan, K Kuns, C Wipf, A Markowitz, AR Wade, Y Chen, ... Physical Review A 102 (2), 023507, 2020	13	2020
Positive Geometry of the S-Matrix Y Bai Princeton, NJ: Princeton University, 2018	1	2018

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by