Suivre
Deep Ganguli
Deep Ganguli
Anthropic
Adresse e-mail validée de cns.nyu.edu
Titre
Citée par
Citée par
Année
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models
A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ...
arXiv preprint arXiv:2206.04615, 2022
7392022
Training a helpful and harmless assistant with reinforcement learning from human feedback
Y Bai, A Jones, K Ndousse, A Askell, A Chen, N DasSarma, D Drain, ...
arXiv preprint arXiv:2204.05862, 2022
6832022
The AI index 2021 annual report
D Zhang, S Mishra, E Brynjolfsson, J Etchemendy, D Ganguli, B Grosz, ...
arXiv preprint arXiv:2103.06312, 2021
590*2021
Constitutional ai: Harmlessness from ai feedback
Y Bai, S Kadavath, S Kundu, A Askell, J Kernion, A Jones, A Chen, ...
arXiv preprint arXiv:2212.08073, 2022
5842022
Druid: A real-time analytical data store
F Yang, E Tschetter, X Léauté, N Ray, G Merlino, D Ganguli
Proceedings of the 2014 ACM SIGMOD international conference on Management of …, 2014
2382014
Understanding the Capabilities
A Tamkin, M Brundage, J Clark, D Ganguli
Limitations, and Societal Impact of Large Language Models, 2021
235*2021
Efficient sensory encoding and Bayesian inference with heterogeneous neural populations
D Ganguli, EP Simoncelli
Neural computation 26 (10), 2103-2134, 2014
2282014
Language models (mostly) know what they know
S Kadavath, T Conerly, A Askell, T Henighan, D Drain, E Perez, ...
arXiv preprint arXiv:2207.05221, 2022
2232022
Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned
D Ganguli, L Lovitt, J Kernion, A Askell, Y Bai, S Kadavath, B Mann, ...
arXiv preprint arXiv:2209.07858, 2022
2142022
A general language assistant as a laboratory for alignment
A Askell, Y Bai, A Chen, D Drain, D Ganguli, T Henighan, A Jones, ...
arXiv preprint arXiv:2112.00861, 2021
2142021
In-context learning and induction heads
C Olsson, N Elhage, N Nanda, N Joseph, N DasSarma, T Henighan, ...
arXiv preprint arXiv:2209.11895, 2022
1892022
Predictability and surprise in large generative models
D Ganguli, D Hernandez, L Lovitt, A Askell, Y Bai, A Chen, T Conerly, ...
Proceedings of the 2022 ACM Conference on Fairness, Accountability, and …, 2022
1722022
A mathematical framework for transformer circuits
N Elhage, N Nanda, C Olsson, T Henighan, N Joseph, B Mann, A Askell, ...
Transformer Circuits Thread 1, 1, 2021
167*2021
Discovering language model behaviors with model-written evaluations
E Perez, S Ringer, K Lukošiūtė, K Nguyen, E Chen, S Heiner, C Pettit, ...
arXiv preprint arXiv:2212.09251, 2022
1252022
Implicit encoding of prior probabilities in optimal neural populations
D Ganguli, E Simoncelli
Advances in neural information processing systems 23, 2010
1062010
The capacity for moral self-correction in large language models
D Ganguli, A Askell, N Schiefer, TI Liao, K Lukošiūtė, A Chen, A Goldie, ...
arXiv preprint arXiv:2302.07459, 2023
932023
Towards measuring the representation of subjective global opinions in language models
E Durmus, K Nyugen, TI Liao, N Schiefer, A Askell, A Bakhtin, C Chen, ...
arXiv preprint arXiv:2306.16388, 2023
622023
Neural and perceptual signatures of efficient sensory coding
D Ganguli, EP Simoncelli
arXiv preprint arXiv:1603.00058, 2016
262016
Starfish: Open source image based transcriptomics and proteomics tools
S Axelrod, AJ Carr, J Freeman, D Ganguli, B Long, T Tung
J. Open Source Softw 6, 2440, 2018
23*2018
Evaluating and mitigating discrimination in language model decisions
A Tamkin, A Askell, L Lovitt, E Durmus, N Joseph, S Kravec, K Nguyen, ...
arXiv preprint arXiv:2312.03689, 2023
112023
Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.
Articles 1–20