PyTorch: An imperative style, high-performance deep learning library A Paszke, S Gross, F Massa, A Lerer, J Bradbury, G Chanan, T Killeen, ... Advances in Neural Information Processing Systems, 8024-8035, 2019 | 32643 | 2019 |
JAX: composable transformations of Python+NumPy programs J Bradbury, R Frostig, P Hawkins, MJ Johnson, C Leary, D Maclaurin, ... https://github.com/google/jax, 18, 2018 | 1583 | 2018 |
Pointer Sentinel Mixture Models S Merity, C Xiong, J Bradbury, R Socher ICLR 2017, 2016 | 1579 | 2016 |
PaLM: Scaling Language Modeling with Pathways A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... arXiv preprint arXiv:2204.02311, 2022 | 1488 | 2022 |
Ask me anything: Dynamic memory networks for natural language processing A Kumar, O Irsoy, P Ondruska, M Iyyer, J Bradbury, I Gulrajani, V Zhong, ... ICML 2016, 2016 | 1462 | 2016 |
Learned in Translation: Contextualized Word Vectors B McCann, J Bradbury, C Xiong, R Socher NIPS 2017, 2017 | 1134 | 2017 |
Non-Autoregressive Neural Machine Translation J Gu, J Bradbury, C Xiong, VOK Li, R Socher ICLR 2018, 2018 | 673 | 2018 |
Quasi-Recurrent Neural Networks J Bradbury, S Merity, C Xiong, R Socher ICLR 2017, 2016 | 548 | 2016 |
Scaling language models: Methods, analysis & insights from training gopher JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... arXiv preprint arXiv:2112.11446, 2021 | 458 | 2021 |
OpenSpiel: A framework for reinforcement learning in games M Lanctot, E Lockhart, JB Lespiau, V Zambaldi, S Upadhyay, J Pérolat, ... arXiv preprint arXiv:1908.09453, 2019 | 183 | 2019 |
PaLI: A Jointly-Scaled Multilingual Language-Image Model X Chen, X Wang, S Changpinyo, AJ Piergiovanni, P Padlewski, D Salz, ... arXiv preprint arXiv:2209.06794, 2022 | 162 | 2022 |
Palm 2 technical report R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ... arXiv preprint arXiv:2305.10403, 2023 | 112 | 2023 |
Scaling Up Models and Data with and A Roberts, HW Chung, A Levskaya, G Mishra, J Bradbury, D Andor, ... arXiv preprint arXiv:2203.17189, 2022 | 64 | 2022 |
Efficiently Scaling Transformer Inference R Pope, S Douglas, A Chowdhery, J Devlin, J Bradbury, A Levskaya, ... MLSys 2023, 2022 | 38 | 2022 |
On Machine Learning and Programming Languages M Innes, S Karpinski, V Shah, D Barber, P Stenetorp, T Besard, ... SysML 2018, 2018 | 19 | 2018 |
A Flexible Approach to Automated RNN Architecture Generation M Schrimpf, S Merity, J Bradbury, R Socher ICLR Workshop 2018, 2018 | 19 | 2018 |
MetaMind Neural Machine Translation System for WMT 2016 J Bradbury, R Socher WMT 2016, 2016 | 18 | 2016 |
Towards Neural Machine Translation with Latent Tree Attention J Bradbury, R Socher SPNLP 2017, 2017 | 16 | 2017 |
Block-diagonal Hessian-free Optimization for Training Neural Networks H Zhang, C Xiong, J Bradbury, R Socher arXiv preprint arXiv:1712.07296, 2017 | 15 | 2017 |
Exploring the limits of Concurrency in ML Training on Google TPUs S Kumar, Y Wang, C Young, J Bradbury, N Kumar, D Chen, A Swing MLSys 2021, 2021 | 14 | 2021 |