Attention is all you need A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ... Advances in neural information processing systems, 5998-6008, 2017 | 5172 | 2017 |
One model to learn them all L Kaiser, AN Gomez, N Shazeer, A Vaswani, N Parmar, L Jones, ... arXiv preprint arXiv:1706.05137, 2017 | 151 | 2017 |
Tensor2tensor for neural machine translation A Vaswani, S Bengio, E Brevdo, F Chollet, AN Gomez, S Gouws, L Jones, ... arXiv preprint arXiv:1803.07416, 2018 | 141 | 2018 |
The reversible residual network: Backpropagation without storing activations AN Gomez, M Ren, R Urtasun, RB Grosse Advances in neural information processing systems, 2214-2224, 2017 | 101 | 2017 |
Depthwise Separable Convolutions for Neural Machine Translation L Kaiser, AN Gomez, F Chollet International Conference on Learning Representations, 2018 | 75 | 2018 |
Unsupervised cipher cracking using discrete gans AN Gomez, S Huang, I Zhang, BM Li, M Osama, L Kaiser arXiv preprint arXiv:1801.04883, 2018 | 21 | 2018 |
Learning Sparse Networks Using Targeted Dropout AN Gomez, I Zhang, S Rao Kamalakara, D Madaan, K Swersky, Y Gal, ... arXiv preprint arXiv:1905.13678, 2019 | 14* | 2019 |
The Difficulty of Training Sparse Neural Networks U Evci, F Pedregosa, A Gomez, E Elsen arXiv preprint arXiv:1906.10732, 2019 | 1 | 2019 |
Attention-based sequence transduction neural networks NM Shazeer, AN Gomez, LM Kaiser, JD Uszkoreit, LO Jones, NJ Parmar, ... US Patent 10,452,978, 2019 | | 2019 |
Benchmarking Bayesian Deep Learning with Diabetic Retinopathy Diagnosis A Filos, S Farquhar, AN Gomez, TGJ Rudner, Z Kenton, L Smith, ... | | 2019 |
RL: Generic reinforcement learning codebase in TensorFlow BM Li, A Cowen-Rivers, P Kozakowski, D Tao, SR Kamalakara, ... Journal of Open Source Software 4 (42), 1524, 2019 | | 2019 |