Devansh Arpit

Cited by

	All	Since 2019
Citations	4747	4409
h-index	21	21
i10-index	26	22

1300

650

325

975

20162017201820192020202120222023202421 66 216 359 539 698 1023 1286 502

Public access

View all

4 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Yoshua BengioProfessor of computer science, University of Montreal, Mila, IVADO, CIFARVerified email at umontreal.ca
Stanisław JastrzębskiChief Technology Officer & Chief Scientist @ Molecule.OneVerified email at molecule.one
Aaron CourvilleProfessor, DIRO, Université de Montréal, Mila, Cifar CAI chairVerified email at umontreal.ca
Venu GovindarajuSUNY Distinguished Professor, State University of New York, BuffaloVerified email at buffalo.edu
Yingbo ZhouSenior Research Director, Salesforce ResearchVerified email at salesforce.com
Hung Q. NgoRelationalAIVerified email at relational.ai
Chen Xing (星辰)Salesforce ResearchVerified email at salesforce.com
Ifeoma NwoguComputer Science and Engineering, University at Buffalo, SUNYVerified email at buffalo.edu
Anoop M NamboodiriProfessor, IIIT HyderabadVerified email at iiit.ac.in
Yun Raymond FuNEU, COE Distinguished Professor; MAE, FNAI, FAAAS, FIEEE, FSPIE, FOSA, FIAPRVerified email at neu.edu
Shuang WuAmazon.comVerified email at amazon.com
Nils NappElectrical and Computer Engineering, Cornell UniversityVerified email at cornell.edu

Devansh Arpit

Rashi.ai

Verified email at rashi.ai

Deep Learning NLP


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
A closer look at memorization in deep networks D Arpit, S Jastrzębski, N Ballas, D Krueger, E Bengio, MS Kanwal, ... ICML 2017 (arXiv preprint arXiv:1706.05394), 2017	1820	2017
On the spectral bias of deep neural networks N Rahaman, D Arpit, A Baratin, F Draxler, M Lin, FA Hamprecht, Y Bengio, ... ICML 2019 (arXiv preprint arXiv:1806.08734), 2018	1114*	2018
Three factors influencing minima in SGD S Jastrzębski, Z Kenton, D Arpit, N Ballas, A Fischer, Y Bengio, A Storkey ICANN 2018 (arXiv preprint arXiv:1711.04623), 2017	488	2017
The Break-Even Point on Optimization Trajectories of Deep Neural Networks S Jastrzebski, M Szymczak, S Fort, D Arpit, J Tabor, K Cho, K Geras ICLR 2020 (arXiv preprint arXiv:2002.09572), 2020	144	2020
Normalization propagation: A parametric technique for removing internal covariate shift in deep networks D Arpit, Y Zhou, BU Kota, V Govindaraju ICML 2016 (arXiv preprint arXiv:1603.01431), 2016	141	2016
Residual connections encourage iterative inference S Jastrzebski, D Arpit, N Ballas, V Verma, T Che, Y Bengio ICLR 2018 (arXiv preprint arXiv:1710.04773), 2017	130	2017
A walk with sgd C Xing, D Arpit, C Tsirigotis, Y Bengio arXiv preprint arXiv:1802.08770, 2018	110	2018
Ensemble of averages: Improving model selection and boosting performance in domain generalization D Arpit, H Wang, Y Zhou, C Xiong NeurIPS 2022, 2021	93	2021
Why regularized auto-encoders learn sparse representation? D Arpit, Y Zhou, H Ngo, V Govindaraju ICML 2016 (arXiv preprint arXiv:1505.05561), 2015	91	2015
Deep Nets Don't Learn via Memorization D Krueger, N Ballas, S Jastrzebski, D Arpit, MS Kanwal, T Maharaj, ... ICLR 2017 Workshop, 2017	65	2017
Fraternal Dropout K Zolna, D Arpit, D Suhubdy, Y Bengio ICLR 2018 (arXiv preprint arXiv:1711.00066), 2017	60	2017
How to Initialize your Network? Robust Initialization for WeightNorm & ResNets D Arpit, V Campos, Y Bengio NeurIPs 2019, 2019	52	2019
Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization S Jastrzebski, D Arpit, O Astrand, G Kerg, H Wang, C Xiong, R Socher, ... ICML 2021, 2020	47	2020
h-detach: Modifying the LSTM Gradient Towards Better Optimization D Arpit, B Kanuparthi, G Kerg, NR Ke, I Mitliagkas, Y Bengio ICLR 2019 (arXiv preprint arXiv:1810.03023), 2018	44	2018
Variational bi-lstms S Shabanian, D Arpit, A Trischler, Y Bengio arXiv preprint arXiv:1711.05717, 2017	41	2017
Is joint training better for deep auto-encoders? Y Zhou, D Arpit, I Nwogu, V Govindaraju arXiv preprint arXiv:1405.1380, 2014	40	2014
Bolaa: Benchmarking and orchestrating llm-augmented autonomous agents Z Liu, W Yao, J Zhang, L Xue, S Heinecke, R Murthy, Y Feng, Z Chen, ... arXiv preprint arXiv:2308.05960, 2023	35	2023
Finding Flatter Minima with SGD S Jastrzębski, Z Kenton, D Arpit, N Ballas, A Fischer, Y Bengio, A Storkey ICLR 2018 Workshop, 2018	34	2018
The benefits of over-parameterization at initialization in deep ReLU networks D Arpit, Y Bengio arXiv preprint arXiv:1901.03611, 2019	32	2019
Retroformer: Retrospective large language agents with policy gradient optimization W Yao, S Heinecke, JC Niebles, Z Liu, Y Feng, L Xue, R Murthy, Z Chen, ... arXiv preprint arXiv:2308.02151, 2023	26	2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors