Follow
Zeyuan Allen-Zhu
Zeyuan Allen-Zhu
Meta AI / FAIR Labs
Verified email at csail.mit.edu - Homepage
Title
Cited by
Cited by
Year
LoRA: Low-rank adaptation of large language models
EJ Hu, Y Shen, P Wallis, Z Allen-Zhu, Y Li, S Wang, L Wang, W Chen
ICLR 2022: International Conference on Learning Representations, 2022
9732*2022
A convergence theory for deep learning via over-parameterization
Z Allen-Zhu, Y Li, Z Song
ICML 2019: International Conference on Machine Learning, 2019
16772019
Is Q-learning Provably Efficient?
C Jin, Z Allen-Zhu, S Bubeck, MI Jordan
NIPS 2018: Neural Information Processing Systems, 2018
9992018
Learning and generalization in overparameterized neural networks, going beyond two layers
Z Allen-Zhu, Y Li, Y Liang
NeurIPS 2019: Neural Information Processing Systems, 2019
8982019
Katyusha: the first direct acceleration of stochastic gradient methods
Z Allen-Zhu
STOC 2017: Symposium on Theory of Computing, 19-23, 2017
7112017
Variance reduction for faster non-convex optimization
Z Allen-Zhu, E Hazan
ICML 2016: International Conference on Machine Learning, 699-707, 2016
4482016
Towards understanding ensemble, knowledge distillation and self-distillation in deep learning
Z Allen-Zhu, Y Li
ICLR 2023: International Conference on Learning Representations, 2023
4442023
Linear coupling: An ultimate unification of gradient and mirror descent
Z Allen-Zhu, L Orecchia
ITCS 2017: Innovations in Theoretical Computer Science, 2017
3932017
Finding approximate local minima faster than gradient descent
N Agarwal, Z Allen-Zhu, B Bullins, E Hazan, T Ma
STOC 2017: Symposium on Theory of Computing, 1195-1199, 2017
361*2017
Byzantine Stochastic Gradient Descent
D Alistarh, Z Allen-Zhu, J Li
NIPS 2018: Neural Information Processing Systems, 2018
3482018
A simple, combinatorial algorithm for solving SDD systems in nearly-linear time
JA Kelner, L Orecchia, A Sidford, ZA Zhu
STOC 2013: Symposium on Theory of Computing, 911-920, 2013
2972013
Natasha 2: Faster Non-Convex Optimization Than SGD
Z Allen-Zhu
NIPS 2018: Neural Information Processing Systems, 2018
2672018
Improved SVRG for non-strongly-convex or sum-of-non-convex objectives
Z Allen-Zhu, Y Yuan
ICML 2016: International Conference on Machine Learning, 1080-1089, 2016
2402016
What Can ResNet Learn Efficiently, Going Beyond Kernels?
Z Allen-Zhu, Y Li
NeurIPS 2019: Neural Information Processing Systems, 2019
2302019
Even faster accelerated coordinate descent using non-uniform sampling
Z Allen-Zhu, Z Qu, P Richtárik, Y Yuan
ICML 2016: International Conference on Machine Learning, 1110-1119, 2016
2122016
On the convergence rate of training recurrent neural networks
Z Allen-Zhu, Y Li, Z Song
NeurIPS 2019: Neural Information Processing Systems, 2019
2102019
Asymptotically optimal strategy-proof mechanisms for two-facility games
P Lu, X Sun, Y Wang, ZA Zhu
ACM-EC 2010: Conference on Economics and Computation, 315-324, 2010
2082010
Feature purification: How adversarial training performs robust deep learning
Z Allen-Zhu, Y Li
FOCS 2021: Symposium on Foundations of Computer Science, 977-988, 2022
1762022
Neon2: Finding Local Minima via First-Order Oracles
Z Allen-Zhu, Y Li
NIPS 2018: Neural Information Processing Systems, 2018
1632018
LazySVD: Even faster SVD decomposition yet without agonizing pain
Z Allen-Zhu, Y Li
NIPS 2016: Neural Information Processing Systems, 974-982, 2016
1462016
The system can't perform the operation now. Try again later.
Articles 1–20