Aaron Mishkin
Aaron Mishkin
PhD Student, Stanford University
Verified email at - Homepage
Cited by
Cited by
Painless stochastic gradient: Interpolation, line-search, and convergence rates
S Vaswani, A Mishkin, I Laradji, M Schmidt, G Gidel, S Lacoste-Julien
Advances in neural information processing systems 32, 2019
Slang: Fast structured covariance approximations for bayesian deep learning with natural gradient
A Mishkin, F Kunstner, D Nielsen, M Schmidt, ME Khan
Advances in Neural Information Processing Systems 31, 2018
Fast convex optimization for two-layer relu networks: Equivalent model classes and cone decompositions
A Mishkin, A Sahiner, M Pilanci
International Conference on Machine Learning, 15770-15816, 2022
To each optimizer a norm, to each norm its generalization
S Vaswani, R Babanezhad, J Gallego-Posada, A Mishkin, ...
arXiv preprint arXiv:2006.06821, 2020
Interpolation, Growth Conditions, and Stochastic Gradient Descent
A Mishkin
University of British Columbia, 2020
Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm
AV Ramesh, A Mishkin, M Schmidt, Y Zhou, JW Lavington, J She
arXiv preprint arXiv:2307.01169, 2023
Optimal Sets and Solution Paths of ReLU Networks
A Mishkin, M Pilanci
arXiv preprint arXiv:2306.00119, 2023
Fast Convergence of Greedy 2-Coordinate Updates for Optimizing with an Equality Constraint
AV Ramesh, A Mishkin, M Schmidt
OPT 2022: Optimization for Machine Learning (NeurIPS 2022 Workshop), 2022
The Solution Path of the Group Lasso
A Mishkin, M Pilanci
OPT 2022: Optimization for Machine Learning (NeurIPS 2022 Workshop), 2022
Web ValueCharts: Analyzing Individual and Group Preferences with Interactive, Web-based Visualizations
A Mishkin
How to make your optimizer generalize better
S Vaswani, R Babenzhad, J Gallego, A Mishkin, S Lacoste-Julien, ...
The system can't perform the operation now. Try again later.
Articles 1–11