Representation tradeoffs for hyperbolic embeddings F Sala, C De Sa, A Gu, C Ré International conference on machine learning, 4460-4469, 2018 | 304 | 2018 |

A kernel theory of modern data augmentation T Dao, A Gu, A Ratner, V Smith, C De Sa, C Ré International Conference on Machine Learning, 1528-1537, 2019 | 149 | 2019 |

Learning mixed-curvature representations in product spaces A Gu, F Sala, B Gunel, C Ré International Conference on Learning Representations, 2019 | 147 | 2019 |

No subclass left behind: Fine-grained robustness in coarse-grained classification problems N Sohoni, J Dunnmon, G Angus, A Gu, C Ré Advances in Neural Information Processing Systems 33, 19339-19352, 2020 | 100 | 2020 |

Efficiently modeling long sequences with structured state spaces A Gu, K Goel, C Ré arXiv preprint arXiv:2111.00396, 2021 | 92 | 2021 |

Learning fast algorithms for linear transforms using butterfly factorizations T Dao, A Gu, M Eichhorn, A Rudra, C Ré International conference on machine learning, 1517-1527, 2019 | 69 | 2019 |

Hippo: Recurrent memory with optimal polynomial projections A Gu, T Dao, S Ermon, A Rudra, C Ré Advances in neural information processing systems 33, 1474-1487, 2020 | 67 | 2020 |

Model patching: Closing the subgroup performance gap with data augmentation K Goel, A Gu, Y Li, C Ré arXiv preprint arXiv:2008.06775, 2020 | 66 | 2020 |

The power of deferral: maintaining a constant-competitive steiner tree online A Gu, A Gupta, A Kumar Proceedings of the forty-fifth annual ACM symposium on Theory of Computing …, 2013 | 55 | 2013 |

From trees to continuous embeddings and back: Hyperbolic hierarchical clustering I Chami, A Gu, V Chatziafratis, C Ré Advances in Neural Information Processing Systems 33, 15065-15076, 2020 | 52 | 2020 |

It’s raw! audio generation with state-space models K Goel, A Gu, C Donahue, C Ré International Conference on Machine Learning, 7616-7633, 2022 | 40 | 2022 |

Learning compressed transforms with low displacement rank A Thomas, A Gu, T Dao, A Rudra, C Ré Advances in neural information processing systems 31, 2018 | 40 | 2018 |

Combining recurrent, convolutional, and continuous-time models with linear state space layers A Gu, I Johnson, K Goel, K Saab, T Dao, A Rudra, C Ré Advances in neural information processing systems 34, 572-585, 2021 | 33 | 2021 |

Kaleidoscope: An efficient, learnable representation for all structured linear maps T Dao, NS Sohoni, A Gu, M Eichhorn, A Blonder, M Leszczynski, A Rudra, ... arXiv preprint arXiv:2012.14966, 2020 | 33 | 2020 |

Improving the gating mechanism of recurrent neural networks A Gu, C Gulcehre, T Paine, M Hoffman, R Pascanu International Conference on Machine Learning, 3800-3809, 2020 | 32 | 2020 |

A two-pronged progress in structured dense matrix vector multiplication C De Sa, A Cu, R Puttagunta, C Ré, A Rudra Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete …, 2018 | 23 | 2018 |

Diagonal state spaces are as effective as structured state spaces A Gupta, A Gu, J Berant Advances in Neural Information Processing Systems 35, 22982-22994, 2022 | 18 | 2022 |

On the parameterization and initialization of diagonal state space models A Gu, K Goel, A Gupta, C Ré Advances in Neural Information Processing Systems 35, 35971-35983, 2022 | 17 | 2022 |

Horopca: Hyperbolic dimensionality reduction via horospherical projections I Chami, A Gu, DP Nguyen, C Ré International Conference on Machine Learning, 1419-1429, 2021 | 17 | 2021 |

Catformer: Designing stable transformers via sensitivity analysis JQ Davis, A Gu, K Choromanski, T Dao, C Re, C Finn, P Liang International Conference on Machine Learning, 2489-2499, 2021 | 9 | 2021 |