Albert Gu

According to our database1, Albert Gu authored at least 43 papers between 2015 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
On the Benefits of Memory for Modeling Time-Dependent PDEs.
CoRR, 2024

Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models.
CoRR, 2024

Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers.
CoRR, 2024

An Empirical Study of Mamba-based Language Models.
CoRR, 2024

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models.
CoRR, 2024

Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Augmenting Conformers With Structured State-Space Sequence Models For Online Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Mamba: Linear-Time Sequence Modeling with Selective State Spaces.
CoRR, 2023

Augmenting conformers with structured state space models for online speech recognition.
CoRR, 2023

Structured State Space Models for In-Context Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Resurrecting Recurrent Neural Networks for Long Sequences.
Proceedings of the International Conference on Machine Learning, 2023

Modelling Long Range Dependencies in $N$D: From Task-Specific to a General Purpose CNN.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

How to Train your HIPPO: State Space Models with Generalized Orthogonal Basis Projections.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Pretraining Without Attention.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

2022
S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces.
CoRR, 2022

Towards a General Purpose CNN for Long Range Dependencies in ND.
CoRR, 2022

S4ND: Modeling Images and Videos as Multidimensional Signals with State Spaces.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

On the Parameterization and Initialization of Diagonal State Space Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Diagonal State Spaces are as Effective as Structured State Spaces.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

It's Raw! Audio Generation with State-Space Models.
Proceedings of the International Conference on Machine Learning, 2022

Efficiently Modeling Long Sequences with Structured State Spaces.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
Combining Recurrent, Convolutional, and Continuous-time Models with Linear State Space Layers.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Catformer: Designing Stable Transformers via Sensitivity Analysis.
Proceedings of the 38th International Conference on Machine Learning, 2021

HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections.
Proceedings of the 38th International Conference on Machine Learning, 2021

Model Patching: Closing the Subgroup Performance Gap with Data Augmentation.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

HiPPO: Recurrent Memory with Optimal Polynomial Projections.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Improving the Gating Mechanism of Recurrent Neural Networks.
Proceedings of the 37th International Conference on Machine Learning, 2020

Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps.
Proceedings of the 8th International Conference on Learning Representations, 2020

Sparse Recovery for Orthogonal Polynomial Transforms.
Proceedings of the 47th International Colloquium on Automata, Languages, and Programming, 2020

2019
Improving the Gating Mechanism of Recurrent Neural Networks.
CoRR, 2019

A Kernel Theory of Modern Data Augmentation.
Proceedings of the 36th International Conference on Machine Learning, 2019

Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations.
Proceedings of the 36th International Conference on Machine Learning, 2019

Learning Mixed-Curvature Representations in Product Spaces.
Proceedings of the 7th International Conference on Learning Representations, 2019

2018
A Two-pronged Progress in Structured Dense Matrix Vector Multiplication.
Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, 2018

Learning Compressed Transforms with Low Displacement Rank.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Representation Tradeoffs for Hyperbolic Embeddings.
Proceedings of the 35th International Conference on Machine Learning, 2018

Learning Invariance with Compact Transforms.
Proceedings of the 6th International Conference on Learning Representations, 2018

2016
The Power of Deferral: Maintaining a Constant-Competitive Steiner Tree Online.
SIAM J. Comput., 2016

Recurrence Width for Structured Dense Matrix Vector Multiplication.
CoRR, 2016

2015
Sprague-Grundy Values of the $\mathcal{R}$-Wythoff Game.
Electron. J. Comb., 2015


  Loading...