Zhenmei Shi

Orcid: 0009-0007-6741-7598

According to our database1, Zhenmei Shi authored at least 38 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

2019
2020
2021
2022
2023
2024
0
5
10
15
20
25
30
26
2
2
4
2

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Theoretical Constraints on the Expressive Power of RoPE-based Tensor Attention Transformers.
CoRR, 2024

Fast Gradient Computation for RoPE Attention in Almost Linear Time.
CoRR, 2024

The Computational Limits of State-Space Models and Mamba via the Lens of Circuit Complexity.
CoRR, 2024

Curse of Attention: A Kernel-Based Perspective for Why Transformers Fail to Generalize on Time Series Forecasting and Beyond.
CoRR, 2024

On the Expressive Power of Modern Hopfield Networks.
CoRR, 2024

Circuit Complexity Bounds for RoPE-based Transformer Architecture.
CoRR, 2024

Advancing the Understanding of Fixed Point Iterations in Deep Neural Networks: A Detailed Analytical Study.
CoRR, 2024

Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent.
CoRR, 2024

Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix.
CoRR, 2024

HSR-Enhanced Sparse Attention Acceleration.
CoRR, 2024

Fine-grained Attention I/O Complexity: Comprehensive Analysis for Backward Passes.
CoRR, 2024

Looped ReLU MLPs May Be All You Need as Practical Programmable Computers.
CoRR, 2024

Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction.
CoRR, 2024

Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time.
CoRR, 2024

A Tighter Complexity Analysis of SparseGPT.
CoRR, 2024

Fast John Ellipsoid Computation with Differential Privacy Optimization.
CoRR, 2024

Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability.
CoRR, 2024

Differential Privacy of Cross-Attention with Provable Guarantee.
CoRR, 2024

Differential Privacy Mechanisms in Neural Tangent Kernel Regression.
CoRR, 2024

Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models.
CoRR, 2024

Toward Infinite-Long Prefix in Transformer.
CoRR, 2024

Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective.
CoRR, 2024

Tensor Attention Training: Provably Efficient Learning of Higher-order Transformers.
CoRR, 2024

Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers.
CoRR, 2024

Exploring the Frontiers of Softmax: Provable Optimization, Applications in Diffusion Model, and Beyond.
CoRR, 2024

Fourier Circuits in Neural Networks: Unlocking the Potential of Large Language Models in Mathematical Reasoning and Modular Arithmetic.
CoRR, 2024

Why Larger Language Models Do In-context Learning Differently?
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Towards Few-Shot Adaptation of Foundation Models via Multitask Finetuning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Domain Generalization via Nuclear Norm Regularization.
CoRR, 2023

A Graph-Theoretic Framework for Understanding Open-World Semi-Supervised Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Provable Guarantees for Neural Networks via Gradient Feature Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

When and How Does Known Class Help Discover Unknown Ones? Provable Understanding Through Spectral Analysis.
Proceedings of the International Conference on Machine Learning, 2023

The Trade-off between Universality and Label Efficiency of Representations from Contrastive Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
Attentive Walk-Aggregating Graph Neural Networks.
Trans. Mach. Learn. Res., 2022

Deep Online Fused Video Stabilization.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

A Theoretical Analysis on Feature Learning in Neural Networks: Emergence from Inputs and Advantage over Fixed Features.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2019
SF-Net: Structured Feature Network for Continuous Sign Language Recognition.
CoRR, 2019

DAWN: Dual Augmented Memory Network for Unsupervised Video Object Tracking.
CoRR, 2019


  Loading...