Zhenmei Shi

Orcid: 0009-0007-6741-7598

According to our database¹, Zhenmei Shi authored at least 49 papers between 2019 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Force Matching with Relativistic Constraints: A Physics-Inspired Approach to Stable and Efficient Generative Modeling.

[BibT_eX]

[DOI]

CoRR, February, 2025

Universal Approximation of Visual Autoregressive Transformers.

[BibT_eX]

[DOI]

CoRR, February, 2025

DPBloomfilter: Securing Bloom Filters with Differential Privacy.

[BibT_eX]

[DOI]

CoRR, February, 2025

Dissecting Submission Limit in Desk-Rejections: A Mathematical Analysis of Fairness in AI Conference Policies.

[BibT_eX]

[DOI]

CoRR, February, 2025

High-Order Matching for One-Step Shortcut Diffusion Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

Neural Algorithmic Reasoning for Hypergraphs with Looped Transformers.

[BibT_eX]

[DOI]

CoRR, January, 2025

RichSpace: Enriching Text-to-Video Prompt Space via Text Embedding Interpolation.

[BibT_eX]

[DOI]

CoRR, January, 2025

On the Computational Capability of Graph Neural Networks: A Circuit Complexity Bound Perspective.

[BibT_eX]

[DOI]

CoRR, January, 2025

On Computational Limits and Provably Efficient Criteria of Visual Autoregressive Models: A Fine-Grained Complexity Analysis.

[BibT_eX]

[DOI]

CoRR, January, 2025

Circuit Complexity Bounds for Visual Autoregressive Model.

[BibT_eX]

[DOI]

CoRR, January, 2025

2024

Theoretical Constraints on the Expressive Power of RoPE-based Tensor Attention Transformers.

[BibT_eX]

[DOI]

CoRR, 2024

Fast Gradient Computation for RoPE Attention in Almost Linear Time.

[BibT_eX]

[DOI]

CoRR, 2024

The Computational Limits of State-Space Models and Mamba via the Lens of Circuit Complexity.

[BibT_eX]

[DOI]

CoRR, 2024

Curse of Attention: A Kernel-Based Perspective for Why Transformers Fail to Generalize on Time Series Forecasting and Beyond.

[BibT_eX]

[DOI]

CoRR, 2024

On the Expressive Power of Modern Hopfield Networks.

[BibT_eX]

[DOI]

CoRR, 2024

Circuit Complexity Bounds for RoPE-based Transformer Architecture.

[BibT_eX]

[DOI]

CoRR, 2024

Advancing the Understanding of Fixed Point Iterations in Deep Neural Networks: A Detailed Analytical Study.

[BibT_eX]

[DOI]

CoRR, 2024

Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent.

[BibT_eX]

[DOI]

CoRR, 2024

Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix.

[BibT_eX]

[DOI]

CoRR, 2024

HSR-Enhanced Sparse Attention Acceleration.

[BibT_eX]

[DOI]

CoRR, 2024

Fine-grained Attention I/O Complexity: Comprehensive Analysis for Backward Passes.

[BibT_eX]

[DOI]

CoRR, 2024

Looped ReLU MLPs May Be All You Need as Practical Programmable Computers.

[BibT_eX]

[DOI]

CoRR, 2024

Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction.

[BibT_eX]

[DOI]

CoRR, 2024

Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time.

[BibT_eX]

[DOI]

CoRR, 2024

A Tighter Complexity Analysis of SparseGPT.

[BibT_eX]

[DOI]

CoRR, 2024

Fast John Ellipsoid Computation with Differential Privacy Optimization.

[BibT_eX]

[DOI]

CoRR, 2024

Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability.

[BibT_eX]

[DOI]

Zhuoyan Xu

Zhenmei Shi

Yingyu Liang

CoRR, 2024

Differential Privacy of Cross-Attention with Provable Guarantee.

[BibT_eX]

[DOI]

CoRR, 2024

Differential Privacy Mechanisms in Neural Tangent Kernel Regression.

[BibT_eX]

[DOI]

CoRR, 2024

Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Toward Infinite-Long Prefix in Transformer.

[BibT_eX]

[DOI]

CoRR, 2024

Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective.

[BibT_eX]

[DOI]

CoRR, 2024

Tensor Attention Training: Provably Efficient Learning of Higher-order Transformers.

[BibT_eX]

[DOI]

CoRR, 2024

Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers.

[BibT_eX]

[DOI]

CoRR, 2024

Exploring the Frontiers of Softmax: Provable Optimization, Applications in Diffusion Model, and Beyond.

[BibT_eX]

[DOI]

CoRR, 2024

Fourier Circuits in Neural Networks: Unlocking the Potential of Large Language Models in Mathematical Reasoning and Modular Arithmetic.

[BibT_eX]

[DOI]

CoRR, 2024

Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Why Larger Language Models Do In-context Learning Differently?

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Towards Few-Shot Adaptation of Foundation Models via Multitask Finetuning.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Domain Generalization via Nuclear Norm Regularization.

[BibT_eX]

[DOI]

CoRR, 2023

A Graph-Theoretic Framework for Understanding Open-World Semi-Supervised Learning.

[BibT_eX]

[DOI]

Yiyou Sun

Zhenmei Shi

Yixuan Li

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Provable Guarantees for Neural Networks via Gradient Feature Learning.

[BibT_eX]

[DOI]

Zhenmei Shi

Junyi Wei

Yingyu Liang

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

When and How Does Known Class Help Discover Unknown Ones? Provable Understanding Through Spectral Analysis.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

The Trade-off between Universality and Label Efficiency of Representations from Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

Attentive Walk-Aggregating Graph Neural Networks.

[BibT_eX]

[DOI]

Mehmet Furkan Demirel

Trans. Mach. Learn. Res., 2022

Deep Online Fused Video Stabilization.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

A Theoretical Analysis on Feature Learning in Neural Networks: Emergence from Inputs and Advantage over Fixed Features.

[BibT_eX]

[DOI]

Zhenmei Shi

Junyi Wei

Yingyu Liang

Proceedings of the Tenth International Conference on Learning Representations, 2022

2019

SF-Net: Structured Feature Network for Continuous Sign Language Recognition.

[BibT_eX]

[DOI]

CoRR, 2019

DAWN: Dual Augmented Memory Network for Unsupervised Video Object Tracking.

[BibT_eX]

[DOI]

CoRR, 2019

Zhenmei Shi

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...