Beidi Chen

According to our database¹, Beidi Chen authored at least 68 papers between 2016 and 2024.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference.

[BibT_eX]

[DOI]

CoRR, 2024

MagicPIG: LSH Sampling for Efficient LLM Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild.

[BibT_eX]

[DOI]

CoRR, 2024

Sirius: Contextual Sparsity with Correction for Efficient LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding.

[BibT_eX]

[DOI]

CoRR, 2024

MINI-SEQUENCE TRANSFORMER: Optimizing Intermediate Memory for Long Sequences Training.

[BibT_eX]

[DOI]

CoRR, 2024

VcLLM: Video Codecs are Secretly Tensor Codecs.

[BibT_eX]

[DOI]

CoRR, 2024

It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF.

[BibT_eX]

[DOI]

CoRR, 2024

Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity.

[BibT_eX]

[DOI]

CoRR, 2024

SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices.

[BibT_eX]

[DOI]

CoRR, 2024

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution.

[BibT_eX]

[DOI]

CoRR, 2024

Memory Mosaics.

[BibT_eX]

[DOI]

CoRR, 2024

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding.

[BibT_eX]

[DOI]

CoRR, 2024

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length.

[BibT_eX]

[DOI]

CoRR, 2024

Prompt-prompted Mixture of Experts for Efficient LLM Generation.

[BibT_eX]

[DOI]

Harry Dong

Beidi Chen

Yuejie Chi

CoRR, 2024

Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding.

[BibT_eX]

[DOI]

CoRR, 2024

LLM Inference Unveiled: Survey and Roofline Model Insights.

[BibT_eX]

[DOI]

CoRR, 2024

Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding.

[BibT_eX]

[DOI]

CoRR, 2024

Learn To be Efficient: Build Structured Sparsity in Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Q-Hitter: A Better Token Oracle for Efficient LLM Inference via Sparse-Quantized KV Cache.

[BibT_eX]

[DOI]

Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Soft Prompt Recovers Compressed LLMs, Transferably.

[BibT_eX]

[DOI]

Anshumali Shrivastava

Proceedings of the Forty-first International Conference on Machine Learning, 2024

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

HexGen: Generative Inference of Large Language Model over Heterogeneous Environment.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

LoCoCo: Dropping In Convolutions for Long Context Compression.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Efficient Streaming Language Models with Attention Sinks.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

JoMA: Demystifying Multilayer Transformers via Joint Dynamics of MLP and Attention.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

HexGen: Generative Inference of Foundation Model over Heterogeneous Decentralized Environment.

[BibT_eX]

[DOI]

CoRR, 2023

H<sub>2</sub>O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

InRank: Incremental Low-Rank Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt.

[BibT_eX]

[DOI]

Anshumali Shrivastava

CoRR, 2023

High-throughput Generative Inference of Large Language Models with a Single GPU.

[BibT_eX]

[DOI]

CoRR, 2023

Modeling Scattering Coefficients using Self-Attentive Complex Polynomials with Image-based Representation.

[BibT_eX]

[DOI]

CoRR, 2023

H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time.

[BibT_eX]

[DOI]

Anshumali Shrivastava

Proceedings of the International Conference on Machine Learning, 2023

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Fast Algorithms for a New Relaxation of Optimal Transport.

[BibT_eX]

[DOI]

Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

2022

Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees.

[BibT_eX]

[DOI]

CoRR, 2022

Decentralized Training of Foundation Models in Heterogeneous Environments.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Fine-tuning Language Models over Slow Networks using Activation Quantization with Guarantees.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

HALOS: Hashing Large Output Space for Cheap Inference.

[BibT_eX]

[DOI]

Anshumali Shrivastava

Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

Monarch: Expressive Structured Matrices for Efficient and Accurate Training.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

2021

Satellite Images and Deep Learning to Identify Discrepancy in Mailing Addresses with Applications to Census 2020 in Houston.

[BibT_eX]

[DOI]

Anshumali Shrivastava

CoRR, 2021

Scatterbrain: Unifying Sparse and Low-rank Attention Approximation.

[BibT_eX]

[DOI]

CoRR, 2021

Locality Sensitive Teaching.

[BibT_eX]

[DOI]

Anshumali Shrivastava

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Scatterbrain: Unifying Sparse and Low-rank Attention.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

A Tale of Two Efficient and Informative Negative Sampling Distributions.

[BibT_eX]

[DOI]

Anshumali Shrivastava

Proceedings of the 38th International Conference on Machine Learning, 2021

SOLAR: Sparse Orthogonal Learned and Random Embeddings.

[BibT_eX]

[DOI]

Tharun Medini

Beidi Chen

Anshumali Shrivastava

Proceedings of the 9th International Conference on Learning Representations, 2021

MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training.

[BibT_eX]

[DOI]

Anshumali Shrivastava

Christopher Ré

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

A Constant-time Adaptive Negative Sampling.

[BibT_eX]

[DOI]

Anshumali Shrivastava

CoRR, 2020

Climbing the WOL: Training for Cheaper Inference.

[BibT_eX]

[DOI]

Anshumali Shrivastava

CoRR, 2020

SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems.

[BibT_eX]

[DOI]

Tsung-Yuan Charlie Tai

Anshumali Shrivastava

Proceedings of the Third Conference on Machine Learning and Systems, 2020

Angular Visual Hardness.

[BibT_eX]

[DOI]

Anshumali Shrivastava

Animesh Garg

Animashree Anandkumar

Proceedings of the 37th International Conference on Machine Learning, 2020

2019

Sub-Linear Privacy-Preserving Near-Neighbor Search.

[BibT_eX]

[DOI]

M. Sadegh Riazi

Beidi Chen

Anshumali Shrivastava

Dan S. Wallach

Farinaz Koushanfar

IACR Cryptol. ePrint Arch., 2019

Lsh-sampling Breaks the Computation Chicken-and-egg Loop in Adaptive Stochastic Gradient Estimation.

[BibT_eX]

[DOI]

Beidi Chen

Yingchen Xu

Anshumali Shrivastava

CoRR, 2019

SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems.

[BibT_eX]

[DOI]

Beidi Chen

Tharun Medini

Anshumali Shrivastava

CoRR, 2019

Fast and Accurate Stochastic Gradient Estimation.

[BibT_eX]

[DOI]

Beidi Chen

Yingchen Xu

Anshumali Shrivastava

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

2018

Densified Winner Take All (WTA) Hashing for Sparse Datasets.

[BibT_eX]

[DOI]

Beidi Chen

Anshumali Shrivastava

Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, 2018

Lsh-Sampling breaks the Computational chicken-and-egg Loop in adaptive stochastic Gradient estimation.

[BibT_eX]

[DOI]

Beidi Chen

Yingchen Xu

Anshumali Shrivastava

Proceedings of the 6th International Conference on Learning Representations, 2018

2017

Unique Entity Estimation with Application to the Syrian Conflict.

[BibT_eX]

[DOI]

Beidi Chen

Anshumali Shrivastava

Rebecca C. Steorts

CoRR, 2017

2016

Sub-linear Privacy-preserving Search with Untrusted Server and Semi-honest Parties.

[BibT_eX]

[DOI]

M. Sadegh Riazi

Beidi Chen

Anshumali Shrivastava

Dan S. Wallach

Farinaz Koushanfar

CoRR, 2016

Revisiting Winner Take All (WTA) Hashing for Sparse Datasets.

[BibT_eX]

[DOI]

Beidi Chen

Anshumali Shrivastava

CoRR, 2016

Beidi Chen

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...