Mu Li

Orcid: 0000-0002-4433-2301

Affiliations:
  • Amazon, Palo Alto, CA, USA
  • Carnegie Mellon University, PA, USA (former)


According to our database1, Mu Li authored at least 64 papers between 2014 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Online learning from capricious data streams via shared and new feature spaces.
Appl. Intell., October, 2024

Improving Semantic Segmentation via Efficient Self-Training.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2024

Multimodal Chain-of-Thought Reasoning in Language Models.
Trans. Mach. Learn. Res., 2024

2023
What Makes for Good Tokenizers in Vision Transformer?
IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

RAF: Holistic Compilation for Deep Learning Model Training.
CoRR, 2023

LayoutDiffuse: Adapting Foundational Diffusion Models for Layout-to-Image Generation.
CoRR, 2023

GFM: Building Geospatial Foundation Models via Continual Pretraining.
CoRR, 2023

MixGen: A New Multi-Modal Data Augmentation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2023

Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

PreDiff: Precipitation Nowcasting with Latent Diffusion Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

XTab: Cross-table Pretraining for Tabular Transformers.
Proceedings of the International Conference on Machine Learning, 2023

AIM: Adapting Image Models for Efficient Video Action Recognition.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Learning Multimodal Data Augmentation in Feature Space.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Parameter-Efficient Fine-Tuning Design Spaces.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Automatic Chain of Thought Prompting in Large Language Models.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

A Cheaper and Better Diffusion Language Model with Soft-Masked Noise.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Tailoring Instructions to Student's Learning Levels Boosts Knowledge Distillation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
MiCS: Near-linear Scaling for Training Gigantic Model on Public Cloud.
Proc. VLDB Endow., 2022

SPT: Semi-Parametric Prompt Tuning for Multitask Prompted Learning.
CoRR, 2022

Are Multimodal Models Robust to Image and Text Perturbations?
CoRR, 2022

Visual Prompt Tuning for Test-time Domain Adaptation.
CoRR, 2022

Earthformer: Exploring Space-Time Transformers for Earth System Forecasting.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition.
Proceedings of the International Conference on Machine Learning, 2022

Removing Batch Normalization Boosts Adversarial Training.
Proceedings of the International Conference on Machine Learning, 2022

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

ResNeSt: Split-Attention Networks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

2021
Dive into Deep Learning.
CoRR, 2021

SelfNorm and CrossNorm for Out-of-Distribution Robustness.
CoRR, 2021

Progressive Coordinate Transforms for Monocular 3D Object Detection.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Benchmarking Multimodal AutoML for Tabular Data with Text Fields.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Blending Anti-Aliasing into Vision Transformer.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference.
Proceedings of the Fourth Conference on Machine Learning and Systems, 2021

A Unified Efficient Pyramid Transformer for Semantic Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

Video Contrastive Learning with Global Context.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

CrossNorm and SelfNorm for Generalization under Distribution Shifts.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Distiller: A Systematic Study of Model Distillation Methods in Natural Language Processing.
Proceedings of the Second Workshop on Simple and Efficient Natural Language Processing, 2021

Lorien: Efficient Deep Learning Workloads Delivery.
Proceedings of the SoCC '21: ACM Symposium on Cloud Computing, 2021

2020
GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing.
J. Mach. Learn. Res., 2020

A Comprehensive Study of Deep Video Action Recognition.
CoRR, 2020

Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes.
CoRR, 2020

Improving Semantic Segmentation via Self-Training.
CoRR, 2020

AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data.
CoRR, 2020

FeatGraph: a flexible and efficient backend for graph neural network systems.
Proceedings of the International Conference for High Performance Computing, 2020

CSER: Communication-efficient SGD with Error Reset.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019
On the Powerball Method: Variants of Descent Methods for Accelerated Optimization.
IEEE Control. Syst. Lett., 2019

GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing.
CoRR, 2019

Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources.
CoRR, 2019

Language Models with Transformers.
CoRR, 2019

Bag of Freebies for Training Object Detection Neural Networks.
CoRR, 2019

Optimizing CNN Model Inference on CPUs.
Proceedings of the 2019 USENIX Annual Technical Conference, 2019

A Unified Optimization Approach for CNN Model Inference on Integrated GPUs.
Proceedings of the 48th International Conference on Parallel Processing, 2019

Bag of Tricks for Image Classification with Convolutional Neural Networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2017
Data Driven Resource Allocation for Distributed Learning.
Proceedings of the Workshops of the The Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
On the Powerball Method.
CoRR, 2016

DiFacto: Distributed Factorization Machines.
Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, 2016

AdaDelay: Delay Adaptive Distributed Stochastic Optimization.
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016

2015
AdaDelay: Delay Adaptive Distributed Stochastic Convex Optimization.
CoRR, 2015

Graph Partitioning via Parallel Submodular Approximation to Accelerate Distributed Machine Learning.
CoRR, 2015

MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems.
CoRR, 2015

Inferring Movement Trajectories from GPS Snippets.
Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, 2015

Cuckoo Linear Algebra.
Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015

2014
Scaling Distributed Machine Learning with the Parameter Server.
Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation, 2014

Communication Efficient Distributed Machine Learning with the Parameter Server.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Efficient mini-batch training for stochastic optimization.
Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014


  Loading...