Wei Lin
Orcid: 0000-0002-3003-0150Affiliations:
- Alibaba Group, China
- Microsoft, Redmond, WA, USA (former)
According to our database1,
Wei Lin
authored at least 98 papers
between 2009 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
Boosting the Convergence of Reinforcement Learning-Based Auto-Pruning Using Historical Data.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., February, 2024
CoRR, 2024
Adaptive Utilization of Cross-scenario Information for Multi-scenario Recommendation.
CoRR, 2024
PertEval: Unveiling Real Knowledge Capacity of LLMs with Knowledge-Invariant Perturbations.
CoRR, 2024
AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework.
CoRR, 2024
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache.
CoRR, 2024
MonoNN: Enabling a New Monolithic Optimization Space for Neural Network Inference Tasks on Modern GPU-Centric Architectures.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024
HAP: SPMD DNN Training on Heterogeneous GPU Clusters with Automated Program Synthesis.
Proceedings of the Nineteenth European Conference on Computer Systems, 2024
How to Trade Off the Quantity and Capacity of Teacher Ensemble: Learning Categorical Distribution to Stochastically Employ a Teacher for Distillation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
HAP: SPMD DNN Training on Heterogeneous GPU Clusters with Automated Program Synthesis.
Dataset, November, 2023
HAP: SPMD DNN Training on Heterogeneous GPU Clusters with Automated Program Synthesis.
Dataset, November, 2023
BladeDISC: Optimizing Dynamic Shape Machine Learning Workloads via Compiler Approach.
Proc. ACM Manag. Data, September, 2023
IEEE Trans. Parallel Distributed Syst., April, 2023
Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity.
Proc. VLDB Endow., 2023
GoldMiner: Elastic Scaling of Training Data Pre-Processing Pipelines for Deep Learning.
Proc. ACM Manag. Data, 2023
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity.
CoRR, 2023
Heterogeneous Knowledge Fusion: A Novel Approach for Personalized Recommendation via LLM.
CoRR, 2023
CoRR, 2023
Ada-Grouper: Accelerating Pipeline Parallelism in Preempted Network by Adaptive Group-Scheduling for Micro-Batches.
CoRR, 2023
Auto-Parallelizing Large Models with Rhino: A Systematic Approach on Production AI Platform.
CoRR, 2023
CoRR, 2023
EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs.
Proceedings of the International Conference for High Performance Computing, 2023
Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, 2023
RECom: A Compiler Approach to Accelerating Recommendation Model Inference with Massive Embedding Columns.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
uGrapher: High-Performance Graph Operator Computation via Unified Abstraction for Graph Neural Networks.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
2022
IEEE Trans. Parallel Distributed Syst., 2022
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022
Proceedings of the 2022 USENIX Annual Technical Conference, 2022
MLaaS in the Wild: Workload Analysis and Scheduling in Large-Scale Heterogeneous GPU Clusters.
Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022
Optimizing Federated Unsupervised Person Re-identification via Camera-aware Clustering.
Proceedings of the 24th IEEE International Workshop on Multimedia Signal Processing, 2022
Proceedings of the IEEE 33rd International Symposium on Software Reliability Engineering, 2022
Proceedings of the IEEE INFOCOM 2022, 2022
PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022
Proceedings of the The 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022, 2022
AStitch: enabling a new multi-dimensional optimization space for memory-intensive ML training and inference on modern SIMT architectures.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022
2021
IEEE Trans. Parallel Distributed Syst., 2021
Fangorn: Adaptive Execution Framework for Heterogeneous Workloads on Shared Clusters.
Proc. VLDB Endow., 2021
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining.
CoRR, 2021
Towards a Better Tradeoff between Effectiveness and Efficiency in Pre-Ranking: A Learnable Feature Selection based Approach.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021
Explicit Semantic Cross Feature Learning via Pre-trained Graph Neural Networks for CTR Prediction.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021
MeLL: Large-scale Extensible User Intent Classification for Dialogue Systems with Meta Lifelong Learning.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021
Proceedings of the EuroMLSys@EuroSys 2021, 2021
Learning Effective and Efficient Embedding via an Adaptively-Masked Twins-based Layer.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021
EasyTransfer: A Simple and Scalable Deep Transfer Learning Platform for NLP Applications.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021
2020
CoRR, 2020
EasyTransfer - A Simple and Scalable Deep Transfer Learning Platform for NLP Applications.
CoRR, 2020
INT8 Winograd Acceleration for Conv1D Equipped ASR Models Deployed on Mobile Devices.
CoRR, 2020
CoRR, 2020
CoRR, 2020
Auto-MAP: A DQN Framework for Exploring Distributed Execution Plans for DNN Workloads.
CoRR, 2020
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020
One-shot Text Field labeling using Attention and Belief Propagation for Structure Information Extraction.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
Proceedings of the Middleware '20: 21st International Middleware Conference, 2020
AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture Search.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020
A History-Based Auto-Tuning Framework for Fast and High-Performance DNN Design on GPU.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
Proceedings of the CoNEXT '20: The 16th International Conference on emerging Networking EXperiments and Technologies, 2020
Proceedings of the CoNEXT '20: The 16th International Conference on emerging Networking EXperiments and Technologies, 2020
2019
FusionStitching: Boosting Execution Efficiency of Memory Intensive Computations for DL Workloads.
CoRR, 2019
Proceedings of the IEEE International Symposium on Workload Characterization, 2019
Proceedings of the 2019 IEEE International Conference on Data Mining, 2019
Proceedings of the 2019 IEEE Hot Chips 31 Symposium (HCS), 2019
Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019
Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019
Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2019
Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019
2018
CoRR, 2018
FusionStitching: Deep Fusion and Code Generation for Tensorflow Computations on GPUs.
CoRR, 2018
Transfer Learning for Context-Aware Question Matching in Information-seeking Conversations in E-commerce.
CoRR, 2018
IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection.
CoRR, 2018
IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018
Transfer Learning for Context-Aware Question Matching in Information-seeking Conversations in E-commerce.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018
2016
Proceedings of the 13th USENIX Symposium on Networked Systems Design and Implementation, 2016
2015
IEEE Trans. Parallel Distributed Syst., 2015
2014
Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation, 2014
Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014
Nondeterminism in MapReduce considered harmful? an empirical study on non-commutative aggregators in MapReduce programs.
Proceedings of the 36th International Conference on Software Engineering, 2014
2013
Proceedings of the 35th International Conference on Software Engineering, 2013
2012
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2012
Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation, 2012
Optimizing Data Shuffling in Data-Parallel Computation by Understanding User-Defined Functions.
Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation, 2012
2010
Proceedings of the 1st ACM Symposium on Cloud Computing, 2010
2009
Proceedings of HotOS'09: 12th Workshop on Hot Topics in Operating Systems, 2009