Jun Yang

Affiliations:
  • NVIDIA Corp, Beijing, China
  • Alibaba Group, Computing Platform Department, Hangzhou, China


According to our database1, Jun Yang authored at least 18 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Boosting the Convergence of Reinforcement Learning-Based Auto-Pruning Using Historical Data.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., February, 2024

2022
Optimizing DNN Compilation for Distributed Training With Joint OP and Tensor Fusion.
IEEE Trans. Parallel Distributed Syst., 2022

Efficient Pipeline Planning for Expedited Distributed DNN Training.
Proceedings of the IEEE INFOCOM 2022, 2022

AStitch: enabling a new multi-dimensional optimization space for memory-intensive ML training and inference on modern SIMT architectures.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021
DAPPLE: a pipelined data parallel approach for training large models.
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021

DISC: A Dynamic Shape Compiler for Machine Learning Workloads.
Proceedings of the EuroMLSys@EuroSys 2021, 2021

2020
EasyTransfer - A Simple and Scalable Deep Transfer Learning Platform for NLP Applications.
CoRR, 2020

INT8 Winograd Acceleration for Conv1D Equipped ASR Models Deployed on Mobile Devices.
CoRR, 2020

FusionStitching: Boosting Memory Intensive Computations for Deep Learning Workloads.
CoRR, 2020

Auto-MAP: A DQN Framework for Exploring Distributed Execution Plans for DNN Workloads.
CoRR, 2020

Fast Training of Deep Learning Models over Multiple GPUs.
Proceedings of the Middleware '20: 21st International Middleware Conference, 2020

A History-Based Auto-Tuning Framework for Fast and High-Performance DNN Design on GPU.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Optimizing distributed training deployment in heterogeneous GPU clusters.
Proceedings of the CoNEXT '20: The 16th International Conference on emerging Networking EXperiments and Technologies, 2020

2019
FusionStitching: Boosting Execution Efficiency of Memory Intensive Computations for DL Workloads.
CoRR, 2019

Characterizing Deep Learning Training Workloads on Alibaba-PAI.
Proceedings of the IEEE International Symposium on Workload Characterization, 2019

2018
Graph-Adaptive Pruning for Efficient Inference of Convolutional Neural Networks.
CoRR, 2018

FusionStitching: Deep Fusion and Code Generation for Tensorflow Computations on GPUs.
CoRR, 2018

Efficient Deep Learning Inference Based on Model Compression.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018


  Loading...