Youshan Miao

Orcid: 0000-0002-2395-9965

According to our database1, Youshan Miao authored at least 27 papers between 2012 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Efficient Schedule Construction for Distributed Execution of Large DNN Models.
IEEE Trans. Parallel Distributed Syst., December, 2024

nnScaler: Constraint-Guided Parallelization Plan Generation for Deep Learning Training.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

Aceso: Efficient Parallel DNN Training through Iterative Bottleneck Alleviation.
Proceedings of the Nineteenth European Conference on Computer Systems, 2024

2023
SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction.
CoRR, 2023

Adam Accumulation to Reduce Memory Footprints of Both Activations and Gradients for Large-Scale DNN Training.
Proceedings of the ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland, 2023

2022
Breaking the computation and communication abstraction barrier in distributed machine learning workloads.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021
Efficient Data Loader for Fast Sampling-Based GNN Training on Large Graphs.
IEEE Trans. Parallel Distributed Syst., 2021

Dense-to-Sparse Gate for Mixture-of-Experts.
CoRR, 2021

ZIPPER: Exploiting Tile- and Operator-level Parallelism for General and Scalable Graph Neural Network Acceleration.
CoRR, 2021

CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning.
CoRR, 2021

CrossoverScheduler: Overlapping Multiple Distributed Training Applications in a Crossover Manner.
CoRR, 2021

Accelerating GNN training with locality-aware partial execution.
Proceedings of the APSys '21: 12th ACM SIGOPS Asia-Pacific Workshop on Systems, 2021

2020
Distributed Graph Computation Meets Machine Learning.
IEEE Trans. Parallel Distributed Syst., 2020

Architectural Implications of Graph Neural Networks.
IEEE Comput. Archit. Lett., 2020

Rammer: Enabling Holistic Deep Learning Compiler Optimizations with rTasks.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

Motif-Preserving Temporal Network Embedding.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

PaGraph: Scaling GNN training on large graphs via computation-aware caching.
Proceedings of the SoCC '20: ACM Symposium on Cloud Computing, 2020

2019
NeuGraph: Parallel Deep Neural Network Computation on Large Graphs.
Proceedings of the 2019 USENIX Annual Technical Conference, 2019

Fast Distributed Deep Learning over RDMA.
Proceedings of the Fourteenth EuroSys Conference 2019, Dresden, Germany, March 25-28, 2019, 2019

2018
Towards Efficient Large-Scale Graph Neural Network Computing.
CoRR, 2018

RPC Considered Harmful: Fast Distributed Deep Learning on RDMA.
CoRR, 2018

2017
Tux<sup>2</sup>: Distributed Graph Computation for Machine Learning.
Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation, 2017

2015
ImmortalGraph: A System for Storage and Analysis of Temporal Graphs.
ACM Trans. Storage, 2015

GraM: scaling graph computation to the trillions.
Proceedings of the Sixth ACM Symposium on Cloud Computing, 2015

2014
Chronos: a graph engine for temporal graph analysis.
Proceedings of the Ninth Eurosys Conference 2014, 2014

2012
Kineograph: taking the pulse of a fast-changing and connected world.
Proceedings of the European Conference on Computer Systems, 2012


  Loading...