Yulong Ao

According to our database1, Yulong Ao authored at least 17 papers between 2015 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Emu3: Next-Token Prediction is All You Need.
CoRR, 2024

Aquila2 Technical Report.
CoRR, 2024

AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies.
CoRR, 2024

2021
Adaptive SpMV/SpMSpV on GPUs for Input Vectors of Varied Sparsity.
IEEE Trans. Parallel Distributed Syst., 2021

End-to-end Adaptive Distributed Training on PaddlePaddle.
CoRR, 2021

AutoWM: a novel domain-specific tool for universal multi-/many-core accelerations of the WRF cloud microphysics.
Clust. Comput., 2021

2020
Solving a trillion unknowns per second with HPGMG on Sunway TaihuLight.
Clust. Comput., 2020

2019
Enabling Highly Efficient k-Means Computations on the SW26010 Many-Core Processor of Sunway TaihuLight.
J. Comput. Sci. Technol., 2019

2018
Extreme-Scale High-Order WENO Simulations of 3-D Detonation Wave with 10 Million Cores.
ACM Trans. Archit. Code Optim., 2018

Performance Optimization of the HPCG Benchmark on the Sunway TaihuLight Supercomputer.
ACM Trans. Archit. Code Optim., 2018

A Fast Sparse Triangular Solver for Structured-grid Problems on Sunway Many-core Processor SW26010.
Proceedings of the 47th International Conference on Parallel Processing, 2018

Extreme-Scale Realistic Stencil Computations on Sunway TaihuLight with Ten Million Cores.
Proceedings of the 18th IEEE/ACM International Symposium on Cluster, 2018

2017
26 PFLOPS Stencil Computations for Atmospheric Modeling on Sunway TaihuLight.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

Towards Highly Efficient DGEMM on the Emerging SW26010 Many-Core Processor.
Proceedings of the 46th International Conference on Parallel Processing, 2017

2016
10M-core scalable fully-implicit solver for nonhydrostatic atmospheric dynamics.
Proceedings of the International Conference for High Performance Computing, 2016

2015
Pattern-Driven Hybrid Multi- and Many-Core Acceleration in the MPAS Shallow-Water Model.
Proceedings of the 44th International Conference on Parallel Processing, 2015

Performance Evaluation of HPGMG on Tianhe-2: Early Experience.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015


  Loading...