Shigang Li
Orcid: 0000-0003-0022-7865Affiliations:
- Chinese Academy of Sciences, Institute of Computing Technology, Beijing, China
- University of Science and Technology Beijing, China (PhD 2014)
According to our database1,
Shigang Li
authored at least 60 papers
between 2010 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
-
on dl.acm.org
On csauthors.net:
Bibliography
2024
IEEE Trans. Parallel Distributed Syst., August, 2024
POSTER: ParGNN: Efficient Training for Large-Scale Graph Neural Network on GPU Clusters.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024
A High-Performance Design, Implementation, Deployment, and Evaluation of The Slim Fly Network.
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024
2023
AGCM-3DLF: Accelerating Atmospheric General Circulation Model via 3-D Parallelization and Leap-Format.
IEEE Trans. Parallel Distributed Syst., March, 2023
AutoDDL: Automatic Distributed Deep Learning with Asymptotically Optimal Communication.
CoRR, 2023
Proceedings of the International Conference for High Performance Computing, 2023
ANT-MOC: Scalable Neutral Particle Transport Using 3D Method of Characteristics on Multi-GPU Systems.
Proceedings of the International Conference for High Performance Computing, 2023
Proceedings of the International Conference for High Performance Computing, 2023
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023
PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023
Asynch-SGBDT: Train Stochastic Gradient Boosting Decision Trees in an Asynchronous Parallel Manner.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
2022
VenusAI: An artificial intelligence platform for scientific discovery on supercomputers.
J. Syst. Archit., 2022
Proceedings of the SC22: International Conference for High Performance Computing, 2022
Proceedings of the SC22: International Conference for High Performance Computing, 2022
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022
2021
Breaking (Global) Barriers in Parallel Stochastic Optimization With Wait-Avoiding Group Averaging.
IEEE Trans. Parallel Distributed Syst., 2021
Why Dataset Properties Bound the Scalability of Parallel Machine Learning Training Algorithms.
IEEE Trans. Parallel Distributed Syst., 2021
Proceedings of the International Conference for High Performance Computing, 2021
Chimera: efficiently training large-scale neural networks with bidirectional pipelines.
Proceedings of the International Conference for High Performance Computing, 2021
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Proceedings of the Fourth Conference on Machine Learning and Systems, 2021
2020
FastNBL: fast neighbor lists establishment for molecular dynamics simulation based on bitwise operations.
J. Supercomput., 2020
J. Parallel Distributed Comput., 2020
The static parallel distribution algorithms for hybrid density-functional calculations in HONPAS package.
Int. J. High Perform. Comput. Appl., 2020
Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging.
CoRR, 2020
Taming unbalanced training workloads in deep learning with partial collective operations.
Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020
A Highly Efficient Dynamical Core of Atmospheric General Circulation Model based on Leap-Format.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020
2019
Correction to: FastNBL: fast neighbor lists establishment for molecular dynamics simulation based on bitwise operations.
J. Supercomput., 2019
J. Parallel Distributed Comput., 2019
CoRR, 2019
OpenKMC: a KMC design for hundred-billion-atom simulation using millions of cores on Sunway Taihulight.
Proceedings of the International Conference for High Performance Computing, 2019
swMD: Performance Optimizations for Molecular Dynamics Simulation on Sunway Taihulight.
Proceedings of the 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2019
Using Gradient Based Multikernel Gaussian Process and Meta-Acquisition Function to Accelerate SMBO.
Proceedings of the 31st IEEE International Conference on Tools with Artificial Intelligence, 2019
2018
IEEE Trans. Parallel Distributed Syst., 2018
CoRR, 2018
Proceedings of the 47th International Conference on Parallel Processing, 2018
Massively Scaling the Metal Microscopic Damage Simulation on Sunway TaihuLight Supercomputer.
Proceedings of the 47th International Conference on Parallel Processing, 2018
AGCM3D: A Highly Scalable Finite-Difference Dynamical Core of Atmospheric General Circulation Model Based on 3D Decomposition.
Proceedings of the 24th IEEE International Conference on Parallel and Distributed Systems, 2018
2017
Hybrid-optimization strategy for the communication of large-scale Kinetic Monte Carlo simulation.
Comput. Phys. Commun., 2017
Asynchronous COMID: the theoretic basis for transmitted data sparsification tricks on Parameter Server.
CoRR, 2017
CoRR, 2017
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017
2016
ACM Trans. Archit. Code Optim., 2016
2015
Sci. China Inf. Sci., 2015
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015
Analyzing MPI-3.0 Process-Level Shared Memory: A Case Study with Stencil Computations.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015
2014
Clust. Comput., 2014
2013
Proceedings of the 21st Euromicro International Conference on Parallel, 2013
Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, 2013
2011
Proceedings of the International Conference on Computational Science, 2011
Extending Synchronization Constructs in OpenMP to Exploit Pipeline Parallelism on Heterogeneous Multi-core.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2011
Scheduling Multi-paradigm and Multi-grain Parallel Components on Heterogeneous Platforms.
Proceedings of the Sixth Chinagrid Annual Conference, ChinaGrid 2011, Dalian, Liaoning, 2011
2010
Proceedings of the Algorithms and Architectures for Parallel Processing, 2010