Yunquan Zhang
Orcid: 0000-0001-7520-9640
According to our database1,
Yunquan Zhang
authored at least 148 papers
between 2003 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
IEEE Trans. Parallel Distributed Syst., September, 2024
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
Proceedings of the 38th ACM International Conference on Supercomputing, 2024
HAM-SpMSpV: an Optimized Parallel Algorithm for Masked Sparse Matrix-Sparse Vector Multiplications on multi-core CPUs.
Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing, 2024
2023
MP-DPS: adaptive distributed training for deep learning based on node merging and path prediction.
CCF Trans. High Perform. Comput., December, 2023
Redesigning OpenKMC for Multi-Component Trillion-Atom Simulations on the New Sunway Supercomputer.
IEEE Trans. Parallel Distributed Syst., July, 2023
AGCM-3DLF: Accelerating Atmospheric General Circulation Model via 3-D Parallelization and Leap-Format.
IEEE Trans. Parallel Distributed Syst., March, 2023
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023
Asynch-SGBDT: Train Stochastic Gradient Boosting Decision Trees in an Asynchronous Parallel Manner.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
Proceedings of the 37th International Conference on Supercomputing, 2023
Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023
Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023
2022
Publisher Correction: Smart scheduler: an adaptive NVM-aware thread scheduling approach on NUMA systems.
CCF Trans. High Perform. Comput., December, 2022
CCF Trans. High Perform. Comput., December, 2022
IEEE Trans. Parallel Distributed Syst., 2022
An Accurate and Efficient Large-Scale Regression Method Through Best Friend Clustering.
IEEE Trans. Parallel Distributed Syst., 2022
Trinity: Neural Network Adaptive Distributed Parallel Training Method Based on Reinforcement Learning.
Algorithms, 2022
Large-Scale Simulation of Quantum Computational Chemistry on a New Sunway Supercomputer.
Proceedings of the SC22: International Conference for High Performance Computing, 2022
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022
Proceedings of the 51st International Conference on Parallel Processing, 2022
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022
Aware: Adaptive Distributed Training with Computation, Communication and Position Awareness for Deep Learning Model.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022
2021
Why Dataset Properties Bound the Scalability of Parallel Machine Learning Training Algorithms.
IEEE Trans. Parallel Distributed Syst., 2021
Efficient parallel linear scaling method to get the response density matrix in all-electron real-space density-functional perturbation theory.
Comput. Phys. Commun., 2021
Many-core acceleration of the first-principles all-electron quantum perturbation calculations.
Comput. Phys. Commun., 2021
Enhanced AGCM3D: A Highly Scalable Dynamical Core of Atmospheric General Circulation Model Based on Leap-Format.
CoRR, 2021
Reducing Redundancy in Data Organization and Arithmetic Calculation for Stencil Computations.
CoRR, 2021
CoRR, 2021
Proceedings of the International Conference for High Performance Computing, 2021
Extreme-scale <i>ab initio</i> quantum raman spectra simulations on the leadership HPC system in China.
Proceedings of the International Conference for High Performance Computing, 2021
Accelerating all-electron <i>ab initio</i> simulation of raman spectra for biological systems.
Proceedings of the International Conference for High Performance Computing, 2021
TensorKMC: kinetic Monte Carlo simulation of 50 trillion atoms driven by deep learning on a new generation of Sunway supercomputer.
Proceedings of the International Conference for High Performance Computing, 2021
Reducing redundancy in data organization and arithmetic calculation for stencil computations.
Proceedings of the International Conference for High Performance Computing, 2021
AutoTSMM: An Auto-tuning Framework for Building High-Performance Tall-and-Skinny Matrix-Matrix Multiplication on CPUs.
Proceedings of the 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York City, NY, USA, September 30, 2021
Proceedings of the 27th IEEE International Conference on Parallel and Distributed Systems, 2021
Proceedings of the Algorithms and Architectures for Parallel Processing, 2021
Proceedings of the 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, 2021
2020
IEEE Trans. Parallel Distributed Syst., 2020
FastNBL: fast neighbor lists establishment for molecular dynamics simulation based on bitwise operations.
J. Supercomput., 2020
并行程序设计语言中局部性机制的研究 (Research on Locality-aware Design Mechanism of State-of-the-art Parallel Programming Languages).
计算机科学, 2020
J. Parallel Distributed Comput., 2020
The static parallel distribution algorithms for hybrid density-functional calculations in HONPAS package.
Int. J. High Perform. Comput. Appl., 2020
Int. J. High Perform. Comput. Appl., 2020
Accelerated LiDAR data processing algorithm for self-driving cars on the heterogeneous computing platform.
IET Comput. Digit. Tech., 2020
The dynamic parallel distribution algorithm for hybrid density-functional calculations in HONPAS package.
Comput. Phys. Commun., 2020
A Highly Efficient Dynamical Core of Atmospheric General Circulation Model based on Leap-Format.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020
Proceedings of the Algorithms and Architectures for Parallel Processing, 2020
2019
Correction to: FastNBL: fast neighbor lists establishment for molecular dynamics simulation based on bitwise operations.
J. Supercomput., 2019
2018年中国高性能计算机发展现状分析与展望 (State-of-the-art Analysis and Perspectives of 2018 China HPC Development).
计算机科学, 2019
J. Parallel Distributed Comput., 2019
CoRR, 2019
OpenKMC: a KMC design for hundred-billion-atom simulation using millions of cores on Sunway Taihulight.
Proceedings of the International Conference for High Performance Computing, 2019
Proceedings of the International Conference for High Performance Computing, 2019
swMD: Performance Optimizations for Molecular Dynamics Simulation on Sunway Taihulight.
Proceedings of the 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2019
Using Gradient Based Multikernel Gaussian Process and Meta-Acquisition Function to Accelerate SMBO.
Proceedings of the 31st IEEE International Conference on Tools with Artificial Intelligence, 2019
Proceedings of the 48th International Conference on Parallel Processing, 2019
2018
IEEE Trans. Parallel Distributed Syst., 2018
CoRR, 2018
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2018
Proceedings of the Languages and Compilers for Parallel Computing, 2018
Proceedings of the 47th International Conference on Parallel Processing, 2018
Massively Scaling the Metal Microscopic Damage Simulation on Sunway TaihuLight Supercomputer.
Proceedings of the 47th International Conference on Parallel Processing, 2018
AGCM3D: A Highly Scalable Finite-Difference Dynamical Core of Atmospheric General Circulation Model Based on 3D Decomposition.
Proceedings of the 24th IEEE International Conference on Parallel and Distributed Systems, 2018
Proceedings of the Algorithms and Architectures for Parallel Processing, 2018
Proceedings of the Benchmarking, Measuring, and Optimizing, 2018
2017
Hybrid-optimization strategy for the communication of large-scale Kinetic Monte Carlo simulation.
Comput. Phys. Commun., 2017
Asynchronous COMID: the theoretic basis for transmitted data sparsification tricks on Parameter Server.
CoRR, 2017
CoRR, 2017
Proceedings of the International Conference for High Performance Computing, 2017
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017
Proceedings of the 23rd IEEE International Conference on Parallel and Distributed Systems, 2017
2016
ACM Trans. Archit. Code Optim., 2016
Int. J. Parallel Emergent Distributed Syst., 2016
边缘海静力数值预报模式并行算法研究 (Parallelization of Hydrostatic Numerical Forecasting Model of Marginal Sea).
计算机科学, 2016
Concurr. Comput. Pract. Exp., 2016
Proceedings of the Network and Parallel Computing, 2016
2015
基于Pthreads的并行DSRC压缩算法设计与实现 (Design and Implementation of Parallel DSRC Compression Algorithm Based on Pthreads).
计算机科学, 2015
基于Julia语言的并行计算方法初探 (Primary Investigation into Parallel Computing in Julia Language).
计算机科学, 2015
基于OpenCL的直方图生成算法优化方法研究 (Research on Histogram Generation Algorithm Optimization Based on OpenCL).
计算机科学, 2015
Sci. China Inf. Sci., 2015
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015
Proceedings of the 44th International Conference on Parallel Processing, 2015
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015
Analyzing MPI-3.0 Process-Level Shared Memory: A Case Study with Stencil Computations.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015
2014
J. Softw., 2014
J. Comput. Sci. Technol., 2014
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014
Physically based parallel ray tracer for the Metropolis light transport algorithm on the Tianhe-2 supercomputer.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014
2013
AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs.
Proceedings of the International Conference for High Performance Computing, 2013
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013
Proceedings of the IEEE 33rd International Conference on Distributed Computing Systems, 2013
Proceedings of the Algorithms and Architectures for Parallel Processing, 2013
Large Scale Satellite Imagery Simulations with Physically Based Ray Tracing on Tianhe-1A Supercomputer.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013
2012
Implementing High-performance Intensity Model with Blur Effect on GPUs for Large-scale Star Image Simulation.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012
Proceedings of the 41st International Conference on Parallel Processing, 2012
Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012
Proceedings of the Algorithms and Architectures for Parallel Processing, 2012
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012
Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012
2011
Proceedings of the International Conference on Parallel Processing, 2011
Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011
2010
Perspectives of China's HPC system development: a view from the 2009 China HPC TOP100 list.
Frontiers Comput. Sci. China, 2010
Heterogeneous Multi-core Parallel SGEMM Performance Testing and Analysis on Cell/B.E Processor.
Proceedings of the Fifth International Conference on Networking, Architecture, and Storage, 2010
Proceedings of the 12th IEEE International Conference on High Performance Computing and Communications, 2010
Proceedings of the 12th IEEE International Conference on High Performance Computing and Communications, 2010
Proceedings of the 13th IEEE International Conference on Computational Science and Engineering, 2010
QuantWiz: A scalable parallel software package for label-free protein quantification.
Proceedings of the Fifth International Conference on Bio-Inspired Computing: Theories and Applications, 2010
Accelerating Linpack Performance with Mixed Precision Algorithm on CPU+GPGPU Heterogeneous Cluster.
Proceedings of the 10th IEEE International Conference on Computer and Information Technology, 2010
2009
A parallel shortest path algorithm based on graph-partitioning and iterative correcting.
Comput. Syst. Sci. Eng., 2009
Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009
QuantWiz: A Parallel Software Package for LC-MS-based Label-Free Protein Quantification.
Proceedings of the 11th IEEE International Conference on High Performance Computing and Communications, 2009
Performance Evaluation of Multithreaded Sparse Matrix-Vector Multiplication Using OpenMP.
Proceedings of the 11th IEEE International Conference on High Performance Computing and Communications, 2009
Proceedings of the High Performance Computing and Applications, 2009
2008
Frontiers Comput. Sci. China, 2008
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications, 2008
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications, 2008
Utilizing the Multi-threading Techniques to Improve the Two-Level Checkpoint/Rollback System for MPI Applications.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications, 2008
2007
Frontiers Comput. Sci. China, 2007
Proceedings of the CHINA HPC 2007, 2007
Block size selection of parallel LU and QR on PVP-based and RISC-based supercomputers.
Proceedings of the CHINA HPC 2007, 2007
Efficient Construction of FM-index Using Overlapping Block Processing for Large Scale Texts.
Proceedings of the Advances in Information Retrieval, 2007
2006
2003
Hardware Impact on Communication Performance of Beowulf LINUX Cluster.
Proceedings of the 21st IASTED International Multi-Conference on Applied Informatics (AI 2003), 2003