Guangming Tan
Orcid: 0000-0002-6361-5948
According to our database1,
Guangming Tan
authored at least 154 papers
between 2005 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
J. Opt. Commun. Netw., March, 2024
J. Comput. Sci. Technol., March, 2024
FILL: a heterogeneous resource scheduling system addressing the low throughput problem in GROMACS.
CCF Trans. High Perform. Comput., February, 2024
CoRR, 2024
Overcoming Memory Constraints in Quantum Circuit Simulation with a High-Fidelity Compression Framework.
CoRR, 2024
CoRR, 2024
POSTER: FineCo: Fine-grained Heterogeneous Resource Management for Concurrent DNN Inferences.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024
Proceedings of the Optical Fiber Communications Conference and Exhibition, 2024
A Coordinated Strategy for GNN Combining Computational Graph and Operator Optimizations.
Proceedings of the 38th ACM International Conference on Supercomputing, 2024
Proceedings of the 53rd International Conference on Parallel Processing, 2024
ElasticRoom: Multi-Tenant DNN Inference Engine via Co-design with Resource-constrained Compilation and Strong Priority Scheduling.
Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing, 2024
Proceedings of the Euro-Par 2024: Parallel Processing, 2024
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023
Editorial for the special issue on architecture, algorithms and applications of high performance sparse matrix computations.
CCF Trans. High Perform. Comput., June, 2023
Adaptive Workload-Balanced Scheduling Strategy for Global Ocean Data Assimilation on Massive GPUs.
Proceedings of the International Conference for High Performance Computing, 2023
Proceedings of the International Conference for High Performance Computing, 2023
Proceedings of the 37th International Conference on Supercomputing, 2023
Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023
Proceedings of the 43rd IEEE International Conference on Distributed Computing Systems, 2023
Proceedings of the Algorithms and Architectures for Parallel Processing, 2023
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023
Integrative Drug Discovery Platform: A Modular Approach for Efficient and Automated Virtual Screening.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2023
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
IEEE Trans. Parallel Distributed Syst., 2022
Double precision is not necessary for LSQR for solving discrete linear ill-posed problems.
CoRR, 2022
CoRR, 2022
Extending the limit of molecular dynamics with ab initio accuracy to 10 billion atoms.
CoRR, 2022
Fast and accurate variable batch size convolution neural network training on large scale distributed systems.
Concurr. Comput. Pract. Exp., 2022
Improvement of AI forecast of gridded PM2.5 forecast in China through ConvLSTM and Attention.
CCF Trans. High Perform. Comput., 2022
Proceedings of the SC22: International Conference for High Performance Computing, 2022
2.5 Million-Atom Ab Initio Electronic-Structure Simulation of Complex Metallic Heterostructures with DGDFT.
Proceedings of the SC22: International Conference for High Performance Computing, 2022
Proceedings of the SC22: International Conference for High Performance Computing, 2022
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022
Extending the limit of molecular dynamics with <i>ab initio</i> accuracy to 10 billion atoms.
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022
CSAM: A Channel and Spatial Attention Mechanism for Impervious Surface Extraction in Difficult Areas.
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, 2022
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022
TileSpMSpV: A Tiled Algorithm for Sparse Matrix-Sparse Vector Multiplication on GPUs.
Proceedings of the 51st International Conference on Parallel Processing, 2022
Proceedings of the IEEE 40th International Conference on Computer Design, 2022
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022
2021
Optimizing the LINPACK Algorithm for Large-Scale PCIe-Based CPU-GPU Heterogeneous Systems.
IEEE Trans. Parallel Distributed Syst., 2021
A New Optoelectronic Hybrid Network Based on Scheduling Optimization of Optical Links.
IEEE Trans. Computers, 2021
J. Comput. Sci. Technol., 2021
Guest Editorial: Special issue on Network and Parallel Computing for Emerging Architectures and Applications.
Int. J. Parallel Program., 2021
Int. J. Parallel Program., 2021
Editorial for the special issue on large-scale AI in classical HPC environment and AI for science.
CCF Trans. High Perform. Comput., 2021
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021
Spatio-Temporal Features Processing Network for Change Detection in Remote Sensing Images.
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, 2021
Deep Reinforcement Agent for Failure-aware Job scheduling in High-Performance Computing.
Proceedings of the 27th IEEE International Conference on Parallel and Distributed Systems, 2021
Building Agile Workflow Microservice System for HPC Applications Based on Fast-start OSv.
Proceedings of the 27th IEEE International Conference on Parallel and Distributed Systems, 2021
WidePipe: High-Throughput Deep Learning Inference System on a Cluster of Neural Processing Units.
Proceedings of the 39th IEEE International Conference on Computer Design, 2021
2020
Int. J. Parallel Program., 2020
Towards a heterogeneous architecture solver for the incompressible Navier-Stokes equations.
CCF Trans. High Perform. Comput., 2020
CCF Trans. High Perform. Comput., 2020
Proceedings of the SPAA '20: 32nd ACM Symposium on Parallelism in Algorithms and Architectures, 2020
Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020
FEB<sup>3D</sup>: An Efficient FPGA-Accelerated Compression Framework for Microscopy Images.
Proceedings of the Network and Parallel Computing, 2020
2019
Int. J. Parallel Program., 2019
CCF Trans. High Perform. Comput., 2019
Wormhole optical network: a new architecture to solve long diameter problem in exascale computer.
CCF Trans. High Perform. Comput., 2019
Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2019
Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2019
Proceedings of the 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2019
Proceedings of the 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2019
IA-SpGEMM: an input-aware auto-tuning framework for parallel sparse matrix-matrix multiplication.
Proceedings of the ACM International Conference on Supercomputing, 2019
Proceedings of the 25th IEEE International Conference on Parallel and Distributed Systems, 2019
A New Traffic Offloading Method with Slow Switching Optical Device in Exascale Computer.
Proceedings of the 37th IEEE International Conference on Computer Design, 2019
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019
2018
IEEE Trans. Parallel Distributed Syst., 2018
ACM Trans. Parallel Comput., 2018
Design and Implementation of Adaptive SpMV Library for Multicore and Many-Core Architecture.
ACM Trans. Math. Softw., 2018
Automated and precise event detection method for big data in biomedical imaging with support vector machine.
Comput. Syst. Sci. Eng., 2018
Register-based implementation of the sparse general matrix-matrix multiplication on GPUs.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018
Routing and Spectrum Allocation for Time Varying Traffic by Artificial Bee Colony Algorithm in Elastic Optical Networks.
Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2018
Proceedings of the 47th International Conference on Parallel Processing, 2018
Proceedings of the 47th International Conference on Parallel Processing, 2018
2017
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017
Proceedings of the International Conference on Supercomputing, 2017
Proceedings of the 23rd IEEE International Conference on Parallel and Distributed Systems, 2017
Proceedings of the 19th IEEE International Conference on High Performance Computing and Communications; 15th IEEE International Conference on Smart City; 3rd IEEE International Conference on Data Science and Systems, 2017
2016
Graphine: Programming Graph-Parallel Computation of Large Natural Graphs for Multicore Clusters.
IEEE Trans. Parallel Distributed Syst., 2016
Accelerating Irregular Computation in Massive Short Reads Mapping on FPGA Co-Processor.
IEEE Trans. Parallel Distributed Syst., 2016
边缘海静力数值预报模式并行算法研究 (Parallelization of Hydrostatic Numerical Forecasting Model of Marginal Sea).
计算机科学, 2016
Proceedings of the Algorithms and Architectures for Parallel Processing, 2016
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2016
2015
SuperDragon: A Heterogeneous Parallel System for Accelerating 3D Reconstruction of Cryo-Electron Microscopy Images.
ACM Trans. Reconfigurable Technol. Syst., 2015
Detection of soft errors in LU decomposition with partial pivoting using algorithm-based fault tolerance.
Int. J. High Perform. Comput. Appl., 2015
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015
Proceedings of the 44th International Conference on Parallel Processing, 2015
Proceedings of the 44th International Conference on Parallel Processing, 2015
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015
Application Taxonomy via Algorithmic Commonality for Domain-Specific Architecture Desgin.
Proceedings of the 22nd IEEE International Conference on High Performance Computing, 2015
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015
2014
Exploiting fine-grained parallelism in graph traversal algorithms via lock virtualization on multi-core architecture.
J. Supercomput., 2014
Accelerating massive short reads mapping for next generation sequencing (abstract only).
Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2014
Reducing Communication in Parallel Breadth-First Search on Distributed Memory Systems.
Proceedings of the 17th IEEE International Conference on Computational Science and Engineering, 2014
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014
2013
Scalability study of molecular dynamics simulation on Godson-T many-core architecture.
J. Parallel Distributed Comput., 2013
J. Comput. Sci. Technol., 2013
Comput. Sci. Res. Dev., 2013
CoRR, 2013
A Study of Leveraging Memory Level Parallelism for DRAM System on Multi-core/Many-Core Architecture.
Proceedings of the 12th IEEE International Conference on Trust, 2013
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2013
ParaInsight: An Assistant for Quantitatively Analyzing Multi-granularity Parallel Region.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013
Vlock: Lock virtualization mechanism for exploiting fine-grained parallelism in graph traversal algorithms.
Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, 2013
2012
Compression and Sieve: Reducing Communication in Parallel Breadth First Search on Distributed Memory Systems
CoRR, 2012
A lightweight hybrid hardware/software approach for object-relative memory profiling.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2012
A Case Study of Designing Efficient Algorithm-based Fault Tolerant Application for Exascale Parallelism.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012
Investigating Memory Optimization of Hash-index for Next Generation Sequencing on Multi-core Architecture.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012
Proceedings of the International Conference on Supercomputing, 2012
A coarse-grained stream architecture for cryo-electron microscopy images 3D reconstruction.
Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012
Accelerating Millions of Short Reads Mapping on a Heterogeneous Architecture with FPGA Accelerator.
Proceedings of the 2012 IEEE 20th Annual International Symposium on Field-Programmable Custom Computing Machines, 2012
2011
Analysis and performance results of computing betweenness centrality on IBM Cyclops64.
J. Supercomput., 2011
J. Comput. Sci. Technol., 2011
J. Comput. Sci. Technol., 2011
Environ. Model. Softw., 2011
A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism
CoRR, 2011
Proceedings of the Conference on High Performance Computing Networking, 2011
Poster: revisiting virtual channel memory for performance and fairness on multi-core architecture.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011
Experience of parallelizing cryo-EM 3D reconstruction on a CPU-GPU heterogeneous system.
Proceedings of the 20th ACM International Symposium on High Performance Distributed Computing, 2011
Proceedings of the 18th International Conference on High Performance Computing, 2011
Performance analysis and optimization of molecular dynamics simulation on <i>Godson-T</i> many-core processor.
Proceedings of the 8th Conference on Computing Frontiers, 2011
2010
Proceedings of the 16th IEEE International Conference on Parallel and Distributed Systems, 2010
Preliminary Investigation of Accelerating Molecular Dynamics Simulation on Godson-T Many-Core Processor.
Proceedings of the Euro-Par 2010 Parallel Processing Workshops, 2010
2009
Improving Performance of Dynamic Programming via Parallelism and Locality on Multicore Architectures.
IEEE Trans. Parallel Distributed Syst., 2009
SIGMETRICS Perform. Evaluation Rev., 2009
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2009
Proceedings of the 23rd international conference on Supercomputing, 2009
Proceedings of the ICPP 2009, 2009
Proceedings of the Euro-Par 2009 Parallel Processing, 2009
2008
Experience on optimizing irregular computation for memory hierarchy in manycore architecture.
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008
Just-In-Time Locality and Percolation for Optimizing Irregular Applications on a Manycore Architecture.
Proceedings of the Languages and Compilers for Parallel Computing, 2008
2007
Regular Paper: A Study of Architectural Optimization Methods in Bioinformatics Applications.
Int. J. High Perform. Comput. Appl., 2007
Proceedings of the SPAA 2007: Proceedings of the 19th Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2007
Implementation of the Smith-Waterman algorithm on a reconfigurable supercomputing platform.
Proceedings of the 1st international workshop on High-performance reconfigurable computing technology and applications, 2007
2006
J. Comput. Sci. Technol., 2006
Biology - Locality and parallelism optimization for dynamic programming algorithm in bioinformatics.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006
Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006
2005
Proceedings of the 4th International Symposium on Parallel and Distributed Computing (ISPDC 2005), 2005
Proceedings of the 11th International Conference on Parallel and Distributed Systems, 2005
An Efficient Dynamic Programming Algorithm and Implementation for RNA Secondary Structure Prediction.
Proceedings of the Computational Science, 2005
Proceedings of the Computational Science, 2005