Hong An
Orcid: 0000-0002-3900-3722
According to our database1,
Hong An
authored at least 119 papers
between 1999 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
SWattention: designing fast and memory-efficient attention for a new Sunway Supercomputer.
J. Supercomput., July, 2024
Uncovering the performance bottleneck of modern HPC processor with static code analyzer: a case study on Kunpeng 920.
CCF Trans. High Perform. Comput., June, 2024
An N-Shaped Lightweight Network with a Feature Pyramid and Hybrid Attention for Brain Tumor Segmentation.
Entropy, February, 2024
Gene expression bias between the subgenomes of allopolyploid hybrids is an emergent property of the kinetics of expression.
PLoS Comput. Biol., January, 2024
Extending the limit of LR-TDDFT on two different approaches: Numerical algorithms and new Sunway heterogeneous supercomputer.
Parallel Comput., 2024
PWDFT-SW: Extending the Limit of Plane-Wave DFT Calculations to 16K Atoms on the New Sunway Supercomputer.
CoRR, 2024
Rethinking automatic segmentation of gross target volume from a decoupling perspective.
Comput. Medical Imaging Graph., 2024
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
DB-SpGEMM: A Massively Distributed Block-Sparse Matrix-Matrix Multiplication for Linear-Scaling DFT Calculations.
Proceedings of the 53rd International Conference on Parallel Processing, 2024
Multi-level Load Balancing Strategies for Massively Parallel Smoothed Particle Hydrodynamics Simulation.
Proceedings of the 53rd International Conference on Parallel Processing, 2024
A<sup>3</sup>PIM: An Automated, Analytic and Accurate Processing-in-Memory Offloader.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024
2023
Deep learning representations for quantum many-body systems on heterogeneous hardware.
Mach. Learn. Sci. Technol., March, 2023
High performance computing for first-principles Kohn-Sham density functional theory towards exascale supercomputers.
CCF Trans. High Perform. Comput., March, 2023
swMPAS-A: Scaling MPAS-A to 39 Million Heterogeneous Cores on the New Generation Sunway Supercomputer.
IEEE Trans. Parallel Distributed Syst., 2023
Establishing a Modeling System in 3-km Horizontal Resolution for Global Atmospheric Circulation triggered by Submarine Volcanic Eruptions with 400 Billion Smoothed Particle Hydrodynamics.
Proceedings of the International Conference for High Performance Computing, 2023
Contrast Learning Based Robust Framework for Weakly Supervised Medical Image Segmentation with Coarse Bounding Box Annotations.
Proceedings of the Computational Mathematics Modeling in Cancer Analysis, 2023
H-DenseFormer: An Efficient Hybrid Densely Connected Transformer for Multimodal Tumor Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023
FRNET: An Effective Hybrid Structure for Automatic Segmentation of Head and Neck Primary Tumors from Multimodal Images.
Proceedings of the 20th IEEE International Symposium on Biomedical Imaging, 2023
SWSPH: A Massively Parallel SPH Implementation for Hundred-Billion-Particle Simulation on New Sunway Supercomputer.
Proceedings of the Euro-Par 2023: Parallel Processing - 29th International Conference on Parallel and Distributed Computing, Limassol, Cyprus, August 28, 2023
2022
Bridging the Gap between Deep Learning and Frustrated Quantum Spin System for Extreme-Scale Simulations on New Generation of Sunway Supercomputer.
IEEE Trans. Parallel Distributed Syst., 2022
AI for Quantum Mechanics: High Performance Quantum Many-Body Simulations via Deep Learning.
Proceedings of the SC22: International Conference for High Performance Computing, 2022
2.5 Million-Atom Ab Initio Electronic-Structure Simulation of Complex Metallic Heterostructures with DGDFT.
Proceedings of the SC22: International Conference for High Performance Computing, 2022
A Systematic Methodology for performance characterizing of Heterogeneous Systems with a dataflow runtime simulator.
Proceedings of the 4th International Conference on Robotics, 2022
Proceedings of the 3rd International Symposium on Artificial Intelligence for Medicine Sciences, 2022
Accelerating Parallel First-Principles Excited-State Calculation by Low-Rank Approximation with K-Means Clustering.
Proceedings of the 51st International Conference on Parallel Processing, 2022
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022
Quantifying Throughput of Basic Blocks on ARM Microarchitectures by Static Code Analyzers: A Case Study on Kunpeng 920.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022
Whole Slide Image Multi-Classification of Cervical Epithelial Lesions Based on Unsupervised Pre-training.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022
Automatic Segmentation of Target Structures for Total Marrow and Lymphoid Irradiation in Bone Marrow Transplantation.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022
FcTC-UNet: Fine-grained Combination of Transformer and CNN for Thoracic Organs Segmentation.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022
Proceedings of the 6th International Conference on Computer Science and Application Engineering, 2022
2021
J. Comput. Sci. Technol., 2021
swFLOW: A large-scale distributed framework for deep learning on Sunway TaihuLight supercomputer.
Inf. Sci., 2021
Int. J. Parallel Program., 2021
Symplectic structure-preserving particle-in-cell whole-volume simulation of tokamak plasmas to 111.3 trillion particles and 25.7 billion grids.
Proceedings of the International Conference for High Performance Computing, 2021
Global Multi-Level Attention Network for the Segmentation of Clinical Target Volume In The Planning CT For Cervical Cancer.
Proceedings of the 18th IEEE International Symposium on Biomedical Imaging, 2021
Proceedings of the 18th IEEE International Symposium on Biomedical Imaging, 2021
Proceedings of the IPMV 2021: 3rd International Conference on Image Processing and Machine Vision, Hong Kong, SAR, China, May 22, 2021
Proceedings of the ICCPR '21: 10th International Conference on Computing and Pattern Recognition, Shanghai, China, October 15, 2021
Simultaneous Right Ventricle End-diastolic and End-systolic Frame Identification and Landmark Detection on Echocardiography.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021
DARNet: Dual-Attention Residual Network for Automatic Diagnosis of COVID-19 via CT Images.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2021
2020
Distributed deep learning system for cancerous region detection on Sunway TaihuLight.
CCF Trans. High Perform. Comput., 2020
IEEE Access, 2020
Proceedings of the Network and Parallel Computing, 2020
Proceedings of the 2020 IEEE 17th International Symposium on Biomedical Imaging Workshops (ISBI Workshops), 2020
An Efficient Multi-GPU Implementation for Linear-Response Time-Dependent Density Functional Theory.
Proceedings of the 22nd IEEE International Conference on High Performance Computing and Communications; 18th IEEE International Conference on Smart City; 6th IEEE International Conference on Data Science and Systems, 2020
Optimizing Astrophysical Simulation Software on Sunway Heterogeneous Manycore Architecture.
Proceedings of the 22nd IEEE International Conference on High Performance Computing and Communications; 18th IEEE International Conference on Smart City; 6th IEEE International Conference on Data Science and Systems, 2020
2019
CARS: A contention-aware scheduler for efficient resource management of HPC storage systems.
Parallel Comput., 2019
众核平台上广度优先搜索算法的优化 (Optimization of Breadth-first Search Algorithm Based on Many-core Platform).
计算机科学, 2019
Degree-of-Node Task Scheduling of Fine-Grained Parallel Programs on Heterogeneous Systems.
J. Comput. Sci. Technol., 2019
Int. J. Parallel Program., 2019
Proceedings of the Network and Parallel Computing, 2019
Gdarts: A GPU-Based Runtime System for Dataflow Task Programming on Dependency Applications.
Proceedings of the 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2019
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019
Interference-Aware I/O Scheduling for Data-Intensive Applications on Hierarchical HPC Storage Systems.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019
Redesign NAMD Molecular Dynamics Non-Bonded Force-Field on Sunway Manycore Processor.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019
Proceedings of the 3rd International Conference on High Performance Compilation, 2019
2018
PEPS++: Towards Extreme-Scale Simulations of Strongly Correlated Quantum Many-Particle Models on Sunway TaihuLight.
IEEE Trans. Parallel Distributed Syst., 2018
Combining Hadoop with MPI to Solve Metagenomics Problems that are both Data- and Compute-intensive.
Int. J. Parallel Program., 2018
Int. J. Parallel Program., 2018
Proceedings of the 47th International Conference on Parallel Processing, 2018
2017
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017
Refactoring the Molecular Docking Simulation for Heterogeneous, Manycore Processors Systems.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017
A hierarchical grid algorithm for accelerating high-performance conjugate gradient benchmark on sunway many-core processor.
Proceedings of the 3rd International Conference on Communication and Information Processing, 2017
Pipelining Computation and Optimization Strategies for Scaling GROMACS on the Sunway Many-Core Processor.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2017
2016
Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, 2016
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016
2015
Speculative Parallelism Characterization Profiling in General Purpose Computing Applications.
J. Comput. Sci. Eng., 2015
Optimization and Analysis of Parallel Back Propagation Neural Network on GPU Using CUDA.
Proceedings of the Neural Information Processing - 22nd International Conference, 2015
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015
2014
Int. J. High Perform. Syst. Archit., 2014
Efficient execution of speculative threads and transactions with hardware transactional memory.
Future Gener. Comput. Syst., 2014
A Criticality-Aware DVFS Runtime Utility for Optimizing Power Efficiency of Multithreaded Applications.
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014
Proceedings of the Algorithms and Architectures for Parallel Processing, 2014
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014
2013
Proceedings of the 12th IEEE International Conference on Trust, 2013
2012
Int. J. Inf. Technol. Commun. Convergence, 2012
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012
Proceedings of the International Conference on Supercomputing, 2012
Proceedings of the International Conference on Supercomputing, 2012
Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012
VSCP: A Cache Controlling Method for Improving Single Thread Performance in Multicore System.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012
Proceedings of the 2012 IEEE/ACIS 11th International Conference on Computer and Information Science, Shanghai, China, May 30, 2012
Value Predicted LogSPoTM: Improve the Parallesim of Thread Level System by Using a Value Predictor.
Proceedings of the 2012 IEEE/ACIS 11th International Conference on Computer and Information Science, Shanghai, China, May 30, 2012
2011
CHMasters: A Scalable and Speed-Efficient Metadata Service in Distributed File System.
Proceedings of the 12th International Conference on Parallel and Distributed Computing, 2011
A Non-blocking Programming Framework for Pipeline Application on Multi-core Platform.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2011
A Priority-Aware NoC to Reduce Squashes in Thread Level Speculation for Chip Multiprocessors.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2011
Proceedings of the Third International Conference on Communications and Mobile Computing, 2011
Proceedings of the Seventh International Conference on Computational Intelligence and Security, 2011
2010
Proceedings of the 2010 International Conference on Parallel and Distributed Computing, 2010
Proceedings of the 2010 International Conference on High Performance Computing & Simulation, 2010
Proceedings of the Computational Science and Its Applications, 2010
Proceedings of the Algorithms and Architectures for Parallel Processing, 2010
The optimization of parallel Smith-Waterman sequence alignment using on-chip memory of GPGPU.
Proceedings of the Fifth International Conference on Bio-Inspired Computing: Theories and Applications, 2010
2009
The Mapping Framework and Optimizing Strategy for Block Cryptography Algorithms on Cell Broadband Engine.
Proceedings of the 2009 International Conference on Parallel and Distributed Computing, 2009
Performance and Power Efficiency Analysis of the Symmetric Cryptograph on Two Stream Processor Architectures.
Proceedings of the Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2009), 2009
Investigation of Factors Impacting Thread-Level Parallelism from Desktop, Multimedia and HPC Applications.
Proceedings of the Fourth International Conference on Frontier of Computer Science and Technology, 2009
Proceedings of the Fourth International Conference on Frontier of Computer Science and Technology, 2009
Scaling the Performance of Tiled Processor Architectures with On-Chip-Network Topology.
Proceedings of the Second International Joint Conference on Computational Sciences and Optimization, 2009
2008
Proceedings of the 13th Asia-Pacific Computer Systems Architecture Conference, 2008
Proceedings of the 13th Asia-Pacific Computer Systems Architecture Conference, 2008
Proceedings of the 13th Asia-Pacific Computer Systems Architecture Conference, 2008
2007
Balancing Thread Partition for Efficiently Exploiting Speculative Thread-Level Parallelism.
Proceedings of the Advanced Parallel Processing Technologies, 7th International Symposium, 2007
Proceedings of the Advances in Computer Systems Architecture, 2007
2005
Improving Latency Tolerance of Network Processors Through Simultaneous Multithreading.
Proceedings of the Advanced Parallel Processing Technologies, 6th International Workshop, 2005
2000
Proceedings of the Applied Computing 2000, 2000
1999
Proceedings of the TOOLS 1999: 31st International Conference on Technology of Object-Oriented Languages and Systems, 1999
A Java/CORBA Based Universal Framework for Super Server User-End Integrated Environments.
Proceedings of the TOOLS 1999: 31st International Conference on Technology of Object-Oriented Languages and Systems, 1999