Establishing a Modeling System in 3-km Horizontal Resolution for Global Atmospheric Circulation triggered by Submarine Volcanic Eruptions with 400 Billion Smoothed Particle Hydrodynamics.

[BibT_eX]

[DOI]

Shenghong Huang

Junshi Chen

Proceedings of the International Conference for High Performance Computing, 2023

Contrast Learning Based Robust Framework for Weakly Supervised Medical Image Segmentation with Coarse Bounding Box Annotations.

[BibT_eX]

[DOI]

Proceedings of the Computational Mathematics Modeling in Cancer Analysis, 2023

H-DenseFormer: An Efficient Hybrid Densely Connected Transformer for Multimodal Tumor Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

FRNET: An Effective Hybrid Structure for Automatic Segmentation of Head and Neck Primary Tumors from Multimodal Images.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Symposium on Biomedical Imaging, 2023

SWSPH: A Massively Parallel SPH Implementation for Hundred-Billion-Particle Simulation on New Sunway Supercomputer.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2023: Parallel Processing - 29th International Conference on Parallel and Distributed Computing, Limassol, Cyprus, August 28, 2023

2022

Bridging the Gap between Deep Learning and Frustrated Quantum Spin System for Extreme-Scale Simulations on New Generation of Sunway Supercomputer.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2022

AI for Quantum Mechanics: High Performance Quantum Many-Body Simulations via Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the SC22: International Conference for High Performance Computing, 2022

2.5 Million-Atom Ab Initio Electronic-Structure Simulation of Complex Metallic Heterostructures with DGDFT.

[BibT_eX]

[DOI]

Proceedings of the SC22: International Conference for High Performance Computing, 2022

A Systematic Methodology for performance characterizing of Heterogeneous Systems with a dataflow runtime simulator.

[BibT_eX]

[DOI]

Proceedings of the 4th International Conference on Robotics, 2022

SCAR U-Net: A 3D Spatial-Channel Attention ResU-Net for Brain Tumor Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Symposium on Artificial Intelligence for Medicine Sciences, 2022

Accelerating Parallel First-Principles Excited-State Calculation by Low-Rank Approximation with K-Means Clustering.

[BibT_eX]

[DOI]

Proceedings of the 51st International Conference on Parallel Processing, 2022

High-Performance Matrix Multiplication on the New Generation Shenwei Processor.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

Machine Learning-enabled Performance Model for DNN Applications and AI Accelerator.

[BibT_eX]

[DOI]

Quantifying Throughput of Basic Blocks on ARM Microarchitectures by Static Code Analyzers: A Case Study on Kunpeng 920.

[BibT_eX]

[DOI]

Whole Slide Image Multi-Classification of Cervical Epithelial Lesions Based on Unsupervised Pre-training.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

Automatic Segmentation of Target Structures for Total Marrow and Lymphoid Irradiation in Bone Marrow Transplantation.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

FcTC-UNet: Fine-grained Combination of Transformer and CNN for Thoracic Organs Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

ITUnet: Integration Of Transformers And Unet For Organs-At-Risk Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

An Accelerated First Principle Method Implemented on IntelGPU.

[BibT_eX]

[DOI]

Le Xu

Hong An

Proceedings of the 6th International Conference on Computer Science and Application Engineering, 2022

2021

Towards Efficient Short-Range Pair Interaction on Sunway Many-Core Architecture.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2021

swFLOW: A large-scale distributed framework for deep learning on Sunway TaihuLight supercomputer.

[BibT_eX]

[DOI]

Inf. Sci., 2021

RDMA-Based Apache Storm for High-Performance Stream Data Processing.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2021

Dual-Attention Residual Network for Automatic Diagnosis of COVID-19.

[BibT_eX]

[DOI]

CoRR, 2021

Symplectic structure-preserving particle-in-cell whole-volume simulation of tokamak plasmas to 111.3 trillion particles and 25.7 billion grids.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2021

Global Multi-Level Attention Network for the Segmentation of Clinical Target Volume In The Planning CT For Cervical Cancer.

[BibT_eX]

[DOI]

Proceedings of the 18th IEEE International Symposium on Biomedical Imaging, 2021

Fast Whole Slide Image Analysis Of Cervical Cancer Using Weak Annotation.

[BibT_eX]

[DOI]

Proceedings of the 18th IEEE International Symposium on Biomedical Imaging, 2021

Reducing the Annotation Cost of Whole Slide Histology Images using Active Learning.

[BibT_eX]

[DOI]

Proceedings of the IPMV 2021: 3rd International Conference on Image Processing and Machine Vision, Hong Kong, SAR, China, May 22, 2021

Rethinking Logits-Level Knowledge Distillation.

[BibT_eX]

[DOI]

Teng Gao

Hong An

Proceedings of the ICCPR '21: 10th International Conference on Computing and Pattern Recognition, Shanghai, China, October 15, 2021

Simultaneous Right Ventricle End-diastolic and End-systolic Frame Identification and Landmark Detection on Echocardiography.

[BibT_eX]

[DOI]

Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021

DARNet: Dual-Attention Residual Network for Automatic Diagnosis of COVID-19 via CT Images.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2021

2020

Distributed deep learning system for cancerous region detection on Sunway TaihuLight.

[BibT_eX]

[DOI]

CCF Trans. High Perform. Comput., 2020

Runtime Adaptive Matrix Multiplication for the SW26010 Many-Core Processor.

[BibT_eX]

[DOI]

IEEE Access, 2020

RDMA-Based Apache Storm for High-Performance Stream Data Processing.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, 2020

A Novel U-Like Network For The Segmentation Of Thoracic Organs.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE 17th International Symposium on Biomedical Imaging Workshops (ISBI Workshops), 2020

An Efficient Multi-GPU Implementation for Linear-Response Time-Dependent Density Functional Theory.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Conference on High Performance Computing and Communications; 18th IEEE International Conference on Smart City; 6th IEEE International Conference on Data Science and Systems, 2020

Optimizing Astrophysical Simulation Software on Sunway Heterogeneous Manycore Architecture.

[BibT_eX]

[DOI]

2019

CARS: A contention-aware scheduler for efficient resource management of HPC storage systems.

[BibT_eX]

[DOI]

Parallel Comput., 2019

众核平台上广度优先搜索算法的优化 (Optimization of Breadth-first Search Algorithm Based on Many-core Platform).

[BibT_eX]

[DOI]

计算机科学, 2019

Degree-of-Node Task Scheduling of Fine-Grained Parallel Programs on Heterogeneous Systems.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2019

Improving the Performance of Distributed MXNet with RDMA.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2019

DDP-B: A Distributed Dynamic Parallel Framework for Meta-genomics Binary Similarity.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, 2019

Gdarts: A GPU-Based Runtime System for Dataflow Task Programming on Dependency Applications.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2019

Improving the Performance of MongoDB with RDMA.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

swFLOW: A Dataflow Deep Learning Framework on Sunway TaihuLight Supercomputer.

[BibT_eX]

[DOI]

TripletRun: A Dataflow Runtime Simulator and Its Performance Model.

[BibT_eX]

[DOI]

Interference-Aware I/O Scheduling for Data-Intensive Applications on Hierarchical HPC Storage Systems.

[BibT_eX]

[DOI]

Weihao Liang

Yong Chen

Hong An

Redesign NAMD Molecular Dynamics Non-Bonded Force-Field on Sunway Manycore Processor.

[BibT_eX]

[DOI]

An effective method for operations placement in Tensor Flow.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Conference on High Performance Compilation, 2019

2018

PEPS++: Towards Extreme-Scale Simulations of Strongly Correlated Quantum Many-Particle Models on Sunway TaihuLight.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2018

Combining Hadoop with MPI to Solve Metagenomics Problems that are both Data- and Compute-intensive.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2018

Improving the Performance of Distributed TensorFlow with RDMA.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2018

Contention-Aware Resource Scheduling for Burst Buffer Systems.

[BibT_eX]

[DOI]

Proceedings of the 47th International Conference on Parallel Processing, 2018

2017

A Dataflow-Based Runtime Support on a 100P Actual System.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

Refactoring the Molecular Docking Simulation for Heterogeneous, Manycore Processors Systems.

[BibT_eX]

[DOI]

A hierarchical grid algorithm for accelerating high-performance conjugate gradient benchmark on sunway many-core processor.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Conference on Communication and Information Processing, 2017

Pipelining Computation and Optimization Strategies for Scaling GROMACS on the Sunway Many-Core Processor.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2017

2016

A Flexible Chip Multiprocessor Simulator Dedicated for Thread Level Speculation.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, 2016

Parallelizing Back Propagation Neural Network on Speculative Multicores.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

2015

程序阶段性分析和阶段检测技术 (Program Phase Analysis and Phase Detection Techniques).

[BibT_eX]

[DOI]

计算机科学, 2015

Speculative Parallelism Characterization Profiling in General Purpose Computing Applications.

[BibT_eX]

[DOI]

J. Comput. Sci. Eng., 2015

Optimization and Analysis of Parallel Back Propagation Neural Network on GPU Using CUDA.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing - 22nd International Conference, 2015

Local State Reusing for Efficient Model Checking of Multithreaded Programs.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

Parallelizing Block Cryptography Algorithms on Speculative Multicores.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

Optimization of Binomial Option Pricing on Intel MIC Heterogeneous System.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

2014

Exploring speculative procedure and loop level parallelism in SPLASH2.

[BibT_eX]

[DOI]

Int. J. High Perform. Syst. Archit., 2014

Efficient execution of speculative threads and transactions with hardware transactional memory.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2014

A Criticality-Aware DVFS Runtime Utility for Optimizing Power Efficiency of Multithreaded Applications.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Understanding the SIMD Efficiency of Graph Traversal on GPU.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2014

A Compiler Translate Directive-Based Language to Optimized CUDA.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

2013

Phase-Priority based Directory Coherence for Multicore Processor

[BibT_eX]

[DOI]

Gongming Li

Hong An

CoRR, 2013

Quantitative Analysis of Inter-block Dependence in Speculative Execution.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE International Conference on Trust, 2013

2012

Priority-based squash reducing methods in thread level speculation.

[BibT_eX]

[DOI]

Int. J. Inf. Technol. Commun. Convergence, 2012

FlexBFS: a parallelism-aware implementation of breadth-first search on GPU.

[BibT_eX]

[DOI]

Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

A Speculative HMMER Search Implementation on GPU.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

CRQ-based fair scheduling on composable multicore architectures.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Supercomputing, 2012

Distributed replay protocol for distributed uniprocessors.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Supercomputing, 2012

SeTM: Efficient Execution of Speculative Threads with Hardware Transactional Memory.

[BibT_eX]

[DOI]

Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012

VSCP: A Cache Controlling Method for Improving Single Thread Performance in Multicore System.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

Distributed Control Independence for Composable Multi-processors.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE/ACIS 11th International Conference on Computer and Information Science, Shanghai, China, May 30, 2012

Value Predicted LogSPoTM: Improve the Parallesim of Thread Level System by Using a Value Predictor.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE/ACIS 11th International Conference on Computer and Information Science, Shanghai, China, May 30, 2012

2011

CHMasters: A Scalable and Speed-Efficient Metadata Service in Distributed File System.

[BibT_eX]

[DOI]

Proceedings of the 12th International Conference on Parallel and Distributed Computing, 2011

A Non-blocking Programming Framework for Pipeline Application on Multi-core Platform.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2011

A Priority-Aware NoC to Reduce Squashes in Thread Level Speculation for Chip Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2011

Exploiting Speculative Thread-Level Parallelism Based on Transactional Memory.

[BibT_eX]

[DOI]

Proceedings of the Third International Conference on Communications and Mobile Computing, 2011

Accelerating Block Cryptography Algorithms in Procedure Level Speculation.

[BibT_eX]

[DOI]

Proceedings of the Seventh International Conference on Computational Intelligence and Security, 2011

2010

FACRA: Flexible-Core Architecture Chip Resource Abstractor.

[BibT_eX]

[DOI]

Proceedings of the 2010 International Conference on Parallel and Distributed Computing, 2010

CuHMMer: A load-balanced CPU-GPU cooperative bioinformatics application.

[BibT_eX]

[DOI]

Proceedings of the 2010 International Conference on High Performance Computing & Simulation, 2010

Pattern-Unit Based Regular Expression Matching with Reconfigurable Function Unit.

[BibT_eX]

[DOI]

Proceedings of the Computational Science and Its Applications, 2010

Dynamic Resource Tuning for Flexible Core Chip Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2010

The optimization of parallel Smith-Waterman sequence alignment using on-chip memory of GPGPU.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Conference on Bio-Inspired Computing: Theories and Applications, 2010

2009

The Mapping Framework and Optimizing Strategy for Block Cryptography Algorithms on Cell Broadband Engine.

[BibT_eX]

[DOI]

Proceedings of the 2009 International Conference on Parallel and Distributed Computing, 2009

Performance and Power Efficiency Analysis of the Symmetric Cryptograph on Two Stream Processor Architectures.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2009), 2009

Investigation of Factors Impacting Thread-Level Parallelism from Desktop, Multimedia and HPC Applications.

[BibT_eX]

[DOI]

Proceedings of the Fourth International Conference on Frontier of Computer Science and Technology, 2009

A Program Behavior Study of Block Cryptography Algorithms on GPGPU.

[BibT_eX]

[DOI]

Proceedings of the Fourth International Conference on Frontier of Computer Science and Technology, 2009

Scaling the Performance of Tiled Processor Architectures with On-Chip-Network Topology.

[BibT_eX]

[DOI]

Proceedings of the Second International Joint Conference on Computational Sciences and Optimization, 2009

2008

A wire delay scalable stream processor architecture.

[BibT_eX]

[DOI]

Proceedings of the 13th Asia-Pacific Computer Systems Architecture Conference, 2008

Profile guided optimization for dataflow predication.

[BibT_eX]

[DOI]

Proceedings of the 13th Asia-Pacific Computer Systems Architecture Conference, 2008

LogSPoTM: a scalable thread level speculation model based on transactional memory.

[BibT_eX]

[DOI]

Proceedings of the 13th Asia-Pacific Computer Systems Architecture Conference, 2008

2007

Balancing Thread Partition for Efficiently Exploiting Speculative Thread-Level Parallelism.

[BibT_eX]

[DOI]

Proceedings of the Advanced Parallel Processing Technologies, 7th International Symposium, 2007

An Online Profile Guided Optimization Approach for Speculative Parallel Threading.

[BibT_eX]

[DOI]

Proceedings of the Advances in Computer Systems Architecture, 2007

2005

Improving Latency Tolerance of Network Processors Through Simultaneous Multithreading.

[BibT_eX]

[DOI]

Proceedings of the Advanced Parallel Processing Technologies, 6th International Workshop, 2005

2000

Broadcasting Under Network Ignorance Scenario.

[BibT_eX]

[DOI]

Xiangchuan Chen

Hong An

Shirong Zheng

Proceedings of the Applied Computing 2000, 2000

1999

A Parallel and Distributed Debugger Implemented with Java.

[BibT_eX]

[DOI]

Proceedings of the TOOLS 1999: 31st International Conference on Technology of Object-Oriented Languages and Systems, 1999

A Java/CORBA Based Universal Framework for Super Server User-End Integrated Environments.

[BibT_eX]

[DOI]

Proceedings of the TOOLS 1999: 31st International Conference on Technology of Object-Oriented Languages and Systems, 1999

Hong An

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...