Xian-He Sun

Orcid: 0000-0002-1093-0792

Affiliations:
  • Illinois Institute of Technology, Department of Computer Science, Chicago, IL, USA
  • NASA Langley Research Center, ICASE, Hampton, VA, USA (former)
  • Louisiana State University, Department of Computer Science, Baton Rouge, LA, USA (former)
  • Michigan State University, Department of Computer Science, East Lansing, MI, USA (PhD 1990)


According to our database1, Xian-He Sun authored at least 282 papers between 1989 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Awards

IEEE Fellow

IEEE Fellow 2012, "For contributions to memory-bounded performance metrics and scalable parallel computing".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
AMBEA: Aggressive Maximal Biclique Enumeration in Large Bipartite Graph Computing.
IEEE Trans. Computers, December, 2024

Skyway: Accelerate Graph Applications with a Dual-Path Architecture and Fine-Grained Data Management.
J. Comput. Sci. Technol., July, 2024

An Evaluation of DAOS for Simulation and Deep Learning HPCWorkloads.
ACM SIGOPS Oper. Syst. Rev., June, 2024

Enumeration of Billions of Maximal Bicliques in Bipartite Graphs without Using GPUs.
Proceedings of the International Conference for High Performance Computing, 2024

MegaMmap: Blurring the Boundary Between Memory and Storage for Data-Intensive Workloads.
Proceedings of the International Conference for High Performance Computing, 2024

DFTracer: An Analysis-Friendly Data Flow Tracer for AI-Driven Workflows.
Proceedings of the International Conference for High Performance Computing, 2024

TunIO: An AI-powered Framework for Optimizing HPC I/O.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

Viper: A High-Performance I/O Framework for Transparently Updating, Storing, and Transferring Deep Neural Network Models.
Proceedings of the 53rd International Conference on Parallel Processing, 2024

AUTOHET: An Automated Heterogeneous ReRAM-Based Accelerator for DNN Inference.
Proceedings of the 53rd International Conference on Parallel Processing, 2024

HStream: A hierarchical data streaming engine for high-throughput scientific applications.
Proceedings of the 53rd International Conference on Parallel Processing, 2024

CHROME: Concurrency-Aware Holistic Cache Management Framework with Online Reinforcement Learning.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

DaYu: Optimizing Distributed Scientific Workflows by Decoding Dataflow Semantics and Dynamics.
Proceedings of the IEEE International Conference on Cluster Computing, 2024

Hades: A Context-Aware Active Storage Framework for Accelerating Large-Scale Data Analysis.
Proceedings of the 24th IEEE International Symposium on Cluster, 2024

ACES: Accelerating Sparse Matrix Multiplication with Adaptive Execution Flow and Concurrency-Aware Cache Optimizations.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
The Memory-Bounded Speedup Model and Its Impacts in Computing.
J. Comput. Sci. Technol., February, 2023

IOMax: Maximizing Out-of-Core I/O Analysis Performance on HPC Systems.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Data Flow Lifecycles for Optimizing Workflow Coordination.
Proceedings of the International Conference for High Performance Computing, 2023

Meltrix: A RRAM-Based Polymorphic Architecture Enhanced by Function Synthesis.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

CARE: A Concurrency-Aware Enhanced Lightweight Cache Management Framework.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

iCache: An Importance-Sampling-Informed Cache for Accelerating I/O-Bound DNN Model Training.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

An Evaluation of DAOS for Simulation and Deep Learning HPC Workloads.
Proceedings of the 3rd Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems, 2023

2022
Accelerating Tensor Swapping in GPUs With Self-Tuning Compression.
IEEE Trans. Parallel Distributed Syst., 2022

Accelerating Graph Processing With Lightweight Learning-Based Data Reordering.
IEEE Comput. Archit. Lett., 2022

A Generalized Model for Modern Hierarchical Memory System.
Proceedings of the Winter Simulation Conference, 2022

LabStor: A Modular and Extensible Platform for Developing High-Performance, Customized I/O Stacks in Userspace.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

LuxIO: Intelligent Resource Provisioning and Auto-Configuration for Storage Services.
Proceedings of the 29th IEEE International Conference on High Performance Computing, 2022

Stimulus: Accelerate Data Management for Scientific AI applications in HPC.
Proceedings of the 22nd IEEE International Symposium on Cluster, 2022

NVAlloc: rethinking heap metadata management in persistent memory allocators.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021
Sova: A Software-Defined Autonomic Framework for Virtual Network Allocations.
IEEE Trans. Parallel Distributed Syst., 2021

Preface.
J. Comput. Sci. Technol., 2021

A Study on Modeling and Optimization of Memory Systems.
J. Comput. Sci. Technol., 2021

Survey the storage systems used in HPC and BDA ecosystems.
CoRR, 2021

HCDA: from computational thinking to a generalized thinking paradigm.
Commun. ACM, 2021

CoPIM: A Concurrency-aware PIM Workload Offloading Architecture for Graph Applications.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2021

AUTO-PRUNE: automated DNN pruning and mapping for ReRAM-based accelerator.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

Premier: A Concurrency-Aware Pseudo-Partitioning Framework for Shared Last-Level Cache.
Proceedings of the 39th IEEE International Conference on Computer Design, 2021

Apollo: : An ML-assisted Real-Time Storage Resource Observer.
Proceedings of the HPDC '21: The 30th International Symposium on High-Performance Parallel and Distributed Computing, 2021

pMEMCPY: a simple, lightweight, and portable I/O library for storing data in persistent memory.
Proceedings of the IEEE International Conference on Cluster Computing, 2021

HFlow: A Dynamic and Elastic Multi-Layered I/O Forwarder.
Proceedings of the IEEE International Conference on Cluster Computing, 2021

CSWAP: A Self-Tuning Compression Framework for Accelerating Tensor Swapping in GPUs.
Proceedings of the IEEE International Conference on Cluster Computing, 2021

DLIO: A Data-Centric Benchmark for Scientific Deep Learning Applications.
Proceedings of the 21st IEEE/ACM International Symposium on Cluster, 2021

2020
A Holistic Heterogeneity-Aware Data Placement Scheme for Hybrid Parallel I/O Systems.
IEEE Trans. Parallel Distributed Syst., 2020

Bridging Storage Semantics Using Data Labels and Asynchronous I/O.
ACM Trans. Storage, 2020

Optimizing Parallel I/O Accesses through Pattern-Directed and Layout-Aware Replication.
IEEE Trans. Computers, 2020

I/O Acceleration via Multi-Tiered Data Buffering and Prefetching.
J. Comput. Sci. Technol., 2020

Performance Modeling and Evaluation of a Production Disaggregated Memory System.
Proceedings of the MEMSYS 2020: The International Symposium on Memory Systems, 2020

HFetch: Hierarchical Data Prefetching for Scientific Workflows in Multi-Tiered Storage Environments.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

HCompress: Hierarchical Data Compression for Multi-Tiered Storage Environments.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

APAC: An Accurate and Adaptive Prefetch Framework with Concurrent Memory Access Analysis.
Proceedings of the 38th IEEE International Conference on Computer Design, 2020

HCL: Distributing Parallel Data Structures in Extreme Scales.
Proceedings of the IEEE International Conference on Cluster Computing, 2020

HReplica: A Dynamic Data Replication Engine with Adaptive Compression for Multi-Tiered Storage.
Proceedings of the 2020 IEEE International Conference on Big Data (IEEE BigData 2020), 2020

2019
On Cost-Driven Collaborative Data Caching: A New Model Approach.
IEEE Trans. Parallel Distributed Syst., 2019

LPM: A Systematic Methodology for Concurrent Data Access Pattern Optimization from a Matching Perspective.
IEEE Trans. Parallel Distributed Syst., 2019

CADS: Core-Aware Dynamic Scheduler for Multicore Memory Controllers.
CoRR, 2019

LABIOS: A Distributed Label-Based I/O System.
Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing, 2019

An Intelligent, Adaptive, and Flexible Data Compression Framework.
Proceedings of the 19th IEEE/ACM International Symposium on Cluster, 2019

NIOBE: An Intelligent I/O Bridging Engine for Complex and Distributed Workflows.
Proceedings of the 2019 IEEE International Conference on Big Data (IEEE BigData), 2019

2018
A Cost-Effective Distribution-Aware Data Replication Scheme for Parallel I/O Systems.
IEEE Trans. Computers, 2018

CaL: Extending Data Locality to Consider Concurrency for Performance Optimization.
IEEE Trans. Big Data, 2018

A Migratory Heterogeneity-Aware Data Layout Scheme for Parallel File Systems.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

IRIS: I/O Redirection via Integrated Storage.
Proceedings of the 32nd International Conference on Supercomputing, 2018

Hermes: a heterogeneous-aware multi-tiered distributed I/O buffering system.
Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing, 2018

Vidya: Performing Code-Block I/O Characterization for Data Access Optimization.
Proceedings of the 25th IEEE International Conference on High Performance Computing, 2018

Harmonia: An Interference-Aware Dynamic I/O Scheduler for Shared Non-volatile Burst Buffers.
Proceedings of the IEEE International Conference on Cluster Computing, 2018

SciDP: Support HPC and Big Data Applications via Integrated Scientific Data Processing.
Proceedings of the IEEE International Conference on Cluster Computing, 2018

Horizon: a multi-abstraction framework for graph analytics.
Proceedings of the 15th ACM International Conference on Computing Frontiers, 2018

2017
Using MinMax-Memory Claims to Improve In-Memory Workflow Computations in the Cloud.
IEEE Trans. Parallel Distributed Syst., 2017

Cost-Aware Region-Level Data Placement in Multi-Tiered Parallel I/O Systems.
IEEE Trans. Parallel Distributed Syst., 2017

Evaluating the Combined Effect of Memory Capacity and Concurrency for Many-Core Chip Design.
ACM Trans. Model. Perform. Evaluation Comput. Syst., 2017

Modeling and Simulation of Extreme-Scale Fat-Tree Networks for HPC Systems and Data Centers.
ACM Trans. Model. Comput. Simul., 2017

HARL: Optimizing Parallel File Systems with Heterogeneity-Aware Region-Level Data Layout.
IEEE Trans. Computers, 2017

Heterogeneity-Aware Collective I/O for Parallel I/O Systems with Hybrid HDD/SSD Servers.
IEEE Trans. Computers, 2017

Special Issue on Scalable Computing Systems for Big Data Applications.
J. Parallel Distributed Comput., 2017

Rethinking key-value store for parallel I/O optimization.
Int. J. High Perform. Comput. Appl., 2017

Evaluating GPGPU Memory Performance Through the C-AMAT Model.
Proceedings of the Workshop on Memory Centric Programming for HPC, 2017

Principles of Memory-Centric Programming for High Performance Computing.
Proceedings of the Workshop on Memory Centric Programming for HPC, 2017

2016
Towards Exploring Data-Intensive Scientific Applications at Extreme Scales through Systems and Simulations.
IEEE Trans. Parallel Distributed Syst., 2016

Improving Performance of Parallel I/O Systems through Selective and Layout-Aware SSD Cache.
IEEE Trans. Parallel Distributed Syst., 2016

Boosting Parallel File System Performance via Heterogeneity-Aware Selective Data Layout.
IEEE Trans. Parallel Distributed Syst., 2016

Enhancing hybrid parallel file system through performance and space-aware data layout.
Int. J. High Perform. Comput. Appl., 2016

A memory-driven scheduling scheme and optimization for concurrent execution in GPU.
Clust. Comput., 2016

Rethinking High Performance Computing System Architecture for Scientific Big Data Applications.
Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, 2016

Towards Energy Efficient Data Management in HPC: The Open Ethernet Drive Approach.
Proceedings of the 1st Joint International Workshop on Parallel Data Storage and data Intensive Scalable Computing Systems, 2016

Utilizing Concurrency: A New Theory for Memory Wall.
Proceedings of the Languages and Compilers for Parallel Computing, 2016

On MinMax-Memory Claims for Scientific Workflows in the In-memory Cloud Computing.
Proceedings of the 36th IEEE International Conference on Distributed Computing Systems, 2016

Leveraging burst buffer coordination to prevent I/O interference.
Proceedings of the 12th IEEE International Conference on e-Science, 2016

Efficient design space exploration via statistical sampling and AdaBoost learning.
Proceedings of the 53rd Annual Design Automation Conference, 2016

Efficient design space exploration by knowledge transfer.
Proceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2016

Towards optimizing large-scale data transfers with end-to-end integrity verification.
Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

Visualization and Adaptive Subsetting of Earth Science Data in HDFS: A Novel Data Analysis Strategy with Hadoop and Spark.
Proceedings of the 2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), 2016

2015
I/O and File Systems for Data-Intensive Applications.
Proceedings of the Handbook on Data Centers, 2015

Recent advances in autonomic provisioning of big data applications on clouds.
IEEE Trans. Cloud Comput., 2015

Reevaluating Data Stall Time with the Consideration of Data Access Concurrency.
J. Comput. Sci. Technol., 2015

<i>C</i><sup>2</sup>-bound: a capacity and concurrency driven analytical model for many-core design.
Proceedings of the International Conference for High Performance Computing, 2015

Efficient disk-to-disk sorting: a case study in the decoupled execution paradigm.
Proceedings of the 2015 International Workshop on Data-Intensive Scalable Computing Systems, 2015

FatTreeSim: Modeling Large-scale Fat-Tree Networks for HPC Systems and Data Centers Using Parallel and Discrete Event Simulation.
Proceedings of the 3rd ACM Conference on SIGSIM-Principles of Advanced Discrete Simulation, London, United Kingdom, June 10, 2015

HAS: Heterogeneity-Aware Selective Data Layout Scheme for Parallel File Systems on Hybrid Servers.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

DaCache: Memory Divergence-Aware GPU Cache Management.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

LPM: Concurrency-Driven Layered Performance Matching.
Proceedings of the 44th International Conference on Parallel Processing, 2015

A Heterogeneity-Aware Region-Level Data Layout for Hybrid Parallel File Systems.
Proceedings of the 44th International Conference on Parallel Processing, 2015

LCIndex: A Local and Clustering Index on Distributed Ordered Tables for Flexible Multi-dimensional Range Queries.
Proceedings of the 44th International Conference on Parallel Processing, 2015

Dominoes: Speculative Repair in Erasure-Coded Hadoop System.
Proceedings of the 22nd IEEE International Conference on High Performance Computing, 2015

IC-Data: Improving Compressed Data Processing in Hadoop.
Proceedings of the 22nd IEEE International Conference on High Performance Computing, 2015

Overcoming Hadoop Scaling Limitations through Distributed Task Execution.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

IOSIG+: On the Role of I/O Tracing and Analysis for Hadoop Systems.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

Fast Fault Injection and Sensitivity Analysis for Collective Communications.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

YARNsim: Simulating Hadoop YARN.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

A Hadoop-based visualization and diagnosis framework for earth science data.
Proceedings of the 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, CA, USA, October 29, 2015

PortHadoop: Support direct HPC data processing in Hadoop.
Proceedings of the 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, CA, USA, October 29, 2015

2014
APC: A Novel Memory Metric and Measurement Methodology for Modern Memory Systems.
IEEE Trans. Computers, 2014

Concurrent Average Memory Access Time.
Computer, 2014

Rethinking key-value store for parallel I/O optimization.
Proceedings of the 2014 International Workshop on Data Intensive Scalable Computing Systems, 2014

PSA: a performance and space-aware data layout scheme for hybrid parallel file systems.
Proceedings of the 2014 International Workshop on Data Intensive Scalable Computing Systems, 2014

HPIS3: towards a high-performance simulator for hybrid parallel I/O and storage systems.
Proceedings of the 9th Parallel Data Storage Workshop, 2014

Decoupled I/O for Data-Intensive High Performance Computing.
Proceedings of the 43rd International Conference on Parallel Processing Workshops, 2014

S4D-Cache: Smart Selective SSD Cache for Parallel I/O Systems.
Proceedings of the IEEE 34th International Conference on Distributed Computing Systems, 2014

Performance-Aware Data Placement in Hybrid Parallel File Systems.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2014

SCALER: Scalable parallel file write in HDFS.
Proceedings of the 2014 IEEE International Conference on Cluster Computing, 2014

2013
Layout-aware scientific computing: A case study using the MILC code.
J. Comput. Sci., 2013

Performance comparison under failures of MPI and MapReduce: An analytical approach.
Future Gener. Comput. Syst., 2013

Cost-intelligent application-specific data layout optimization for parallel file systems.
Clust. Comput., 2013

Pattern-Direct and Layout-Aware Replication Scheme for Parallel I/O Systems.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

BPS: A Performance Metric of I/O System.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

I/O acceleration with pattern detection.
Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, 2013

A cost-aware region-level data placement scheme for hybrid parallel I/O systems.
Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013

Runtime system design of decoupled execution paradigm for data-intensive high-end computing.
Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013

2012
APC: a performance metric of memory systems.
SIGMETRICS Perform. Evaluation Rev., 2012

Algorithm-level Feedback-controlled Adaptive data prefetcher: Accelerating data access for high-performance processors.
Parallel Comput., 2012

Discovering Structure in Unstructured I/O.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

A Source-aware Interrupt Scheduling for Modern Parallel I/O Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

A Server-Level Adaptive Data Layout Strategy for Parallel File Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

CHAIO: Enabling HPC Applications on Data-Intensive File Systems.
Proceedings of the 41st International Conference on Parallel Processing, 2012

ADAPT: Availability-Aware MapReduce Data Placement for Non-dedicated Distributed Computing.
Proceedings of the 2012 IEEE 32nd International Conference on Distributed Computing Systems, 2012

KNOWAC: I/O Prefetch via Accumulated Knowledge.
Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012

A Decoupled Execution Paradigm for Data-Intensive High-End Computing.
Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012

Boosting Application-Specific Parallel I/O Optimization Using IOSIG.
Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

Checkpointing Orchestration: Toward a Scalable HPC Fault-Tolerant Environment.
Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

SERA-IO: Integrating Energy Consciousness into Parallel I/O Middleware.
Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

2011
Special issue on Data Intensive Computing.
J. Parallel Distributed Comput., 2011

Global-aware and multi-order context-based prefetching for high-performance processors.
Int. J. High Perform. Comput. Appl., 2011

Server-side I/O coordination for parallel file systems.
Proceedings of the Conference on High Performance Computing Networking, 2011

Layout-aware scientific computing: a case study using MILC.
Proceedings of the second workshop on Scalable algorithms for large-scale systems, 2011

EthSpeeder: A High-performance Scalable Fault-Tolerant Ethernet Network Architecture for Data Center.
Proceedings of the Sixth International Conference on Networking, Architecture, and Storage, 2011

A Hybrid Shared-Nothing/Shared-Data Storage Scheme for Large-Scale Data Processing.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2011

LACIO: A New Collective I/O Strategy for Parallel I/O Systems.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

A cost-intelligent application-specific data layout scheme for parallel file systems.
Proceedings of the 20th ACM International Symposium on High Performance Distributed Computing, 2011

PAC-PLRU: A Cache Replacement Policy to Salvage Discarded Predictions from Hardware Prefetchers.
Proceedings of the 11th IEEE/ACM International Symposium on Cluster, 2011

A Segment-Level Adaptive Data Layout Scheme for Improved Load Balance in Parallel File Systems.
Proceedings of the 11th IEEE/ACM International Symposium on Cluster, 2011

A Hybrid Shared-Nothing/Shared-Data Storage Architecture for Large Scale Databases.
Proceedings of the 11th IEEE/ACM International Symposium on Cluster, 2011

Performance under Failures of MapReduce Applications.
Proceedings of the 11th IEEE/ACM International Symposium on Cluster, 2011

2010
Reevaluating Amdahl's law in the multicore era.
J. Parallel Distributed Comput., 2010

An evaluation of parallel optimization for OpenSolaris<sup>®</sup> network stack.
Proceedings of the 35th Annual IEEE Conference on Local Computer Networks, 2010

Characterizing energy efficiency of I/O intensive parallel applications on power-aware clusters.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Timing local streams: improving timeliness in data prefetching.
Proceedings of the 24th International Conference on Supercomputing, 2010

Improving the Effectiveness of Context-Based Prefetching with Multi-order Analysis.
Proceedings of the 39th International Conference on Parallel Processing, 2010

Optimizing HPC Fault-Tolerant Environment: An Analytical Approach.
Proceedings of the 39th International Conference on Parallel Processing, 2010

A layout-aware optimization strategy for collective I/O.
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 2010

Improving Parallel I/O Performance with Data Layout Awareness.
Proceedings of the 2010 IEEE International Conference on Cluster Computing, 2010

REMEM: REmote MEMory as Checkpointing Storage.
Proceedings of the Cloud Computing, Second International Conference, 2010

An Adaptive Data Prefetcher for High-Performance Processors.
Proceedings of the 10th IEEE/ACM International Conference on Cluster, 2010

2009
Fault-Aware Runtime Strategies for High-Performance Computing.
IEEE Trans. Parallel Distributed Syst., 2009

Special Issue of the Journal of Parallel and Distributed Computing: Data-Intensive Computing.
J. Parallel Distributed Comput., 2009

Taxonomy of Data Prefetching for Multicore Processors.
J. Comput. Sci. Technol., 2009

Core-aware memory access scheduling schemes.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Performance under Failure of Multi-tier Web Services.
Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

Modeling Data Access Contention in Multicore Architectures.
Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

Introduction.
Proceedings of the Euro-Par 2009 Parallel Processing, 2009

V-MCS: A configuration system for virtual machines.
Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009

Performance under Failures of DAG-based Parallel Computing.
Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, 2009

2008
Dual-mode transmission networks for DTV.
IEEE Trans. Consumer Electron., 2008

Algorithm-system scalability of heterogeneous computing.
J. Parallel Distributed Comput., 2008

Hiding I/O latency with pre-execution prefetching for parallel applications.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Parallel I/O prefetching using MPI file caching and I/O signatures.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

A Parallel Algorithm for Block Tridiagonal Systems.
Proceedings of the Ninth International Conference on Parallel and Distributed Computing, 2008

A Taxonomy of Data Prefetching Mechanisms.
Proceedings of the 9th International Symposium on Parallel Architectures, 2008

2008 International Conference on Parallel Processing September 8-12, 2008 Portland, Oregon Exploring Parallel I/O Concurrency with Speculative Prefetching.
Proceedings of the 2008 International Conference on Parallel Processing, 2008

Lattice QCD Workflows: A Case Study.
Proceedings of the Fourth International Conference on e-Science, 2008

2007
log<sub>n</sub>P and log<sub>3</sub>P: Accurate Analytical Models of Point-to-Point Communication in Distributed Systems.
IEEE Trans. Computers, 2007

Server-Based Data Push Architecture for Multi-Processor Environments.
J. Comput. Sci. Technol., 2007

Performance under failures of high-end computing.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

Data access history cache and associated data prefetching mechanisms.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

Improving Data Access Performance with Server Push Architecture.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Fault-Driven Re-Scheduling For Improving System-level Fault Resilience.
Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

Quality of Service of Grid Computing: Resource Sharing.
Proceedings of the Grid and Cooperative Computing, 2007

Dynamic Scheduling with Process Migration.
Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007), 2007

2006
Grid harvest service: A performance system of grid computing.
J. Parallel Distributed Comput., 2006

Performance analysis and optimization - International workshop on performance analysis and optimization of high-end computing systems.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Automatic Memory Optimizations for Improving MPI Derived Datatype Performance.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

The GHS grid scheduling system: implementation and performance comparison.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

Remove the memory wall: from performance modeling to architecture optimization.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

STAS: A Scalability Testing and Analysis System.
Proceedings of the 2006 IEEE International Conference on Cluster Computing, 2006

QoS Oriented Resource Reservation in Shared Environments.
Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

Network Bandwidth Predictor (NBP): A System for Online Network performance Forecasting.
Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

MPI-Mitten: Enabling Migration Technology in MPI.
Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

Memory Servers: A Scope of SOA for High-End Computing.
Proceedings of the 2006 IEEE International Conference on Services Computing (SCC 2006), 2006

2005
Isolating Costs in Shared Memory Communication Buffering.
Parallel Process. Lett., 2005

A systematic approach for closer integration of cellular and Internet services.
IEEE Netw., 2005

Viewpoints on Grid Standards.
J. Comput. Sci. Technol., 2005

Inhibitors for ubiquitous deployment of services in the next-generation network.
IEEE Commun. Mag., 2005

A Highly Parallel Algorithm for the Numerical Simulation of Unsteady Diffusion Processes.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

GHS: A Performance System of Grid Computing.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

A Neural Network Based Predictive Mechanism for Available Bandwidth.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Scalability of Heterogeneous Computing.
Proceedings of the 34th International Conference on Parallel Processing (ICPP 2005), 2005

Incorporating Data Movement into Grid Task Scheduling.
Proceedings of the Grid and Cooperative Computing - GCC 2005, 4th International Conference, Beijing, China, November 30, 2005

2004
A Parallel Two-Level Hybrid Method for Tridiagonal Systems and Its Application to Fast Poisson Solvers.
IEEE Trans. Parallel Distributed Syst., 2004

Terminating telephony services on the internet.
IEEE/ACM Trans. Netw., 2004

Communication State Transfer for the Mobility of Concurrent Heterogeneous Computing.
IEEE Trans. Computers, 2004

Middleware: the key to next generation computing.
J. Parallel Distributed Comput., 2004

Self-adaptive task allocation and scheduling of meta-tasks in non-dedicated heterogeneous computing.
Int. J. High Perform. Comput. Netw., 2004

Preface.
J. Grid Comput., 2004

A Runtime System for Autonomic Rescheduling of MPI Programs.
Proceedings of the 33rd International Conference on Parallel Processing (ICPP 2004), 2004

Memory Conscious Task Partition and Scheduling in Grid Environments.
Proceedings of the 5th International Workshop on Grid Computing (GRID 2004), 2004

Extensions to an Internet signaling protocol to support telecommunication services.
Proceedings of the Global Telecommunications Conference, 2004. GLOBECOM '04, Dallas, Texas, USA, 29 November, 2004

Predicting memory-access cost based on data-access patterns.
Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004

2003
QoS Guided Min-Min Heuristic for Grid Task Scheduling.
J. Comput. Sci. Technol., 2003

A File Transfer Component for Grids.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2003

Grid Harvest Service: A System for Long-Term, Application-Level Task Scheduling.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Quantifying Locality Effect in Data Access Delay: Memory logP.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Accessing telephony services from the Internet.
Proceedings of the 12th International Conference on Computer Communications and Networks, 2003

Services spanning heterogeneous networks.
Proceedings of IEEE International Conference on Communications, 2003

A General Self-Adaptive Task Scheduling System for Non-Dedicated Heterogeneous Computing.
Proceedings of the 2003 IEEE International Conference on Cluster Computing (CLUSTER 2003), 2003

HPCM: A Pre-Compiler Aided Middleware for the Mobility of Legacy Code.
Proceedings of the 2003 IEEE International Conference on Cluster Computing (CLUSTER 2003), 2003

Improving the Performance of MPI Derived Datatypes by Optimizing Memory-Access Cost.
Proceedings of the 2003 IEEE International Conference on Cluster Computing (CLUSTER 2003), 2003

Flow-Based Multistage Co-Allocation Service.
Proceedings of the International Conference on Communications in Computing, 2003

2002
Performance Modeling and Prediction of Nondedicated Network Computing.
IEEE Trans. Computers, 2002

Data collection and restoration for heterogeneous process migration.
Softw. Pract. Exp., 2002

Stabilized Explicit-Implicit Domain Decomposition Methods for the Numerical Solution of Parabolic Equations.
SIAM J. Sci. Comput., 2002

Scalability versus Execution Time in Scalable Systems.
J. Parallel Distributed Comput., 2002

SCALA: A Performance System For Scalable Computing.
Int. J. High Perform. Comput. Appl., 2002

Design and Development of a Scalable Distributed Debugger for Cluster Computing.
Clust. Comput., 2002

A Parallel Two-Level Hybrid Method for Diagonal Dominant Tridiagonal Systems.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

SNOW: Software Systems for Process Migration in High-Performance, Heterogeneous Distributed Environments.
Proceedings of the 31st International Conference on Parallel Processing Workshops (ICPP 2002 Workshops), 2002

2001
Adaptive multivariate regression for advanced memory system evaluation: application and experience.
Perform. Evaluation, 2001

Stable, globally non-iterative, non-overlapping domain decomposition parallel solvers for parabolic problems.
Proceedings of the 2001 ACM/IEEE conference on Supercomputing, 2001

A Protocol Design of Communication State Transfer for Distributed Computing.
Proceedings of the 21st International Conference on Distributed Computing Systems (ICDCS 2001), 2001

2000
Average-Case Analysis of Isospeed Scalability of Parallel Computations on Multiprocessors.
Int. J. High Speed Comput., 2000

Execution-driven performance analysis for distributed and parallel systems.
Proceedings of the Second International Workshop on Software and Performance, 2000

PDRS: A Performance Data Representation System.
Proceedings of the Parallel and Distributed Processing, 2000

Adaptive Wavelet ADI Method: Application and Parallelization.
Proceedings of the 2000 International Workshop on Parallel Processing, 2000

A Statistical-Empirical Hybrid Approach to Hierarchical Memory Analysis.
Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

1999
Integrated Range Comparison for Data-Parallel Compilation Systems.
IEEE Trans. Parallel Distributed Syst., 1999

Editorial.
J. Supercomput., 1999

Computer simulation of PEC network.
Simul. Pract. Theory, 1999

A Java-based Distributed Debbuger Supporting MPI and PVM.
Parallel Distributed Comput. Pract., 1999

A Domain Decomposition Based Parallel Solver for Time Dependent Differential Equations.
Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

A Coordinated Approach for Process Migration in Heterogeneous Environments.
Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

A Memory-Centric Characterization of ASCI Applications Via a Combined Approach of Statistical and Empirical Analysis.
Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

SCALA: A Framework for Performance Evaluation of Scalable Computing.
Proceedings of the Parallel and Distributed Processing, 1999

A Factorial Performance Evaluation for Hierarchical Memory Systems.
Proceedings of the 13th International Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS / SPDP '99), 1999

1998
Performance Range Comparison via Crossing Point Analysis.
Proceedings of the Parallel and Distributed Processing, 10 IPPS/SPDP'98 Workshops Held in Conjunction with the 12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing, Orlando, Florida, USA, March 30, 1998

Memory Space Representation for Heterogeneous Network Process Migration.
Proceedings of the 12th International Parallel Processing Symposium / 9th Symposium on Parallel and Distributed Processing (IPPS/SPDP '98), March 30, 1998

Performance Range Comparison for Restructuring Compilation.
Proceedings of the 1998 International Conference on Parallel Processing (ICPP '98), 1998

1997
Limitations of Cycle Stealing for Parallel Processing on a Network of Homogeneous Workstations.
J. Parallel Distributed Comput., 1997

Performance comparison of a set of periodic and non-periodic tridiagonal solvers on SP2 and Paragon parallel computers.
Concurr. Pract. Exp., 1997

Parallel Implementation of a Data-Transpose Technique for the Solution of Poisson's Equation in Cylindrical Coordinates.
Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997

A Highly Accurate Fast Solver for Helmholtz Equations.
Proceedings of the 11th international conference on Supercomputing, 1997

1996
Performance Prediction: A Case Study Using a Scalable Shared-Virtual-Memory Machine.
IEEE Parallel Distributed Technol. Syst. Appl., 1996

Performance measurement and comparison of a set of parallel periodic and non-periodic tridiagonal solvers.
Proceedings of the 1996 International Symposium on Parallel Architectures, 1996

The Relation of Scalability and Execution Time.
Proceedings of IPPS '96, 1996

MpPVM: A Software System for Non-Dedicated Heterogeneous Computing.
Proceedings of the 1996 International Conference on Parallel Processing, 1996

1995
Performance Considerations of Shared Virtual Memory Machines.
IEEE Trans. Parallel Distributed Syst., 1995

Application and Accuracy of the Parallel Diagonal Dominant Algorithm.
Parallel Comput., 1995

A Parallel Prefix Algorithm for Almost Toeplitz Tridiagonal Systems.
Int. J. High Speed Comput., 1995

Performance prediction of scalable computing: a case study.
Proceedings of the 28th Annual Hawaii International Conference on System Sciences (HICSS-28), 1995

1994
Scalability of Parallel Algorithm-Machine Combinations.
IEEE Trans. Parallel Distributed Syst., 1994

Special Issue on Scalability of Parallel Algorithms and Architectures - Guest Editors' Introduction.
J. Parallel Distributed Comput., 1994

Shared Virtual Memory and Generalized Speedup.
Proceedings of the 8th International Symposium on Parallel Processing, 1994

A Massively Parallel Algorithm for Compact Finite Difference Schemes.
Proceedings of the 1994 International Conference on Parallel Processing, 1994

1993
Scalable Problems and Memory-Bounded Speedup.
J. Parallel Distributed Comput., 1993

Distributed computing feasibility in a non-dedicated homogeneous distributed system.
Proceedings of the Proceedings Supercomputing '93, 1993

On the Parallel Diagonal Dominant Algorithm.
Proceedings of the 1993 International Conference on Parallel Processing, 1993

1992
Efficient Tridiagonal Solvers on Multicomputers.
IEEE Trans. Computers, 1992

Preprocessing predicates and queries.
Inf. Syst., 1992

1991
Parallel Homotopy Algorithm for the Symmetric Tridiagonal Eigenvalue Problem.
SIAM J. Sci. Comput., 1991

Toward a better parallel performance metric.
Parallel Comput., 1991

SIZEUP: A New Parallel Performance Metric.
Proceedings of the International Conference on Parallel Processing, 1991

1990
Another view on parallel speedup.
Proceedings of the Proceedings Supercomputing '90, New York, NY, USA, November 12-16, 1990, 1990

Dynamic query range for multikey searching.
Proceedings of the Fourteenth Annual International Computer Software and Applications Conference, 1990

1989
Processing Implication on Queries.
IEEE Trans. Software Eng., 1989

Solving Implication Problems in Database Applications.
Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data, Portland, Oregon, USA, May 31, 1989

Compute-Exchange Computation for Solving Power Flow Problems: The Model and Application.
Proceedings of the Fourth SIAM Conference on Parallel Processing for Scientific Computing, 1989

Parallel algorithms for solution of tridiagonal systems on multicomputers.
Proceedings of the 3rd international conference on Supercomputing, 1989


  Loading...