Xiaodong Zhang

Orcid: 0000-0003-3411-3612

Affiliations:
  • Ohio State University, Columbus, OH, USA (since 2006)
  • College of William and Mary, Williamsburg, VA, USA (former)
  • University of Texas at San Antonio, TX, USA (former)
  • University of Colorado at Boulder, CO, USA (PhD 1989)


According to our database1, Xiaodong Zhang authored at least 237 papers between 1990 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
RR-Compound: RDMA-Fused gRPC for Low Latency, High Throughput, and Easy Interface.
IEEE Trans. Parallel Distributed Syst., August, 2024

High-Performance Spatial Data Analytics: Systematic R\u0026D for Scale-Out and Scale-Up Solutions from the Past to Now.
Proc. VLDB Endow., August, 2024

RTScan: Efficient Scan with Ray Tracing Cores.
Proc. VLDB Endow., February, 2024

RayJoin: Fast and Precise Spatial Join.
Proceedings of the 38th ACM International Conference on Supercomputing, 2024

UltraPrecise: A GPU-Based Framework for Arbitrary-Precision Arithmetic in Database Systems.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

2022
An RDMA-enabled In-memory Computing Platform for R-tree on Clusters.
ACM Trans. Spatial Algorithms Syst., June, 2022

NeutronStar: Distributed GNN Training with Hybrid Dependency Management.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

Maze: A Cost-Efficient Video Deduplication System at Web-scale.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

2021
Mixer: Efficiently Understanding and Retrieving Visual Content at Web-Scale.
Proc. VLDB Endow., 2021

The Art of Balance: A RateupDB Experience of Building a CPU/GPU Hybrid Database Product.
Proc. VLDB Endow., 2021

NestGPU: Nested Query Processing on GPU.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

DBSpinner: Making a Case for Iterative Processing in Databases.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

2020
Automating Incremental and Asynchronous Evaluation for Recursive Aggregate Data Processing.
Proceedings of the 2020 International Conference on Management of Data, 2020

2019
Software system research in post-Moore's Law era: a historical perspective for the future.
Sci. China Inf. Sci., 2019

SEP-graph: finding shortest execution paths for graph processing under a hybrid framework on GPU.
Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2019

HYPHA: a framework based on separation of parallelisms to accelerate persistent homology matrix reduction.
Proceedings of the ACM International Conference on Supercomputing, 2019

DirectLoad: A Fast Web-Scale Index System Across Large Regional Centers.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

Catfish: Adaptive RDMA-enabled R-Tree for Low Latency and High Throughput.
Proceedings of the 39th IEEE International Conference on Distributed Computing Systems, 2019

2018
A Low-cost Disk Solution Enabling LSM-tree to Achieve High Performance for Mixed Read/Write Workloads.
ACM Trans. Storage, 2018

Software-Defined Software: A Perspective of Machine Learning-Based Software Production.
Proceedings of the 38th IEEE International Conference on Distributed Computing Systems, 2018

SQLoop: High Performance Iterative Processing in Data Management.
Proceedings of the 38th IEEE International Conference on Distributed Computing Systems, 2018

2017
A distributed in-memory key-value store system on heterogeneous CPU-GPU cluster.
VLDB J., 2017

Software Support Inside and Outside Solid-State Devices for High Performance and High Efficiency.
Proc. IEEE, 2017

The high efficient dynamics modeling method for modular manipulator based on Space Operator Algebra.
Proceedings of the 2017 IEEE International Conference on Robotics and Biomimetics, 2017

Enabling Effective Utilization of GPUs for Data Management Systems.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

Feisu: Fast Query Execution over Heterogeneous Data Sources on Large-Scale Clusters.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

LSbM-tree: Re-Enabling Buffer Caching in Data Management for Mixed Reads and Writes.
Proceedings of the 37th IEEE International Conference on Distributed Computing Systems, 2017

2016
BCC: Reducing False Aborts in Optimistic Concurrency Control with Low Cost for In-Memory Databases.
Proc. VLDB Endow., 2016

Re-enabling high-speed caching for LSM-trees.
CoRR, 2016

Spark-GPU: An accelerated in-memory data processing engine on clusters.
Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

2015
Mega-KV: A Case for GPUs to Maximize the Throughput of In-Memory Key-Value Stores.
Proc. VLDB Endow., 2015

Hetero-DB: Next Generation High-Performance Database Systems by Best Utilizing Heterogeneous Computing and Storage Resources.
J. Comput. Sci. Technol., 2015

SideWalk: A Facility of Lightweight Out-of-Band Communications for Augmenting Distributed Data Processing Flows.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

2014
Concurrent Analytical Query Processing with GPUs.
Proc. VLDB Endow., 2014

Major technical advancements in apache hive.
Proceedings of the International Conference on Management of Data, 2014

GDM: device memory management for gpgpu computing.
Proceedings of the ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, 2014

2013
On distributed computation rate optimization for deploying cloud computing programming frameworks.
SIGMETRICS Perform. Evaluation Rev., 2013

The Yin and Yang of Processing Data Warehousing Queries on GPU Devices.
Proc. VLDB Endow., 2013

Understanding Insights into the Basic Structure and Essential Issues of Table Placement Methods in Clusters.
Proc. VLDB Endow., 2013

Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce.
Proc. VLDB Endow., 2013

Demonstration of Hadoop-GIS: a spatial data warehousing system over MapReduce.
Proceedings of the 21st SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2013

LDPC-in-SSD: making advanced error correction codes work effectively in solid state drives.
Proceedings of the 11th USENIX conference on File and Storage Technologies, 2013

UNIK: unsupervised social network spam detection.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

S-CAVE: Effective SSD caching to improve virtual machine storage performance.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012
Accelerating Pathology Image Data Cross-Comparison on CPU-GPU Hybrid Systems.
Proc. VLDB Endow., 2012

hStorage-DB: Heterogeneity-aware Data Management to Exploit the Full Capability of Hybrid Storage Systems.
Proc. VLDB Endow., 2012

Spammer Behavior Analysis and Detection in User Generated Content on Social Networks.
Proceedings of the 2012 IEEE 32nd International Conference on Distributed Computing Systems, 2012

BWS: balanced work stealing for time-sharing multicores.
Proceedings of the European Conference on Computer Systems, 2012

2011
ULCC: a user-level facility for optimizing shared cache performance on multicores.
Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011

Allocation and scheduling of relief materials based on GIS.
Proceedings of the IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services, 2011

Hystor: making the best use of solid state drives in high performance storage systems.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

RCFile: A fast and space-efficient data placement structure in MapReduce-based warehouse systems.
Proceedings of the 27th International Conference on Data Engineering, 2011

YSmart: Yet Another SQL-to-MapReduce Translator.
Proceedings of the 2011 International Conference on Distributed Computing Systems, 2011

Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing.
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011

CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of Flash Memory based Solid State Drives.
Proceedings of the 9th USENIX Conference on File and Storage Technologies, 2011

SRM-buffer: an OS buffer management technique to prevent last level cache from thrashing in multicores.
Proceedings of the European Conference on Computer Systems, 2011

Onboard controlling system design of unmanned airship.
Proceedings of the International Conference on Electronic and Mechanical Engineering and Information Technology, 2011

DOT: a matrix model for analyzing, optimizing and deploying software for big data analytics in distributed systems.
Proceedings of the ACM Symposium on Cloud Computing in conjunction with SOSP 2011, 2011

2010
Service Encapsulation for Middleware Management Interfaces.
Proceedings of the Fifth IEEE International Symposium on Service-Oriented System Engineering, 2010

A comparative analysis of tanker risks based on Port State Control.
Proceedings of the IEEE International Conference on Systems, 2010

Building a Domain-Knowledge Guided System Software Environment to Achieve High-Performance of Multi-core Processors.
Proceedings of the Network and Parallel Computing, IFIP International Conference, 2010

PS-BC: power-saving considerations in design of buffer caches serving heterogeneous storage devices.
Proceedings of the 2010 International Symposium on Low Power Electronics and Design, 2010

TopBT: A Topology-Aware and Infrastructure-Independent BitTorrent Client.
Proceedings of the INFOCOM 2010. 29th IEEE International Conference on Computer Communications, 2010

Research on Dynamic Game between Owner and Contractor in the Bidding Process of the Construction Project.
Proceedings of the International Conference on E-Business and E-Government, 2010

Management as a Service: An Empirical Case Study in the Internetware Cloud.
Proceedings of the IEEE 7th International Conference on e-Business Engineering, 2010

Splitter: a proxy-based approach for post-migration testing of web applications.
Proceedings of the European Conference on Computer Systems, 2010

2009
MCC-DB: Minimizing Cache Conflicts in Multi-core Processors for Databases.
Proc. VLDB Endow., 2009

Understanding intrinsic characteristics and system implications of flash memory based solid state drives.
Proceedings of the Eleventh International Joint Conference on Measurement and Modeling of Computer Systems, 2009

Enabling software management for multicore caches with a lightweight hardware support.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Analyzing patterns of user content generation in online social networks.
Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28, 2009

Research on the Next-Generation Internet Transition Technology.
Proceedings of the 2009 Second International Symposium on Computational Intelligence and Design, 2009

The Research and Realization of IEEE 802.1X and Account.
Proceedings of the 2009 Second International Symposium on Computational Intelligence and Design, 2009

CUBS: Coordinated Upload Bandwidth Sharing in Residential Networks.
Proceedings of the 17th annual IEEE International Conference on Network Protocols, 2009

BP-Wrapper: A System Framework Making Any Replacement Algorithms (Almost) Lock Contention Free.
Proceedings of the 25th International Conference on Data Engineering, 2009

Task-allocation algorithm for collaborative design based on negotiation mechanism.
Proceedings of the 13th International Conference on Computers Supported Cooperative Work in Design, 2009

Soft-OLP: Improving Hardware Cache Performance through Software-Controlled Object-Level Partitioning.
Proceedings of the PACT 2009, 2009

2008
Peer-to-Peer Communication.
Proceedings of the Wiley Encyclopedia of Computer Science and Engineering, 2008

LightFlood: Minimizing Redundant Messages and Maximizing Scope of Peer-to-Peer Search.
IEEE Trans. Parallel Distributed Syst., 2008

The stretched exponential distribution of internet media access patterns.
Proceedings of the Twenty-Seventh Annual ACM Symposium on Principles of Distributed Computing, 2008

Automatic Software Fault Diagnosis by Exploiting Application Signatures.
Proceedings of the 22nd Large Installation System Administration Conference, 2008

Caching for bursts (C-Burst): let hard disks sleep well and work energetically.
Proceedings of the 2008 International Symposium on Low Power Electronics and Design, 2008

Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems.
Proceedings of the 14th International Conference on High-Performance Computer Architecture (HPCA-14 2008), 2008

Research Issues and Challenges to Advance System Software for Multicore Processors and Data-Intensive Applications.
Proceedings of the 2008 IEEE/IPIP International Conference on Embedded and Ubiquitous Computing (EUC 2008), 2008

2007
Design and Analysis of Sensing Scheduling Algorithms under Partial Coverage for Object Detection in Sensor Networks.
IEEE Trans. Parallel Distributed Syst., 2007

SProxy: A Caching Infrastructure to Support Internet Streaming.
IEEE Trans. Multim., 2007

Maintaining Strong Cache Consistency for the Domain Name System.
IEEE Trans. Knowl. Data Eng., 2007

Coordinated Multilevel Buffer Cache Management with Consistent Access Locality Quantification.
IEEE Trans. Computers, 2007

Cooperative Relay Service in a Wireless LAN.
IEEE J. Sel. Areas Commun., 2007

A performance study of BitTorrent-like peer-to-peer systems.
IEEE J. Sel. Areas Commun., 2007

Realization of a development platform for Web-based product customization systems.
Int. J. Comput. Integr. Manuf., 2007

Cost-Aware Caching Algorithms for Distributed Storage Servers.
Proceedings of the Distributed Computing, 21st International Symposium, 2007

Novel Bi-Orthogonal Filter Design Methodology for Filter-Bank Based Transmission.
Proceedings of the IEEE Wireless Communications and Networking Conference, 2007

DiskSeen: Exploiting Disk Layout and Access History to Enhance I/O Prefetch.
Proceedings of the 2007 USENIX Annual Technical Conference, 2007

Does internet media traffic really follow Zipf-like distribution?
Proceedings of the 2007 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2007

Locality-aware Buffer Management: Algorithms Design and Systems Implementation for Data Intensive Applications (A Brief Progress Report).
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

PSM-throttling: Minimizing Energy Consumption for Bulk Data Communications in WLANs.
Proceedings of the IEEE International Conference on Network Protocols, 2007

SCAP: Smart Caching inWireless Access Points to Improve P2P Streaming.
Proceedings of the 27th IEEE International Conference on Distributed Computing Systems (ICDCS 2007), 2007

STEP: Sequentiality and Thrashing Detection Based Prefetching to Improve Performance of Networked Storage Servers.
Proceedings of the 27th IEEE International Conference on Distributed Computing Systems (ICDCS 2007), 2007

Organization-Oriented Simulation of Collaborative Product Development Process Based on Designer's Agent Model.
Proceedings of the 11th International Conference on Computer Supported Cooperative Work in Design, 2007

2006
Segment-based streaming media proxy: modeling and optimization.
IEEE Trans. Multim., 2006

Design and Evaluation of a Scalable and Reliable P2P Assisted Proxy for On-Demand Streaming Media Delivery.
IEEE Trans. Knowl. Data Eng., 2006

Auto-CFD-NOW: A pre-compiler for effectively parallelizing CFD applications on networks of workstations.
J. Supercomput., 2006

MESA: reducing cache conflicts by integrating static and run-time methods.
Proceedings of the 2006 IEEE International Symposium on Performance Analysis of Systems and Software, 2006

SmartSaver: turning flash drive into a disk energy saver for mobile computers.
Proceedings of the 2006 International Symposium on Low Power Electronics and Design, 2006

Exploiting Idle Communication Power to Improve Wireless Network Performance and Energy Efficiency.
Proceedings of the INFOCOM 2006. 25th IEEE International Conference on Computer Communications, 2006

Delving into internet streaming media delivery: a quality and resource utilization perspective.
Proceedings of the 6th ACM SIGCOMM Internet Measurement Conference, 2006

A Case for Internet Streaming via Web Servers.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

ASAP: an AS-Aware Peer-Relay Protocol for High Quality VoIP.
Proceedings of the 26th IEEE International Conference on Distributed Computing Systems (ICDCS 2006), 2006

A Locality-Aware Cooperative Cache Management Protocol to Improve Network File System Performance.
Proceedings of the 26th IEEE International Conference on Distributed Computing Systems (ICDCS 2006), 2006

2005
Location Awareness in Unstructured Peer-to-Peer Systems.
IEEE Trans. Parallel Distributed Syst., 2005

Fast proxy delivery of multiple streaming sessions in shared running buffers.
IEEE Trans. Multim., 2005

Making LRU Friendly to Weak Locality Workloads: A Novel Replacement Algorithm to Improve Buffer Cache Performance.
IEEE Trans. Computers, 2005

A study on object tracking quality under probabilistic coverage in sensor networks.
ACM SIGMOBILE Mob. Comput. Commun. Rev., 2005

Token-ordered LRU: an effective page replacement policy and its implementation in Linux systems.
Perform. Evaluation, 2005

Look-Ahead Architecture Adaptation to Reduce Processor Power Consumption.
IEEE Micro, 2005

Fast and low-cost search schemes by exploiting localities in P2P networks.
J. Parallel Distributed Comput., 2005

Segment-Based Proxy Caching for Internet Streaming Media Delivery.
IEEE Multim., 2005

Coordinated data prefetching for web contents.
Comput. Commun., 2005

Analysis of multimedia workloads with implications for internet streaming.
Proceedings of the 14th international conference on World Wide Web, 2005

CLOCK-Pro: An Effective Improvement of the CLOCK Replacement.
Proceedings of the 2005 USENIX Annual Technical Conference, 2005

A Concatenated ML Decoder for SFBC-OFDM Systems in Frequency Selective Fading Channels.
Proceedings of the IEEE 16th International Symposium on Personal, 2005

Analyzing Object Detection Quality Under Probabilistic Coverage in Sensor Networks.
Proceedings of the Quality of Service - IWQoS 2005: 13th International Workshop, 2005

System Support to Balance the Resource Supply and Demand in High-end Computin.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

SCOPE: scalable consistency maintenance in structured P2P systems.
Proceedings of the INFOCOM 2005. 24th Annual Joint Conference of the IEEE Computer and Communications Societies, 2005

Measurements, Analysis, and Modeling of BitTorrent-like Systems.
Proceedings of the 5th Internet Measurement Conference, 2005

The Power of P2P beyond File Sharing.
Proceedings of the 25th International Conference on Distributed Computing Systems Workshops (ICDCS 2005 Workshops), 2005

DISC: Dynamic Interleaved Segment Caching for Interactive Streaming.
Proceedings of the 25th International Conference on Distributed Computing Systems (ICDCS 2005), 2005

DULO: An Effective Buffer Cache Management Scheme to Exploit Both Temporal and Spatial Localities.
Proceedings of the FAST '05 Conference on File and Storage Technologies, 2005

Design and Analysis of Wave Sensing Scheduling Protocols for Object-Tracking Applications.
Proceedings of the Distributed Computing in Sensor Systems, 2005

2004
Adaptive Memory Allocations in Clusters to Handle Unexpectedly Large Data-Intensive Jobs.
IEEE Trans. Parallel Distributed Syst., 2004

Building a Large and Efficient Hybrid Peer-to-Peer Internet Caching System.
IEEE Trans. Knowl. Data Eng., 2004

Design and Optimization of Large Size and Low Overhead Off-Chip Caches.
IEEE Trans. Computers, 2004

Enforcing direct communications between clients and Web servers to improve proxy performance and security.
Softw. Pract. Exp., 2004

Exploiting Content Localities for Efficient Search in P2P Systems.
Proceedings of the Distributed Computing, 18th International Conference, 2004

An Empirical Study of a Segment-Based Streaming Proxy in an Enterprise Environment.
Proceedings of the Web Content Caching and Distribution: 9th International Workshop, 2004

SAT-Match: A Self-Adaptive Topology Matching Method to Achieve Low Lookup Latency in Structured P2P Overlay Networks.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

Location-Aware Topology Matching in P2P Systems.
Proceedings of the Proceedings IEEE INFOCOM 2004, 2004

Designs of High Quality Streaming Proxy Systems.
Proceedings of the Proceedings IEEE INFOCOM 2004, 2004

ULC: A File Block Placement and Replacement Protocol to Effectively Exploit Hierarchical Locality in Multi-Level Buffer Caches.
Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS 2004), 2004

SRB: Shared Running Buffers in Proxy to Exploit Memory Locality of Multiple Streaming Media Sessions.
Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS 2004), 2004

2003
Low-Cost and Reliable Mutual Anonymity Protocols in Peer-to-Peer Networks.
IEEE Trans. Parallel Distributed Syst., 2003

On scalable and locality-aware web document sharing.
J. Parallel Distributed Comput., 2003

A Popularity-Based Prediction Model for Web Prefetching.
Computer, 2003

Buffer Sharing for Proxy Caching of Streaming Sessions.
Proceedings of the Twelfth International World Wide Web Conference - Posters, 2003

Streaming Flow Analyses for Prefetching in Segment-Based Proxy Caching to Improve Delivery Quality.
Proceedings of the Web Content Caching and Distribution, 8th International Workshop, 2003

Adaptive and lazy segmentation based proxy caching for streaming media delivery.
Proceedings of the Network and Operating System Support for Digital Audio and Video, 2003

LightFlood: an Efficient Flooding Scheme for File Search in Unstructured Peer-to-Peer Systems.
Proceedings of the 32nd International Conference on Parallel Processing (ICPP 2003), 2003

Accurately Modeling Workload Interactions for Deploying Prefetching in Web Servers.
Proceedings of the 32nd International Conference on Parallel Processing (ICPP 2003), 2003

Mutual Anonymity Protocols for Hybrid Peer-to-Peer Systems.
Proceedings of the 23rd International Conference on Distributed Computing Systems (ICDCS 2003), 2003

FloodTrail: an efficient file search technique in unstructured peer-to-peer systems.
Proceedings of the Global Telecommunications Conference, 2003

Auto-CFD: Efficiently Parallelizing CFD Applications on Clusters.
Proceedings of the 2003 IEEE International Conference on Cluster Computing (CLUSTER 2003), 2003

Efficient Distributed Disk Caching in Data Grid Management.
Proceedings of the 2003 IEEE International Conference on Cluster Computing (CLUSTER 2003), 2003

2002
Dynamic Cluster Resource Allocations for Jobs with Known and Unknown Memory Demands.
IEEE Trans. Parallel Distributed Syst., 2002

TPF: a dynamic system thrashing protection facility.
Softw. Pract. Exp., 2002

Access-Mode Predictions for Low-Power Cache Design.
IEEE Micro, 2002

LIRS: an efficient low inter-reference recency set replacement policy to improve buffer cache performance.
Proceedings of the International Conference on Measurements and Modeling of Computer Systems, 2002

OFDM scheme based on complex orthogonal wavelet packet.
Proceedings of the 12th IEEE International Symposium on Personal, 2002

On Reliable and Scalable Peer-to-Peer Web Document Sharing.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

Popularity-Based PPM: An Effective Web Prefetching Technique for High Accuracy and Low Storage.
Proceedings of the 31st International Conference on Parallel Processing (ICPP 2002), 2002

Adaptive and Virtual Reconfigurations for Effective Dynamic Job Scheduling in Cluster Systems.
Proceedings of the 22nd International Conference on Distributed Computing Systems (ICDCS'02), 2002

Fine-Grain Priority Scheduling on Multi-Channel Memory Systems.
Proceedings of the Eighth International Symposium on High-Performance Computer Architecture (HPCA'02), 2002

2001
Coordinated data prefetching by utilizing reference information at both proxy and web servers.
SIGMETRICS Perform. Evaluation Rev., 2001

Fast Bit-Reversals on Uniprocessors and Shared-Memory Multiprocessors.
SIAM J. Sci. Comput., 2001

Cached DRAM for ILP Processor Memory Access Latency Reduction.
IEEE Micro, 2001

Architectural Effects of Symmetric Multiprocessors on TPC-C Commercial Workload.
J. Parallel Distributed Comput., 2001

Breaking Address Mapping Symmetry at Multi-levels of Memory Heirarchy to Reduce DRAM Row-buffer Conflicts.
J. Instr. Level Parallelism, 2001

Exploiting Neglected Data Locality in Browsers.
Proceedings of the Poster Proceedings of the Tenth International World Wide Web Conference, 2001

Dynamic Load Sharing with Unknown Memory Demands in Clusters.
Proceedings of the 21st International Conference on Distributed Computing Systems (ICDCS 2001), 2001

Adaptive Page Replacement to Protect Thrashing in Linux.
Proceedings of the 5th Annual Linux Showcase & Conference 2001, 2001

2000
Cacheminer: A Runtime Approach to Exploit Cache Locality on SMP.
IEEE Trans. Parallel Distributed Syst., 2000

Memory Hierarchy Considerations for Cost-Effective Cluster Computing.
IEEE Trans. Computers, 2000

Improving Memory Performance of Sorting Algorithms.
ACM J. Exp. Algorithmics, 2000

A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality.
Proceedings of the 33rd Annual IEEE/ACM International Symposium on Microarchitecture, 2000

Effective Load Sharing on Heterogeneous Networks of Workstations.
Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

Improving Distributed Workload Performance by Sharing both CPU and Memory Resources.
Proceedings of the 20th International Conference on Distributed Computing Systems, 2000

Incorporating Job Migration and Network RAM to Share Cluster Memory Resources.
Proceedings of the Ninth IEEE International Symposium on High Performance Distributed Computing, 2000

1999
Comparative evaluation and case studies of shared-memory and data-parallel execution patterns.
Sci. Program., 1999

Profit-effective parallel computing.
IEEE Concurr., 1999

Cache-Optimal Methods for Bit-Reversals.
Proceedings of the ACM/IEEE Conference on Supercomputing, 1999

The Impact of Memory Hierarchies on Cluster Computing.
Proceedings of the 13th International Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS / SPDP '99), 1999

1998
Lock Bypassing: An Efficient Algorithm for Concurrently Accessing Priority Heaps.
ACM J. Exp. Algorithmics, 1998

Characterizing and scheduling communication interactions of parallel and local jobs on networks of workstations.
Comput. Commun., 1998

A memory-layout oriented run-time technique for locality optimization.
Proceedings of the 1998 International Conference on Parallel Processing (ICPP '98), 1998

1997
Software Support for Multiprocessor Latency Measurement and Evaluation.
IEEE Trans. Software Eng., 1997

Adaptively Scheduling Parallel Loops in Distributed Shared-Memory Systems.
IEEE Trans. Parallel Distributed Syst., 1997

Two fast and high-associativity cache schemes.
IEEE Micro, 1997

Erratum: "An Effective and Practical Performance Prediction Model for Parallel Computing on Nondedicated Heterogeneous NOW".
J. Parallel Distributed Comput., 1997

Coordinating Parallel Processes on Networks of Workstations.
J. Parallel Distributed Comput., 1997

Performance bottlenecks and potentials of parallel computing on networks of workstations.
Int. J. Syst. Sci., 1997

Multi-Column Implementations for Cache Associativity.
Proceedings of the Proceedings 1997 International Conference on Computer Design: VLSI in Computers & Processors, 1997

A Comparative Evaluation of Hierarchical Network Architecture of the HP-Convex Exemplar.
Proceedings of the Proceedings 1997 International Conference on Computer Design: VLSI in Computers & Processors, 1997

Characterizing Communication Interactions of Parallel and Sequential Jobs on Networks of Workstations.
Proceedings of the 1997 IEEE International Conference on Communications: Towards the Knowledge Millennium, 1997

Nova Visualization for Optimization of Data-Parallel Programs.
Proceedings of the Euro-Par '97 Parallel Processing, 1997

Effectively Scheduling Parallel Tasks and Communications on Networks of Workstations.
Proceedings of the Euro-Par '97 Parallel Processing, 1997

1996
A Fast Token-Chasing Mutual Exclusion Algorithm in Arbitrary Network Topologies.
J. Parallel Distributed Comput., 1996

An Effective and Practical Performance Prediction Model for Parallel Computing on Nondedicated Heterogeneous NOW.
J. Parallel Distributed Comput., 1996

Semi-Empirical Multiprocessor Performance Predictions.
J. Parallel Distributed Comput., 1996

Evaluating and designing software mutual exclusion algorithms on shared-memory multiprocessors.
IEEE Parallel Distributed Technol. Syst. Appl., 1996

An adaptive loop scheduling algorithm on shared-memory systems.
Proceedings of the Eighth IEEE Symposium on Parallel and Distributed Processing, 1996

Parallelizing FDTD Methods for Solving Electromagnetic Scattering Problems.
Proceedings of the Applications on Advanced Architecture Computers, 1996

1995
Comparative Performance Evaluation of Hot Spot Contention Between MIN-Based and Ring-Based Shared-Memory Architectures.
IEEE Trans. Parallel Distributed Syst., 1995

Comparative Modeling and Evaluation of CC-NUMA and COMA on Hierarchical Ring Architectures.
IEEE Trans. Parallel Distributed Syst., 1995

Parallelizing an Oil Refining Simulation: Numerical Methods, Implementations and Experience.
Parallel Comput., 1995

Modeling and characterizing parallel computing performance on heterogeneous networks of workstations.
Proceedings of the Seventh IEEE Symposium on Parallel and Distributed Processing, 1995

A Semi-Empirical Approach to Scalability Study.
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, 1995

Asynchronous PVM Network Computing.
Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, 1995

Folding spatial image filters on the CM-5.
Proceedings of IPPS '95, 1995

Multiprocessor Scalability Predictions Through Detailed Program Execution Analysis.
Proceedings of the 9th international conference on Supercomputing, 1995

A Framework of Performance Prediction of Parallel Computing on Nondedicated Heterogeneous NOW.
Proceedings of the 1995 International Conference on Parallel Processing, 1995

GRAPH: A Tool for Visualizing Communication and Optimizing Layout in Data-Parallel Programs.
Proceedings of the 1995 International Conference on Parallel Processing, 1995

Software Support for Asynchronous Computing Across Networks.
Proceedings of the 19th International Computer Software and Applications Conference (COMPSAC'95), 1995

1994
Triangular Decomposition Methods for Solving Reducible Nonlinear Systems of Equations.
SIAM J. Optim., 1994

Latency Metric: An Experimental Method for Measuring and Evaluating Parallel Program and Architecture Scalability.
J. Parallel Distributed Comput., 1994

Spin-lock synchronization on the Butterfly and KSR1.
IEEE Parallel Distributed Technol. Syst. Appl., 1994

Performance predictions on implicit communication systems.
Proceedings of the Sixth IEEE Symposium on Parallel and Distributed Processing, 1994

Distributed image edge detection methods and performance.
Proceedings of the Sixth IEEE Symposium on Parallel and Distributed Processing, 1994

Modeling Data Migration on CC-NUMA and CC-COMA Hierarchical Ring Architectures.
Proceedings of the MASCOTS '94, Proceedings of the Second International Workshop on Modeling, Analysis, and Simulation On Computer and Telecommunication Systems, January 31, 1994

Evaluation and Measurement of Multiprocessor Latency Patterns.
Proceedings of the 8th International Symposium on Parallel Processing, 1994

Communication and Computation Patterns of Large Scale Image Convolutions on Parallel Architectures.
Proceedings of the 8th International Symposium on Parallel Processing, 1994

Measuring and Analyzing Parallel Computing Scalability.
Proceedings of the 1994 International Conference on Parallel Processing, 1994

Latency Analysis of CC-NUMA and CC-COMA Rings.
Proceedings of the 1994 International Conference on Parallel Processing, 1994

Distributed Computation of Electromagnetic Scattering Problems Using Finite-Difference Time-Domain Decompositions.
Proceedings of the Third International Symposium on High Performance Distributed Computing, 1994

1993
MIN-Graph: A Tool for Monitoring and Visualizing MIN-Based Multiprocessor Performance.
J. Parallel Distributed Comput., 1993

Modeling and Measuring of Hot Spots on MIN-Based and HR-Based Shared-Memory Architectures.
Proceedings of the Fifth IEEE Symposium on Parallel and Distributed Processing, 1993

Execution Behavior Analysis and Performance Improvement in Shared-memory Mutiprocessors.
Proceedings of the Fifth IEEE Symposium on Parallel and Distributed Processing, 1993

Parallel Implementations of an Oil Refining Simulation.
Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, 1993

Evaluation Synchronization Effects to Scientific Computations on Large Scale Shared-Memory Multiprocessors.
Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, 1993

Experimental Performance Evaluation on Network-Based Shared-Memory Architectures.
Proceedings of the Parallel Computing: Trends and Applications, 1993

Parallel Triangular Decompositions of an Oil Refining Simulation.
Proceedings of the 7th international conference on Supercomputing, 1993

1992
Dynamic and static load balancing for solving block bordered circuit equations on multiprocessors.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1992

Parallel Methods for Solving Nonlinear Block Bordered Systems of Equations.
SIAM J. Sci. Comput., 1992

1991
Performance Prediction and Evaluation of Parallel Processing on a NUMA Multiprocessor.
IEEE Trans. Software Eng., 1991

Performance Measurement and Modeling to Evaluate Various Effects on a Shared Memory Multiprocessor.
IEEE Trans. Software Eng., 1991

System effects of interprocessor communication latency in multicomputers.
IEEE Micro, 1991

Parallel Block Triangular Decompositions for Solving Sparse Nonlinear Systems of Equations.
Proceedings of the Fifth SIAM Conference on Parallel Processing for Scientific Computing, 1991

Dynamic and static load scheduling performance on a NUMA shared memory multiprocessor.
Proceedings of the 5th international conference on Supercomputing, 1991

1990
Distributed task processing performance on a NUMA shared memory multiprocessor.
Proceedings of the Second IEEE Symposium on Parallel and Distributed Processing, 1990


  Loading...