Xiaodong Yu

Orcid: 0000-0001-6244-1264

Affiliations:
  • Stevens Institute of Technology, Hoboken, NJ, USA
  • Argonne National Laboratory, Lemont, IL, USA
  • Virginia Tech, Blacksburg, VA, USA (PhD 2019)


According to our database1, Xiaodong Yu authored at least 39 papers between 2013 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
A Survey on Error-Bounded Lossy Compression for Scientific Datasets.
CoRR, 2024

Centimani: Enabling Fast AI Accelerator Selection for DNN Training with a Novel Performance Predictor.
Proceedings of the 2024 USENIX Annual Technical Conference, 2024

POSTER: Optimizing Collective Communications with Error-bounded Lossy Compression for GPU Clusters.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024

An Optimized Error-controlled MPI Collective Framework Integrated with Lossy Compression.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

gZCCL: Compression-Accelerated Collective Communication Framework for GPU Clusters.
Proceedings of the 38th ACM International Conference on Supercomputing, 2024

CereSZ: Enabling and Scaling Error-bounded Lossy Compression on Cerebras CS-2.
Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing, 2024

A Portable, Fast, DCT-based Compressor for AI Accelerators.
Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing, 2024

2023
gZCCL: Compression-Accelerated Collective Communication Framework for GPU Clusters.
CoRR, 2023

C-Coll: Introducing Error-bounded Lossy Compression into MPI Collectives.
CoRR, 2023

cuSZp: An Ultra-fast GPU Error-bounded Lossy Compression Framework with Optimized End-to-End Performance.
Proceedings of the International Conference for High Performance Computing, 2023

Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

GPU-Accelerated Error-Bounded Compression Framework for Quantum Circuit Simulations.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs.
Proceedings of the 37th International Conference on Supercomputing, 2023

Lightweight Huffman Coding for Efficient GPU Compression.
Proceedings of the 37th International Conference on Supercomputing, 2023

HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs.
Proceedings of the 37th International Conference on Supercomputing, 2023

FZ-GPU: A Fast and High-Ratio Lossy Compressor for Scientific Computing Applications on GPUs.
Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, 2023

2022
SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates.
CoRR, 2022

SZx: an Ultra-fast Error-bounded Lossy Compressor for Scientific Datasets.
CoRR, 2022

Optimizing Huffman Decoding for Error-Bounded Lossy Compression on GPUs.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

Ultrafast Error-bounded Lossy Compression for Scientific Datasets.
Proceedings of the HPDC '22: The 31st International Symposium on High-Performance Parallel and Distributed Computing, Minneapolis, MN, USA, 27 June 2022, 2022

2021
Scalable and accurate multi-GPU based image reconstruction of large-scale ptychography data.
CoRR, 2021

cuSZ(x): Optimizing Error-Bounded Lossy Compression for Scientific Data on GPUs.
CoRR, 2021

High-Performance Ptychographic Reconstruction with Federated Facilities.
Proceedings of the Driving Scientific and Engineering Discoveries Through the Integration of Experiment, Big Data, and Modeling and Simulation, 2021

Topology-aware optimizations for multi-GPU ptychographic image reconstruction.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

cuZ-Checker: A GPU-Based Ultra-Fast Assessment System for Lossy Compressions.
Proceedings of the IEEE International Conference on Cluster Computing, 2021

Optimizing Error-Bounded Lossy Compression for Scientific Data on GPUs.
Proceedings of the IEEE International Conference on Cluster Computing, 2021

2020
GPU-Based Static Data-Flow Analysis for Fast and Scalable Android App Vetting.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

2019
GPU-Based Iterative Medical CT Image Reconstructions.
J. Signal Process. Syst., 2019

Comparative Measurement of Cache Configurations' Impacts on Cache Timing Side-Channel Attacks.
Proceedings of the 12th USENIX Workshop on Cyber Security Experimentation and Test, 2019

2018
Novel meshes for multivariate interpolation and approximation.
Proceedings of the ACMSE 2018 Conference, Richmond, KY, USA, March 29-31, 2018, 2018

2017
A framework for fast and fair evaluation of automata processing hardware.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017

Demystifying automata processing: GPUs, FPGAs or Micron's AP?
Proceedings of the International Conference on Supercomputing, 2017

An Enhanced Image Reconstruction Tool for Computed Tomography on CPUs.
Proceedings of the Computing Frontiers Conference, 2017

Robotomata: A framework for approximate pattern matching of big data on an automata processor.
Proceedings of the 2017 IEEE International Conference on Big Data (IEEE BigData 2017), 2017

2016
cuART: Fine-Grained Algebraic Reconstruction Technique for Computed Tomography Images on GPUs.
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016

O3FA: A Scalable Finite Automata-based Pattern-Matching Engine for Out-of-Order Deep Packet Inspection.
Proceedings of the 2016 Symposium on Architectures for Networking and Communications Systems, 2016

2014
Revisiting State Blow-Up: Automatically Building Augmented-FA While Preserving Functional Equivalence.
IEEE J. Sel. Areas Commun., 2014

2013
Exploring different automata representations for efficient regular expression matching on GPUs.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013

GPU acceleration of regular expression matching for large datasets: exploring the implementation space.
Proceedings of the Computing Frontiers Conference, 2013


  Loading...