Xiaosong Ma

Orcid: 0000-0003-1261-2496

Affiliations:
  • North Carolina State University, Raleigh, NC, USA


According to our database1, Xiaosong Ma authored at least 118 papers between 1999 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Model Decomposition and Reassembly for Purified Knowledge Transfer in Personalized Federated Learning.
IEEE Trans. Mob. Comput., January, 2025

2024
Amend to Alignment: Decoupled Prompt Tuning for Mitigating Spurious Correlation in Vision-Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023
End-to-end I/O Monitoring on Leading Supercomputers.
ACM Trans. Storage, February, 2023

Understand Data Preprocessing for Effective End-to-End Training of Deep Neural Networks.
CoRR, 2023

SwapPrompt: Test-Time Prompt Adaptation for Vision-Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Towards Unbiased Training in Federated Open-world Semi-supervised Learning.
Proceedings of the International Conference on Machine Learning, 2023

FrozenHot Cache: Rethinking Cache Management for Modern Hardware.
Proceedings of the Eighteenth European Conference on Computer Systems, 2023

Persistent Memory Disaggregation for Cloud-Native Relational Databases.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

GZKP: A GPU Accelerated Zero-Knowledge Proof System.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
Ten Years after ImageNet: A 360° Perspective on AI.
CoRR, 2022

Shared, High-Performance Software Components for Shared, High-Performance Hardware.
Proceedings of the 1st International Workshop on Composable Data Management Systems, 2022

Fast optical frequency detection techniques for coherent distributed sensing and communication systems.
Proceedings of the Optical Fiber Communications Conference and Exhibition, 2022

Layer-wised Model Aggregation for Personalized Federated Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Leveraging NVMe SSDs for Building a Fast, Cost-effective, LSM-tree-based KV Store.
ACM Trans. Storage, 2021

Random Walks on Huge Graphs at Cache Efficiency.
Proceedings of the SOSP '21: ACM SIGOPS 28th Symposium on Operating Systems Principles, 2021

Parameterized Knowledge Transfer for Personalized Federated Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

You Can Hear But You Cannot Record: Privacy Protection by Jamming Audio Recording.
Proceedings of the ICC 2021, 2021

FusionRAID: Achieving Consistent Low Latency for Commodity SSD Arrays.
Proceedings of the 19th USENIX Conference on File and Storage Technologies, 2021

SpanDB: A Fast, Cost-Effective LSM-tree Based KV Store on Hybrid Storage.
Proceedings of the 19th USENIX Conference on File and Storage Technologies, 2021

2020
Determining Data Distribution for Large Disk Enclosures with 3-D Data Templates.
ACM Trans. Storage, 2020

LiveGraph: A Transactional Graph Storage System with Purely Sequential Adjacency List Scans.
Proc. VLDB Endow., 2020

QarSUMO: A Parallel, Congestion-optimized Traffic Simulator.
Proceedings of the SIGSPATIAL '20: 28th International Conference on Advances in Geographic Information Systems, 2020

2019
LiveGraph: A Transactional Graph Storage System with Purely Sequential Adjacency List Scans.
CoRR, 2019

KnightKing: a fast distributed graph random walk engine.
Proceedings of the 27th ACM Symposium on Operating Systems Principles, 2019

Spread-n-share: improving application performance and cluster throughput with resource-aware job placement.
Proceedings of the International Conference for High Performance Computing, 2019

End-to-end I/O Monitoring on a Leading Supercomputer.
Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation, 2019

Automatic, Application-Aware I/O Forwarding Resource Allocation.
Proceedings of the 17th USENIX Conference on File and Storage Technologies, 2019

2018
Spindle: Informed Memory Access Monitoring.
Proceedings of the 2018 USENIX Annual Technical Conference, 2018

ShenTu: processing multi-trillion edge graphs on millions of cores in seconds.
Proceedings of the International Conference for High Performance Computing, 2018

Exploiting Locality in Graph Analytics through Hardware-Accelerated Traversal Scheduling.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

KPart: A Hybrid Cache Partitioning-Sharing Technique for Commodity Multicores.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

RAID+: Deterministic and Balanced Data Distribution for Large Disk Enclosures.
Proceedings of the 16th USENIX Conference on File and Storage Technologies, 2018

2017
Understanding object-level memory access patterns across the spectrum.
Proceedings of the International Conference for High Performance Computing, 2017

POSTER: Improving Datacenter Efficiency Through Partitioning-Aware Scheduling.
Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016
Building Semi-Elastic Virtual Clusters for Cost-Effective HPC Cloud Resource Provisioning.
IEEE Trans. Parallel Distributed Syst., 2016

MPI-ACC: Accelerator-Aware MPI for Scientific Applications.
IEEE Trans. Parallel Distributed Syst., 2016

S-RAC: SSD Friendly Caching for Data Center Workloads.
Proceedings of the 9th ACM International on Systems and Storage Conference, 2016

Server-side log data analytics for I/O workload characterization and coordination on large shared storage systems.
Proceedings of the International Conference for High Performance Computing, 2016

Gemini: A Computation-Centric Distributed Graph Processing System.
Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, 2016

2015
Automatic Cloud I/O Configurator for I/O Intensive Parallel Applications.
IEEE Trans. Parallel Distributed Syst., 2015

Combining Phase Identification and Statistic Modeling for Automated Parallel Benchmark Generation.
Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2015

AsHES Introduction and Committees.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Cost-Effective Resource Configuration for Cloud Video Streaming Services.
Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

2014
A Wireless MEMS Inertial Switch for Measuring Both Threshold Triggering Acceleration and Response Time.
IEEE Trans. Instrum. Meas., 2014

vCacheShare: Automated Server Flash Cache Space Management in a Virtualization Environment.
Proceedings of the 2014 USENIX Annual Technical Conference, 2014

CYPRESS: Combining Static and Dynamic Analysis for Top-Down Communication Trace Compression.
Proceedings of the International Conference for High Performance Computing, 2014

On the Feasibility of Data Loss Insurance for Personal Cloud Storage.
Proceedings of the 6th USENIX Workshop on Hot Topics in Cloud Computing, 2014

Automatic identification of application I/O signatures from noisy server-side traces.
Proceedings of the 12th USENIX conference on File and Storage Technologies, 2014

2013
Cost-effective cloud HPC resource provisioning by building semi-elastic virtual clusters.
Proceedings of the International Conference for High Performance Computing, 2013

ACIC: automatic cloud I/O configurator for HPC applications.
Proceedings of the International Conference for High Performance Computing, 2013

Accelerating Batch Analytics with Residual Resources from Interactive Clouds.
Proceedings of the 2013 IEEE 21st International Symposium on Modelling, 2013

ACIC: automatic cloud I/O configurator for parallel applications.
Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, 2013

Building and scaling virtual clusters with residual resources from interactive clouds.
Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, 2013

On the efficacy of GPU-integrated MPI for scientific applications.
Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, 2013

uCache: A Utility-Aware Multilevel SSD Cache Management Policy.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013

Active flash: towards energy-efficient, in-situ data analytics on extreme-scale machines.
Proceedings of the 11th USENIX conference on File and Storage Technologies, 2013

PARLO: PArallel Run-Time Layout Optimization for Scientific Data Explorations with Heterogeneous Access Patterns.
Proceedings of the 13th IEEE/ACM International Symposium on Cluster, 2013

RSVM: A Region-based Software Virtual Memory for GPU.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012
Quantum teleportation over 143 kilometres using active feed-forward.
Nat., 2012

Reliable MapReduce computing on opportunistic resources.
Clust. Comput., 2012

Reducing Data Movement Costs Using Energy-Efficient, Active Computation on SSD.
Proceedings of the 2012 Workshop on Power-Aware Computing Systems, HotPower'12, 2012

Employing Checkpoint to Improve Job Scheduling in Large-Scale Systems.
Proceedings of the Job Scheduling Strategies for Parallel Processing, 2012

NVMalloc: Exposing an Aggregate SSD Store as a Memory Partition in Extreme-Scale Machines.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Efficient Intranode Communication in GPU-Accelerated Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

DMA-Assisted, Intranode Communication in GPU Accelerated Systems.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

2011
Proceedings of the Encyclopedia of Parallel Computing, 2011

Coordinating Computation and I/O in Massively Parallel Sequence Search.
IEEE Trans. Parallel Distributed Syst., 2011

Transparent runtime parallelization of the R scripting language.
J. Parallel Distributed Comput., 2011

Cloud versus in-house cluster: evaluating Amazon cluster compute instances for running MPI applications.
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011

EMFS: Email-based Personal Cloud Storage.
Proceedings of the Sixth International Conference on Networking, Architecture, and Storage, 2011

Using Shared Memory to Accelerate MapReduce on Graphics Processing Units.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Probabilistic Communication and I/O Tracing with Deterministic Replay at Scale.
Proceedings of the International Conference on Parallel Processing, 2011

Skel: Generative Software for Producing Skeletal I/O Applications.
Proceedings of the IEEE 7th International Conference on E-Science, 2011

One optimized I/O configuration per HPC application: leveraging the configurability of cloud.
Proceedings of the APSys '11 Asia Pacific Workshop on Systems, 2011

2010
A fast moisture sensitivity level qualification method.
Microelectron. Reliab., 2010

Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures.
Proceedings of the Conference on High Performance Computing Networking, 2010

MOON: MapReduce On Opportunistic eNvironments.
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 2010

2009
Fast reliability qualification of SiP products.
Microelectron. Reliab., 2009

Improving the availability of supercomputer job input data using temporal replication.
Comput. Sci. Res. Dev., 2009

Improving Data Availability for Better Access Performance: A Study on Caching Scientific Data on Distributed Desktop Workstations.
J. Grid Comput., 2009

Energy and performance impact of aggressive volunteer computing with multi-core computers.
Proceedings of the 17th Annual Meeting of the IEEE/ACM International Symposium on Modelling, 2009

SigLM: Signature-driven load management for cloud computing infrastructures.
Proceedings of the 17th International Workshop on Quality of Service, 2009

Machine learning based online performance prediction for runtime parallelization and task scheduling.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2009

Memory resource allocation for file system prefetching: from a supply chain management perspective.
Proceedings of the 2009 EuroSys Conference, Nuremberg, Germany, April 1-3, 2009, 2009

2008
Adaptive Request Scheduling for Parallel Scientific Web Services.
Proceedings of the Scientific and Statistical Database Management, 2008

Massively parallel genomic sequence search on the Blue Gene/P architecture.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Semantics-based distributed I/O for mpiBLAST.
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008

On-the-Fly Recovery of Job Input Data in Supercomputers.
Proceedings of the 2008 International Conference on Parallel Processing, 2008

PFC: Transparent Optimization of Existing Prefetching Strategies for Multi-Level Storage Systems.
Proceedings of the 28th IEEE International Conference on Distributed Computing Systems (ICDCS 2008), 2008

2007
Recovering transient data: automated on-demand data reconstruction and offloading for supercomputers.
ACM SIGOPS Oper. Syst. Rev., 2007

Characterization of moisture properties of polymers for IC packaging.
Microelectron. Reliab., 2007

Optimizing center performance through coordinated data staging, scheduling and recovery.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

Automatic Parallelization of Scripting Languages: Toward Transparent Desktop Parallel Computing.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Cyberinfrastructure for Contamination Source Characterization in Water Distribution Systems.
Proceedings of the Computational Science, 2007

2006
High-Level Buffering for Hiding Periodic Output Cost in Scientific Simulations.
IEEE Trans. Parallel Distributed Syst., 2006

Constructing collaborative desktop storage caches for large scientific datasets.
ACM Trans. Storage, 2006

Grid applications - Parallel genomic sequence-searching on an ad-hoc grid: experiences, lessons learned, and implications.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Coupling prefix caching and collective downloads for remote dataset access.
Proceedings of the 20th Annual International Conference on Supercomputing, 2006

Exploring I/O Strategies for Parallel Sequence-Search Tools with S3aSim.
Proceedings of the 15th IEEE International Symposium on High Performance Distributed Computing, 2006

Positioning Dynamic Storage Caches for Transient Data.
Proceedings of the 2006 IEEE International Conference on Cluster Computing, 2006

2005
Cross-Platform Performance Prediction of Parallel Applications Using Partial Execution.
Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

FreeLoader: Scavenging Desktop Storage Resources for Scientific Data.
Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

Efficient Data Access for Parallel BLAST.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Governor: Autonomic Throttling for Aggressive Idle Resource Scavenging.
Proceedings of the Second International Conference on Autonomic Computing (ICAC 2005), 2005

2004
GODIVA: Lightweight Data Management for Scientific Visualization Applications.
Proceedings of the 20th International Conference on Data Engineering, 2004

RFS: efficient and flexible remote file access for MPI-IO.
Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004

2003
Hiding Periodic I/O Costs in Parallel Applications
PhD thesis, 2003

Two-Body Job Searches.
SIGMOD Rec., 2003

Declustering Large Multidimensional Data Sets for Range Queries over Heterogeneous Disks.
Proceedings of the 15th International Conference on Scientific and Statistical Database Management (SSDBM 2003), 2003

Improving MPI-IO Output Performance with Active Buffering Plus Threads.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Flexible and Efficient Parallel I/O for Large-Scale Multi-Component Simulations.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

2002
Automatic and portable performance modeling for parallel I/O: a machine-learning approach.
SIGMETRICS Perform. Evaluation Rev., 2002

Faster Collective Output through Active Buffering.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

Enhancing Data Migration Performance via Parallel Data Compression.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

Active buffering plus compressed migration: an integrated solution to parallel simulations' data transport needs.
Proceedings of the 16th international conference on Supercomputing, 2002

2001
Tuning high-performance scientific codes: the use of performance models to control resource usage during data migration and I/O.
Proceedings of the 15th international conference on Supercomputing, 2001

2000
PRUNES: an efficient and complete strategy for automated trust negotiation over the Internet.
Proceedings of the CCS 2000, 2000

1999
SIFFEA: Scalable Integrated Framework for Finite Element Analysis.
Proceedings of the Computing in Object-Oriented Parallel Environments, 1999


  Loading...