2024
Understanding and Alleviating Memory Consumption in RLHF for LLMs.
CoRR, 2024
Scaler: Efficient and Effective Cross Flow Analysis.
CoRR, 2024
ProTrain: Efficient LLM Training via Memory-Aware Techniques.
CoRR, 2024
AdapMTL: Adaptive Pruning Framework for Multitask Learning Model.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Scaler: Efficient and Effective Cross Flow Analysis.
Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, 2024
Exploring Performance and Cost Optimization with ASIC-Based CXL Memory.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Nineteenth European Conference on Computer Systems, 2024
Improving Resource and Energy Efficiency for Cloud 3D through Excessive Rendering Reduction.
Proceedings of the Nineteenth European Conference on Computer Systems, 2024
2023
MemPerf: Profiling Allocator-Induced Performance Slowdowns.
Proc. ACM Program. Lang., October, 2023
NUMAlloc: A Faster NUMA Memory Allocator.
Proceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management, 2023
Profile Dynamic Memory Allocation in Autonomous Driving Software.
Proceedings of the 10th International Conference on Dependable Systems and Their Applications, 2023
2022
CachePerf: A Unified Cache Miss Classifier via Hybrid Hardware Sampling.
Proceedings of the SIGMETRICS/PERFORMANCE '22: ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, Mumbai, India, June 6, 2022
Deadlock prediction via generalized dependency.
Proceedings of the ISSTA '22: 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, Virtual Event, South Korea, July 18, 2022
2021
GraphZero: A High-Performance Subgraph Matching System.
ACM SIGOPS Oper. Syst. Rev., 2021
NumaPerf: Predictive and Full NUMA Profiling.
CoRR, 2021
FreeLunch: Compression-based GPU Memory Management for Convolutional Neural Networks.
Proceedings of the IEEE/ACM Workshop on Memory Centric High Performance Computing, 2021
NumaPerf: predictive NUMA profiling.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021
Dryadic: Flexible and Fast Graph Pattern Matching at Scale.
Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021
2020
WATCHER: in-situ failure diagnosis.
Proc. ACM Program. Lang., 2020
Prober: Practically Defending Overflows with Page Protection.
Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020
2019
GraphZero: Breaking Symmetry for Efficient Graph Mining.
CoRR, 2019
CSOD: Context-Sensitive Overflow Detection.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019
2018
Guarder: A Tunable Secure Allocator.
Proceedings of the 27th USENIX Security Symposium, 2018
A User Space-based Project for Practicing Core Memory Management Concepts.
Proceedings of the 49th ACM Technical Symposium on Computer Science Education, 2018
iReplayer: in-situ and identical record-and-replay for multithreaded applications.
Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2018
Sampler: PMU-Based Sampling to Detect Memory Errors Latent in Production Software.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018
2017
UNDEAD: detecting and preventing deadlocks in production software.
Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, 2017
SyncPerf: Categorizing, Detecting, and Diagnosing Synchronization Performance Bugs.
Proceedings of the Twelfth European Conference on Computer Systems, 2017
FreeGuard: A Faster Secure Heap Allocator.
Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017
2016
DoubleTake: fast and precise error detection via evidence-based dynamic analysis.
Proceedings of the 38th International Conference on Software Engineering, 2016
Cheetah: detecting false sharing efficiently and effectively.
Proceedings of the 2016 International Symposium on Code Generation and Optimization, 2016
2015
Foreseer: Workload-Aware Data Storage for MapReduce.
Proceedings of the 35th IEEE International Conference on Distributed Computing Systems, 2015
2014
PREDATOR: predictive false sharing detection.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014
2011
Dthreads: efficient deterministic multithreading.
Proceedings of the 23rd ACM Symposium on Operating Systems Principles 2011, 2011
SHERIFF: precise detection and automatic mitigation of false sharing.
Proceedings of the 26th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2011
2009
Grace: safe multithreaded programming for C/C++.
Proceedings of the 24th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2009
2008
Redline: First Class Support for Interactivity in Commodity Operating Systems.
Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation, 2008