2025
Towards Workload-aware Cloud Efficiency: A Large-scale Empirical Study of Cloud Workload Characteristics.
Proceedings of the 16th ACM/SPEC International Conference on Performance Engineering, 2025
2024
Towards Cloud Efficiency with Large-scale Workload Characterization.
CoRR, 2024
Workload Intelligence: Punching Holes Through the Cloud Abstraction.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Fast and Accurate DNN Performance Estimation across Diverse Hardware Platforms.
Proceedings of the 32nd International Conference on Modeling, 2024
TraceUpscaler: Upscaling Traces to Evaluate Systems at High Load.
Proceedings of the Nineteenth European Conference on Computer Systems, 2024
AutoBurst: Autoscaling Burstable Instances for Cost-effective Latency SLOs.
Proceedings of the 2024 ACM Symposium on Cloud Computing, 2024
2023
SplitRPC: A {Control + Data} Path Splitting RPC Stack for ML Inference Serving.
Proc. ACM Meas. Anal. Comput. Syst., 2023
Kerveros: Efficient and Scalable Cloud Admission Control.
,
,
,
,
,
,
,
,
,
,
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023
2022
Overflowing emerging neural network inference tasks from the GPU to the CPU on heterogeneous servers.
Proceedings of the SYSTOR '22: The 15th ACM International Systems and Storage Conference, Haifa, Israel, June 13, 2022
Metastable Failures in the Wild.
Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022
2021
Metastable failures in distributed systems.
Proceedings of the HotOS '21: Workshop on Hot Topics in Operating Systems, 2021
TraceSplitter: a new paradigm for downscaling traces.
Proceedings of the EuroSys '21: Sixteenth European Conference on Computer Systems, 2021
tprof: Performance profiling via structural aggregation and automated analysis of distributed systems traces.
Proceedings of the SoCC '21: ACM Symposium on Cloud Computing, 2021
2020
The Fast and The Frugal: Tail Latency Aware Provisioning for Coping with Load Variations.
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020
Peafowl: in-application CPU scheduling to reduce power consumption of in-memory key-value stores.
Proceedings of the SoCC '20: ACM Symposium on Cloud Computing, 2020
2019
BurScale: Using Burstable Instances for Cost-Effective Autoscaling in the Public Cloud.
Proceedings of the ACM Symposium on Cloud Computing, SoCC 2019, 2019
2018
RobinHood: Tail Latency Aware Caching - Dynamic Reallocation from Cache-Rich to Cache-Poor.
Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018
2017
WorkloadCompactor: reducing datacenter cost while providing tail latency SLO guarantees.
Proceedings of the 2017 Symposium on Cloud Computing, SoCC 2017, Santa Clara, CA, USA, 2017
2016
TetriSched: global rescheduling with adaptive plan-ahead in dynamic heterogeneous clusters.
Proceedings of the Eleventh European Conference on Computer Systems, 2016
SNC-Meister: Admitting More Tenants with Tail Latency SLOs.
Proceedings of the Seventh ACM Symposium on Cloud Computing, 2016
2014
PriorityMeister: Tail Latency QoS for Shared Networked Storage.
Proceedings of the ACM Symposium on Cloud Computing, 2014
2013
IOFlow: a software-defined storage architecture.
Proceedings of the ACM SIGOPS 24th Symposium on Operating Systems Principles, 2013
2012
SOFTScale: Stealing Opportunistically for Transient Scaling.
Proceedings of the Middleware 2012, 2012
Saving Cash by Using Less Cache.
Proceedings of the 4th USENIX Workshop on Hot Topics in Cloud Computing, 2012