Chao Li
Orcid: 0000-0001-6218-4659Affiliations:
- Shanghai Jiao Tong University, Department of Computer Science and Engineering, China
- University of Florida, Gainesville, FL, USA (PhD 2014)
According to our database1,
Chao Li
authored at least 127 papers
between 2011 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
-
on dl.acm.org
On csauthors.net:
Bibliography
2024
IEEE Trans. Parallel Distributed Syst., August, 2024
IEEE Trans. Parallel Distributed Syst., July, 2024
FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework.
Proc. VLDB Endow., April, 2024
Weighted doubly robust learning: An uplift modeling technique for estimating mixed treatments' effect.
Decis. Support Syst., January, 2024
CoRR, 2024
SOFA: A Compute-Memory Optimized Sparsity Accelerator via Cross-Stage Coordinated Tiling.
CoRR, 2024
CoRR, 2024
Proceedings of the 32nd IEEE/ACM International Symposium on Quality of Service, 2024
Exploiting Similarity Opportunities of Emerging Vision AI Models on Hybrid Bonding Architecture.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
A Tale of Two Domains: Exploring Efficient Architecture Design for Truly Autonomous Things.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
M<sup>2</sup>SN: Adaptive and Dynamic Multi-modal Shortcut Network Architecture for Latency-Aware Applications.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
LoRAExit: Empowering Dynamic Modulation of LLMs in Resource-limited Settings using Low-rank Adapters.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Proceedings of the 24th IEEE International Symposium on Cluster, 2024
FaaSGraph: Enabling Scalable, Efficient, and Cost-Effective Graph Processing with Serverless Computing.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
JUNO: Optimizing High-Dimensional Approximate Nearest Neighbour Search with Sparsity-Aware Algorithm and Ray-Tracing Core Mapping.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
Fractal: Joint Multi-Level Sparse Pattern Tuning of Accuracy and Performance for DNN Pruning.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023
Frontiers Comput. Sci., October, 2023
IEEE Trans. Computers, September, 2023
Fargraph+: Excavating the parallelism of graph processing workload on RDMA-based far memory system.
J. Parallel Distributed Comput., July, 2023
ACM Trans. Design Autom. Electr. Syst., January, 2023
DFlow: Efficient Dataflow-based Invocation Workflow Execution for Function-as-a-Service.
CoRR, 2023
BLAD: Adaptive Load Balanced Scheduling and Operator Overlap Pipeline For Accelerating The Dynamic GNN Training.
Proceedings of the International Conference for High Performance Computing, 2023
SMG: A System-Level Modality Gating Facility for Fast and Energy-Efficient Multimodal Computing.
Proceedings of the IEEE Real-Time Systems Symposium, 2023
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
FIRST: Exploiting the Multi-Dimensional Attributes of Functions for Power-Aware Serverless Computing.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
MMBench: Benchmarking End-to-End Multi-modal DNNs and Understanding Their Hardware-Software Implications.
Proceedings of the IEEE International Symposium on Workload Characterization, 2023
MMExit: Enabling Fast and Efficient Multi-modal DNN Inference with Adaptive Network Exits.
Proceedings of the Euro-Par 2023: Parallel Processing - 29th International Conference on Parallel and Distributed Computing, Limassol, Cyprus, August 28, 2023
Not All Resources are Visible: Exploiting Fragmented Shadow Resources in Shared-State Scheduler Architecture.
Proceedings of the 2023 ACM Symposium on Cloud Computing, SoCC 2023, 2023
uGrapher: High-Performance Graph Operator Computation via Unified Abstraction for Graph Neural Networks.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
2022
IEEE Trans. Cloud Comput., 2022
IEEE Trans. Computers, 2022
Performance optimization for cloud computing systems in the microservice era: state-of-the-art and research opportunities.
Frontiers Comput. Sci., 2022
Analyzing the Hardware-Software Implications of Multi-modal DNN Workloads using MMBench.
CoRR, 2022
IEEE Comput. Archit. Lett., 2022
Proceedings of the ICPE '22: ACM/SPEC International Conference on Performance Engineering, Bejing, China, April 9, 2022
Help Rather Than Recycle: Alleviating Cold Startup in Serverless Computing Through Inter-Function Container Sharing.
Proceedings of the 2022 USENIX Annual Technical Conference, 2022
DVABatch: Diversity-aware Multi-Entry Multi-Exit Batching for Efficient Processing of DNN Services on GPUs.
Proceedings of the 2022 USENIX Annual Technical Conference, 2022
Proceedings of the Network and Parallel Computing, 2022
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022
PAME: precision-aware multi-exit DNN serving for reducing latencies of batched inferences.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022
Proceedings of the IEEE 40th International Conference on Computer Design, 2022
Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022
SALO: an efficient spatial accelerator enabling hybrid sparse attention mechanisms for long sequences.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022
VELTAIR: towards high-performance multi-tenant deep learning services via adaptive compilation and scheduling.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022
2021
Exploring Highly Dependable and Efficient Datacenter Power System Using Hybrid and Hierarchical Energy Buffers.
IEEE Trans. Sustain. Comput., 2021
ACM Trans. Reconfigurable Technol. Syst., 2021
ACM Trans. Archit. Code Optim., 2021
Fangorn: Adaptive Execution Framework for Heterogeneous Workloads on Shared Clusters.
Proc. VLDB Endow., 2021
ZIPPER: Exploiting Tile- and Operator-level Parallelism for General and Scalable Graph Neural Network Acceleration.
CoRR, 2021
Enable simultaneous DNN services based on deterministic operator overlap and precise latency prediction.
Proceedings of the International Conference for High Performance Computing, 2021
AuTraScale: An Automated and Transfer Learning Solution for Streaming System Auto-Scaling.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021
AlphaR: Learning-Powered Resource Management for Irregular, Dynamic Microservice Graph.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021
Proceedings of the 39th IEEE International Conference on Computer Design, 2021
Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021
2020
J. Parallel Distributed Comput., 2020
Towards QoS-Aware and Resource-Efficient GPU Microservices Based on Spatial Multitasking GPUs In Datacenters.
CoRR, 2020
IEEE Comput. Archit. Lett., 2020
Proceedings of the International Conference for High Performance Computing, 2020
DLFusion: An Auto-Tuning Compiler for Layer Fusion on Deep Neural Network Accelerator.
Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2020
Sturgeon: Preference-aware Co-location for Improving Utilization of Power Constrained Computers.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020
Proceedings of the 40th IEEE International Conference on Distributed Computing Systems, 2020
Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020
How Far Does BERT Look At: Distance-based Clustering and Analysis of BERT's Attention.
Proceedings of the 28th International Conference on Computational Linguistics, 2020
2019
CongraPlus: Towards Efficient Processing of Concurrent Graph Queries on NUMA Machines.
IEEE Trans. Parallel Distributed Syst., 2019
IEEE Trans. Computers, 2019
IEEE Trans. Computers, 2019
Bandwidth and Locality Aware Task-stealing for Manycore Architectures with Bandwidth-Asymmetric Memory.
ACM Trans. Archit. Code Optim., 2019
A Comprehensive Rearranging Priority Based Method To Accelerate the Reconstruction of RAID Arrays.
Proceedings of the 38th International Symposium on Reliable Distributed Systems Workshops, 2019
Characterizing and orchestrating NFV-ready servers for efficient edge data processing.
Proceedings of the International Symposium on Quality of Service, 2019
SprintCon: Controllable and Efficient Computational Sprinting for Data Center Servers.
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019
Themis: Predicting and Reining in Application-Level Slowdown on Spatial Multitasking GPUs.
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019
Laius: Towards latency awareness and improved utilization of spatial multitasking accelerators in datacenters.
Proceedings of the ACM International Conference on Supercomputing, 2019
Avalon: towards QoS awareness and improved utilization through multi-resource management in datacenters.
Proceedings of the ACM International Conference on Supercomputing, 2019
When Power Oversubscription Meets Traffic Flood Attack: Re-Thinking Data Center Peak Load Management.
Proceedings of the 48th International Conference on Parallel Processing, 2019
Unleashing the Scalability Potential of Power-Constrained Data Center in the Microservice Era.
Proceedings of the 48th International Conference on Parallel Processing, 2019
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
PSL: Exploiting Parallelism, Sparsity and Locality to Accelerate Matrix Factorization on x86 Platforms.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2019
2018
IEEE Trans. Parallel Distributed Syst., 2018
Exploring Customizable Heterogeneous Power Distribution and Management for Datacenter.
IEEE Trans. Parallel Distributed Syst., 2018
Edge-Oriented Computing Paradigms: A Survey on Architecture Design and System Management.
ACM Comput. Surv., 2018
Dynamic allocation of power delivery paths in consolidated data centers based on adaptive UPS switching.
Comput. Networks, 2018
Power Grab in Aggressively Provisioned Data Centers: What is the Risk and What Can Be Done About It.
Proceedings of the 36th IEEE International Conference on Computer Design, 2018
Proceedings of the 36th IEEE International Conference on Computer Design, 2018
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018
Proceedings of the IEEE Global Communications Conference, 2018
2017
IEEE Trans. Very Large Scale Integr. Syst., 2017
IEEE Trans. Parallel Distributed Syst., 2017
IEEE Trans. Parallel Distributed Syst., 2017
Congra: Towards Efficient Processing of Concurrent Graph Queries on Shared-Memory Machines.
Proceedings of the 2017 IEEE International Conference on Computer Design, 2017
2016
RE-UPS: an adaptive distributed energy storage system for dynamically managing solar energy in green datacenters.
J. Supercomput., 2016
IEEE Trans. Computers, 2016
ACM Trans. Auton. Adapt. Syst., 2016
Cache-emulated register file: An integrated on-chip memory architecture for high performance GPGPUs.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016
Proceedings of the 2016 International Conference on Supercomputing, 2016
Proceedings of the 2016 International Conference on Supercomputing, 2016
Bridging the Semantic Gaps of GPU Acceleration for Scale-out CNN-based Big Data Processing: Think Big, See Small.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016
2015
IEEE Comput. Archit. Lett., 2015
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015
HEB: deploying and managing hybrid energy buffers for improving datacenter efficiency and economy.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015
Proceedings of the 2015 IEEE International Symposium on Workload Characterization, 2015
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015
Proceedings of the 44th International Conference on Parallel Processing, 2015
Proceedings of the 33rd IEEE International Conference on Computer Design, 2015
Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2015
2014
Towards Automated Provisioning and Emergency Handling in Renewable Energy Powered Datacenters.
J. Comput. Sci. Technol., 2014
Understanding the Impact of vCPU Scheduling on DVFS-Based Power Management in Virtualized Cloud Environment.
Proceedings of the IEEE 22nd International Symposium on Modelling, 2014
Proceedings of the 11th International Conference on Autonomic Computing, 2014
2013
Optimizing virtual machine live storage migration in heterogeneous storage environment.
Proceedings of the ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (co-located with ASPLOS 2013), 2013
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013
Chameleon: Adapting throughput server to time-varying green power budget using online learning.
Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013
Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013
2012
Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012
2011
Proceedings of the SIGMETRICS 2011, 2011
Proceedings of the 2011 IEEE International Symposium on Workload Characterization, 2011
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011