Chao Li

Orcid: 0000-0001-6218-4659

Affiliations:
  • Shanghai Jiao Tong University, Department of Computer Science and Engineering, China
  • University of Florida, Gainesville, FL, USA (PhD 2014)


According to our database1, Chao Li authored at least 127 papers between 2011 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
WASP: Efficient Power Management Enabling Workload-Aware, Self-Powered AIoT Devices.
IEEE Trans. Parallel Distributed Syst., August, 2024

Bayesian-Driven Automated Scaling in Stream Computing With Multiple QoS Targets.
IEEE Trans. Parallel Distributed Syst., July, 2024

FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk Framework.
Proc. VLDB Endow., April, 2024

Weighted doubly robust learning: An uplift modeling technique for estimating mixed treatments' effect.
Decis. Support Syst., January, 2024

AutoVCoder: A Systematic Framework for Automated Verilog Code Generation using LLMs.
CoRR, 2024

SOFA: A Compute-Memory Optimized Sparsity Accelerator via Cross-Stage Coordinated Tiling.
CoRR, 2024

Efficient Orchestrated AI Workflows Execution on Scale-out Spatial Architecture.
CoRR, 2024

Towards Fast Setup and High Throughput of GPU Serverless Computing.
CoRR, 2024

CPM: A Cross-layer Power Management Facility to Enable QoS-Aware AIoT Systems.
Proceedings of the 32nd IEEE/ACM International Symposium on Quality of Service, 2024

Exploiting Similarity Opportunities of Emerging Vision AI Models on Hybrid Bonding Architecture.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

A Tale of Two Domains: Exploring Efficient Architecture Design for Truly Autonomous Things.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

CoCG: Fine-grained Cloud Game Co-location on Heterogeneous Platform.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

M<sup>2</sup>SN: Adaptive and Dynamic Multi-modal Shortcut Network Architecture for Latency-Aware Applications.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

LoRAExit: Empowering Dynamic Modulation of LLMs in Resource-limited Settings using Low-rank Adapters.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Improving the Efficiency of Serverless Computing via Core-Level Power Management.
Proceedings of the 24th IEEE International Symposium on Cluster, 2024

FaaSGraph: Enabling Scalable, Efficient, and Cost-Effective Graph Processing with Serverless Computing.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

JUNO: Optimizing High-Dimensional Approximate Nearest Neighbour Search with Sparsity-Aware Algorithm and Ray-Tracing Core Mapping.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

Fractal: Joint Multi-Level Sparse Pattern Tuning of Accuracy and Performance for DNN Pruning.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

Amanda: Unified Instrumentation Framework for Deep Neural Networks.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
FPGA sharing in the cloud: a comprehensive analysis.
Frontiers Comput. Sci., October, 2023

Optimizing GPU-Based Graph Sampling and Random Walk for Efficiency and Scalability.
IEEE Trans. Computers, September, 2023

Fargraph+: Excavating the parallelism of graph processing workload on RDMA-based far memory system.
J. Parallel Distributed Comput., July, 2023

DRAGON: Dynamic Recurrent Accelerator for Graph Online Convolution.
ACM Trans. Design Autom. Electr. Syst., January, 2023

DFlow: Efficient Dataflow-based Invocation Workflow Execution for Function-as-a-Service.
CoRR, 2023

BLAD: Adaptive Load Balanced Scheduling and Operator Overlap Pipeline For Accelerating The Dynamic GNN Training.
Proceedings of the International Conference for High Performance Computing, 2023

SMG: A System-Level Modality Gating Facility for Fast and Energy-Efficient Multimodal Computing.
Proceedings of the IEEE Real-Time Systems Symposium, 2023

High-Throughput GPU Random Walk with Fine-Tuned Concurrent Query Processing.
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023

Architecting Efficient Multi-modal AIoT Systems.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

FIRST: Exploiting the Multi-Dimensional Attributes of Functions for Power-Aware Serverless Computing.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

MMBench: Benchmarking End-to-End Multi-modal DNNs and Understanding Their Hardware-Software Implications.
Proceedings of the IEEE International Symposium on Workload Characterization, 2023

MMExit: Enabling Fast and Efficient Multi-modal DNN Inference with Adaptive Network Exits.
Proceedings of the Euro-Par 2023: Parallel Processing - 29th International Conference on Parallel and Distributed Computing, Limassol, Cyprus, August 28, 2023

Not All Resources are Visible: Exploiting Fragmented Shadow Resources in Shared-State Scheduler Architecture.
Proceedings of the 2023 ACM Symposium on Cloud Computing, SoCC 2023, 2023

uGrapher: High-Performance Graph Operator Computation via Unified Abstraction for Graph Neural Networks.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
Integrated Power Anomaly Defense: Towards Oversubscription-Safe Data Centers.
IEEE Trans. Cloud Comput., 2022

Tapping into NFV Environment for Opportunistic Serverless Edge Function Deployment.
IEEE Trans. Computers, 2022

Performance optimization for cloud computing systems in the microservice era: state-of-the-art and research opportunities.
Frontiers Comput. Sci., 2022

Analyzing the Hardware-Software Implications of Multi-modal DNN Workloads using MMBench.
CoRR, 2022

Characterizing and Understanding End-to-End Multi-Modal Neural Networks on GPUs.
IEEE Comput. Archit. Lett., 2022

Oversubscribing GPU Unified Virtual Memory: Implications and Suggestions.
Proceedings of the ICPE '22: ACM/SPEC International Conference on Performance Engineering, Bejing, China, April 9, 2022

Help Rather Than Recycle: Alleviating Cold Startup in Serverless Computing Through Inter-Function Container Sharing.
Proceedings of the 2022 USENIX Annual Technical Conference, 2022

DVABatch: Diversity-aware Multi-Entry Multi-Exit Batching for Efficient Processing of DNN Services on GPUs.
Proceedings of the 2022 USENIX Annual Technical Conference, 2022

Cloud-Native Server Consolidation for Energy-Efficient FaaS Deployment.
Proceedings of the Network and Parallel Computing, 2022

Excavating the Potential of Graph Workload on RDMA-based Far Memory Architecture.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

Exploring Efficient Microservice Level Parallelism.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

PAME: precision-aware multi-exit DNN serving for reducing latencies of batched inferences.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

HyFarM: Task Orchestration on Hybrid Far Memory for High Performance Per Bit.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

SALO: an efficient spatial accelerator enabling hybrid sparse attention mechanisms for long sequences.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

VELTAIR: towards high-performance multi-tenant deep learning services via adaptive compilation and scheduling.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021
Exploring Highly Dependable and Efficient Datacenter Power System Using Hybrid and Hierarchical Energy Buffers.
IEEE Trans. Sustain. Comput., 2021

ACE-GCN: A Fast Data-driven FPGA Accelerator for GCN Embedding.
ACM Trans. Reconfigurable Technol. Syst., 2021

Grus: Toward Unified-memory-efficient High-performance Graph Processing on GPU.
ACM Trans. Archit. Code Optim., 2021

Fangorn: Adaptive Execution Framework for Heterogeneous Workloads on Shared Clusters.
Proc. VLDB Endow., 2021

Preface.
J. Comput. Sci. Technol., 2021

ZIPPER: Exploiting Tile- and Operator-level Parallelism for General and Scalable Graph Neural Network Acceleration.
CoRR, 2021

Enable simultaneous DNN services based on deterministic operator overlap and precise latency prediction.
Proceedings of the International Conference for High Performance Computing, 2021

AuTraScale: An Automated and Transfer Learning Solution for Streaming System Auto-Scaling.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

AlphaR: Learning-Powered Resource Management for Irregular, Dynamic Microservice Graph.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

CHARM: Collaborative Host and Accelerator Resource Management for GPU Datacenters.
Proceedings of the 39th IEEE International Conference on Computer Design, 2021

Skywalker: Efficient Alias-Method-Based Graph Sampling and Random Walk on GPUs.
Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021

2020
Predicting and reining in application-level slowdown on spatial multitasking GPUs.
J. Parallel Distributed Comput., 2020

Towards QoS-Aware and Resource-Efficient GPU Microservices Based on Spatial Multitasking GPUs In Datacenters.
CoRR, 2020

Architectural Implications of Graph Neural Networks.
IEEE Comput. Archit. Lett., 2020

ANT-man: towards agile power management in the microservice era.
Proceedings of the International Conference for High Performance Computing, 2020

DLFusion: An Auto-Tuning Compiler for Layer Fusion on Deep Neural Network Accelerator.
Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2020

Sturgeon: Preference-aware Co-location for Improving Utilization of Power Constrained Computers.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

OVERSEE: Outsourcing Verification to Enable Resource Sharing in Edge Environment.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

CODA: Improving Resource Utilization by Slimming and Co-locating DNN and CPU Jobs.
Proceedings of the 40th IEEE International Conference on Distributed Computing Systems, 2020

Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

How Far Does BERT Look At: Distance-based Clustering and Analysis of BERT's Attention.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

2019
CongraPlus: Towards Efficient Processing of Concurrent Graph Queries on NUMA Machines.
IEEE Trans. Parallel Distributed Syst., 2019

Dapper: An Adaptive Manager for Large-Capacity Persistent Memory.
IEEE Trans. Computers, 2019

DR Refresh: Releasing DRAM Potential by Enabling Read Accesses Under Refresh.
IEEE Trans. Computers, 2019

Bandwidth and Locality Aware Task-stealing for Manycore Architectures with Bandwidth-Asymmetric Memory.
ACM Trans. Archit. Code Optim., 2019

Yugong: Geo-Distributed Data and Job Placement at Scale.
Proc. VLDB Endow., 2019

A Comprehensive Rearranging Priority Based Method To Accelerate the Reconstruction of RAID Arrays.
Proceedings of the 38th International Symposium on Reliable Distributed Systems Workshops, 2019

Characterizing and orchestrating NFV-ready servers for efficient edge data processing.
Proceedings of the International Symposium on Quality of Service, 2019

SprintCon: Controllable and Efficient Computational Sprinting for Data Center Servers.
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

Themis: Predicting and Reining in Application-Level Slowdown on Spatial Multitasking GPUs.
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

Excavating the Potential of GPU for Accelerating Graph Traversal.
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

Laius: Towards latency awareness and improved utilization of spatial multitasking accelerators in datacenters.
Proceedings of the ACM International Conference on Supercomputing, 2019

Avalon: towards QoS awareness and improved utilization through multi-resource management in datacenters.
Proceedings of the ACM International Conference on Supercomputing, 2019

When Power Oversubscription Meets Traffic Flood Attack: Re-Thinking Data Center Peak Load Management.
Proceedings of the 48th International Conference on Parallel Processing, 2019

Unleashing the Scalability Potential of Power-Constrained Data Center in the Microservice Era.
Proceedings of the 48th International Conference on Parallel Processing, 2019

Performance of Training Sparse Deep Neural Networks on GPUs.
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

Adversarial Defense Through Network Profiling Based Path Extraction.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

PSL: Exploiting Parallelism, Sparsity and Locality to Accelerate Matrix Factorization on x86 Platforms.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2019

2018
IBOM: An Integrated and Balanced On-Chip Memory for High Performance GPGPUs.
IEEE Trans. Parallel Distributed Syst., 2018

Exploring Customizable Heterogeneous Power Distribution and Management for Datacenter.
IEEE Trans. Parallel Distributed Syst., 2018

Edge-Oriented Computing Paradigms: A Survey on Architecture Design and System Management.
ACM Comput. Surv., 2018

Dynamic allocation of power delivery paths in consolidated data centers based on adaptive UPS switching.
Comput. Networks, 2018

Power Grab in Aggressively Provisioned Data Centers: What is the Risk and What Can Be Done About It.
Proceedings of the 36th IEEE International Conference on Computer Design, 2018

DR DRAM: Accelerating Memory-Read-Intensive Applications.
Proceedings of the 36th IEEE International Conference on Computer Design, 2018

Adaptive Memory Fusion: Towards Transparent, Agile Integration of Persistent Memory.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

Your WiFi Knows How You Behave: Leveraging WiFi Channel Data for Behavior Analysis.
Proceedings of the IEEE Global Communications Conference, 2018

2017
Bank Stealing for a Compact and Efficient Register File Architecture in GPGPU.
IEEE Trans. Very Large Scale Integr. Syst., 2017

Managing Battery Aging for High Energy Availability in Green Datacenters.
IEEE Trans. Parallel Distributed Syst., 2017

Oasis: Scaling Out Datacenter Sustainably and Economically.
IEEE Trans. Parallel Distributed Syst., 2017

Congra: Towards Efficient Processing of Concurrent Graph Queries on Shared-Memory Machines.
Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

2016
RE-UPS: an adaptive distributed energy storage system for dynamically managing solar energy in green datacenters.
J. Supercomput., 2016

Energy-Efficient eDRAM-Based On-Chip Storage Architecture for GPGPUs.
IEEE Trans. Computers, 2016

Managing Server Clusters on Renewable Energy Mix.
ACM Trans. Auton. Adapt. Syst., 2016

Cache-emulated register file: An integrated on-chip memory architecture for high performance GPGPUs.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Power Attack Defense: Securing Battery-Backed Data Centers.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Towards an Adaptive Multi-Power-Source Datacenter.
Proceedings of the 2016 International Conference on Supercomputing, 2016

HOPE: Enabling Efficient Service Orchestration in Software-Defined Data Centers.
Proceedings of the 2016 International Conference on Supercomputing, 2016

Bridging the Semantic Gaps of GPU Acceleration for Scale-out CNN-based Big Data Processing: Think Big, See Small.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015
Leveraging Heterogeneous Power for Improving Datacenter Efficiency and Resiliency.
IEEE Comput. Archit. Lett., 2015

Bank stealing for conflict mitigation in GPGPU Register File.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

HEB: deploying and managing hybrid energy buffers for improving datacenter efficiency and economy.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

Towards sustainable in-situ server systems in the big data era.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

On Power-Performance Characterization of Concurrent Throughput Kernels.
Proceedings of the 2015 IEEE International Symposium on Workload Characterization, 2015

Building Fuel Powered Supercomputing Data Center at Low Cost.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

Exploring Hardware Profile-Guided Green Datacenter Scheduling.
Proceedings of the 44th International Conference on Parallel Processing, 2015

A novel TSV probing technique with adhesive test interposer.
Proceedings of the 33rd IEEE International Conference on Computer Design, 2015

BAAT: Towards Dynamically Managing Battery Aging in Green Datacenters.
Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2015

2014
Towards Automated Provisioning and Emergency Handling in Renewable Energy Powered Datacenters.
J. Comput. Sci. Technol., 2014

Understanding the Impact of vCPU Scheduling on DVFS-Based Power Management in Virtualized Cloud Environment.
Proceedings of the IEEE 22nd International Symposium on Modelling, 2014

Managing Green Datacenters Powered by Hybrid Renewable Energy Systems.
Proceedings of the 11th International Conference on Autonomic Computing, 2014

2013
Optimizing virtual machine live storage migration in heterogeneous storage environment.
Proceedings of the ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (co-located with ASPLOS 2013), 2013

Enabling datacenter servers to scale out economically and sustainably.
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013

Chameleon: Adapting throughput server to time-varying green power budget using online learning.
Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013

Enabling distributed generation powered sustainable high-performance data center.
Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013

2012
iSwitch: Coordinating and optimizing renewable energy powered server clusters.
Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012

2011
Characterizing and analyzing renewable energy driven data centers.
Proceedings of the SIGMETRICS 2011, 2011

A quantitative analysis of cooling power in container-based data centers.
Proceedings of the 2011 IEEE International Symposium on Workload Characterization, 2011

SolarCore: Solar energy driven multi-core architecture power management.
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011


  Loading...