Quan Chen

Orcid: 0000-0001-5832-0347

Affiliations:
  • Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai Institute for Advanced Communication and Data Science, China
  • University of Michigan, Ann Arbor, MI, USA (2014 - 2016)
  • Shanghai Jiao Tong University, School of Software, China (PhD 2014)


According to our database1, Quan Chen authored at least 143 papers between 2010 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Versatile Low-Frequency Magnetoelectric Antenna With Memory in Computing Ability and Internet of Underground Things Application.
IEEE Internet Things J., October, 2024

Adaptive QoS-Aware Microservice Deployment With Excessive Loads via Intra- and Inter-Datacenter Scheduling.
IEEE Trans. Parallel Distributed Syst., September, 2024

Hardware-Software Co-Design Enabling Static and Dynamic Sparse Attention Mechanisms.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., September, 2024

Accelerating Sparse DNNs Based on Tiled GEMM.
IEEE Trans. Computers, May, 2024

SHA: QoS-Aware Software and Hardware Auto-Tuning for Database Systems.
J. Comput. Sci. Technol., March, 2024

Towards Fast Setup and High Throughput of GPU Serverless Computing.
CoRR, 2024

A Codesign of Scheduling and Parallelization for Large Model Training in Heterogeneous Clusters.
CoRR, 2024

POSTER: FineCo: Fine-grained Heterogeneous Resource Management for Concurrent DNN Inferences.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024

PAS: Towards Accurate and Efficient Federated Learning with Parameter-Adaptive Synchronization.
Proceedings of the 32nd IEEE/ACM International Symposium on Quality of Service, 2024

FedCA: Efficient Federated Learning with Client Autonomy.
Proceedings of the 53rd International Conference on Parallel Processing, 2024

ElasticRoom: Multi-Tenant DNN Inference Engine via Co-design with Resource-constrained Compilation and Strong Priority Scheduling.
Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing, 2024

An Optimizing Framework on MLIR for Efficient FPGA-based Accelerator Generation.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

FaaSMem: Improving Memory Efficiency of Serverless Computing with Memory Pool Architecture.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

FaaSGraph: Enabling Scalable, Efficient, and Cost-Effective Graph Processing with Serverless Computing.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

JUNO: Optimizing High-Dimensional Approximate Nearest Neighbour Search with Sparsity-Aware Algorithm and Ray-Tracing Core Mapping.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
Improving Cluster Utilization Through Adaptive Resource Management for Deep Neural Network and CPU Jobs Colocation.
IEEE Trans. Computers, December, 2023

Enabling Efficient Spatio-Temporal GPU Sharing for Network Function Virtualization.
IEEE Trans. Computers, October, 2023

PASTO: Enabling Secure and Efficient Task Offloading in TrustZone-Enabled Edge Clouds.
IEEE Trans. Veh. Technol., June, 2023

Blockchain-Aided Edge Computing Market: Smart Contract and Consensus Mechanisms.
IEEE Trans. Mob. Comput., June, 2023

ISPA: Exploiting Intra-SM Parallelism in GPUs via Fine-Grained Resource Management.
IEEE Trans. Computers, May, 2023

A Robust Calibration and Adaptive Multipair of Magnetic Gradient Tensors Localization Method for Magnetic Anomaly Detection.
IEEE Trans. Geosci. Remote. Sens., 2023

Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level.
Proc. VLDB Endow., 2023

Kronos: towards bus contention-aware job scheduling in warehouse scale computers.
Frontiers Comput. Sci., 2023

Adaptive CPU Resource Allocation for Emulator in Kernel-based Virtual Machine.
CoRR, 2023

DFlow: Efficient Dataflow-based Invocation Workflow Execution for Function-as-a-Service.
CoRR, 2023

Nodens: Enabling Resource Efficient and Fast QoS Recovery of Dynamic Microservice Applications in Datacenters.
Proceedings of the 2023 USENIX Annual Technical Conference, 2023

BLAD: Adaptive Load Balanced Scheduling and Operator Overlap Pipeline For Accelerating The Dynamic GNN Training.
Proceedings of the International Conference for High Performance Computing, 2023

Optimizing Dynamic Neural Networks with Brainstorm.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

On Efficient Packet Batching and Resource Allocation for GPU based NFV Acceleration.
Proceedings of the 31st IEEE/ACM International Symposium on Quality of Service, 2023

CONTC: A Traffic Control System for Container Overlay Networks.
Proceedings of the 31st IEEE/ACM International Symposium on Quality of Service, 2023

FIRST: Exploiting the Multi-Dimensional Attributes of Functions for Power-Aware Serverless Computing.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

On Efficient Zygote Container Planning toward Fast Function Startup in Serverless Edge Cloud.
Proceedings of the IEEE INFOCOM 2023, 2023

PAC: Preference-Aware Co-location Scheduling on Heterogeneous NUMA Architectures To Improve Resource Utilization.
Proceedings of the 37th International Conference on Supercomputing, 2023

PMR: Priority Memory Reclaim to Improve the Performance of Latency-Critical Services.
Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023

Microless: Cost-Efficient Hybrid Deployment of Microservices on IaaS VMs and Serverless.
Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023

STAG: Enabling Low Latency and Low Staleness of GNN-based Services with Dynamic Graphs.
Proceedings of the 41st IEEE International Conference on Computer Design, 2023

MARS: Exploiting Multi-Level Parallelism for DNN Workloads on Adaptive Multi-Accelerator Systems.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Maximizing the Utilization of GPUs Used by Cloud Gaming through Adaptive Co-location with Combo.
Proceedings of the 2023 ACM Symposium on Cloud Computing, SoCC 2023, 2023

Not All Resources are Visible: Exploiting Fragmented Shadow Resources in Shared-State Scheduler Architecture.
Proceedings of the 2023 ACM Symposium on Cloud Computing, SoCC 2023, 2023

AdaptGear: Accelerating GNN Training via Adaptive Subgraph-Level Kernels on GPUs.
Proceedings of the 20th ACM International Conference on Computing Frontiers, 2023

Efficient Scheduler Live Update for Linux Kernel with Modularization.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

DataFlower: Exploiting the Data-flow Paradigm for Serverless Workflow Orchestration.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
The Serverless Computing Survey: A Technical Primer for Design Architecture.
ACM Comput. Surv., January, 2022

Online Thread Auto-Tuning for Performance Improvement and Resource Saving.
IEEE Trans. Parallel Distributed Syst., 2022

Efficient and Secure Deep Learning Inference in Trusted Processor Enabled Edge Clouds.
IEEE Trans. Parallel Distributed Syst., 2022

Adaptive Resource Efficient Microservice Deployment in Cloud-Edge Continuum.
IEEE Trans. Parallel Distributed Syst., 2022

Toward QoS-Awareness and Improved Utilization of Spatial Multitasking GPUs.
IEEE Trans. Computers, 2022

Reliability and Incentive of Performance Assessment for Decentralized Clouds.
J. Comput. Sci. Technol., 2022

Special Issue on Programming Models and Applications for Multicores and Manycores 2020.
Concurr. Comput. Pract. Exp., 2022

Special issue on programming models and applications for multicores and manycores 2019-2020.
Concurr. Comput. Pract. Exp., 2022

Help Rather Than Recycle: Alleviating Cold Startup in Serverless Computing Through Inter-Function Container Sharing.
Proceedings of the 2022 USENIX Annual Technical Conference, 2022

RunD: A Lightweight Secure Container Runtime for High-density Deployment and High-concurrency Startup in Serverless Computing.
Proceedings of the 2022 USENIX Annual Technical Conference, 2022

DVABatch: Diversity-aware Multi-Entry Multi-Exit Batching for Efficient Processing of DNN Services on GPUs.
Proceedings of the 2022 USENIX Annual Technical Conference, 2022

PilotFish: Harvesting Free Cycles of Cloud Gaming with Deep Learning Training.
Proceedings of the 2022 USENIX Annual Technical Conference, 2022

QoS-Aware Irregular Collaborative Inference for Improving Throughput of DNN Services.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

Exploring Efficient Microservice Level Parallelism.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

QoS-awareness of Microservices with Excessive Loads via Inter-Datacenter Scheduling.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

CSC: Collaborative System Configuration for I/O-Intensive Applications in Multi-Tenant Clouds.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

PAME: precision-aware multi-exit DNN serving for reducing latencies of batched inferences.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

Federated Learning on Non-IID Data Silos: An Experimental Study.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

SALO: an efficient spatial accelerator enabling hybrid sparse attention mechanisms for long sequences.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Characterizing and orchestrating VM reservation in geo-distributed clouds to improve the resource efficiency.
Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022, 2022

Astraea: towards QoS-aware and resource-efficient multi-stage GPU services.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

VELTAIR: towards high-performance multi-tenant deep learning services via adaptive compilation and scheduling.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

FaaSFlow: enable efficient workflow execution for function-as-a-service.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021
Adaptive Preference-Aware Co-Location for Improving Resource Utilization of Power Constrained Datacenters.
IEEE Trans. Parallel Distributed Syst., 2021

E<sup>2</sup>bird: Enhanced Elastic Batch for Improving Responsiveness and Throughput of Deep Learning Services.
IEEE Trans. Parallel Distributed Syst., 2021

Pagurus: Eliminating Cold Startup in Serverless Computing with Inter-Action Container Sharing.
CoRR, 2021

Enable simultaneous DNN services based on deterministic operator overlap and precise latency prediction.
Proceedings of the International Conference for High Performance Computing, 2021

Gost: Enabling Efficient Spatio-Temporal GPU Sharing for Network Function Virtualization.
Proceedings of the 29th IEEE/ACM International Symposium on Quality of Service, 2021

BiPS: Hotness-aware Bi-tier Parameter Synchronization for Recommendation Models.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

AlphaR: Learning-Powered Resource Management for Irregular, Dynamic Microservice Graph.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

QoS-Aware and Resource Efficient Microservice Deployment in Cloud-Edge Continuum.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

Characterizing and Demystifying the Implicit Convolution Algorithm on Commercial Matrix-Multiplication Accelerators.
Proceedings of the IEEE International Symposium on Workload Characterization, 2021

Dubhe: Towards Data Unbiasedness with Homomorphic Encryption in Federated Learning Client Selection.
Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

CHARM: Collaborative Host and Accelerator Resource Management for GPU Datacenters.
Proceedings of the 39th IEEE International Conference on Computer Design, 2021

Exploiting Intra-SM Parallelism in GPUs via Persistent and Elastic Blocks.
Proceedings of the 39th IEEE International Conference on Computer Design, 2021

Lasagna: Accelerating Secure Deep Learning Inference in SGX-enabled Edge Cloud.
Proceedings of the SoCC '21: ACM Symposium on Cloud Computing, 2021

Skywalker: Efficient Alias-Method-Based Graph Sampling and Random Walk on GPUs.
Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021

2020
Predicting and reining in application-level slowdown on spatial multitasking GPUs.
J. Parallel Distributed Comput., 2020

Probabilistic robust regression with adaptive weights - a case study on face recognition.
Frontiers Comput. Sci., 2020

Towards QoS-Aware and Resource-Efficient GPU Microservices Based on Spatial Multitasking GPUs In Datacenters.
CoRR, 2020

Survey and design of paleozoic: a high-performance compiler tool chain for deep learning inference accelerator.
CCF Trans. High Perform. Comput., 2020

Spool: Reliable Virtualized NVMe Storage Pool in Public Cloud Infrastructure.
Proceedings of the 2020 USENIX Annual Technical Conference, 2020

Alita: comprehensive performance isolation through bias resource management for public clouds.
Proceedings of the International Conference for High Performance Computing, 2020

DLFusion: An Auto-Tuning Compiler for Layer Fusion on Deep Neural Network Accelerator.
Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2020

Sturgeon: Preference-aware Co-location for Improving Utilization of Power Constrained Computers.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

Amoeba: QoS-Awareness and Reduced Resource Usage of Microservices with Serverless Computing.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

OVERSEE: Outsourcing Verification to Enable Resource Sharing in Edge Environment.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

URSA: Precise Capacity Planning and Fair Scheduling based on Low-level Statistics for Public Clouds.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

Task Offloading in Trusted Execution Environment empowered Edge Computing.
Proceedings of the 26th IEEE International Conference on Parallel and Distributed Systems, 2020

CODA: Improving Resource Utilization by Slimming and Co-locating DNN and CPU Jobs.
Proceedings of the 40th IEEE International Conference on Distributed Computing Systems, 2020

Asymmetric Resilience: Exploiting Task-Level Idempotency for Transient Error Recovery in Accelerator-Based Systems.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

How Far Does BERT Look At: Distance-based Clustering and Analysis of BERT's Attention.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

2019
DR Refresh: Releasing DRAM Potential by Enabling Read Accesses Under Refresh.
IEEE Trans. Computers, 2019

Bandwidth and Locality Aware Task-stealing for Manycore Architectures with Bandwidth-Asymmetric Memory.
ACM Trans. Archit. Code Optim., 2019

URSA: Precise Capacity Planning and Contention-aware Scheduling for Public Clouds.
CoRR, 2019

Characterizing Perception Module Performance and Robustness in Production-Scale Autonomous Driving System.
Proceedings of the Network and Parallel Computing, 2019

Characterizing and orchestrating NFV-ready servers for efficient edge data processing.
Proceedings of the International Symposium on Quality of Service, 2019

Themis: Predicting and Reining in Application-Level Slowdown on Spatial Multitasking GPUs.
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

Laius: Towards latency awareness and improved utilization of spatial multitasking accelerators in datacenters.
Proceedings of the ACM International Conference on Supercomputing, 2019

Avalon: towards QoS awareness and improved utilization through multi-resource management in datacenters.
Proceedings of the ACM International Conference on Supercomputing, 2019

When Power Oversubscription Meets Traffic Flood Attack: Re-Thinking Data Center Peak Load Management.
Proceedings of the 48th International Conference on Parallel Processing, 2019

Characterizing and Balancing the Workloads of Semi-Containerized Clouds.
Proceedings of the 25th IEEE International Conference on Parallel and Distributed Systems, 2019

Optimizing the Aggregated Throughput of GPUs in Public Clouds Based on Adaptive Kernel Reordering.
Proceedings of the 25th IEEE International Conference on Parallel and Distributed Systems, 2019

Ebird: Elastic Batch for Improving Responsiveness and Throughput of Deep Learning Services.
Proceedings of the 37th IEEE International Conference on Computer Design, 2019

Adversarial Defense Through Network Profiling Based Path Extraction.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

POSTER: Precise Capacity Planning for Database Public Clouds.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018
Contention and Locality-Aware Work-Stealing for Iterative Applications in Multi-Socket Computers.
IEEE Trans. Computers, 2018

DCF: A Dataflow-Based Collaborative Filtering Training Algorithm.
Int. J. Parallel Program., 2018

KSM: Online Application-Level Performance Slowdown Prediction for Spatial Multitasking GPGPU.
IEEE Comput. Archit. Lett., 2018

DLFuzz: differential fuzzing testing of deep learning systems.
Proceedings of the 2018 ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018

Deep learning based classification for paddy pests & diseases recognition.
Proceedings of 2018 International Conference on Mathematics and Artificial Intelligence, 2018

Power Grab in Aggressively Provisioned Data Centers: What is the Risk and What Can Be Done About It.
Proceedings of the 36th IEEE International Conference on Computer Design, 2018

DR DRAM: Accelerating Memory-Read-Intensive Applications.
Proceedings of the 36th IEEE International Conference on Computer Design, 2018

CLIBE: Precise Cluster-Level I/O Bandwidth Enforcement in Distributed File System.
Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

In-growth test for monolithic 3D integrated SRAM.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

2017
Electro: Toward QoS-Aware Power Management for Latency-Critical Applications.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

Preemption-Aware Kernel Scheduling for GPUs.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

PowerChief: Intelligent Power Allocation for Multi-Stage Applications to Improve Responsiveness on Power Constrained CMP.
Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

Prophet: Precise QoS Prediction on Non-Preemptive Accelerators to Improve Utilization in Warehouse-Scale Computers.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

Task Scheduling for Multi-core and Parallel Architectures - Challenges, Solutions and Perspectives
Springer, ISBN: 978-981-10-6237-7, 2017

2016
Adaptive demand-aware work-stealing in multi-programmed multi-core architectures.
Concurr. Comput. Pract. Exp., 2016

SAWS: Selective Asymmetry-Aware Work-Stealing for Asymmetric Multi-core Architectures.
Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications; 14th IEEE International Conference on Smart City; 2nd IEEE International Conference on Data Science and Systems, 2016

Baymax: QoS Awareness and Increased Utilization for Non-Preemptive Accelerators in Warehouse Scale Computers.
Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, 2016

2015
Locality-Aware Work Stealing Based on Online Profiling and Auto-Tuning for Multisocket Multicore Architectures.
ACM Trans. Archit. Code Optim., 2015

DjiNN and Tonic: DNN as a service and its implications for future warehouse scale computers.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

2014
Cold-Start Recommendation Using Bi-Clustering and Fusion for Large-Scale Social Recommender Systems.
IEEE Trans. Emerg. Top. Comput., 2014

Adaptive workload-aware task scheduling for single-ISA asymmetric multicore architectures.
ACM Trans. Archit. Code Optim., 2014

CPU + GPU scheduling with asymptotic profiling.
Parallel Comput., 2014

DWS: Demand-aware Work-Stealing in Multi-programmed Multi-core Architectures.
Proceedings of the 2014 PPOPP International Workshop on Programming Models and Applications for Multicores and Manycores, 2014

EEWA: Energy-Efficient Workload-Aware Task Scheduling in Multi-core Architectures.
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

LAWS: locality-aware work-stealing for multi-socket multi-core architectures.
Proceedings of the 2014 International Conference on Supercomputing, 2014

2013
Adaptive Cache Aware Bitier Work-Stealing in Multisocket Multicore Architectures.
IEEE Trans. Parallel Distributed Syst., 2013

HAT: history-based auto-tuning MapReduce in heterogeneous environments.
J. Supercomput., 2013

CAP: co-scheduling based on asymptotic profiling in CPU+GPU hybrid systems.
Proceedings of the 2013 PPOPP International Workshop on Programming Models and Applications for Multicores and Manycores, 2013

HMHS: Hybrid Multistage Heuristic Scheduling Algorithm for Heterogeneous MapReduce System.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2013

2012
WATS: Workload-Aware Task Scheduling in Asymmetric Multi-core Architectures.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

CATS: cache aware task-stealing based on online profiling in multi-socket multi-core architectures.
Proceedings of the International Conference on Supercomputing, 2012

2011
CAB: Cache Aware Bi-tier Task-Stealing in Multi-socket Multi-core Architecture.
Proceedings of the International Conference on Parallel Processing, 2011

2010
SAMR: A Self-adaptive MapReduce Scheduling Algorithm in Heterogeneous Environment.
Proceedings of the 10th IEEE International Conference on Computer and Information Technology, 2010


  Loading...