Quan Chen

Wen Chen

Zhu Han

IEEE Trans. Mob. Comput., June, 2023

ISPA: Exploiting Intra-SM Parallelism in GPUs via Fine-Grained Resource Management.

[BibT_eX]

[DOI]

IEEE Trans. Computers, May, 2023

A Robust Calibration and Adaptive Multipair of Magnetic Gradient Tensors Localization Method for Magnetic Anomaly Detection.

[BibT_eX]

[DOI]

IEEE Trans. Geosci. Remote. Sens., 2023

Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2023

Kronos: towards bus contention-aware job scheduling in warehouse scale computers.

[BibT_eX]

[DOI]

Frontiers Comput. Sci., 2023

Adaptive CPU Resource Allocation for Emulator in Kernel-based Virtual Machine.

[BibT_eX]

[DOI]

CoRR, 2023

DFlow: Efficient Dataflow-based Invocation Workflow Execution for Function-as-a-Service.

[BibT_eX]

[DOI]

CoRR, 2023

Nodens: Enabling Resource Efficient and Fast QoS Recovery of Dynamic Microservice Applications in Datacenters.

[BibT_eX]

[DOI]

Proceedings of the 2023 USENIX Annual Technical Conference, 2023

BLAD: Adaptive Load Balanced Scheduling and Operator Overlap Pipeline For Accelerating The Dynamic GNN Training.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2023

Optimizing Dynamic Neural Networks with Brainstorm.

[BibT_eX]

[DOI]

Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

On Efficient Packet Batching and Resource Allocation for GPU based NFV Acceleration.

[BibT_eX]

[DOI]

Proceedings of the 31st IEEE/ACM International Symposium on Quality of Service, 2023

CONTC: A Traffic Control System for Container Overlay Networks.

[BibT_eX]

[DOI]

Proceedings of the 31st IEEE/ACM International Symposium on Quality of Service, 2023

FIRST: Exploiting the Multi-Dimensional Attributes of Functions for Power-Aware Serverless Computing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

On Efficient Zygote Container Planning toward Fast Function Startup in Serverless Edge Cloud.

[BibT_eX]

[DOI]

Proceedings of the IEEE INFOCOM 2023, 2023

PAC: Preference-Aware Co-location Scheduling on Heterogeneous NUMA Architectures To Improve Resource Utilization.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Supercomputing, 2023

PMR: Priority Memory Reclaim to Improve the Performance of Latency-Critical Services.

[BibT_eX]

[DOI]

Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023

Microless: Cost-Efficient Hybrid Deployment of Microservices on IaaS VMs and Serverless.

[BibT_eX]

[DOI]

Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023

STAG: Enabling Low Latency and Low Staleness of GNN-based Services with Dynamic Graphs.

[BibT_eX]

[DOI]

Proceedings of the 41st IEEE International Conference on Computer Design, 2023

MARS: Exploiting Multi-Level Parallelism for DNN Workloads on Adaptive Multi-Accelerator Systems.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Maximizing the Utilization of GPUs Used by Cloud Gaming through Adaptive Co-location with Combo.

[BibT_eX]

[DOI]

Proceedings of the 2023 ACM Symposium on Cloud Computing, SoCC 2023, 2023

Not All Resources are Visible: Exploiting Fragmented Shadow Resources in Shared-State Scheduler Architecture.

[BibT_eX]

[DOI]

Proceedings of the 2023 ACM Symposium on Cloud Computing, SoCC 2023, 2023

AdaptGear: Accelerating GNN Training via Adaptive Subgraph-Level Kernels on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 20th ACM International Conference on Computing Frontiers, 2023

Efficient Scheduler Live Update for Linux Kernel with Modularization.

[BibT_eX]

[DOI]

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

DataFlower: Exploiting the Data-flow Paradigm for Serverless Workflow Orchestration.

[BibT_eX]

[DOI]

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022

The Serverless Computing Survey: A Technical Primer for Design Architecture.

[BibT_eX]

[DOI]

ACM Comput. Surv., January, 2022

Online Thread Auto-Tuning for Performance Improvement and Resource Saving.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2022

Efficient and Secure Deep Learning Inference in Trusted Processor Enabled Edge Clouds.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2022

Adaptive Resource Efficient Microservice Deployment in Cloud-Edge Continuum.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2022

Toward QoS-Awareness and Improved Utilization of Spatial Multitasking GPUs.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2022

Reliability and Incentive of Performance Assessment for Decentralized Clouds.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2022

Special Issue on Programming Models and Applications for Multicores and Manycores 2020.

[BibT_eX]

[DOI]

Min Si

Concurr. Comput. Pract. Exp., 2022

Special issue on programming models and applications for multicores and manycores 2019-2020.

[BibT_eX]

[DOI]

Min Si

Concurr. Comput. Pract. Exp., 2022

Help Rather Than Recycle: Alleviating Cold Startup in Serverless Computing Through Inter-Function Container Sharing.

[BibT_eX]

[DOI]

Proceedings of the 2022 USENIX Annual Technical Conference, 2022

RunD: A Lightweight Secure Container Runtime for High-density Deployment and High-concurrency Startup in Serverless Computing.

[BibT_eX]

[DOI]

Proceedings of the 2022 USENIX Annual Technical Conference, 2022

DVABatch: Diversity-aware Multi-Entry Multi-Exit Batching for Efficient Processing of DNN Services on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 2022 USENIX Annual Technical Conference, 2022

PilotFish: Harvesting Free Cycles of Cloud Gaming with Deep Learning Training.

[BibT_eX]

[DOI]

Proceedings of the 2022 USENIX Annual Technical Conference, 2022

QoS-Aware Irregular Collaborative Inference for Improving Throughput of DNN Services.

[BibT_eX]

[DOI]

Proceedings of the SC22: International Conference for High Performance Computing, 2022

Exploring Efficient Microservice Level Parallelism.

[BibT_eX]

[DOI]

Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

QoS-awareness of Microservices with Excessive Loads via Inter-Datacenter Scheduling.

[BibT_eX]

[DOI]

Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

CSC: Collaborative System Configuration for I/O-Intensive Applications in Multi-Tenant Clouds.

[BibT_eX]

[DOI]

Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

PAME: precision-aware multi-exit DNN serving for reducing latencies of batched inferences.

[BibT_eX]

[DOI]

Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

Federated Learning on Non-IID Data Silos: An Experimental Study.

[BibT_eX]

[DOI]

Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

SALO: an efficient spatial accelerator enabling hybrid sparse attention mechanisms for long sequences.

[BibT_eX]

[DOI]

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Characterizing and orchestrating VM reservation in geo-distributed clouds to improve the resource efficiency.

[BibT_eX]

[DOI]

Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022, 2022

Astraea: towards QoS-aware and resource-efficient multi-stage GPU services.

[BibT_eX]

[DOI]

Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

VELTAIR: towards high-performance multi-tenant deep learning services via adaptive compilation and scheduling.

[BibT_eX]

[DOI]

Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

FaaSFlow: enable efficient workflow execution for function-as-a-service.

[BibT_eX]

[DOI]

Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021

Adaptive Preference-Aware Co-Location for Improving Resource Utilization of Power Constrained Datacenters.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2021

E<sup>2</sup>bird: Enhanced Elastic Batch for Improving Responsiveness and Throughput of Deep Learning Services.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2021

Pagurus: Eliminating Cold Startup in Serverless Computing with Inter-Action Container Sharing.

[BibT_eX]

[DOI]

Zijun Li

CoRR, 2021

Enable simultaneous DNN services based on deterministic operator overlap and precise latency prediction.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2021

Gost: Enabling Efficient Spatio-Temporal GPU Sharing for Network Function Virtualization.

[BibT_eX]

[DOI]

Proceedings of the 29th IEEE/ACM International Symposium on Quality of Service, 2021

BiPS: Hotness-aware Bi-tier Parameter Synchronization for Recommendation Models.

[BibT_eX]

[DOI]

Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

AlphaR: Learning-Powered Resource Management for Irregular, Dynamic Microservice Graph.

[BibT_eX]

[DOI]

Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

QoS-Aware and Resource Efficient Microservice Deployment in Cloud-Edge Continuum.

[BibT_eX]

[DOI]

Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

Characterizing and Demystifying the Implicit Convolution Algorithm on Commercial Matrix-Multiplication Accelerators.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Workload Characterization, 2021

Dubhe: Towards Data Unbiasedness with Homomorphic Encryption in Federated Learning Client Selection.

[BibT_eX]

[DOI]

Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

CHARM: Collaborative Host and Accelerator Resource Management for GPU Datacenters.

[BibT_eX]

[DOI]

Proceedings of the 39th IEEE International Conference on Computer Design, 2021

Exploiting Intra-SM Parallelism in GPUs via Persistent and Elastic Blocks.

[BibT_eX]

[DOI]

Proceedings of the 39th IEEE International Conference on Computer Design, 2021

Lasagna: Accelerating Secure Deep Learning Inference in SGX-enabled Edge Cloud.

[BibT_eX]

[DOI]

Proceedings of the SoCC '21: ACM Symposium on Cloud Computing, 2021

Skywalker: Efficient Alias-Method-Based Graph Sampling and Random Walk on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021

2020

Predicting and reining in application-level slowdown on spatial multitasking GPUs.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2020

Probabilistic robust regression with adaptive weights - a case study on face recognition.

[BibT_eX]

[DOI]

Frontiers Comput. Sci., 2020

Towards QoS-Aware and Resource-Efficient GPU Microservices Based on Spatial Multitasking GPUs In Datacenters.

[BibT_eX]

[DOI]

CoRR, 2020

Survey and design of paleozoic: a high-performance compiler tool chain for deep learning inference accelerator.

[BibT_eX]

[DOI]

CCF Trans. High Perform. Comput., 2020

Spool: Reliable Virtualized NVMe Storage Pool in Public Cloud Infrastructure.

[BibT_eX]

[DOI]

Proceedings of the 2020 USENIX Annual Technical Conference, 2020

Alita: comprehensive performance isolation through bias resource management for public clouds.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2020

DLFusion: An Auto-Tuning Compiler for Layer Fusion on Deep Neural Network Accelerator.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2020

Sturgeon: Preference-aware Co-location for Improving Utilization of Power Constrained Computers.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

Amoeba: QoS-Awareness and Reduced Resource Usage of Microservices with Serverless Computing.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

OVERSEE: Outsourcing Verification to Enable Resource Sharing in Edge Environment.

[BibT_eX]

[DOI]

Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

URSA: Precise Capacity Planning and Fair Scheduling based on Low-level Statistics for Public Clouds.

[BibT_eX]

[DOI]

Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

Task Offloading in Trusted Execution Environment empowered Edge Computing.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Conference on Parallel and Distributed Systems, 2020

CODA: Improving Resource Utilization by Slimming and Co-locating DNN and CPU Jobs.

[BibT_eX]

[DOI]

Proceedings of the 40th IEEE International Conference on Distributed Computing Systems, 2020

Asymmetric Resilience: Exploiting Task-Level Idempotency for Transient Error Recovery in Accelerator-Based Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration.

[BibT_eX]

[DOI]

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

How Far Does BERT Look At: Distance-based Clustering and Analysis of BERT's Attention.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Computational Linguistics, 2020

2019

DR Refresh: Releasing DRAM Potential by Enabling Read Accesses Under Refresh.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2019

Bandwidth and Locality Aware Task-stealing for Manycore Architectures with Bandwidth-Asymmetric Memory.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2019

URSA: Precise Capacity Planning and Contention-aware Scheduling for Public Clouds.

[BibT_eX]

[DOI]

CoRR, 2019

Characterizing Perception Module Performance and Robustness in Production-Scale Autonomous Driving System.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, 2019

Characterizing and orchestrating NFV-ready servers for efficient edge data processing.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Quality of Service, 2019

Themis: Predicting and Reining in Application-Level Slowdown on Spatial Multitasking GPUs.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

Laius: Towards latency awareness and improved utilization of spatial multitasking accelerators in datacenters.

[BibT_eX]

[DOI]

Daniel Edward Mawhirter

Bo Wu

Chao Li

Proceedings of the ACM International Conference on Supercomputing, 2019

Avalon: towards QoS awareness and improved utilization through multi-resource management in datacenters.

[BibT_eX]

[DOI]

Proceedings of the ACM International Conference on Supercomputing, 2019

When Power Oversubscription Meets Traffic Flood Attack: Re-Thinking Data Center Peak Load Management.

[BibT_eX]

[DOI]

Proceedings of the 48th International Conference on Parallel Processing, 2019

Characterizing and Balancing the Workloads of Semi-Containerized Clouds.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Conference on Parallel and Distributed Systems, 2019

Optimizing the Aggregated Throughput of GPUs in Public Clouds Based on Adaptive Kernel Reordering.

[BibT_eX]

[DOI]

Jingjin Du

Proceedings of the 25th IEEE International Conference on Parallel and Distributed Systems, 2019

Ebird: Elastic Batch for Improving Responsiveness and Throughput of Deep Learning Services.

[BibT_eX]

[DOI]

Proceedings of the 37th IEEE International Conference on Computer Design, 2019

Adversarial Defense Through Network Profiling Based Path Extraction.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

POSTER: Precise Capacity Planning for Database Public Clouds.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018

Contention and Locality-Aware Work-Stealing for Iterative Applications in Multi-Socket Computers.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2018

DCF: A Dataflow-Based Collaborative Filtering Training Algorithm.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2018

KSM: Online Application-Level Performance Slowdown Prediction for Spatial Multitasking GPGPU.

[BibT_eX]

[DOI]

Wenyi Zhao

IEEE Comput. Archit. Lett., 2018

DLFuzz: differential fuzzing testing of deep learning systems.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018

Deep learning based classification for paddy pests & diseases recognition.

[BibT_eX]

[DOI]

Ahmad Arib Alfarisy

Proceedings of 2018 International Conference on Mathematics and Artificial Intelligence, 2018

Power Grab in Aggressively Provisioned Data Centers: What is the Risk and What Can Be Done About It.

[BibT_eX]

[DOI]

Proceedings of the 36th IEEE International Conference on Computer Design, 2018

DR DRAM: Accelerating Memory-Read-Intensive Applications.

[BibT_eX]

[DOI]

Proceedings of the 36th IEEE International Conference on Computer Design, 2018

CLIBE: Precise Cluster-Level I/O Bandwidth Enforcement in Distributed File System.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

In-growth test for monolithic 3D integrated SRAM.

[BibT_eX]

[DOI]

Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

2017

Electro: Toward QoS-Aware Power Management for Latency-Critical Applications.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

Preemption-Aware Kernel Scheduling for GPUs.

[BibT_eX]

[DOI]

PowerChief: Intelligent Power Allocation for Multi-Stage Applications to Improve Responsiveness on Power Constrained CMP.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

Prophet: Precise QoS Prediction on Non-Preemptive Accelerators to Improve Utilization in Warehouse-Scale Computers.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

Task Scheduling for Multi-core and Parallel Architectures - Challenges, Solutions and Perspectives

[BibT_eX]

[DOI]

Springer, ISBN: 978-981-10-6237-7, 2017

2016

Adaptive demand-aware work-stealing in multi-programmed multi-core architectures.

[BibT_eX]

[DOI]

Long Zheng

Concurr. Comput. Pract. Exp., 2016

SAWS: Selective Asymmetry-Aware Work-Stealing for Asymmetric Multi-core Architectures.

[BibT_eX]

[DOI]

Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications; 14th IEEE International Conference on Smart City; 2nd IEEE International Conference on Data Science and Systems, 2016

Baymax: QoS Awareness and Increased Utilization for Non-Preemptive Accelerators in Warehouse Scale Computers.

[BibT_eX]

[DOI]

Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, 2016

2015

Locality-Aware Work Stealing Based on Online Profiling and Auto-Tuning for Multisocket Multicore Architectures.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2015

DjiNN and Tonic: DNN as a service and its implications for future warehouse scale computers.

[BibT_eX]

[DOI]

Johann Hauswald

Yiping Kang

Michael A. Laurenzano

Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

2014

Cold-Start Recommendation Using Bi-Clustering and Fusion for Large-Scale Social Recommender Systems.

[BibT_eX]

[DOI]

IEEE Trans. Emerg. Top. Comput., 2014

Adaptive workload-aware task scheduling for single-ISA asymmetric multicore architectures.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2014

CPU + GPU scheduling with asymptotic profiling.

[BibT_eX]

[DOI]

Parallel Comput., 2014

DWS: Demand-aware Work-Stealing in Multi-programmed Multi-core Architectures.

[BibT_eX]

[DOI]

Long Zheng

Proceedings of the 2014 PPOPP International Workshop on Programming Models and Applications for Multicores and Manycores, 2014

EEWA: Energy-Efficient Workload-Aware Task Scheduling in Multi-core Architectures.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

LAWS: locality-aware work-stealing for multi-socket multi-core architectures.

[BibT_eX]

[DOI]

Haibing Guan

Proceedings of the 2014 International Conference on Supercomputing, 2014

2013

Adaptive Cache Aware Bitier Work-Stealing in Multisocket Multicore Architectures.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2013

HAT: history-based auto-tuning MapReduce in heterogeneous environments.

[BibT_eX]

[DOI]

J. Supercomput., 2013

CAP: co-scheduling based on asymptotic profiling in CPU+GPU hybrid systems.

[BibT_eX]

[DOI]

Proceedings of the 2013 PPOPP International Workshop on Programming Models and Applications for Multicores and Manycores, 2013

HMHS: Hybrid Multistage Heuristic Scheduling Algorithm for Heterogeneous MapReduce System.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2013

2012

WATS: Workload-Aware Task Scheduling in Asymmetric Multi-core Architectures.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

CATS: cache aware task-stealing based on online profiling in multi-socket multi-core architectures.

[BibT_eX]

[DOI]