M. Mustafa Rafique

Orcid: 0000-0002-5034-2880

According to our database1, M. Mustafa Rafique authored at least 62 papers between 2008 and 2024.

Collaborative distances:



In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


Towards Affordable Reproducibility Using Scalable Capture and Comparison of Intermediate Multi-Run Results.
Proceedings of the 25th International Middleware Conference, 2024

Deep Optimizer States: Towards Scalable Training of Transformer Models using Interleaved Offloading.
Proceedings of the 25th International Middleware Conference, 2024

Application-Attuned Memory Management for Containerized HPC Workflows.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

DataStates-LLM: Lazy Asynchronous Checkpointing for Large Language Models.
Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing, 2024

Breaking the Memory Wall: A Study of I/O Patterns and GPU Memory Utilization for Hybrid CPU-GPU Offloaded Optimizers.
Proceedings of the 14th Workshop on AI and Scientific Computing at Scale using Flexible Computing Infrastructures, 2024

Asynchronous Multi-Level Checkpointing: An Enabler of Reproducibility using Checkpoint History Analytics.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

COLTI: Towards Concurrent and Co-located DNN Training and Inference.
Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, 2023

GPU-Enabled Asynchronous Multi-level Checkpoint Caching and Prefetching.
Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, 2023

Towards Efficient I/O Pipelines Using Accumulated Compression.
Proceedings of the 30th IEEE International Conference on High Performance Computing, 2023

Optimizing the Training of Co-Located Deep Learning Models Using Cache-Aware Staggering.
Proceedings of the 30th IEEE International Conference on High Performance Computing, 2023

Accelerating Performance of GPU-based Workloads Using CXL.
Proceedings of the 13th Workshop on AI and Scientific Computing at Scale using Flexible Computing, 2023

PredictDDL: Reusable Workload Performance Prediction for Distributed Deep Learning.
Proceedings of the IEEE International Conference on Cluster Computing, 2023

Product weighted taxonomy extraction using Twitter.
Int. J. Bus. Inf. Syst., 2022

Canary: Fault-Tolerant FaaS for Stateful Time-Sensitive Applications.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

Exploiting CXL-based Memory for Distributed Deep Learning.
Proceedings of the 51st International Conference on Parallel Processing, 2022

Towards Efficient Cache Allocation for High-Frequency Checkpointing.
Proceedings of the 29th IEEE International Conference on High Performance Computing, 2022

Towards Data Gravity and Compliance Aware Distributed Deep Learning on Hybrid Clouds.
Proceedings of the 29th IEEE International Conference on High Performance Computing, 2022

On Realizing Efficient Deep Learning Using Serverless Computing.
Proceedings of the 22nd IEEE International Symposium on Cluster, 2022

Clustering-cum-Handover Management Scheme for improved Internet access in high-density mobile wireless environments.
Sustain. Comput. Informatics Syst., 2021

Profit maximisation in long-term e-service agreements.
Int. J. Bus. Inf. Syst., 2021

Energy-makespan optimization of workflow scheduling in fog-cloud computing.
Computing, 2021

Towards Efficient I/O Scheduling for Collaborative Multi-Level Checkpointing.
Proceedings of the 29th International Symposium on Modeling, 2021

Extending the Control Plane of Container Orchestrators for I/O Virtualization.
Proceedings of the 2nd International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC, 2020

Infrastructure-Aware TensorFlow for Heterogeneous Datacenters.
Proceedings of the 28th International Symposium on Modeling, 2020

CoSim: A Simulator for Co-Scheduling of Batch and On-Demand Jobs in HPC Datacenters.
Proceedings of the 24th IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications, 2020

CuVPP: Filter-based Longest Prefix Matching in Software Data Planes.
Proceedings of the IEEE International Conference on Cluster Computing, 2020

MARBLE: A Multi-GPU Aware Job Scheduler for Deep Learning on HPC Systems.
Proceedings of the 20th IEEE/ACM International Symposium on Cluster, 2020

Container Orchestration by Kubernetes for RDMA Networking.
Proceedings of the 27th IEEE International Conference on Network Protocols, 2019

A Quantitative Study of Deep Learning Training on Heterogeneous Supercomputers.
Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019

Optimization of data-intensive workflows in stream-based data processing models.
J. Supercomput., 2017

Evaluation of Data Locality Strategies for Hybrid Cloud Bursting of Iterative MapReduce.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

Adaptive energy-efficient clustering path planning routing protocols for heterogeneous wireless sensor networks.
Sustain. Comput. Informatics Syst., 2016

An accelerated framework for the classification of biological targets from solid-state micropore data.
Comput. Methods Programs Biomed., 2016

List-Based Task Scheduling for Cloud Computing.
Proceedings of the 2016 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, 2016

CHOPPER: Optimizing Data Partitioning for In-memory Data Analytics Frameworks.
Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

On Efficient Hierarchical Storage for Big Data Processing.
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016

On exploiting data locality for iterative mapreduce applications in hybrid clouds.
Proceedings of the 3rd IEEE/ACM International Conference on Big Data Computing, 2016

Realizing Accelerated Cost-Effective Distributed RAID.
Proceedings of the Handbook on Data Centers, 2015

Eliminating the State of Ping-Pong for Mobile IP Optimization.
Wirel. Pers. Commun., 2015

Enabling Big Data Analytics in the Hybrid Cloud Using Iterative MapReduce.
Proceedings of the 8th IEEE/ACM International Conference on Utility and Cloud Computing, 2015

Heterogeneous cloud systems monitoring using semantic and linked data technologies.
Proceedings of the IFIP/IEEE International Symposium on Integrated Network Management, 2015

HETS: Heterogeneous Edge and Task Scheduling Algorithm for Heterogeneous Computing Systems.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Towards energy awareness in Hadoop.
Proceedings of the Fourth International Workshop on Network-Aware Data Management, 2014

An advanced heterogeneity-aware centralized energy efficient clustering routing protocol for wireless sensor networks.
Proceedings of the International Green Computing Conference, 2014

Distributed Detection of Cancer Cells in High-Throughput Cellular Spike Streams.
Proceedings of the 14th IEEE/ACM International Symposium on Cluster, 2014

Data-Intensive Workflow Optimization Based on Application Task Graph Partitioning in Heterogeneous Computing Systems.
Proceedings of the 2014 IEEE Fourth International Conference on Big Data and Cloud Computing, 2014

A Capacity Allocation Approach for Volunteer Cloud Federations Using Poisson-Gamma Gibbs Sampling.
Proceedings of the 2014 IEEE 7th International Conference on Cloud Computing, Anchorage, AK, USA, June 27, 2014

Coordinating Social Care and Healthcare using Semantic Web Technologies.
Proceedings of the ISWC 2013 Posters & Demonstrations Track, 2013

Dynamic Sharing of GPUs in Cloud Systems.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Leveraging Collaborative Content Exchange for On-Demand VM Multi-deployments in IaaS Clouds.
Proceedings of the Euro-Par 2013 Parallel Processing, 2013

GPU-based real-time detection and analysis of biological targets using solid-state nanopores.
Medical Biol. Eng. Comput., 2012

On the Use of GPUs in Realizing Cost-Effective Distributed RAID.
Proceedings of the 20th IEEE International Symposium on Modeling, 2012

An Adaptive Framework for Managing Heterogeneous Many-Core Clusters.
PhD thesis, 2011

Reusable software components for accelerator-based clusters.
J. Syst. Softw., 2011

A capabilities-aware framework for using computational accelerators in data-intensive computing.
J. Parallel Distributed Comput., 2011

Power management for heterogeneous clusters: An experimental study.
Proceedings of the 2011 International Green Computing Conference and Workshops, 2011

Symphony: A Scheduler for Client-Server Applications on Coprocessor-Based Heterogeneous Clusters.
Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

Designing Accelerator-Based Distributed Systems for High Performance.
Proceedings of the 10th IEEE/ACM International Conference on Cluster, 2010

A Capabilities-Aware Programming Model for Asymmetric High-End Systems.
Proceedings of the 10th IEEE/ACM International Conference on Cluster, 2010

Supporting MapReduce on large-scale asymmetric multi-core clusters.
ACM SIGOPS Oper. Syst. Rev., 2009

CellMR: A framework for supporting mapreduce on asymmetric cell-based clusters.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

DMA-based prefetching for i/o-intensive workloads on the cell architecture.
Proceedings of the 5th Conference on Computing Frontiers, 2008
