Rafael Mayo

Orcid: 0000-0003-1552-3069

  • Jaume I University, Department of Computer Science and Engineering, Castelló de la Plana, Spain
  • Polytechnic University of Valencia, Spain (PhD 2001)

According to our database1, Rafael Mayo authored at least 104 papers between 1993 and 2024.

Collaborative distances:




In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


Dynamic spawning of MPI processes applied to malleability.
Int. J. High Perform. Comput. Appl., 2024

DMRlib: Easy-Coding and Efficient Resource Management for Job Malleability.
IEEE Trans. Computers, 2021

Malleability Implementation in a MPI Iterative Method.
Proceedings of the IEEE International Conference on Cluster Computing, 2021

Analysis of Threading Libraries for High Performance Computing.
IEEE Trans. Computers, 2020

Noise estimation for hyperspectral subspace identification on FPGAs.
J. Supercomput., 2019

Dynamic reconfiguration of noniterative scientific applications: A case study with HPG aligner.
Int. J. High Perform. Comput. Appl., 2019

Performance Model of MapReduce Iterative Applications for Hybrid Cloud Bursting.
IEEE Trans. Parallel Distributed Syst., 2018

Exploring the interoperability of remote GPGPU virtualization using rCUDA and directive-based programming models.
J. Supercomput., 2018

DMR API: Improving cluster productivity by turning applications into malleable.
Parallel Comput., 2018

On the adequacy of lightweight thread approaches for high-level parallel programming models.
Future Gener. Comput. Syst., 2018

GSaaS: A Service to Cloudify and Schedule GPUs.
IEEE Access, 2018

Time and energy modeling of a high-performance multi-threaded Cholesky factorization.
J. Supercomput., 2017

Efficient Scalable Computing through Flexible Applications and Adaptive Workloads.
Proceedings of the 46th International Conference on Parallel Processing Workshops, 2017

GLTO: On the Adequacy of Lightweight Thread Approaches for OpenMP Implementations.
Proceedings of the 46th International Conference on Parallel Processing, 2017

GLT: A Unified API for Lightweight Thread Libraries.
Proceedings of the Euro-Par 2017: Parallel Processing - 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28, 2017

Cost Model and Analysis of Iterative MapReduce Applications for Hybrid Cloud Bursting.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

Evaluation of Data Locality Strategies for Hybrid Cloud Bursting of Iterative MapReduce.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

Architecture-aware configuration and scheduling of matrix multiplication on asymmetric multicore processors.
Clust. Comput., 2016

A Review of Lightweight Thread Approaches for High Performance Computing.
Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

Enabling GPU Virtualization in Cloud Environments.
Proceedings of the CLOSER 2016, 2016

On exploiting data locality for iterative mapreduce applications in hybrid clouds.
Proceedings of the 3rd IEEE/ACM International Conference on Big Data Computing, 2016

Time and energy modeling of high-performance Level-3 BLAS on x86 architectures.
Simul. Model. Pract. Theory, 2015

Reducing the cost of power monitoring with DC wattmeters.
Comput. Sci. Res. Dev., 2015

Performance and Energy Optimization of Matrix Multiplication on Asymmetric big.LITTLE Processors.
CoRR, 2015

Improving the user experience of the rCUDA remote GPU virtualization framework.
Concurr. Comput. Pract. Exp., 2015

Out-of-core macromolecular simulations on multithreaded architectures.
Concurr. Comput. Pract. Exp., 2015

Time and energy modeling of an INTRA-ONLY HEVC encoder.
Proceedings of the 2015 Visual Communications and Image Processing, 2015

Enabling Big Data Analytics in the Hybrid Cloud Using Iterative MapReduce.
Proceedings of the 8th IEEE/ACM International Conference on Utility and Cloud Computing, 2015

Exploiting Task-Parallelism on GPU Clusters via OmpSs and rCUDA Virtualization.
Proceedings of the 2015 IEEE TrustCom/BigDataSE/ISPA, 2015

Evaluating the Potential of Low Power Systems for Headphone-based Spatial Audio Applications.
Proceedings of the International Conference on Computational Science, 2015

Vectorization of binaural sound virtualization on the ARM Cortex-A15 architecture.
Proceedings of the 23rd European Signal Processing Conference, 2015

Exploring the Suitability of Remote GPGPU Virtualization for the OpenACC Programming Model Using rCUDA.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

Assessing Power Monitoring Approaches for Energy and Power Analysis of Computers.
Sustain. Comput. Informatics Syst., 2014

A complete and efficient CUDA-sharing solution for HPC clusters.
Parallel Comput., 2014

Automatic detection of power bottlenecks in parallel scientific applications.
Comput. Sci. Res. Dev., 2014

Modeling power and energy of the task-parallel Cholesky factorization on multicore processors.
Comput. Sci. Res. Dev., 2014

Modeling power and energy consumption of dense matrix factorizations on multicore processors.
Concurr. Comput. Pract. Exp., 2014

Enhancing performance and energy consumption of runtime schedulers for dense linear algebra.
Concurr. Comput. Pract. Exp., 2014

Assessing the impact of the CPU power-saving modes on the task-parallel solution of sparse linear systems.
Clust. Comput., 2014

Adaptive Downtime for Live Migration of Virtual Machines.
Proceedings of the 7th IEEE/ACM International Conference on Utility and Cloud Computing, 2014

SLURM Support for Remote GPU Virtualization: Implementation and Performance Study.
Proceedings of the 26th IEEE International Symposium on Computer Architecture and High Performance Computing, 2014

Analyzing the Energy Efficiency of the Memory Subsystem in Multicore Processors.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2014

Author's retrospective for biomedical image analysis on a cooperative cluster of gpus and multicores.
Proceedings of the ACM International Conference on Supercomputing 25th Anniversary Volume, 2014

Parallel performance and energy efficiency of modern video encoders on multithreaded architectures.
Proceedings of the 22nd European Signal Processing Conference, 2014

Evaluating the Impact of Virtualization on Performance and Power Dissipation.
Proceedings of the CLOSER 2014, 2014

Energy-efficient execution of dense linear algebra algorithms on multi-core processors.
Clust. Comput., 2013

Solving Some Mysteries in Power Monitoring of Servers: Take Care of Your Wattmeters!
Proceedings of the Energy Efficiency in Large Scale Distributed Systems, 2013

Runtime Scheduling of the LU Factorization: Performance and Energy.
Proceedings of the Energy Efficiency in Large Scale Distributed Systems, 2013

Influence of InfiniBand FDR on the performance of remote GPU virtualization.
Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013

A simulator to assess energy saving strategies and policies in HPC workloads.
ACM SIGOPS Oper. Syst. Rev., 2012

Color and texture analysis on emerging parallel architectures.
Int. J. High Perform. Comput. Appl., 2012

Optimization of power consumption in the iterative solution of sparse linear systems on graphics processors.
Comput. Sci. Res. Dev., 2012

DVFS-control techniques for dense linear algebra operations on multi-core processors.
Comput. Sci. Res. Dev., 2012

Analysis of Strategies to Save Energy for Message-Passing Dense Linear Algebra Kernels.
Proceedings of the 20th Euromicro International Conference on Parallel, 2012

Saving Energy in the LU Factorization with Partial Pivoting on Multi-core Processors.
Proceedings of the 20th Euromicro International Conference on Parallel, 2012

Binding Performance and Power of Dense Linear Algebra Operations.
Proceedings of the 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, 2012

Reducing Energy Consumption of Dense Linear Algebra Operations on Hybrid CPU-GPU Platforms.
Proceedings of the 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, 2012

Leveraging Task-Parallelism in Energy-Efficient ILU Preconditioners.
Proceedings of the ICT as Key Technology against Global Warming, 2012

Tools for Power-Energy Modelling and Analysis of Parallel Scientific Applications.
Proceedings of the 41st International Conference on Parallel Processing, 2012

CU2rCU: Towards the complete rCUDA remote GPU virtualization and sharing solution.
Proceedings of the 19th International Conference on High Performance Computing, 2012

Color and texture analysis using emerging parallel architectures.
Int. J. High Perform. Comput. Appl., 2011

A parallel solver for huge dense linear systems.
Comput. Phys. Commun., 2011

Large-scale linear system solver using secondary storage: Self-energy in hybrid nanostructures.
Comput. Phys. Commun., 2011

Symmetric Rank-k Update on Clusters of Multicore Processors with SMPSs.
Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011

Power-aware Dense Linear Algebra Implementations on Multi-core and Many-core Processors.
Proceedings of the 3rd Many-core Applications Research Community (MARC) Symposium. Proceedings of the 3rd MARC Symposium, 2011

Evaluation of the Energy Performance of Dense Linear Algebra Kernels on Multi-core and Many-Core Processors.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Power Consumption of Mixed Precision in the Iterative Solution of Sparse Linear Systems.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Improving power efficiency of dense linear algebra algorithms on multi-core processors via slack control.
Proceedings of the 2011 International Conference on High Performance Computing & Simulation, 2011

Performance of CUDA Virtualized Remote GPUs in High Performance Clusters.
Proceedings of the International Conference on Parallel Processing, 2011

Enabling CUDA acceleration within virtual machines using rCUDA.
Proceedings of the 18th International Conference on High Performance Computing, 2011

Analysis and optimization of power consumption in the iterative solution of sparse linear systems on multi-core and many-core platforms.
Proceedings of the 2011 International Green Computing Conference and Workshops, 2011

rCUDA: Reducing the number of GPU-based accelerators in high performance clusters.
Proceedings of the 2010 International Conference on High Performance Computing & Simulation, 2010

EnergySaving Cluster Roll: Power Saving System for Clusters.
Proceedings of the Architecture of Computing Systems, 2010

Toward the parallelization of GSL.
J. Supercomput., 2009

Out-of-core solution of linear systems on graphics processors.
Int. J. Parallel Emergent Distributed Syst., 2009

Exploiting the capabilities of modern GPUs for dense matrix computations.
Concurr. Comput. Pract. Exp., 2009

Exploring the GPU for Enhancing Parallelism on Color and Texture Analysis.
Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009

A Proposal to Extend the OpenMP Tasking Model for Heterogeneous Architectures.
Proceedings of the Evolving OpenMP in an Age of Extreme Parallelism, 2009

An Efficient Implementation of GPU Virtualization in High Performance Clusters.
Proceedings of the Euro-Par 2009, 2009

An Extension of the StarSs Programming Model for Platforms with Multiple GPUs.
Proceedings of the Euro-Par 2009 Parallel Processing, 2009

Attaining High Performance in General-Purpose Computations on Current Graphics Processors.
Proceedings of the High Performance Computing for Computational Science, 2008

Evaluation and tuning of the Level 3 CUBLAS for graphics processors.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Biomedical image analysis on a cooperative cluster of GPUs and multicores.
Proceedings of the 22nd Annual International Conference on Supercomputing, 2008

Solving Dense Linear Systems on Graphics Processors.
Proceedings of the Euro-Par 2008, 2008

Stabilizing large-scale generalized systems on parallel computers using multithreading and message-passing.
Concurr. Comput. Pract. Exp., 2007

Strategies for Parallelizing the Solution of Rational Matrix Equations.
Proceedings of the Parallel Computing: Architectures, 2007

Parallel Implementation of LQG Balanced Truncation for Large-Scale Systems.
Proceedings of the Large-Scale Scientific Computing, 6th International Conference, 2007

Parallelization of GSL: The Web Service Interface.
Proceedings of the 14th Euromicro International Conference on Parallel, 2006

Parallel Solution of Large-Scale and Sparse Generalized Algebraic Riccati Equations.
Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

Parallelization of GSL on Clusters of Symmetric Multiprocessors.
Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005

Parallel Order Reduction via Balanced Truncation for Optimal Cooling of Steel Profiles.
Proceedings of the Euro-Par 2005, Parallel Processing, 11th International Euro-Par Conference, Lisbon, Portugal, August 30, 2005

Improving Instruction Set Architecture learning results.
Proceedings of the 2004 workshop on Computer architecture education, 2004

Parallelization of GSL: Architecture, Interfaces, and Programming Models.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

Parallel Algorithms for Balanced Truncation Model Reduction of Sparse Systems.
Proceedings of the Applied Parallel Computing, 2004

Parallelization of the GNU Scientific Library on Heterogeneous Systems.
Proceedings of the 3rd International Symposium on Parallel and Distributed Computing (ISPDC 2004), 2004

Remote Model Reduction of Very Large Linear Systems.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Parallel Algorithms for LQ Optimal Control of Discrete-Time Periodic Linear Systems.
J. Parallel Distributed Comput., 2002

Remote Parallel Model Reduction of Linear Time-Invariant Systems Made Easy.
Proceedings of the High Performance Computing for Computational Science, 2002

Enhanced Services for Remote Model Reduction of Large-Scale Dense Linear Systems.
Proceedings of the Applied Parallel Computing Advanced Scientific Computing, 2002

Solving Large Sparse Lyapunov Equations on Parallel Computers (Research Note).
Proceedings of the Euro-Par 2002, 2002

Parallel solvers for discrete-time algebric Riccati equations.
Concurr. Comput. Pract. Exp., 2001

Solving Discrete-Time Periodic Riccati Equations on a Cluster (Research Note).
Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

A tool-kit for the design and simulation of systolic algorithms.
Proceedings of the 1993 Euromicro Workshop on Parallel and Distributed Processing, 1993
