Diego R. Llanos Ferraris

Orcid: 0000-0001-6240-9109

According to our database1, Diego R. Llanos Ferraris authored at least 68 papers between 1999 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Performance improvement of the triangular matrix product in commodity clusters.
J. Supercomput., July, 2024

Challenging Portability Paradigms: FPGA Acceleration Using SYCL and OpenCL.
CoRR, 2024

Finite-Time Lyapunov Exponent Calculation on FPGA using High-Level Synthesis Tools.
CoRR, 2024

2023
Supporting efficient overlapping of host-device operations for heterogeneous programming with CtrlEvents.
J. Parallel Distributed Comput., September, 2023

EPSILOD: efficient parallel skeleton for generic iterative stencil computations in distributed GPUs.
J. Supercomput., June, 2023

Implementation of a motion estimation algorithm for Intel FPGAs using OpenCL.
J. Supercomput., June, 2023

UVaFTLE: Lagrangian finite time Lyapunov exponent extraction for fluid dynamic applications.
J. Supercomput., June, 2023

Correction to: Parallel and distributed Processing: advances on architectures and applications of parallel systems.
Computing, May, 2023

Parallel and distributed Processing: advances on architectures and applications of parallel systems.
Computing, May, 2023

Open SYCL on heterogeneous GPU systems: A case of study.
CoRR, 2023

Mappings and patterns to improve the triangular matrix product on distributed systems.
Proceedings of the IEEE International Conference on Cluster Computing, 2023

2021
Distributed programming of a hyperspectral image registration algorithm for heterogeneous GPU clusters.
J. Parallel Distributed Comput., 2021

Operators for Data Redistribution: Applications to the STL Library and RayTracing Algorithm.
IEEE Access, 2021

2020
Versatile, Low-cost Indoor Positioning Combining Bluetooth and Ultra Wideband Technologies.
Proceedings of the International Conference on Localization and GNSS (ICL-GNSS 2020), Tampere, Finland, June 2nd to 4th, 2020, 2020

2019
Toward a BLAS library truly portable across different accelerator types.
J. Supercomput., 2019

Computational and mathematical models meet heterogeneous computing.
J. Supercomput., 2019

Multi-device Controllers: A Library to Simplify Parallel Heterogeneous Programming.
Int. J. Parallel Program., 2019

HitFlow: A Dataflow Programming Model for Hybrid Distributed- and Shared-Memory Systems.
Int. J. Parallel Program., 2019

High-level parallel programming in a heterogeneous world.
Concurr. Comput. Pract. Exp., 2019

Simplifying the multi-GPU programming of a hyperspectral image registration algorithm.
Proceedings of the 17th International Conference on High Performance Computing & Simulation, 2019

2017
BFCA+: automatic synthesis of parallel code with TLS capabilities.
J. Supercomput., 2017

A technique to automatically determine Ad-hoc communication patterns at runtime.
Parallel Comput., 2017

Using the Xeon Phi Platform to Run Speculatively-Parallelized Codes.
Int. J. Parallel Program., 2017

TORMENT OpenACC2016: A Benchmarking Tool for OpenACC Compilers.
Proceedings of the 25th Euromicro International Conference on Parallel, 2017

Supporting the Xeon Phi Coprocessor in a Heterogeneous Programming Model.
Proceedings of the Euro-Par 2017: Parallel Processing - 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28, 2017

2016
An OpenMP Extension that Supports Thread-Level Speculation.
IEEE Trans. Parallel Distributed Syst., 2016

New Data Structures to Handle Speculative Parallelization at Runtime.
Int. J. Parallel Program., 2016

A Survey on Thread-Level Speculation Techniques.
ACM Comput. Surv., 2016

Comparative Analysis of OpenACC Compilers.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2016

2015
TuCCompi: A Multi-layer Model for Distributed Heterogeneous Computing with Tuning Capabilities.
Int. J. Parallel Program., 2015

Comprehensive Evaluation of a New GPU-based Approach to the Shortest Path Problem.
Int. J. Parallel Program., 2015

On the run-time cost of distributed-memory communications generated using the polyhedral model.
Proceedings of the 2015 International Conference on High Performance Computing & Simulation, 2015

Moody Scheduling for Speculative Parallelization.
Proceedings of the Euro-Par 2015: Parallel Processing, 2015

2014
The Shortest-Path Problem: Analysis and Comparison of Methods
Synthesis Lectures on Theoretical Computer Science, Morgan & Claypool Publishers, ISBN: 978-3-031-02574-7, 2014

An Extensible System for Multilevel Automatic Data Partition and Mapping.
IEEE Trans. Parallel Distributed Syst., 2014

Blending Extensibility and Performance in Dense and Sparse Parallel Data Management.
IEEE Trans. Parallel Distributed Syst., 2014

Optimizing an APSP implementation for NVIDIA GPUs using kernel characterization criteria.
J. Supercomput., 2014

The BonaFide C Analyzer: automatic loop-level characterization and coverage measurement.
J. Supercomput., 2014

Squashing Alternatives for Software-Based Speculative Parallelization.
IEEE Trans. Computers, 2014

Exploiting distributed and shared memory hierarchies with Hitmap.
Proceedings of the International Conference on High Performance Computing & Simulation, 2014

A New GCC Plugin-Based Compiler Pass to Add Support for Thread-Level Speculation into OpenMP.
Proceedings of the Euro-Par 2014 Parallel Processing, 2014

2013
uBench: exposing the impact of CUDA block geometry in terms of performance.
J. Supercomput., 2013

Extending a hierarchical tiling arrays library to support sparse data partitioning.
J. Supercomput., 2013

A new GPU-based approach to the Shortest Path problem.
Proceedings of the International Conference on High Performance Computing & Simulation, 2013

2012
Using SPEC CPU2006 to evaluate the sequential and parallel code generated by commercial and open-source compilers.
J. Supercomput., 2012

Support for Thread-Level Speculation into OpenMP.
Proceedings of the OpenMP in a Heterogeneous World - 8th International Workshop on OpenMP, 2012

Using Fermi Architecture Knowledge to Speed up CUDA and OpenCL Programs.
Proceedings of the 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, 2012

Encapsulated Synchronization and Load-Balance in Heterogeneous Programming.
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

2011
Trasgo: a nested-parallel programming system.
J. Supercomput., 2011

Automatic Data Partitioning Applied to Multigrid PDE Solvers.
Proceedings of the 19th International Euromicro Conference on Parallel, 2011

Towards a Compiler Framework for Thread-Level Speculation.
Proceedings of the 19th International Euromicro Conference on Parallel, 2011

Understanding the impact of CUDA tuning techniques for Fermi.
Proceedings of the 2011 International Conference on High Performance Computing & Simulation, 2011

Exclusive squashing for thread-level speculation.
Proceedings of the 20th ACM International Symposium on High Performance Distributed Computing, 2011

Robust thread-level speculation.
Proceedings of the 18th International Conference on High Performance Computing, 2011

2010
Effortless and Efficient Distributed Data-Partitioning in Linear Algebra.
Proceedings of the 12th IEEE International Conference on High Performance Computing and Communications, 2010

2008
Just-In-Time Scheduling for Loop-based Speculative Parallelization.
Proceedings of the 16th Euromicro International Conference on Parallel, 2008

2007
New Scheduling Strategies for Randomized Incremental Algorithms in the Context of Speculative Parallelization.
IEEE Trans. Computers, 2007

Review of "Grid Computing Security by Anirban Chakrabarti", Springer, 2007, ISBN 3540444920.
ACM Queue, 2007

2006
TPCC-UVa: an open-source TPC-C implementation for global performance measurement of computer systems.
SIGMOD Rec., 2006

Speculative Parallelization.
Computer, 2006

TPCC-UVa: an open-source TPC-C implementation for parallel and distributed systems.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

2005
Design Space Exploration of a Software Speculative Parallelization Scheme.
IEEE Trans. Parallel Distributed Syst., 2005

MESETA: A New Scheduling Strategy for Speculative Parallelization of Randomized Incremental Algorithms.
Proceedings of the 34th International Conference on Parallel Processing Workshops (ICPP 2005 Workshops), 2005

2004
Speculative Parallelization of a Randomized Incremental Convex Hull Algorithm.
Proceedings of the Computational Science and Its Applications, 2004

2003
Toward efficient and robust software speculative parallelization on multiprocessors.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2003

2000
Exploiting parallelism in a network of workstations using COMA-BC.
SIGARCH Comput. Archit. News, 2000

Reducing the Replacement Overhead on COMA Protocols for Workstation-Based Architectures.
Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

1999
A Configurable ACSL-Based Interface Generator for Simulated Systems.
Simul., 1999


  Loading...