José Ignacio Aliaga

Stef Graillat

Int. J. High Perform. Comput. Appl., January, 2024

Dynamic spawning of MPI processes applied to malleability.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2024

Robustness and Accuracy in Pipelined Bi-conjugate Gradient Stabilized Methods.

[BibT_eX]

[DOI]

Mykhailo Havdiak

Proceedings of the Computational Science - ICCS 2024, 2024

2023

Compressed basis GMRES on high-performance graphics processing units.

[BibT_eX]

[DOI]

Thomas Grützmacher

Int. J. High Perform. Comput. Appl., March, 2023

General framework for re-assuring numerical reliability in parallel Krylov solvers: A case of BiCGStab methods.

[BibT_eX]

[DOI]

CoRR, 2023

Sparse matrix-vector and matrix-multivector products for the truncated SVD on graphics processors.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2023

Efficient data redistribution for malleable applications.

[BibT_eX]

[DOI]

Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Configurable synthetic application for studying malleability in HPC.

[BibT_eX]

[DOI]

Proceedings of the 31st Euromicro International Conference on Parallel, 2023

2022

Compression and load balancing for efficient sparse matrix-vector product on multicore processors and graphics processing units.

[BibT_eX]

[DOI]

Thomas Grützmacher

Concurr. Comput. Pract. Exp., 2022

General Framework for Deriving Reproducible Krylov Subspace Algorithms: BiCGStab Case.

[BibT_eX]

[DOI]

Stef Graillat

Proceedings of the Parallel Processing and Applied Mathematics, 2022

2021

Malleability Implementation in a MPI Iterative Method.

[BibT_eX]

[DOI]

Iker Martín-Álvarez

Sergio Iserte

Proceedings of the IEEE International Conference on Cluster Computing, 2021

2020

Iteration-fusing conjugate gradient for sparse linear systems with MPI + OmpSs.

[BibT_eX]

[DOI]

J. Supercomput., 2020

Reproducibility strategies for parallel Preconditioned Conjugate Gradient.

[BibT_eX]

[DOI]

Matthias Wiesenberger

J. Comput. Appl. Math., 2020

Reproducibility of parallel preconditioned conjugate gradient in hybrid programming environments.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2020

Compressed Basis GMRES on High Performance GPUs.

[BibT_eX]

[DOI]

Thomas Grützmacher

CoRR, 2020

Reproducibility of Parallel Preconditioned Conjugate Gradient in Hybrid Programming Environments.

[BibT_eX]

[DOI]

CoRR, 2020

Balanced and Compressed Coordinate Layout for the Sparse Matrix-Vector Product on GPUs.

[BibT_eX]

[DOI]

Yuhsiang M. Tsai

Proceedings of the Euro-Par 2020: Parallel Processing Workshops, 2020

2019

An efficient GPU version of the preconditioned GMRES method.

[BibT_eX]

[DOI]

J. Supercomput., 2019

Accelerating the task/data-parallel version of ILUPACK's BiCG in multi-CPU/GPU configurations.

[BibT_eX]

[DOI]

Parallel Comput., 2019

Erratum to "Exploiting nested task-parallelism in theH-LU factorization" [J. Comput. Sci. 33 (2019) 20-33].

[BibT_eX]

[DOI]

J. Comput. Sci., 2019

Exploiting nested task-parallelism in the H-LU factorization.

[BibT_eX]

[DOI]

J. Comput. Sci., 2019

Energy-aware strategies for task-parallel sparse linear system solvers.

[BibT_eX]

[DOI]

M. Asunción Castaño

Concurr. Comput. Pract. Exp., 2019

2018

Extending ILUPACK with a Task-Parallel Version of BiCG for Dual-GPU Servers.

[BibT_eX]

[DOI]

Proceedings of the 9th International Workshop on Programming Models and Applications for Multicores and Manycores, 2018

Extending ILUPACK with a GPU Version of the BiCGStab Method.

[BibT_eX]

[DOI]

Proceedings of the XLIV Latin American Computer Conference, 2018

2017

Adapting concurrency throttling and voltage-frequency scaling for dense eigensolvers.

[BibT_eX]

[DOI]

J. Supercomput., 2017

Communication in task-parallel ILU-preconditioned CG solvers using MPI + OmpSs.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2017

Overcoming Memory-Capacity Constraints in the Use of ILUPACK on Graphics Processors.

[BibT_eX]

[DOI]

Proceedings of the 29th International Symposium on Computer Architecture and High Performance Computing, 2017

SYCL-BLAS: Combining Expression Trees and Kernel Fusion on Heterogeneous Systems.

[BibT_eX]

[DOI]

Ruymán Reyes

Mehdi Goli

Proceedings of the Parallel Computing is Everywhere, 2017

SYCL-BLAS: Leveraging Expression Trees for Linear Algebra.

[BibT_eX]

[DOI]

Ruymán Reyes

Mehdi Goli

Proceedings of the 5th International Workshop on OpenCL, 2017

Task-Parallel LU Factorization of Hierarchical Matrices Using OmpSs.

[BibT_eX]

[DOI]

Rocío Carratalá-Sáez

Ronald Kriemann

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Evaluating the NVIDIA Tegra Processor as a Low-Power Alternative for Sparse GPU Computations.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing - 4th Latin American Conference, 2017

2016

Exploiting task and data parallelism in ILUPACK's preconditioned CG solver on NUMA architectures and many-core accelerators.

[BibT_eX]

[DOI]

Parallel Comput., 2016

A fast band-Krylov eigensolver for macromolecular functional motion simulation on multicore architectures and graphics processors.

[BibT_eX]

[DOI]

José Ramón López-Blanco

J. Comput. Phys., 2016

Characterizing the efficiency of multicore and manycore processors for the solution of sparse linear systems.

[BibT_eX]

[DOI]

Comput. Sci. Res. Dev., 2016

A Data-Parallel ILUPACK for Sparse General and Symmetric Indefinite Linear Systems.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2016: Parallel Processing Workshops, 2016

Exploiting Task-Parallelism in Message-Passing Sparse Linear System Solvers Using OmpSs.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2016: Parallel Processing, 2016

Design of a Task-Parallel Version of ILUPACK for Graphics Processors.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing - Third Latin American Conference, 2016

2015

Are our dense linear algebra libraries energy-friendly?

[BibT_eX]

[DOI]

Manuel F. Dolz

Comput. Sci. Res. Dev., 2015

Out-of-core macromolecular simulations on multithreaded architectures.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2015

Unveiling the performance-energy trade-off in iterative linear system solvers for multithreaded processors.

[BibT_eX]

[DOI]

Maribel Castillo

Juan Carlos Fernández

Germán León

Concurr. Comput. Pract. Exp., 2015

Harnessing CUDA Dynamic Parallelism for the Solution of Sparse Linear Systems.

[BibT_eX]

[DOI]

Davor Davidovic

Proceedings of the Parallel Computing: On the Road to Exascale, 2015

Performance and Fault Tolerance of Preconditioned Iterative Solvers on Low-Power ARM Architectures.

[BibT_eX]

[DOI]

Dimitrios S. Nikolopoulos

Sandra Catalán

Charalampos Chalios

Proceedings of the Parallel Computing: On the Road to Exascale, 2015

Systematic Fusion of CUDA Kernels for Iterative Sparse Linear System Solvers.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2015: Parallel Processing, 2015

2014

iMODS: internal coordinates normal mode analysis server.

[BibT_eX]

[DOI]

José Ramón López-Blanco

Pablo Chacón

Nucleic Acids Res., 2014

Assessing the impact of the CPU power-saving modes on the task-parallel solution of sparse linear systems.

[BibT_eX]

[DOI]

Clust. Comput., 2014

Leveraging Task-Parallelism with OmpSs in ILUPACK's Preconditioned CG Method.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Symposium on Computer Architecture and High Performance Computing, 2014

Leveraging Data-Parallelism in ILUPACK using Graphics Processors.

[BibT_eX]

[DOI]

Proceedings of the IEEE 13th International Symposium on Parallel and Distributed Computing, 2014

2013

Exploring large macromolecular functional motions on clusters of multicore processors.

[BibT_eX]

[DOI]

José Ramón López-Blanco

J. Comput. Phys., 2013

Out-of-Core Solution of Eigenproblems for Macromolecular Simulations.

[BibT_eX]

[DOI]

Davor Davidovic

Proceedings of the Parallel Processing and Applied Mathematics, 2013

Performance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures.

[BibT_eX]

[DOI]

Maribel Castillo

Juan Carlos Fernández

Germán León

Proceedings of the Parallel Processing and Applied Mathematics, 2013

Reformulated Conjugate Gradient for the Energy-Aware Solution of Linear Systems on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 42nd International Conference on Parallel Processing, 2013

2012

Solving dense generalized eigenproblems on multi-threaded architectures.

[BibT_eX]

[DOI]

Appl. Math. Comput., 2012

Leveraging Task-Parallelism in Energy-Efficient ILU Preconditioners.

[BibT_eX]

[DOI]

Proceedings of the ICT as Key Technology against Global Warming, 2012

2011

ILUPACK.

[BibT_eX]

[DOI]

Proceedings of the Encyclopedia of Parallel Computing, 2011

Exploiting thread-level parallelism in the iterative solution of sparse linear systems.

[BibT_eX]

[DOI]

Parallel Comput., 2011

Analysis and optimization of power consumption in the iterative solution of sparse linear systems on multi-core and many-core platforms.

[BibT_eX]

[DOI]

Juan Carlos Fernández

Proceedings of the 2011 International Green Computing Conference and Workshops, 2011

2010

Parallelization of Multilevel ILU Preconditioners on Distributed-Memory Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel and Scientific Computing, 2010

2009

Toward the parallelization of GSL.

[BibT_eX]

[DOI]

J. Supercomput., 2009

Evaluation of Parallel Sparse Matrix Partitioning Software for Parallel Multilevel ILU Preconditioning on Shared-Memory Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009

2008

Design, Tuning and Evaluation of Parallel Multilevel ILU Preconditioners.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing for Computational Science, 2008

2007

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors.

[BibT_eX]

Proceedings of the Parallel Computing: Architectures, 2007

2006

Parallelization of GSL: The Web Service Interface.

[BibT_eX]

[DOI]

Proceedings of the 14th Euromicro International Conference on Parallel, 2006

2005

Parallelization of GSL on Clusters of Symmetric Multiprocessors.

[BibT_eX]

Adrián Santos

Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005

2004

Parallelization of GSL: Architecture, Interfaces, and Programming Models.

[BibT_eX]

[DOI]

Francisco Almeida

José M. Badía

Sergio Barrachina-Mir

Vicente Blanco Pérez

U. Dorta

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

Parallelization of GSL: Performance of Case Studies.

[BibT_eX]

[DOI]

U. Dorta

Proceedings of the Applied Parallel Computing, 2004

Parallelization of the GNU Scientific Library on Heterogeneous Systems.

[BibT_eX]

[DOI]

José Manuel Badía-Contelles

Francisco Almeida

Sergio Barrachina-Mir

Vicente Blanco Pérez

U. Dorta

Proceedings of the 3rd International Symposium on Parallel and Distributed Computing (ISPDC 2004), 2004

2000

A Lanczos-type method for multiple starting vectors.

[BibT_eX]

[DOI]

Math. Comput., 2000

1996

A Parallel Implementation of the General Lanczos Method on the CRAY T3D.

[BibT_eX]

[DOI]