Sandra Catalán
Orcid: 0000-0002-9321-2728
According to our database1,
Sandra Catalán
authored at least 50 papers
between 2013 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
J. Supercomput., January, 2025
2024
Parallel GEMM-based convolutions for deep learning on multicore ARM and RISC-V architectures.
J. Syst. Archit., 2024
Experiences with nested parallelism in task-parallel applications using malleable BLAS on multicore processors.
Int. J. High Perform. Comput. Appl., 2024
Proceedings of the Euro-Par 2024: Parallel Processing, 2024
2023
Programming parallel dense matrix factorizations and inversion for new-generation NUMA architectures.
J. Parallel Distributed Comput., May, 2023
CoRR, 2023
Fine-grain task-parallel algorithms for matrix factorizations and inversion on many-threaded CPUs.
Concurr. Comput. Pract. Exp., 2023
Automatic Generation of Micro-kernels for Performance Portability of Matrix Multiplication on RISC-V Vector Processors.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023
2022
Proceedings of the High Performance Computing. ISC High Performance 2022 International Workshops - Hamburg, Germany, May 29, 2022
NUMA-Aware Dense Matrix Factorizations and Inversion with Look-Ahead on Multicore Processors.
Proceedings of the 2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2022
2021
J. Parallel Distributed Comput., 2021
A New Generation of Task-Parallel Algorithms for Matrix Inversion in Many-Threaded CPUs.
Proceedings of the PMAM@PPoPP 2021: Proceedings of the Twelfth International Workshop on Programming Models and Applications for Multicores and Manycores, 2021
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021
2020
sLASs: A fully automatic auto-tuned linear algebra library based on OpenMP extensions implemented in OmpSs (LASs Library).
J. Parallel Distributed Comput., 2020
Clust. Comput., 2020
Proceedings of the OpenMP: Portable Multi-Level Parallelism on Modern Systems, 2020
2019
Dynamic look-ahead in the reduction to band form for the singular value decomposition.
Parallel Comput., 2019
Look-ahead in the two-sided reduction to compact band forms for symmetric eigenvalue problems and the SVD.
Numer. Algorithms, 2019
A Case for Malleable Thread-Level Linear Algebra Libraries: The LU Factorization With Partial Pivoting.
IEEE Access, 2019
Proceedings of the 2019 IEEE/ACM Workshop on Education for High-Performance Computing, 2019
Proceedings of the 27th Euromicro International Conference on Parallel, 2019
Proceedings of the 20th International Conference on Parallel and Distributed Computing, 2019
Proceedings of the 20th International Conference on Parallel and Distributed Computing, 2019
2018
PhD thesis, 2018
Static scheduling of the LU factorization with look-ahead on asymmetric multicore processors.
Parallel Comput., 2018
Energy balance between voltage-frequency scaling and resilience for linear algebra routines on low-power multicore architectures.
Parallel Comput., 2018
Two-sided orthogonal reductions to condensed forms on asymmetric multicore processors.
Parallel Comput., 2018
Multi-threaded dense linear algebra libraries for low-power asymmetric multicore processors.
J. Comput. Sci., 2018
Reduction to Band Form for the Singular Value Decomposition on Graphics Accelerators.
Proceedings of the 9th International Workshop on Programming Models and Applications for Multicores and Manycores, 2018
2017
Time and energy modeling of a high-performance multi-threaded Cholesky factorization.
J. Supercomput., 2017
Revisiting conventional task schedulers to exploit asymmetry in multi-core architectures for dense linear algebra operations.
Parallel Comput., 2017
Reduction to Tridiagonal Form for Symmetric Eigenproblems on Asymmetric Multicore Processors.
Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, 2017
Static Versus Dynamic Task Scheduling of the Lu Factorization on ARM big. LITTLE Architectures.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017
2016
An analytical methodology to derive power models based on hardware and software metrics.
Comput. Sci. Res. Dev., 2016
Evaluating fault tolerance on asymmetric multicore systems-on-chip using iso-metrics.
IET Comput. Digit. Tech., 2016
Architecture-aware configuration and scheduling of matrix multiplication on asymmetric multicore processors.
Clust. Comput., 2016
Refactoring Conventional Task Schedulers to Exploit Asymmetric ARM big.LITTLE Architectures in Dense Linear Algebra.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016
The Impact of Panel Factorization on the Gauss-Huard Algorithm for the Solution of Linear Systems on Modern Architectures.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2016
The Impact of Voltage-Frequency Scaling for the Matrix-Vector Product on the IBM POWER8.
Proceedings of the Euro-Par 2016: Parallel Processing, 2016
2015
Simul. Model. Pract. Theory, 2015
Comput. Sci. Res. Dev., 2015
Comput. Sci. Res. Dev., 2015
Performance and Energy Optimization of Matrix Multiplication on Asymmetric big.LITTLE Processors.
CoRR, 2015
Multi-Threaded Dense Linear Algebra Libraries for Low-Power Asymmetric Multicore Processors.
CoRR, 2015
Performance and Fault Tolerance of Preconditioned Iterative Solvers on Low-Power ARM Architectures.
Proceedings of the Parallel Computing: On the Road to Exascale, 2015
2014
Sustain. Comput. Informatics Syst., 2014
Comput. Sci. Res. Dev., 2014
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2014
2013
Proceedings of the Energy Efficiency in Large Scale Distributed Systems, 2013