Francisco D. Igual
Orcid: 0000-0003-4480-9517
According to our database1,
Francisco D. Igual
authored at least 103 papers
between 2008 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
2008
2010
2012
2014
2016
2018
2020
2022
2024
0
5
10
2
4
7
1
2
5
6
3
4
4
9
2
2
6
2
1
2
2
2
4
2
1
2
3
2
1
2
2
4
1
1
7
4
1
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on zbmath.org
-
on orcid.org
-
on dl.acm.org
On csauthors.net:
Bibliography
2025
J. Supercomput., January, 2025
Experience-guided, mixed-precision matrix multiplication with apache TVM for ARM processors.
J. Supercomput., January, 2025
2024
J. Supercomput., July, 2024
Algorithm 1039: Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM.
ACM Trans. Math. Softw., March, 2024
Experiences with nested parallelism in task-parallel applications using malleable BLAS on multicore processors.
Int. J. High Perform. Comput. Appl., 2024
Acceleration and energy consumption optimization in cascading classifiers for face detection on low-cost ARM big.LITTLE asymmetric architectures.
CoRR, 2024
Proceedings of the High Performance Computing. ISC High Performance 2024 International Workshops, 2024
Proceedings of the Euro-Par 2024: Parallel Processing, 2024
2023
J. Supercomput., May, 2023
Programming parallel dense matrix factorizations and inversion for new-generation NUMA architectures.
J. Parallel Distributed Comput., May, 2023
Dynamic power budget redistribution under a power cap on multi-application environments.
Sustain. Comput. Informatics Syst., April, 2023
Algorithm 1033: Parallel Implementations for Computing the Minimum Distance of a Random Linear Code on Distributed-memory Architectures.
ACM Trans. Math. Softw., March, 2023
CoRR, 2023
CoRR, 2023
Fine-grain task-parallel algorithms for matrix factorizations and inversion on many-threaded CPUs.
Concurr. Comput. Pract. Exp., 2023
Automatic Generation of Micro-kernels for Performance Portability of Matrix Multiplication on RISC-V Vector Processors.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023
Proceedings of the 31st Euromicro International Conference on Parallel, 2023
2022
Algorithm 1022: Efficient Algorithms for Computing a Rank-Revealing UTV Factorization on Parallel Computing Architectures.
ACM Trans. Math. Softw., 2022
Proceedings of the High Performance Computing. ISC High Performance 2022 International Workshops - Hamburg, Germany, May 29, 2022
NUMA-Aware Dense Matrix Factorizations and Inversion with Look-Ahead on Multicore Processors.
Proceedings of the 2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2022
Proceedings of the 30th Euromicro International Conference on Parallel, 2022
Applying Game-Learning Environments to Power Capping Scenarios via Reinforcement Learning.
Proceedings of the Cloud Computing, Big Data & Emerging Topics - 10th Conference, 2022
2021
Low precision matrix multiplication for efficient deep learning in NVIDIA Carmel processors.
J. Supercomput., 2021
Efficient algorithms for computing a rank-revealing UTV factorization on parallel computing architectures.
CoRR, 2021
A New Generation of Task-Parallel Algorithms for Matrix Inversion in Many-Threaded CPUs.
Proceedings of the PMAM@PPoPP 2021: Proceedings of the Twelfth International Workshop on Programming Models and Applications for Multicores and Manycores, 2021
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021
2020
Resource Management for Power-Constrained HEVC Transcoding Using Reinforcement Learning.
IEEE Trans. Parallel Distributed Syst., 2020
J. Supercomput., 2020
STEEL-RT: combining single task-single executor model and expanded scheduling to ease heterogeneity exploitation.
J. Supercomput., 2020
Leveraging knowledge-as-a-service (KaaS) for QoS-aware resource management in multi-user video transcoding.
J. Supercomput., 2020
Clust. Comput., 2020
Proceedings of the Cloud Computing, Big Data & Emerging Topics - 8th Conference, 2020
2019
Algorithm 994: Fast Implementations of the Brouwer-Zimmermann Algorithm for the Computation of the Minimum Distance of a Random Linear Code.
ACM Trans. Math. Softw., 2019
Variable intra-task threading for power-constrained performance and energy optimization in DAG scheduling.
J. Supercomput., 2019
J. Supercomput., 2019
Portability Study of an OpenCL Algorithm for Automatic Target Detection in Hyperspectral Images.
IEEE Trans. Geosci. Remote. Sens., 2019
Practical Considerations for Acoustic Source Localization in the IoT Era: Platforms, Energy Efficiency, and Performance.
IEEE Internet Things J., 2019
Parallel Implementations for Computing the Minimum Distance of a Random Linear Code on Multicomputers.
CoRR, 2019
Detecting Time-Fragmented Cache Attacks Against AES Using Performance Monitoring Counters.
Proceedings of the 7th Conference on Cloud Computing & Big Data, 2019
MAMUT: Multi-Agent Reinforcement Learning for Efficient Real-Time Multi-User Video Transcoding.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019
2018
Optimized Fundamental Signal Processing Operations For Energy Minimization on Heterogeneous Mobile Devices.
IEEE Trans. Circuits Syst. I Regul. Pap., 2018
Multi-threaded dense linear algebra libraries for low-power asymmetric multicore processors.
J. Comput. Sci., 2018
Acceleration and energy consumption optimization in cascading classifiers for face detection on low-cost ARM big. LITTLE asymmetric architectures.
Int. J. Circuit Theory Appl., 2018
2017
Time and energy modeling of a high-performance multi-threaded Cholesky factorization.
J. Supercomput., 2017
J. Supercomput., 2017
Revisiting conventional task schedulers to exploit asymmetry in multi-core architectures for dense linear algebra operations.
Parallel Comput., 2017
Performance-Power Evaluation of an OpenCL Implementation of the Simplex Growing Algorithm for Hyperspectral Unmixing.
IEEE Geosci. Remote. Sens. Lett., 2017
Proceedings of the 2017 International Conference on High Performance Computing & Simulation, 2017
Performance and Scalability Study of FMM Kernels on Novel Multi- and Many-core Architectures.
Proceedings of the International Conference on Computational Science, 2017
On the Use of a GPU-Accelerated Mobile Device Processor for Sound Source Localization.
Proceedings of the International Conference on Computational Science, 2017
2016
ACM Trans. Math. Softw., 2016
CoRR, 2016
Architecture-aware configuration and scheduling of matrix multiplication on asymmetric multicore processors.
Clust. Comput., 2016
Refactoring Conventional Task Schedulers to Exploit Asymmetric ARM big.LITTLE Architectures in Dense Linear Algebra.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016
HeSP: A Simulation Framework for Solving the Task Scheduling-Partitioning Problem on Heterogeneous Architectures.
Proceedings of the Euro-Par 2016: Parallel Processing, 2016
2015
Simul. Model. Pract. Theory, 2015
Speeding up the log-polar transform with inexpensive parallel hardware: graphics units and multi-core architectures.
J. Real Time Image Process., 2015
Accelerating fluid-solid simulations (Lattice-Boltzmann & Immersed-Boundary) on heterogeneous architectures.
J. Comput. Sci., 2015
Revisiting Conventional Task Schedulers to Exploit Asymmetry in ARM big.LITTLE Architectures for Dense Linear Algebra.
CoRR, 2015
Performance and Energy Optimization of Matrix Multiplication on Asymmetric big.LITTLE Processors.
CoRR, 2015
Multi-Threaded Dense Linear Algebra Libraries for Low-Power Asymmetric Multicore Processors.
CoRR, 2015
Non-negative Matrix Factorization on Low-Power Architectures and Accelerators: A Comparative Study.
Comput. Electr. Eng., 2015
Balancing task- and data-level parallelism to improve performance and energy consumption of matrix computations on the Intel Xeon Phi.
Comput. Electr. Eng., 2015
Proceedings of the 23rd European Signal Processing Conference, 2015
2014
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2014
Enhancing performance and energy consumption of runtime schedulers for dense linear algebra.
Concurr. Comput. Pract. Exp., 2014
Author's retrospective for biomedical image analysis on a cooperative cluster of gpus and multicores.
Proceedings of the ACM International Conference on Supercomputing 25th Anniversary Volume, 2014
Parallel performance and energy efficiency of modern video encoders on multithreaded architectures.
Proceedings of the 22nd European Signal Processing Conference, 2014
2013
EURASIP J. Adv. Signal Process., 2013
Concurr. Comput. Pract. Exp., 2013
Proceedings of the 20th European MPI Users's Group Meeting, 2013
Proceedings of the Energy Efficiency in Large Scale Distributed Systems, 2013
2012
A Runtime System for Programming Out-of-Core Matrix Algorithms-by-Tiles on Multithreaded Architectures.
ACM Trans. Math. Softw., 2012
The FLAME approach: From dense linear algebra algorithms to high-performance multi-accelerator implementations.
J. Parallel Distributed Comput., 2012
Int. J. High Perform. Comput. Appl., 2012
Optimization of power consumption in the iterative solution of sparse linear systems on graphics processors.
Comput. Sci. Res. Dev., 2012
DVFS-control techniques for dense linear algebra operations on multi-core processors.
Comput. Sci. Res. Dev., 2012
Appl. Math. Comput., 2012
Unleashing the high-performance and low-power of multi-core DSPs for general-purpose HPC.
Proceedings of the SC Conference on High Performance Computing Networking, 2012
Proceedings of the IEEE 24th International Symposium on Computer Architecture and High Performance Computing, 2012
Saving Energy in the LU Factorization with Partial Pivoting on Multi-core Processors.
Proceedings of the 20th Euromicro International Conference on Parallel, 2012
Reducing Energy Consumption of Dense Linear Algebra Operations on Hybrid CPU-GPU Platforms.
Proceedings of the 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, 2012
2011
Int. J. High Perform. Comput. Appl., 2011
Condensed forms for the symmetric eigenvalue problem on multi-threaded architectures.
Concurr. Comput. Pract. Exp., 2011
Power-aware Dense Linear Algebra Implementations on Multi-core and Many-core Processors.
Proceedings of the 3rd Many-core Applications Research Community (MARC) Symposium. Proceedings of the 3rd MARC Symposium, 2011
2010
Int. J. Parallel Program., 2010
Proceedings of the 2010 International Conference on High Performance Computing & Simulation, 2010
2009
Int. J. Parallel Emergent Distributed Syst., 2009
Concurr. Comput. Pract. Exp., 2009
Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2009
Reduction to Condensed Forms for Symmetric Eigenvalue Problems on Multi-core Architectures.
Proceedings of the Parallel Processing and Applied Mathematics, 2009
Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009
Proceedings of the Evolving OpenMP in an Age of Extreme Parallelism, 2009
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009
Proceedings of the Euro-Par 2009, 2009
Proceedings of the Euro-Par 2009 Parallel Processing, 2009
2008
Attaining High Performance in General-Purpose Computations on Current Graphics Processors.
Proceedings of the High Performance Computing for Computational Science, 2008
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
Proceedings of the 22nd Annual International Conference on Supercomputing, 2008
Proceedings of the Euro-Par 2008, 2008