Juan M. Cebrian

Eduardo José Gómez-Hernández

Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

Hardware Cache Locking for All Memory Updates.

[BibT_eX]

[DOI]

Ashkan Asgharzadeh

Proceedings of the 42nd IEEE International Conference on Computer Design, 2024

2023

Near-optimal multi-accelerator architectures for predictive maintenance at the edge.

[BibT_eX]

[DOI]

Mostafa Koraei

Eduardo José Gómez-Hernández

Future Gener. Comput. Syst., 2023

2022

Compiler-Assisted Compaction/Restoration of SIMD Instructions.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2022

Free atomics: hardware atomic operations without fences.

[BibT_eX]

[DOI]

Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

Splash-4: A Modern Benchmark Suite with Lock-Free Constructs.

[BibT_eX]

[DOI]

Eduardo José Gómez-Hernández

Proceedings of the IEEE International Symposium on Workload Characterization, 2022

2021

Efficient, Distributed, and Non-Speculative Multi-Address Atomic Operations.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

2020

Scalability analysis of AVX-512 extensions.

[BibT_eX]

[DOI]

Rekai González-Alberquilla

J. Supercomput., 2020

Efficiency analysis of modern vector architectures: vector ALU sizes, core counts and clock frequencies.

[BibT_eX]

[DOI]

J. Supercomput., 2020

Using Arm's scalable vector extension on stencil codes.

[BibT_eX]

[DOI]

J. Supercomput., 2020

Offloading strategies for Stencil kernels on the KNC Xeon Phi architecture: Accuracy versus performance.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2020

High-throughput fuzzy clustering on heterogeneous architectures.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2020

Semi-automatic validation of cycle-accurate simulation infrastructures: The case for gem5-x86.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2020

Boosting Store Buffer Efficiency with Store-Prefetch Bursts.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

Improving Predication Efficiency through Compaction/Restoration of SIMD Instructions.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

2019

POSTER: An Optimized Predication Execution for SIMD Extensions.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018

A vectorized k-means algorithm for compressed datasets: design and experimental analysis.

[BibT_eX]

[DOI]

Abdullah Al Hasib

J. Supercomput., 2018

Performance and energy effects on task-based parallelized applications - User-directed versus manual vectorization.

[BibT_eX]

[DOI]

J. Supercomput., 2018

Stencil codes on a vector length agnostic architecture.

[BibT_eX]

[DOI]

Adrià Armejach

Helena Caminal

Rekai González-Alberquilla

Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018

2017

A dedicated private-shared cache design for scalable multiprocessors.

[BibT_eX]

[DOI]

Ricardo Fernández Pascual

Alexandra Jimborean

Manuel E. Acacio

Concurr. Comput. Pract. Exp., 2017

Code modernization strategies to 3-D Stencil-based applications on Intel Xeon Phi: KNC and KNL.

[BibT_eX]

[DOI]

Comput. Math. Appl., 2017

2016

Transient Temperature Prediction for Aging Thermal Sensors Using Artificial Neural Network.

[BibT_eX]

[DOI]

Kameswar Rao Vaddina

Proceedings of the 24th Euromicro International Conference on Parallel, 2016

2015

Soft-error mitigation by means of decoupled transactional memory threads.

[BibT_eX]

[DOI]

Distributed Comput., 2015

ParVec: vectorizing the PARSEC benchmark suite.

[BibT_eX]

[DOI]

Computing, 2015

Evaluation of the 3-D finite difference implementation of the acoustic diffusion equation model on massively parallel architectures.

[BibT_eX]

[DOI]

Comput. Electr. Eng., 2015

Evaluation of 3-D Stencil Codes on the Intel Xeon Phi Coprocessor.

[BibT_eX]

[DOI]

Proceedings of the Parallel Computing: On the Road to Exascale, 2015

V-PFORDelta: Data Compression for Energy Efficient Computation of Time Series.

[BibT_eX]

[DOI]

Abdullah Al Hasib

Proceedings of the 22nd IEEE International Conference on High Performance Computing, 2015

Early Experiences with Separate Caches for Private and Shared Data.

[BibT_eX]

[DOI]

Ricardo Fernández Pascual

Manuel E. Acacio

Proceedings of the 11th IEEE International Conference on e-Science, 2015

2014

Managing power constraints in a single-core scenario through power tokens.

[BibT_eX]

[DOI]

J. Supercomput., 2014

Toward energy efficiency in heterogeneous processors: findings on virtual screening methods.

[BibT_eX]

[DOI]

Ginés D. Guerrero

Horacio Emilio Pérez Sánchez

José M. García

Manuel Ujaldon

José M. Cecilia

Concurr. Comput. Pract. Exp., 2014

Performance and energy impact of parallelization and vectorization techniques in modern microprocessors.

[BibT_eX]

[DOI]

Jan Christian Meyer

Computing, 2014

Optimized hardware for suboptimal software: The case for SIMD-aware benchmarks.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Symposium on Performance Analysis of Systems and Software, 2014

2013

Modeling the impact of permanent faults in caches.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2013

Efficient inter-core power and thermal balancing for multicore processors.

[BibT_eX]

[DOI]

Computing, 2013

Energy-Efficient Sparse Matrix Autotuning with CSX - A Trade-off Study.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Temperature effects on on-chip energy measurements.

[BibT_eX]

[DOI]

Proceedings of the International Green Computing Conference, 2013

2012

Improving Energy Efficiency through Parallelization and Vectorization on Intel Core i5 and i7 Processors.

[BibT_eX]

[DOI]

Jan Christian Meyer

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Energy Efficiency Analysis of GPUs.

[BibT_eX]

[DOI]

Ginés D. Guerrero

José M. García

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

2011

Leakage-efficient design of value predictors through state and non-state preserving techniques.

[BibT_eX]

[DOI]

J. Supercomput., 2011

Power Token Balancing: Adapting CMPs to Power Constraints for Parallel Multithreaded Workloads.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Token3D: Reducing Temperature in 3D Die-Stacked CMPs through Cycle-Level Power Control Mechanisms.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

2010

MLP-Aware Instruction Queue Resizing: The Key to Power-Efficient Performance.

[BibT_eX]

[DOI]

Pavlos Petoumenos

Georgia Psychou

Juan Manuel Cebrian Gonzalez

Proceedings of the Architecture of Computing Systems, 2010

2009

Efficient microarchitecture policies for accurately adapting to power constraints.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

2007

Leakage Energy Reduction in Value Predictors through Static Decay.

[BibT_eX]

[DOI]