Adrià Armejach

Jesús Alastruey-Benedé

Pablo Ibáñez

Future Gener. Comput. Syst., 2024

A Mess of Memory System Benchmarking, Simulation and Application Profiling.

[BibT_eX]

[DOI]

Pouya Esmaili-Dokht

Francesco Sgherzi

Valéria Soldera Girelli

Emanuele Confalonieri

Rishabh Dubey

Jason Adlard

CoRR, 2024

A Mess of Memory System Benchmarking, Simulation and Application Profiling.

[BibT_eX]

[DOI]

Pouya Esmaili-Dokht

Francesco Sgherzi

Valéria Soldera Girelli

Emanuele Confalonieri

Rishabh Dubey

Jason Adlard

Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

Exploiting Vector Code Semantics for Efficient Data Cache Prefetching.

[BibT_eX]

[DOI]

Francesc Martínez Palau

Martí Torrents

Marc Casas

Proceedings of the 38th ACM International Conference on Supercomputing, 2024

2023

Porting and Optimizing BWA-MEM2 Using the Fujitsu A64FX Processor.

[BibT_eX]

[DOI]

Rubén Langarita

Pablo Ibáñez

Jesús Alastruey-Benedé

IEEE ACM Trans. Comput. Biol. Bioinform., 2023

Characterization of a Coherent Hardware Accelerator Framework for SoCs.

[BibT_eX]

[DOI]

Balaji Venu

Alexandre de Limas Santana

Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 2023

Efficient Direct Convolution Using Long SIMD Instructions.

[BibT_eX]

[DOI]

Marc Casas

Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023

A Tensor Marshaling Unit for Sparse Tensor Algebra on General-Purpose Processors.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

DynAMO: Improving Parallelism Through Dynamic Placement of Atomic Memory Operations.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

Fast Behavioural RTL Simulation of 10B Transistor SoC Designs with Metro-Mpi.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

2022

A BF16 FMA is All You Need for DNN Training.

[BibT_eX]

[DOI]

IEEE Trans. Emerg. Top. Comput., 2022

Compressed Sparse FM-Index: Fast Sequence Alignment Using Large K-Steps.

[BibT_eX]

[DOI]

Jesús Alastruey-Benedé

IEEE ACM Trans. Comput. Biol. Bioinform., 2022

FASE: A Fast, Accurate and Seamless Emulator for Custom Numerical Formats.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2022

2021

On the use of many-core Marvell ThunderX2 processor for HPC workloads.

[BibT_eX]

[DOI]

J. Supercomput., 2021

Multilevel simulation-based co-design of next generation HPC microprocessors.

[BibT_eX]

[DOI]

Proceedings of the 2021 International Workshop on Performance Modeling, 2021

PLANAR: a programmable accelerator for near-memory data rearrangement.

[BibT_eX]

[DOI]

Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

gem5 + rtl: A Framework to Enable RTL Models Inside a Full-System Simulator.

[BibT_eX]

[DOI]

Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

Dynamically Adapting Floating-Point Precision to Accelerate Deep Neural Network Training.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Conference on Machine Learning and Applications, 2021

Mont-Blanc 2020: Towards Scalable and Power Efficient European HPC Processors.

[BibT_eX]

[DOI]

Rekai González-Alberquilla

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

2020

Using Arm's scalable vector extension on stencil codes.

[BibT_eX]

[DOI]

J. Supercomput., 2020

The gem5 Simulator: Version 20.0+.

[BibT_eX]

[DOI]

Daniel Rodrigues Carvalho

Amin Farmahini Farahani

Hamidreza Khaleghzadeh

CoRR, 2020

Evaluating Mixed-Precision Arithmetic for 3D Generative Adversarial Networks to Simulate High Energy Physics Detectors.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Conference on Machine Learning and Applications, 2020

2019

Design trade-offs for emerging HPC processors based on mobile market technology.

[BibT_eX]

[DOI]

Marc Casas

J. Supercomput., 2019

Design Space Exploration of Next-Generation HPC Machines.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

2018

Stencil codes on a vector length agnostic architecture.

[BibT_eX]

[DOI]

Rekai González-Alberquilla

Helena Caminal

Juan M. Cebrian

Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018

2016

Hardware Acceleration for Query Processing: Leveraging FPGAs, CPUs, and Memory.

[BibT_eX]

[DOI]

Comput. Sci. Eng., 2016

MUSA: a multi-level simulation approach for next-generation HPC machines.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2016

Implications of non-volatile memory as primary storage for database management systems.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Embedded Computer Systems: Architectures, 2016

2015

Tidy Cache: Improving Data Placement in Die-Stacked DRAM Caches.

[BibT_eX]

[DOI]

Adrián Cristal

Osman S. Unsal

Proceedings of the 27th International Symposium on Computer Architecture and High Performance Computing, 2015

2014

Techniques to improve concurrency in hardware transactional memory.

[BibT_eX]

[DOI]