Adrià Armejach

Orcid: 0000-0003-2869-668X

According to our database1, Adrià Armejach authored at least 40 papers between 2009 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
SpChar: Characterizing the sparse puzzle via decision trees.
J. Parallel Distributed Comput., 2024

GenArchBench: A genomics benchmark suite for arm HPC processors.
Future Gener. Comput. Syst., 2024

A Mess of Memory System Benchmarking, Simulation and Application Profiling.
CoRR, 2024


Exploiting Vector Code Semantics for Efficient Data Cache Prefetching.
Proceedings of the 38th ACM International Conference on Supercomputing, 2024

2023
Porting and Optimizing BWA-MEM2 Using the Fujitsu A64FX Processor.
IEEE ACM Trans. Comput. Biol. Bioinform., 2023

Characterization of a Coherent Hardware Accelerator Framework for SoCs.
Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 2023

Efficient Direct Convolution Using Long SIMD Instructions.
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023

A Tensor Marshaling Unit for Sparse Tensor Algebra on General-Purpose Processors.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

DynAMO: Improving Parallelism Through Dynamic Placement of Atomic Memory Operations.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

Fast Behavioural RTL Simulation of 10B Transistor SoC Designs with Metro-Mpi.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

2022
A BF16 FMA is All You Need for DNN Training.
IEEE Trans. Emerg. Top. Comput., 2022

Compressed Sparse FM-Index: Fast Sequence Alignment Using Large K-Steps.
IEEE ACM Trans. Comput. Biol. Bioinform., 2022

FASE: A Fast, Accurate and Seamless Emulator for Custom Numerical Formats.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2022

2021
On the use of many-core Marvell ThunderX2 processor for HPC workloads.
J. Supercomput., 2021


PLANAR: a programmable accelerator for near-memory data rearrangement.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

gem5 + rtl: A Framework to Enable RTL Models Inside a Full-System Simulator.
Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

Dynamically Adapting Floating-Point Precision to Accelerate Deep Neural Network Training.
Proceedings of the 20th IEEE International Conference on Machine Learning and Applications, 2021

Mont-Blanc 2020: Towards Scalable and Power Efficient European HPC Processors.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

2020
Using Arm's scalable vector extension on stencil codes.
J. Supercomput., 2020

The gem5 Simulator: Version 20.0+.
CoRR, 2020

Evaluating Mixed-Precision Arithmetic for 3D Generative Adversarial Networks to Simulate High Energy Physics Detectors.
Proceedings of the 19th IEEE International Conference on Machine Learning and Applications, 2020

2019
Design trade-offs for emerging HPC processors based on mobile market technology.
J. Supercomput., 2019

Design Space Exploration of Next-Generation HPC Machines.
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

2018
Stencil codes on a vector length agnostic architecture.
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018

2016
Hardware Acceleration for Query Processing: Leveraging FPGAs, CPUs, and Memory.
Comput. Sci. Eng., 2016

MUSA: a multi-level simulation approach for next-generation HPC machines.
Proceedings of the International Conference for High Performance Computing, 2016

Implications of non-volatile memory as primary storage for database management systems.
Proceedings of the International Conference on Embedded Computer Systems: Architectures, 2016

2015
Tidy Cache: Improving Data Placement in Die-Stacked DRAM Caches.
Proceedings of the 27th International Symposium on Computer Architecture and High Performance Computing, 2015

2014
Techniques to improve concurrency in hardware transactional memory.
PhD thesis, 2014

An empirical evaluation of High-Level Synthesis languages and tools for database acceleration.
Proceedings of the 24th International Conference on Field Programmable Logic and Applications, 2014

2013
Techniques to improve performance in requester-wins hardware transactional memory.
ACM Trans. Archit. Code Optim., 2013

HARP: Adaptive abort recurrence prediction for Hardware Transactional Memory.
Proceedings of the 20th Annual International Conference on High Performance Computing, 2013

2012
Circuit design of a dual-versioning L1 data cache.
Integr., 2012

Novel SRAM bias control circuits for a low power L1 data cache.
Proceedings of the NORCHIP 2012, Copenhagen, Denmark, November 12-13, 2012, 2012

Transactional prefetching: narrowing the window of contention in hardware transactional memory.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011
Circuit design of a dual-versioning L1 data cache for optimistic concurrency.
Proceedings of the 21st ACM Great Lakes Symposium on VLSI 2010, 2011

Using a Reconfigurable L1 Data Cache for Efficient Version Management in Hardware Transactional Memory.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2009
EazyHTM: eager-lazy hardware transactional memory.
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009


  Loading...