2022

Enabling Homomorphically Encrypted Inference for Large DNN Models.

[DOI]

Guillermo Lloret-Talavera

Marc Jordà

IEEE Trans. Computers, 2022

cuConv: CUDA implementation of convolution for CNN inference.

[DOI]

Marc Jordà

Pedro Valero-Lara

Antonio J. Peña

Clust. Comput., 2022

ecoHMEM: Improving Object Placement Methodology for Hybrid Memory Systems in HPC.

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2022

2019

Performance Evaluation of cuDNN Convolution Algorithms on NVIDIA Volta GPUs.

[DOI]

Marc Jordà

Pedro Valero-Lara

Antonio J. Peña

IEEE Access, 2019

2017

Efficient exception handling support for GPUs.

[DOI]

Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

Direct Inter-Process Communication (dIPC): Repurposing the CODOMs Architecture to Accelerate IPC.

[DOI]

Proceedings of the Twelfth European Conference on Computer Systems, 2017

2015

GPU-SM: shared memory multi-GPU programming.

[DOI]

Proceedings of the 8th Workshop on General Purpose Processing using GPUs, 2015

2013

Comparison based sorting for systems with multiple GPUs.

[DOI]

Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units, 2013