Juan Fernández Peinador

Affiliations:
  • Intel Labs Barcelona, Spain
  • Los Alamos National Laboratory, NM, USA (former)
  • University of Murcia, Spain (PhD 2005)


According to our database1, Juan Fernández Peinador authored at least 43 papers between 2001 and 2015.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2015
Efficient Hardware-Supported Synchronization Mechanisms for Manycores.
Proceedings of the Handbook on Data Centers, 2015

2014
Selective dynamic serialization for reducing energy consumption in hardware transactional memory systems.
J. Supercomput., 2014

2013
Design of an efficient communication infrastructure for highly contended locks in many-core CMPs.
J. Parallel Distributed Comput., 2013

On the design of energy-efficient hardware transactional memory systems.
Concurr. Comput. Pract. Exp., 2013

ECONO: Express coherence notifications for efficient cache coherency in many-core CMPs.
Proceedings of the 2013 International Conference on Embedded Computer Systems: Architectures, 2013

Efficient Dir0B Cache Coherency for Many-Core CMPs.
Proceedings of the International Conference on Computational Science, 2013

Deploying Hardware Locks to Improve Performance and Energy Efficiency of Hardware Transactional Memory.
Proceedings of the Architecture of Computing Systems - ARCS 2013, 2013

2012
Efficient Hardware Barrier Synchronization in Many-Core CMPs.
IEEE Trans. Parallel Distributed Syst., 2012

Stencil computations on heterogeneous platforms for the Jacobi method: GPUs versus Cell BE.
J. Supercomput., 2012

The 2D wavelet transform on emerging architectures: GPUs and multicores.
J. Real Time Image Process., 2012

Dynamic Serialization: Improving Energy Consumption in Eager-Eager Hardware Transactional Memory Systems.
Proceedings of the 20th Euromicro International Conference on Parallel, 2012

Design of a collective communication infrastructure for barrier synchronization in cluster-based nanoscale MPSoCs.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

2011
GLocks: Efficient Support for Highly-Contended Locks in Many-Core CMPs.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

2010
Characterizing the basic synchronization and communication operations in Dual Cell-based Blades through CellStats.
J. Supercomput., 2010

Parallel 3D fast wavelet transform on manycore GPUs and multicore CPUs.
Proceedings of the International Conference on Computational Science, 2010

Characterizing Energy Consumption in Hardware Transactional Memory Systems.
Proceedings of the 22st International Symposium on Computer Architecture and High Performance Computing, 2010

A G-Line-Based Network for Fast and Efficient Barrier Synchronization in Many-Core CMPs.
Proceedings of the 39th International Conference on Parallel Processing, 2010

Efficient and scalable barrier synchronization for many-core CMPs.
Proceedings of the 7th Conference on Computing Frontiers, 2010

2009
A Parallel Implementation of the 2D Wavelet Transform Using CUDA.
Proceedings of the 17th Euromicro International Conference on Parallel, 2009

Fast and Efficient Synchronization and Communication Collective Primitives for Dual Cell-Based Blades.
Proceedings of the Euro-Par 2009 Parallel Processing, 2009

2008
CellStats: A Tool to Evaluate the Basic Synchronization and Communication Operations of the Cell BE.
Proceedings of the 16th Euromicro International Conference on Parallel, 2008

Characterizing the Basic Synchronization and Communication Operations in Dual Cell-Based Blades.
Proceedings of the Computational Science, 2008

Multicore Platforms for Scientific Computing: Cell BE and NVIDIA Tesla.
Proceedings of the 2008 International Conference on Scientific Computing, 2008

2007
Challenges in Mapping Graph Exploration Algorithms on Advanced Multi-core Processors.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Multicore Surprises: Lessons Learned from Optimizing Sweep3D on the Cell Broadband Engine.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

2006
STORM: Scalable Resource Management for Large-Scale Parallel Computers.
IEEE Trans. Computers, 2006

NIC-based reduction algorithms for large-scale clusters.
Int. J. High Perform. Comput. Netw., 2006

An Abstract Interface for System Software on Large-Scale Clusters.
Comput. J., 2006

2005
Adaptive Parallel Job Scheduling with Flexible Coscheduling.
IEEE Trans. Parallel Distributed Syst., 2005

Assessing MPI Performance on QsNet<sup>II</sup>.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

Monitoring and Debugging Parallel Software with BCS-MPI on Large-Scale Clusters.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

2004
On the Feasibility of Incremental Checkpointing for Scientific Computing.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

Architectural Support for System Software on Large-Scale Clusters.
Proceedings of the 33rd International Conference on Parallel Processing (ICPP 2004), 2004

Designing Parallel Operating Systems via Parallel Programming.
Proceedings of the Euro-Par 2004 Parallel Processing, 2004

2003
Scalable NIC-based Reduction on Large-scale Clusters.
Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003

BCS-MPI: A New Approach in the System Software Design for Large-Scale Parallel Computers.
Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003

Parallel Job Scheduling under Dynamic Workloads.
Proceedings of the Job Scheduling Strategies for Parallel Processing, 2003

Flexible CoScheduling: Mitigating Load Imbalance and Improving Utilization of Heterogeneous Resources.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Scalable collective communication on the ASCI Q machine.
Proceedings of the 11th Annual IEEE Symposium on High Performance Interconnects, 2003

2002
STORM: lightning-fast resource management.
Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 2002

Improving the Performance of Real-Time Communication Services on High-Speed LANs under Topology Changes.
Proceedings of the 27th Annual IEEE Conference on Local Computer Networks (LCN 2002), 2002

Scalable Resource Management in High Performance Computers.
Proceedings of the 2002 IEEE International Conference on Cluster Computing (CLUSTER 2002), 2002

2001
Performance Evaluation of Real-Time Communication Services on High-Speed LANs under Topology Changes.
Proceedings of the High Performance Computing - HiPC 2001, 8th International Conference, 2001


  Loading...