Jesús Labarta

Orcid: 0000-0002-7489-4727

According to our database1, Jesús Labarta authored at least 367 papers between 1983 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
$\mathcal{O}(n)$O(n) Key-Value Sort With Active Compute Memory.
IEEE Trans. Computers, May, 2024

RAVE: RISC-V Analyzer of Vector Executions, a QEMU tracing plugin.
CoRR, 2024

A Mess of Memory System Benchmarking, Simulation and Application Profiling.
CoRR, 2024

Graph Computing on Long Vector Architectures (Yes, It Works!).
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

2023
Compressed Real Numbers for AI: a case-study using a RISC-V CPU.
CoRR, 2023

Acceleration with long vector architectures: Implementation and evaluation of the FFT kernel on NEC SX-Aurora and RISC-V vector extension.
Concurr. Comput. Pract. Exp., 2023

Software Development Vehicles to Enable Extended and Early Co-design: A RISC-V and HPC Case of Study.
Proceedings of the High Performance Computing, 2023

Short Reasons for Long Vectors in HPC CPUs: A Study Based on RISC-V.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

2022
Asymmetric HMMs for Online Ball-Bearing Health Assessments.
IEEE Internet Things J., 2022

The MAMe dataset: on the relevance of high resolution and variable shape image properties.
Appl. Intell., 2022

Automatic aggregation of subtask accesses for nested OpenMP-style tasks.
Proceedings of the 2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2022

OmpSs@cloudFPGA: An FPGA Task-Based Programming Model with Message Passing.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

Feature Space Curvature Map: A Method To Homogenize Cluster Densities.
Proceedings of the International Joint Conference on Neural Networks, 2022

Transparent load balancing of MPI programs using [email protected] and DLB.
Proceedings of the 51st International Conference on Parallel Processing, 2022

OmpSs-2@Cluster: Distributed Memory Execution of Nested OpenMP-style Tasks.
Proceedings of the Euro-Par 2022: Parallel Processing, 2022

Towards Reconfigurable Accelerators in HPC: Designing a Multipurpose eFPGA Tile for Heterogeneous SoCs.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

ecoHMEM: Improving Object Placement Methodology for Hybrid Memory Systems in HPC.
Proceedings of the IEEE International Conference on Cluster Computing, 2022

2021
OmpSs@FPGA Framework for High Performance FPGA Computing.
IEEE Trans. Computers, 2021

Size & Shape Matters: The Need of HPC Benchmarks of High Resolution Image Training for Deep Learning.
Supercomput. Front. Innov., 2021

Organization Component Analysis: The method for extracting insights from the shape of cluster.
Proceedings of the International Joint Conference on Neural Networks, 2021

Accelerating FFT Using NEC SX-Aurora Vector Engine.
Proceedings of the Euro-Par 2021: Parallel Processing Workshops, 2021

2020
sLASs: A fully automatic auto-tuned linear algebra library based on OpenMP extensions implemented in OmpSs (LASs Library).
J. Parallel Distributed Comput., 2020

A Closer Look at Art Mediums: The MAMe Image Classification Dataset.
CoRR, 2020

Towards an Auto-Tuned and Task-Based SpMV (LASs Library).
Proceedings of the OpenMP: Portable Multi-Level Parallelism on Modern Systems, 2020

2019
Studying the impact of the Full-Network embedding on multimodal pipelines.
Semantic Web, 2019

MPI+OpenMP tasking scalability for multi-morphology simulations of the human brain.
Parallel Comput., 2019

Integrating blocking and non-blocking MPI primitives with task-based programming models.
Parallel Comput., 2019

A Fast Solver for Large Tridiagonal Systems on Multi-Core Processors (Lass Library).
IEEE Access, 2019

BLAS-3 Optimized by OmpSs Regions (LASs Library).
Proceedings of the 27th Euromicro International Conference on Parallel, 2019

Accelerating Conjugate Gradient using OmpSs.
Proceedings of the 20th International Conference on Parallel and Distributed Computing, 2019

The Cooperative Parallel: A Discussion About Run-Time Schedulers for Nested Parallelism.
Proceedings of the OpenMP: Conquering the Full Hardware Spectrum, 2019

Optimization of Condensed Matter Physics Application with OpenMP Tasking Model.
Proceedings of the OpenMP: Conquering the Full Hardware Spectrum, 2019

Random Forest as a Tumour Genetic Marker Extractor.
Proceedings of the Artificial Intelligence Research and Development, 2019

Feature Discriminativity Estimation in CNNs for Transfer Learning.
Proceedings of the Artificial Intelligence Research and Development, 2019

The OTree: Multidimensional Indexing with efficient data Sampling for HPC.
Proceedings of the 2019 IEEE International Conference on Big Data (IEEE BigData), 2019

Unsupervised Feature Selection for Noisy Data.
Proceedings of the Advanced Data Mining and Applications - 15th International Conference, 2019

2018
Asynchronous and Exact Forward Recovery for Detected Errors in Iterative Solvers.
IEEE Trans. Parallel Distributed Syst., 2018

Exploring the capabilities of support vector machines in detecting silent data corruptions.
Sustain. Comput. Informatics Syst., 2018

Understanding memory access patterns using the BSC performance tools.
Parallel Comput., 2018

On the Behavior of Convolutional Nets for Feature Extraction.
J. Artif. Intell. Res., 2018

Unified fault-tolerance framework for hybrid task-parallel message-passing applications.
Int. J. High Perform. Comput. Appl., 2018

Task-based programming in COMPSs to converge from HPC to big data.
Int. J. High Perform. Comput. Appl., 2018

MPI+X: task-based parallelization and dynamic load balance of finite element assembly.
CoRR, 2018

MPI+OpenMP Tasking Scalability for the Simulation of the Human Brain: Human Brain Project.
Proceedings of the 25th European MPI Users' Group Meeting, 2018

Improving the Interoperability between MPI and Task-Based Programming Models.
Proceedings of the 25th European MPI Users' Group Meeting, 2018

Graph partitioning applied to DAG scheduling to reduce NUMA effects.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018

Variable Batched DGEMM.
Proceedings of the 26th Euromicro International Conference on Parallel, 2018

Identifying the Temporal Structure of Parallel Application Computation Phases.
Proceedings of the 2018 International Conference on High Performance Computing & Simulation, 2018

Reducing Data Movement on Large Shared Memory Systems by Exploiting Computation Dependencies.
Proceedings of the 32nd International Conference on Supercomputing, 2018

Runtime-Guided Management of Stacked DRAM Memories in Task Parallel Programs.
Proceedings of the 32nd International Conference on Supercomputing, 2018

An Out-of-the-box Full-Network Embedding for Convolutional Neural Networks.
Proceedings of the 2018 IEEE International Conference on Big Knowledge, 2018

Application Acceleration on FPGAs with OmpSs@FPGA.
Proceedings of the International Conference on Field-Programmable Technology, 2018

A Visual Distance for WordNet.
Proceedings of the Artificial Intelligence Research and Development, 2018

2017
Task Scheduling Techniques for Asymmetric Multi-Core Systems.
IEEE Trans. Parallel Distributed Syst., 2017

Workflows for Science: a Challenge when Facing the Convergence of HPC and Big Data.
Supercomput. Front. Innov., 2017

PyCOMPSs: Parallel computational workflows in Python.
Int. J. High Perform. Comput. Appl., 2017

Full-Network Embedding in a Multimodal Embedding Pipeline.
CoRR, 2017

Fluid Communities: A Community Detection Algorithm.
CoRR, 2017

An Out-of-the-box Full-network Embedding for Convolutional Neural Networks.
CoRR, 2017

A visual embedding for the unsupervised extraction of abstract semantics.
Cogn. Syst. Res., 2017

Full-Network Embedding in a Multimodal Embedding Pipeline.
Proceedings of the 2nd Workshop on Semantic Deep Learning, 2017

Building Graph Representations of Deep Vector Embeddings.
Proceedings of the 2nd Workshop on Semantic Deep Learning, 2017

Noise Inspector Tool.
Proceedings of the 25th Euromicro International Conference on Parallel, 2017

Improving the Integration of Task Nesting and Dependencies in OpenMP.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

Performance Analysis and Optimization of the FFTXlib on the Intel Knights Landing Architecture.
Proceedings of the 46th International Conference on Parallel Processing Workshops, 2017

Integrating Memory Perspective into the BSC Performance Tools.
Proceedings of the 46th International Conference on Parallel Processing Workshops, 2017

Performance Analysis of Parallel Python Applications.
Proceedings of the International Conference on Computational Science, 2017

cuHinesBatch: Solving Multiple Hines systems on GPUs Human Brain Project<sup>*</sup>.
Proceedings of the International Conference on Computational Science, 2017

ParaView + Alya + D8tree: Integrating High Performance Computing and High Performance Data Analytics.
Proceedings of the International Conference on Computational Science, 2017

Fluid Communities: A Competitive, Scalable and Diverse Community Detection Algorithm.
Proceedings of the Complex Networks & Their Applications VI, 2017

MACORD: Online Adaptive Machine Learning Framework for Silent Error Detection.
Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

Automating the Application Data Placement in Hybrid Memory Systems.
Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

Designing and Modelling Selective Replication for Fault-tolerant HPC Applications.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

2016
PARSECSs: Evaluating the Impact of Task Parallelism in the PARSEC Benchmark Suite.
ACM Trans. Archit. Code Optim., 2016

Hierarchical Hyperlink Prediction for the WWW.
CoRR, 2016

Limitations and Alternatives for the Evaluation of Large-scale Link Prediction.
CoRR, 2016

Detailed and simultaneous power and performance analysis.
Concurr. Comput. Pract. Exp., 2016


MUSA: a multi-level simulation approach for next-generation HPC machines.
Proceedings of the International Conference for High Performance Computing, 2016

Bio-Inspired Call-Stack Reconstruction for Performance Analysis.
Proceedings of the 24th Euromicro International Conference on Parallel, 2016

Multiple Target Task Sharing Support for the OpenMP Accelerator Model.
Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

The Secrets of the Accelerators Unveiled: Tracing Heterogeneous Executions Through OMPT.
Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

Supporting Adaptive Privatization Techniques for Irregular Array Reductions in Task-Parallel Programming Models.
Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

CRC-Based Memory Reliability for Task-Parallel HPC Applications.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Heterogeneous Streaming.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

CATA: Criticality Aware Task Acceleration for Multicore Processors.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Runtime-Guided Mitigation of Manufacturing Variability in Power-Constrained Multi-Socket NUMA Nodes.
Proceedings of the 2016 International Conference on Supercomputing, 2016

A Runtime Heuristic to Selectively Replicate Tasks for Application-Specific Reliability Targets.
Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

On the Representativeness of Convolutional Neural Networks Layers.
Proceedings of the Artificial Intelligence Research and Development, 2016

Spatial Support Vector Regression to Detect Silent Errors in the Exascale Era.
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016

POSTER: Collective Dynamic Parallelism for Directive Based GPU Programming Languages and Compilers.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

POSTER: Exploiting Asymmetric Multi-Core Processors with Flexible System Sofware.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

Reducing Cache Coherence Traffic with Hierarchical Directory Cache and NUMA-Aware Runtime Scheduling.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015
Dense Matrix Computations on NUMA Architectures with Distance-Aware Work Stealing.
Supercomput. Front. Innov., 2015

Extracting Visual Patterns from Deep Learning Representations.
CoRR, 2015

Tareador: a tool to unveil parallelization strategies at undergraduate level.
Proceedings of the Workshop on Computer Architecture Education, 2015

SSMART: smart scheduling of multi-architecture tasks on heterogeneous systems.
Proceedings of the Second Workshop on Accelerator Programming using Directives, 2015

Exploring dynamic parallelism in OpenMP.
Proceedings of the Second Workshop on Accelerator Programming using Directives, 2015

Exploiting asynchrony from exact forward recovery for DUE in iterative solvers.
Proceedings of the International Conference for High Performance Computing, 2015

NanoCheckpoints: A Task-Based Asynchronous Dataflow Framework for Efficient and Scalable Checkpoint/Restart.
Proceedings of the 23rd Euromicro International Conference on Parallel, 2015

Evaluating the Impact of OpenMP 4.0 Extensions on Relevant Parallel Workloads.
Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Towards Task-Parallel Reductions in OpenMP.
Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Quiet Neighborhoods: Key to Protect Job Performance Predictability.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

Criticality-Aware Dynamic Task Scheduling for Heterogeneous Architectures.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

AMA: Asynchronous Management of Accelerators for Task-based Programming Models.
Proceedings of the International Conference on Computational Science, 2015

Boosting irregular array Reductions through In-lined Block-ordering on fast processors.
Proceedings of the 2015 IEEE High Performance Extreme Computing Conference, 2015

Marriage Between Coordinated and Uncoordinated Checkpointing for the Exascale Era.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Collective Offload for Heterogeneous Clusters.
Proceedings of the 22nd IEEE International Conference on High Performance Computing, 2015

Low-Overhead Detection of Memory Access Patterns and Their Time Evolution.
Proceedings of the Euro-Par 2015: Parallel Processing, 2015


Fault-Tolerant Protocol for Hybrid Task-Parallel Message-Passing Applications.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

Programmer-directed partial redundancy for resilient HPC.
Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015

Evaluating Link Prediction on Large Graphs.
Proceedings of the Artificial Intelligence Research and Development, 2015

Spark deployment and performance evaluation on the MareNostrum supercomputer.
Proceedings of the 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, CA, USA, October 29, 2015

Runtime-Guided Management of Scratchpad Memories in Multicore Architectures.
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015

2014
Scheduling parallel jobs on multicore clusters using CPU oversubscription.
J. Supercomput., 2014

Runtime-Aware Architectures: A First Approach.
Supercomput. Front. Innov., 2014

Scalability prediction for fundamental performance factors.
Supercomput. Front. Innov., 2014

Hints to improve automatic load balancing with LeWI for hybrid applications.
J. Parallel Distributed Comput., 2014

Automatic Exploration of Potential Parallelism in Sequential Applications.
Proceedings of the Supercomputing - 29th International Conference, 2014

On the Roles of the Programmer, the Compiler and the Runtime System When Programming Accelerators in OpenMP.
Proceedings of the Using and Improving OpenMP for Devices, Tasks, and More, 2014

Task-Parallel Reductions in OpenMP and OmpSs.
Proceedings of the Using and Improving OpenMP for Devices, Tasks, and More, 2014

Identifying Code Phases Using Piece-Wise Linear Regressions.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Task-Based Programming with OmpSs and Its Application.
Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

ALOJA: A systematic study of Hadoop deployment variables to enable automated characterization of cost-effectiveness.
Proceedings of the 2014 IEEE International Conference on Big Data (IEEE BigData 2014), 2014

2013
Framework for a productive performance optimization.
Parallel Comput., 2013

Programmability and portability for exascale: Top down programming methodology and tools with StarSs.
J. Comput. Sci., 2013

On the trade-off of mixing scientific applications on capacity high-performance computing systems.
IET Comput. Digit. Tech., 2013

On the usefulness of object tracking techniques in performance analysis.
Proceedings of the International Conference for High Performance Computing, 2013

Performance Analytics: Understanding Parallel Applications Using Cluster and Sequence Analysis.
Proceedings of the Tools for High Performance Computing 2013, 2013

Identifying Critical Code Sections in Dataflow Programming Models.
Proceedings of the 21st Euromicro International Conference on Parallel, 2013

Experience with the MPI/STARSS programming model on a large production code.
Proceedings of the Parallel Computing: Accelerating Computational Science and Engineering (CSE), 2013

Self-Adaptive OmpSs Tasks in Heterogeneous Environments.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Programmable and Scalable Reductions on Clusters.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Implementing OmpSs support for regions of data in architectures with multiple address spaces.
Proceedings of the International Conference on Supercomputing, 2013

Global misrouting policies in two-level hierarchical networks.
Proceedings of the 2013 Interconnection Network Architecture: On-Chip, Multi-Chip, 2013

Topic 1: Support Tools and Environments - (Introduction).
Proceedings of the Euro-Par 2013 Parallel Processing, 2013

Performance Analysis and Parallelization Strategies in Neuron Simulation Codes.
Proceedings of the Brain-Inspired Computing - International Workshop, 2013

2012
Power-Aware Parallel Job Scheduling.
Proceedings of the Handbook of Energy-Aware and Green Computing - Two Volume Set., 2012

Parallel job scheduling for power constrained HPC systems.
Parallel Comput., 2012

Understanding the future of energy-performance trade-off via DVFS in HPC environments.
J. Parallel Distributed Comput., 2012

A high-productivity task-based programming model for clusters.
Concurr. Comput. Pract. Exp., 2012

The Network Adapter: The Missing Link between MPI Applications and Network Performance.
Proceedings of the IEEE 24th International Symposium on Computer Architecture and High Performance Computing, 2012

Automatic Refinement of Parallel Applications Structure Detection.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Productive Programming of GPU Clusters with OmpSs.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

HPCS 2012 panels: Panel I: Energy efficient systems in next generation high performance data and compute centers.
Proceedings of the 2012 International Conference on High Performance Computing & Simulation, 2012

HPCS 2012 keynotes: Tuesday keynote: Europe back in the HPC race: Building a European ecosystem to recover and maintain the capacity of designing and building large computers.
Proceedings of the 2012 International Conference on High Performance Computing & Simulation, 2012

On-the-Fly Adaptive Routing in High-Radix Hierarchical Networks.
Proceedings of the 41st International Conference on Parallel Processing, 2012

Tools for Power-Energy Modelling and Analysis of Parallel Scientific Applications.
Proceedings of the 41st International Conference on Parallel Processing, 2012

Effective Quality-of-Service Policy for Capacity High-Performance Computing Systems.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

Contention-aware node allocation policy for high-performance capacity systems.
Proceedings of the 2012 Interconnection Network Architecture, 2012

A Job Scheduling Approach for Multi-core Clusters Based on Virtual Malleability.
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

On the Instrumentation of OpenMP and OmpSs Tasking Constructs.
Proceedings of the Euro-Par 2012: Parallel Processing Workshops, 2012

2011
Ompss: a Proposal for Programming Heterogeneous Multi-Core Architectures.
Parallel Process. Lett., 2011

Simulating Whole Supercomputer Applications.
IEEE Micro, 2011

The International Exascale Software Project roadmap.
Int. J. High Perform. Comput. Appl., 2011

Making the Best of Temporal Locality: Just-in-Time Renaming and Lazy Write-Back on the Cell/B.E.
Int. J. High Perform. Comput. Appl., 2011

Extracting the optimal sampling frequency of applications using spectral analysis.
Concurr. Comput. Pract. Exp., 2011

ACM SRC poster: a portable implementation of the integral histogram in starss.
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011

Folding: Detailed Analysis with Coarse Sampling.
Proceedings of the Tools for High Performance Computing 2011, 2011

The Impact of Application's Micro-Imbalance on the Communication-Computation Overlap.
Proceedings of the 19th International Euromicro Conference on Parallel, 2011

Hybrid Parallel Programming with MPI/StarSs.
Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011

Symmetric Rank-k Update on Clusters of Multicore Processors with SMPSs.
Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011

A Study of Speculative Distributed Scheduling on the Cell/B.E.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Linear programming based parallel job scheduling for power constrained systems.
Proceedings of the 2011 International Conference on High Performance Computing & Simulation, 2011

Poster: programming clusters of GPUs with OMPSs.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

Unveiling Internal Evolution of Parallel Application Computation Phases.
Proceedings of the International Conference on Parallel Processing, 2011

Trace Spectral Analysis toward Dynamic Levels of Detail.
Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

ClusterSs: a task-based programming model for clusters.
Proceedings of the 20th ACM International Symposium on High Performance Distributed Computing, 2011

A Dynamic Load Balancing Approach with SMPSuperscalar and MPI.
Proceedings of the Facing the Multicore - Challenge II, 2011

Quantifying the Potential Task-Based Dataflow Parallelism in MPI Applications.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

Reducing the Impact of Soft Errors on Fabric-Based Collective Communications.
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011

Productive Cluster Programming with OmpSs.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

Parallel Implementation of the Integral Histogram.
Proceedings of the Advances Concepts for Intelligent Vision Systems, 2011

2010
Extending OpenMP to Survive the Heterogeneous Multi-Core Era.
Int. J. Parallel Program., 2010

Automatic Phase Detection and Structure Extraction of MPI Applications.
Int. J. High Perform. Comput. Appl., 2010

Utilization driven power-aware parallel job scheduling.
Comput. Sci. Res. Dev., 2010

Effective communication and computation overlap with hybrid MPI/SMPSs.
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010

Task Superscalar: An Out-of-Order Task Pipeline.
Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010

Optimizing the Exploitation of Multicore Processors and GPUs with OpenMP and OpenCL.
Proceedings of the Languages and Compilers for Parallel Computing, 2010

Simulation environment for studying overlap of communication and computation.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2010

On-line detection of large-scale parallel application's structure.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

BSLD threshold driven power management policy for HPC centers.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Handling task dependencies under strided and aliased references.
Proceedings of the 24th International Conference on Supercomputing, 2010

Overlapping communication and computation by using a hybrid MPI/SMPSs approach.
Proceedings of the 24th International Conference on Supercomputing, 2010

Detailed Load Balance Analysis of Large Scale Parallel Applications.
Proceedings of the 39th International Conference on Parallel Processing, 2010

Performance Data Extrapolation in Parallel Codes.
Proceedings of the 16th IEEE International Conference on Parallel and Distributed Systems, 2010

Impact of Inter-application Contention in Current and Future HPC Systems.
Proceedings of the IEEE 18th Annual Symposium on High Performance Interconnects, 2010

MareIncognito: A Perspective towards Exascale.
Proceedings of the Facing the Multicore-Challenge, 2010

Optimizing job performance under a given power constraint in HPC centers.
Proceedings of the International Green Computing Conference 2010, 2010

Guided Performance Analysis Combining Profile and Trace Tools.
Proceedings of the Euro-Par 2010 Parallel Processing Workshops, 2010

10181 Executive Summary - Program Development for Extreme-Scale Computing.
Proceedings of the Program Development for Extreme-Scale Computing, 02.05. - 07.05.2010, 2010

10181 Abstracts Collection - Program Development for Extreme-Scale Computing.
Proceedings of the Program Development for Extreme-Scale Computing, 02.05. - 07.05.2010, 2010

A Simulation Framework to Automatically Analyze the Communication-Computation Overlap in Scientific Applications.
Proceedings of the 2010 IEEE International Conference on Cluster Computing, 2010

2009
CellSs: Scheduling techniques to better exploit memory hierarchy.
Sci. Program., 2009

A Proposal to Extend the OpenMP Tasking Model with Dependent Tasks.
Int. J. Parallel Program., 2009

Hierarchical Task-Based Programming With StarSs.
Int. J. High Perform. Comput. Appl., 2009

BSC Vision Towards Exascale.
Int. J. High Perform. Comput. Appl., 2009

Programmability Issues.
Int. J. High Perform. Comput. Appl., 2009

Parallelizing dense and banded linear algebra libraries using SMPSs.
Concurr. Comput. Pract. Exp., 2009

Exploiting Locality on the Cell/B.E. through Bypassing.
Proceedings of the Embedded Computer Systems: Architectures, 2009

New Analysis Techniques in the CEPBA-Tools Environment.
Proceedings of the Tools for High Performance Computing 2009, 2009

Impact of the Memory Hierarchy on Shared Memory Architectures in Multicore Programming Models.
Proceedings of the 17th Euromicro International Conference on Parallel, 2009

Automatic Evaluation of the Computation Structure of Parallel Applications.
Proceedings of the 2009 International Conference on Parallel and Distributed Computing, 2009

A Proposal to Extend the OpenMP Tasking Model for Heterogeneous Architectures.
Proceedings of the Evolving OpenMP in an Age of Extreme Parallelism, 2009

Automatic detection of parallel applications computation phases.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Power-aware load balancing of large scale MPI applications.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Tools for scalable performance analysis on Petascale systems.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Exploring pattern-aware routing in generalized fat tree networks.
Proceedings of the 23rd international conference on Supercomputing, 2009

Just-in-Time Renaming and Lazy Write-Back on the Cell/B.E..
Proceedings of the ICPPW 2009, 2009

LeWI: A Runtime Balancing Algorithm for Nested Parallelism.
Proceedings of the ICPP 2009, 2009

Graph-Based Task Replication for Workflow Applications.
Proceedings of the 11th IEEE International Conference on High Performance Computing and Communications, 2009

Detailed Performance Analysis Using Coarse Grain Sampling.
Proceedings of the Euro-Par 2009, 2009

An Extension of the StarSs Programming Model for Platforms with Multiple GPUs.
Proceedings of the Euro-Par 2009 Parallel Processing, 2009

Oblivious routing schemes in extended generalized Fat Tree networks.
Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009

2008
An Evaluation of Marenostrum Performance.
Int. J. High Perform. Comput. Appl., 2008

A Simulation of Seismic Wave Propagation at High Resolution in the Inner Core of the Earth on 2166 Processors of MareNostrum.
Proceedings of the High Performance Computing for Computational Science, 2008

Extending the OpenMP Tasking Model to Allow Dependent Tasks.
Proceedings of the OpenMP in a New Era of Parallelism, 4th International Workshop, 2008

Balancing HPC applications through smart allocation of resources in MT processors.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Automatic analysis of speedup of MPI applications.
Proceedings of the 22nd Annual International Conference on Supercomputing, 2008

Supercomputing for the Future, Supercomputing from the Past (Keynote).
Proceedings of the High Performance Embedded Architectures and Compilers, 2008

Performance Visualization Of Grid Applications Based On OCM-G And Paraver.
Proceedings of the Grid Computing, 2008

A dependency-aware task-based programming environment for multi-core architectures.
Proceedings of the 2008 IEEE International Conference on Cluster Computing, 29 September, 2008

Prediction of behavior of MPI applications.
Proceedings of the 2008 IEEE International Conference on Cluster Computing, 29 September, 2008

2007
Uniform job monitoring in the HPC-Europa project: data model, API and services.
Int. J. Web Grid Serv., 2007

A Proposal for Error Handling in OpenMP.
Int. J. Parallel Program., 2007

CellSs: Making it easier to program the Cell Broadband Engine processor.
IBM J. Res. Dev., 2007

Monitoring and Analysis Framework for Grid Middleware.
Proceedings of the 15th Euromicro International Conference on Parallel, 2007

Prediction f Based Models for Evaluating Backfilling Scheduling Policies.
Proceedings of the Eighth International Conference on Parallel and Distributed Computing, 2007

Design and Implementation of a General-Purpose API of Progress and Performance Indicators.
Proceedings of the Parallel Computing: Architectures, 2007

Automatic Phase Detection of MPI Applications.
Proceedings of the Parallel Computing: Architectures, 2007

Modeling the Impact of Resource Sharing in Backfilling Policies using the Alvio Simulator.
Proceedings of the 15th International Symposium on Modeling, 2007

Transactional Memory and OpenMP.
Proceedings of the A Practical Programming Model for the Multi-Core Era, 2007

Automatic Structure Extraction from MPI Applications Tracefiles.
Proceedings of the Euro-Par 2007, 2007

2006
Running OpenMP applications efficiently on an everything-shared SDSM.
J. Parallel Distributed Comput., 2006

Exploiting multilevel parallelism using OpenMP on a massive multithreaded architecture.
J. Embed. Comput., 2006

Automatic Grid workflow based on imperative programming languages.
Concurr. Comput. Pract. Exp., 2006

Memory - CellSs: a programming model for the cell BE architecture.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Runtime Address Space Computation for SDSM Systems.
Proceedings of the Languages and Compilers for Parallel Computing, 2006

Techniques supporting threadprivate in OpenMP.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

Scaling MPI to short-memory MPPs such as BG/L.
Proceedings of the 20th Annual International Conference on Supercomputing, 2006

Monitoring and analysing a Grid Middleware Node.
Proceedings of the 7th IEEE/ACM International Conference on Grid Computing (GRID 2006), 2006

The Palantir Grid Meta-Information System.
Proceedings of the 7th IEEE/ACM International Conference on Grid Computing (GRID 2006), 2006

Topic 2: Performance Prediction and Evaluation.
Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

Including SMP in Grids as Execution Platform and Other Extensions in GRID Superscalar.
Proceedings of the Second International Conference on e-Science and Grid Technologies (e-Science 2006), 2006

Integration of the Enanos Execution Framework with GRMS.
Proceedings of the Achievements in European Research on Grid Systems: CoreGRID Integration Workshop 2006, 2006

How the JSDL can Exploit the Parallelism?
Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

Uniform Job Monitoring using the HPC-Europa Single Point of Access.
Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

Performance Analysis: From Art to Science.
Proceedings of the Parallel Processing for Scientific Computing, 2006

2005
Performance-Driven Processor Allocation.
IEEE Trans. Parallel Distributed Syst., 2005

Blue Gene/L performance tools.
IBM J. Res. Dev., 2005

Tuning Dynamic Web Applications using Fine-Grain Analysis.
Proceedings of the 13th Euromicro Workshop on Parallel, 2005

WAS Control Center: An Autonomic Performance-Triggered Tracing Environment for WebSphere.
Proceedings of the 13th Euromicro Workshop on Parallel, 2005

eNANOS: Coordinated Scheduling in Grid Environments.
Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005

Scalability of Tracing and Visualization Tools.
Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005

Experiences Parallelizing a Web Server with OpenMP.
Proceedings of the OpenMP Shared Memory Parallel Programming - International Workshops, 2005

Dynamic Load Balancing in MPI Jobs.
Proceedings of the High-Performance Computing - 6th International Symposium, 2005

Optimizing NANOS OpenMP for the IBM Cyclops Multithreaded Architecture.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Another approach to backfilled jobs: applying virtual malleability to expired windows.
Proceedings of the 19th Annual International Conference on Supercomputing, 2005

Data Distribution Strategies for Domain Decomposition Applications in Grid Environments.
Proceedings of the Distributed and Parallel Computing, 2005

Performance Analysis of Domain Decomposition Applications Using Unbalanced Strategies in Grid Environments.
Proceedings of the Grid and Cooperative Computing - GCC 2005, 4th International Conference, Beijing, China, November 30, 2005

eNANOS Grid Resource Broker.
Proceedings of the Advances in Grid Computing, 2005

Implementing phylogenetic inference with GRID superscalar.
Proceedings of the 5th International Symposium on Cluster Computing and the Grid (CCGrid 2005), 2005

2004
Page Migration with Dynamic Space-Sharing Scheduling Policies: The Case of the SGI O2000.
Int. J. Parallel Program., 2004

What Multilevel Parallel Programs Do When You Are Not Watching: A Performance Analysis Case Study Comparing MPI/OpenMP, MLP, and Nested OpenMP.
Proceedings of the Shared Memory Parallel Programming with OpenMP, 2004

Runtime Adjustment of Parallel Nested Loops.
Proceedings of the Shared Memory Parallel Programming with OpenMP, 2004

A Domain Decomposition Strategy for GRID Environments.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

An Expert Assistant for Computer Aided Parallelization.
Proceedings of the Applied Parallel Computing, 2004

Dynamic Load Balancing of MPI+OpenMP Applications.
Proceedings of the 33rd International Conference on Parallel Processing (ICPP 2004), 2004

Paramedir: A Tool for Programmable Performance Analysis.
Proceedings of the Computational Science, 2004

Predicting MPI Buffer Addresses.
Proceedings of the Computational Science, 2004

Scheduling of MPI Applications: Self-co-scheduling.
Proceedings of the Euro-Par 2004 Parallel Processing, 2004

Generation of Simple Analytical Models for Message Passing Applications.
Proceedings of the Euro-Par 2004 Parallel Processing, 2004

Implementing Malleability on MPI Jobs.
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT 2004), 29 September, 2004

2003
Scaling non-regular shared-memory codes by reusing custom loop schedules.
Sci. Program., 2003

Taking advantage of heterogeneity in disk arrays.
J. Parallel Distributed Comput., 2003

Programming Grid Applications with GRID Superscalar.
J. Grid Comput., 2003

Is the Schedule Clause Really Necessary in OpenMP?
Proceedings of the OpenMP Shared Memory Parallel Programming, 2003

Evaluation of OpenMP for the Cyclops Multithreaded Architecture.
Proceedings of the OpenMP Shared Memory Parallel Programming, 2003

Performance Modeling of HPC Applications.
Proceedings of the Parallel Computing: Software Technology, 2003

Deriving analytical models from a limited number of runs.
Proceedings of the Parallel Computing: Software Technology, 2003

Complete instrumentation requirements for performance analysis of Web based technologies.
Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software, 2003

Performance Analysis of Multilevel Parallel Applications on Shared Memory Architectures.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Exploring the Predictability of MPI Messages.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Dual Priority Algorithm to Schedule Real-Time Tasks in a Shared Memory Multiprocessor.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Evaluation of the memory page migration influence in the system performance: the case of the SGI O2000.
Proceedings of the 17th Annual International Conference on Supercomputing, 2003

Interfacing Computer Aided Parallelization and Performance Analysis.
Proceedings of the Computational Science - ICCS 2003, 2003

Performance Evaluation and Prediction.
Proceedings of the Euro-Par 2003. Parallel Processing, 2003

Performance Prediction in a Grid Environment.
Proceedings of the Grid Computing, 2003

2002
Scheduler-Activated Dynamic Page Migration for Multiprogrammed DSM Multiprocessors.
J. Parallel Distributed Comput., 2002

A framework for performance modeling and prediction.
Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 2002

An Efficient Scheme to Allocate Soft-Aperiodic Tasks in Multiprocessor Hard Real-Time Systems.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2002

Dual-Level Parallelism Exploitation with OpenMP in Coastal Ocean Circulation Modeling.
Proceedings of the High Performance Computing, 4th International Symposium, 2002

A Trace-Scaling Agent for Parallel Application Tracing.
Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2002), 2002

Performance Evaluation, Analysis and Optimization.
Proceedings of the Euro-Par 2002, 2002

On the Scalability of Tracing Mechanisms.
Proceedings of the Euro-Par 2002, 2002

2001
A Framework for Integrating Data Alignment, Distribution, and Redistribution in Distributed Memory Multiprocessors.
IEEE Trans. Parallel Distributed Syst., 2001

New OpenMP directives for irregular data access loops.
Sci. Program., 2001

Exploiting memory affinity in OpenMP through schedule reuse.
SIGARCH Comput. Archit. News, 2001

Improving energy saving in hard real time systems via a modified dual priority scheduling.
SIGARCH Comput. Archit. News, 2001

Defining and Supporting Pipelined Executions in OpenMP.
Proceedings of the OpenMP Shared Memory Parallel Programming, 2001

A Dynamic Tracing Mechanism for Performance Analysis of OpenMP Applications.
Proceedings of the OpenMP Shared Memory Parallel Programming, 2001

Extending Heterogeneity to RAID Level 5.
Proceedings of the General Track: 2001 USENIX Annual Technical Conference, 2001

A Dynamic Periodicity Detector: Application to Speedup Computation.
Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001

Improving Processor Allocation through Run-Time Measured Efficiency.
Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001

The trade-off between implicit and explicit data distribution in shared-memory programming paradigms.
Proceedings of the 15th international conference on Supercomputing, 2001

Improving Gang Scheduling through job performance analysis and malleability.
Proceedings of the 15th international conference on Supercomputing, 2001

Complex Pipelined Executions in OpenMP Parallel Applications.
Proceedings of the 2001 International Conference on Parallel Processing, 2001

2000
Sensitivity of Performance Prediction of Message Passing Programs.
J. Supercomput., 2000

NanosCompiler: supporting flexible multilevel parallelism exploitation in OpenMP.
Concurr. Pract. Exp., 2000

Is Data Distribution Necessary in OpenMP?
Proceedings of the Proceedings Supercomputing 2000, 2000

Validation of Dimemas Communication Model for MPI Collective Operations.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2000

UPMLIB: A Runtime System for Tuning the Memory Performance of OpenMP Programs on Scalable Shared-Memory Multiprocessors.
Proceedings of the Languages, 2000

OpenMP Extensions for Thread Groups and Their Run-Time Support.
Proceedings of the Languages and Compilers for Parallel Computing, 2000

A Tool to Schedule Parallel Applications on Multiprocessors: The NANOS CPU MANAGER.
Proceedings of the Job Scheduling Strategies for Parallel Processing, IPDPS 2000 Workshop, 2000

Leveraging Transparent Data Distribution in OpenMP via User-Level Dynamic Page Migration.
Proceedings of the High Performance Computing, Third International Symposium, 2000

Applying Interposition Techniques for Performance Analysis of OpenMP Parallel Applications.
Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

A case for use-level dynamic page migration.
Proceedings of the 14th international conference on Supercomputing, 2000

User-Level Dynamic Page Migration for Multiprogrammed Shared-Memory Multiprocessors.
Proceedings of the 2000 International Conference on Parallel Processing, 2000

Sparse Matrix Structure for Dynamic Parallelisation Efficiency.
Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

A Case for Heterogeneous Disk Arrays.
Proceedings of the 2000 IEEE International Conference on Cluster Computing (CLUSTER 2000), November 28th, 2000

1999
HRaid: A Flexible Storage-system Simulator.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 1999

An Introduction to NANOS Project.
Proceedings of the High Performance Computing, Second International Symposium, 1999

Linear Aggressive Prefetching: A Way to Increase the Performance of Cooperative Caches.
Proceedings of the 13th International Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS / SPDP '99), 1999

Thread fork/join techniques for multi-level parallelism exploitation in NUMA multiprocessors.
Proceedings of the 13th international conference on Supercomputing, 1999

Exploiting Multiple Levels of Parallelism in OpenMP: A Case Study.
Proceedings of the International Conference on Parallel Processing 1999, 1999

The Queue System within PHASE.
Proceedings of the High-Performance Computing and Networking, 7th International Conference, 1999

Influence of Variable Time Operations in Static Instruction Scheduling.
Proceedings of the Euro-Par '99 Parallel Processing, 5th International Euro-Par Conference, Toulouse, France, August 31, 1999

1998
Dynamic task scheduling in distributed real time systems using fuzzy rules.
Microprocess. Microsystems, 1998

Kernel-level Scheduling for the Nano-threads Programming Model.
Proceedings of the 12th international conference on Supercomputing, 1998

1997
DDT: A Research Tool for Automatic Data Distribution in High Performance Fortran.
Sci. Program., 1997

Analyzing Scheduling Policies Using Dimemas.
Parallel Comput., 1997

Runtime Parallelization of the Finite Element Code Permas.
Int. J. High Perform. Comput. Appl., 1997

Exploiting Parallelism Through Directives on the Nano-Threads Programming Model.
Proceedings of the Languages and Compilers for Parallel Computing, 1997

Analysis of Several Scheduling Algorithms under the Nano-Thread Programming Model.
Proceedings of the 11th International Parallel Processing Symposium (IPPS '97), 1997

Design Issues of a Cooperative Cache with no Coherence Problems.
Proceedings of the Fifth Workshop on I/O in Parallel and Distributed Systems, 1997

Avoiding the Cache-Coherence Problem in a Parallel/Distributed File System.
Proceedings of the High-Performance Computing and Networking, 1997

Hamiltonian Recurrence for ILP.
Proceedings of the Euro-Par '97 Parallel Processing, 1997

1996
Using a 0-1 Integer Programming Model for Automatic Static Data Distribution.
Parallel Process. Lett., 1996

A framework for automatic dynamic data mapping.
Proceedings of the Eighth IEEE Symposium on Parallel and Distributed Processing, 1996

PLS: A Parallel Linear Solvers Library for Domain Decomposition Methods.
Proceedings of the Parallel Virtual Machine, 1996

Loop Parallelization: Revisiting Framework of Unimodular Transformations.
Proceedings of the 4th Euromicro Workshop on Parallel and Distributed Processing (PDP '96), 1996

Data Distribution and Loop Parallelization for Shared-Memory Multiprocessors.
Proceedings of the Languages and Compilers for Parallel Computing, 1996

Manufacturing Progressive Addition Lenses using Distributed Parallel Processing.
Proceedings of the Parallel Algorithms for Irregularly Structured Problems, 1996

Experiences and Achievements with the Parallelization of a Large Finite Element System.
Proceedings of the High-Performance Computing and Networking, 1996

A Library Implementation of the Nano-Threads Programming Model.
Proceedings of the Euro-Par '96 Parallel Processing, 1996

DiP: A Parallel Program Development Environment.
Proceedings of the Euro-Par '96 Parallel Processing, 1996

PACA: A Cooperative File System Cache for Parallel Machines.
Proceedings of the Euro-Par '96 Parallel Processing, 1996

1995
Analyzing reference patterns in automatic data distribution tools.
Int. J. Parallel Program., 1995

A Novel Approach Towards Automatic Data Distribution.
Proceedings of the Proceedings Supercomputing '95, San Diego, CA, USA, December 4-8, 1995, 1995

Performance on Distributed Memory Multicomputers of Domain Decomposition Solvers.
Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, 1995

Load Balancing in a Network Flow Optimization Code.
Proceedings of the Applied Parallel Computing, 1995

Data Redistribution in an Automatic Data Distribution Tool.
Proceedings of the Languages and Compilers for Parallel Computing, 1995

A general approach for an automatic parallelization applied to the finite element code PERMAS.
Proceedings of the High-Performance Computing and Networking, 1995

Automatic generation of loop scheduling for VLIW.
Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques, 1995

1994
Implementation of GTS.
Proceedings of the PARLE '94: Parallel Architectures and Languages Europe, 1994

Detecting and Using Affinity in an Automatic Data Distribution Tool.
Proceedings of the Languages and Compilers for Parallel Computing, 1994

1993
Multiprogrammation of parallel applications on the PAROS operating system kernel.
Proceedings of the 1993 Euromicro Workshop on Parallel and Distributed Processing, 1993

Measures of parallelism at compile time.
Proceedings of the 1993 Euromicro Workshop on Parallel and Distributed Processing, 1993

Align and Distribute-based Linear Loop Transformations.
Proceedings of the Languages and Compilers for Parallel Computing, 1993

1991
Balanced Loop Partitioning Using GTS.
Proceedings of the Languages and Compilers for Parallel Computing, 1991

On Automatic Loop Data-Mapping for Distributed-Memory Multiprocessors.
Proceedings of the Distributed Memory Computing, 2nd European Conference, 1991

1989
GTS: parallelization and vectorization of tight recurrences.
Proceedings of the Proceedings Supercomputing '89, Reno, NV, USA, November 12-17, 1989, 1989

GTS: Extracting Full Parallelism Out of DO Loops.
Proceedings of the PARLE '89: Parallel Architectures and Languages Europe, 1989

1987
Optimized Mesh-Connected Networks for SIMD and MIMD Architectures.
Proceedings of the 14th Annual International Symposium on Computer Architecture. Pittsburgh, 1987

1985
Analysis and Simulation of Multiplexed Single-Bus Networks With and Without Buffering.
Proceedings of the 12th Annual Symposium on Computer Architecture, 1985

1983
A performance evaluation of the multiple bus network for multiprocessor systems.
Proceedings of the International Conference on Measurements and Modeling of Computer Systems, 1983


  Loading...