Real-Time Scalable Cortical Computing at 46 Giga-Synaptic OPS/Watt with ~100× Speedup in Time-to-Solution and ~100, 000× Reduction in Energy-to-Solution.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2014

2013

A Systematic Methodology to Generate Decomposable and Responsive Power Models for CMPs.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2013

Counter-Based Power Modeling Methods: Top-Down vs. Bottom-Up.

[BibT_eX]

[DOI]

Comput. J., 2013

2012

DMA++: On the Fly Data Realignment for On-Chip Memories.

[BibT_eX]

[DOI]

Nikola Vujic

Felipe Cabarcas

Marc González Tallada

Alex Ramírez

Xavier Martorell

Eduard Ayguadé

IEEE Trans. Computers, 2012

Energy accounting for shared virtualized environments under DVFS using PMC-based power models.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2012

POTRA: a framework for building power models for next generation multicore architectures.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, 2012

Systematic Energy Characterization of CMP/SMT Processor Systems via Automated Micro-Benchmarks.

[BibT_eX]

[DOI]

Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, 2012

DMA-circular: an enhanced high level programmable DMA controller for optimized management of on-chip local memories.

[BibT_eX]

[DOI]

Proceedings of the Computing Frontiers Conference, CF'12, 2012

2011

Local Memory Design Space Exploration for High-Performance Computing.

[BibT_eX]

[DOI]

Comput. J., 2011

Design space exploration for aggressive core replication schemes in CMPs.

[BibT_eX]

[DOI]

Proceedings of the 20th ACM International Symposium on High Performance Distributed Computing, 2011

2010

Automatic Prefetch and Modulo Scheduling Transformations for the Cell BE Architecture.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2010

Extending OpenMP to Survive the Heterogeneous Multi-Core Era.

[BibT_eX]

[DOI]

Daniel Jiménez-González

Jesús Labarta

Int. J. Parallel Program., 2010

Optimizing the Exploitation of Multicore Processors and GPUs with OpenMP and OpenCL.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2010

Decomposable and responsive power models for multicore processors using performance counters.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Supercomputing, 2010

Analysis of Task Offloading for Accelerators.

[BibT_eX]

[DOI]

Proceedings of the High Performance Embedded Architectures and Compilers, 2010

Accurate energy accounting for shared virtualized environments using PMC-based power modeling techniques.

[BibT_eX]

[DOI]

Proceedings of the 2010 11th IEEE/ACM International Conference on Grid Computing, 2010

2009

Achieving high memory performance from heterogeneous architectures with the SARC programming model.

[BibT_eX]

[DOI]

Proceedings of the 10th workshop on MEmory performance, 2009

Adaptive and Speculative Memory Consistency Support for Multi-core Architectures with On-Chip Local Memories.

[BibT_eX]

[DOI]

Nikola Vujic

Lluc Alvarez

Marc González Tallada

Xavier Martorell

Eduard Ayguadé

Proceedings of the Languages and Compilers for Parallel Computing, 2009

A Proposal to Extend the OpenMP Tasking Model for Heterogeneous Architectures.

[BibT_eX]

[DOI]

Daniel Jiménez-González

Enrique S. Quintana-Ortí

Proceedings of the Evolving OpenMP in an Age of Extreme Parallelism, 2009

Speeding Up Distributed MapReduce Applications Using Hardware Accelerators.

[BibT_eX]

[DOI]

Proceedings of the ICPP 2009, 2009

2008

Evaluation of memory performance on the cell BE with the SARC programming model.

[BibT_eX]

[DOI]

Proceedings of the 9th workshop on MEmory performance, 2008

Automatic Pre-Fetch and Modulo Scheduling Transformations for the Cell BE Architecture.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2008

Prefetching irregular references for software cache on cell.

[BibT_eX]

[DOI]

Tong Chen

Tao Zhang

Zehra Sura

Marc González Tallada

Proceedings of the Sixth International Symposium on Code Generation and Optimization (CGO 2008), 2008

Hybrid access-specific software cache techniques for the cell BE architecture.

[BibT_eX]

[DOI]

Alexandre E. Eichenberger

Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, 2008

2007

A Proposal for Error Handling in OpenMP.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2007

A Novel Asynchronous Software Cache Implementation for the Cell-BE Processor.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2007

2006

Employing nested OpenMP for the parallelization of multi-zone computational fluid dynamics applications.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2006

Runtime Address Space Computation for SDSM Systems.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2006

Techniques supporting threadprivate in OpenMP.

[BibT_eX]

[DOI]

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

2005

Experiences Parallelizing a Web Server with OpenMP.

[BibT_eX]

[DOI]

Proceedings of the OpenMP Shared Memory Parallel Programming - International Workshops, 2005

Automatic thread distribution for nested parallelism in OpenMP.

[BibT_eX]

[DOI]

Alejandro Duran

Marc González

Julita Corbalán

Proceedings of the 19th Annual International Conference on Supercomputing, 2005

2003

Automatic multilevel parallelization using OpenMP.

[BibT_eX]

[DOI]

Sci. Program., 2003

2002

Dual-Level Parallelism Exploitation with OpenMP in Coastal Ocean Circulation Modeling.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 4th International Symposium, 2002

2001

Defining and Supporting Pipelined Executions in OpenMP.

[BibT_eX]

[DOI]

Proceedings of the OpenMP Shared Memory Parallel Programming, 2001

Complex Pipelined Executions in OpenMP Parallel Applications.

[BibT_eX]

[DOI]

Proceedings of the 2001 International Conference on Parallel Processing, 2001

2000

NanosCompiler: supporting flexible multilevel parallelism exploitation in OpenMP.

[BibT_eX]

[DOI]

Concurr. Pract. Exp., 2000

OpenMP Extensions for Thread Groups and Their Run-Time Support.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2000

Applying Interposition Techniques for Performance Analysis of OpenMP Parallel Applications.

[BibT_eX]

[DOI]

Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

1999

Thread fork/join techniques for multi-level parallelism exploitation in NUMA multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 13th international conference on Supercomputing, 1999

Exploiting Multiple Levels of Parallelism in OpenMP: A Case Study.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Processing 1999, 1999

1997

Exploiting Parallelism Through Directives on the Nano-Threads Programming Model.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 1997

Marc González

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...