Marc González
Orcid: 0000-0002-3780-1106Affiliations:
- Oak Ridge National Laboratory, TN, USA
- Polytechnic University of Catalonia (UPC), Computer Architecture Department, Barcelona, Spain
According to our database1,
Marc González
authored at least 55 papers
between 1997 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on linkedin.com
-
on orcid.org
On csauthors.net:
Bibliography
2024
Concurr. Comput. Pract. Exp., 2024
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
sKokkos: Enabling Kokkos with Transparent Device Selection on Heterogeneous Systems using OpenACC.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2024
2023
Heterogeneous programming using OpenMP and CUDA/HIP for hybrid CPU-GPU scientific applications.
Int. J. High Perform. Comput. Appl., September, 2023
Abisko: Deep codesign of an architecture for spiking neural networks using novel neuromorphic materials.
Int. J. High Perform. Comput. Appl., July, 2023
Evaluating performance and portability of high-level programming models: Julia, Python/Numba, and Kokkos on exascale nodes.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
2022
Proceedings of the 9th Workshop on Accelerator Programming Using Directives, 2022
2021
IEEE Trans. Parallel Distributed Syst., 2021
Multi-GPU systems and Unified Virtual Memory for scientific applications: The case of the NAS multi-zone parallel benchmarks.
J. Parallel Distributed Comput., 2021
2016
Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2016
2015
Hardware-Software Coherence Protocol for the Coexistence of Caches and Local Memories.
IEEE Trans. Computers, 2015
Coherence protocol for transparent management of scratchpad memories in shared memory manycore architectures.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015
Proceedings of the 44th International Conference on Parallel Processing Workshops, 2015
2014
Proceedings of the Proceeding of IEEE International Symposium on a World of Wireless, 2014
Real-Time Scalable Cortical Computing at 46 Giga-Synaptic OPS/Watt with ~100× Speedup in Time-to-Solution and ~100, 000× Reduction in Energy-to-Solution.
Proceedings of the International Conference for High Performance Computing, 2014
2013
A Systematic Methodology to Generate Decomposable and Responsive Power Models for CMPs.
IEEE Trans. Computers, 2013
2012
Energy accounting for shared virtualized environments under DVFS using PMC-based power models.
Future Gener. Comput. Syst., 2012
POTRA: a framework for building power models for next generation multicore architectures.
Proceedings of the ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, 2012
Systematic Energy Characterization of CMP/SMT Processor Systems via Automated Micro-Benchmarks.
Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, 2012
DMA-circular: an enhanced high level programmable DMA controller for optimized management of on-chip local memories.
Proceedings of the Computing Frontiers Conference, CF'12, 2012
2011
Comput. J., 2011
Proceedings of the 20th ACM International Symposium on High Performance Distributed Computing, 2011
2010
Automatic Prefetch and Modulo Scheduling Transformations for the Cell BE Architecture.
IEEE Trans. Parallel Distributed Syst., 2010
Int. J. Parallel Program., 2010
Proceedings of the Languages and Compilers for Parallel Computing, 2010
Decomposable and responsive power models for multicore processors using performance counters.
Proceedings of the 24th International Conference on Supercomputing, 2010
Proceedings of the High Performance Embedded Architectures and Compilers, 2010
Accurate energy accounting for shared virtualized environments using PMC-based power modeling techniques.
Proceedings of the 2010 11th IEEE/ACM International Conference on Grid Computing, 2010
2009
Achieving high memory performance from heterogeneous architectures with the SARC programming model.
Proceedings of the 10th workshop on MEmory performance, 2009
Adaptive and Speculative Memory Consistency Support for Multi-core Architectures with On-Chip Local Memories.
Proceedings of the Languages and Compilers for Parallel Computing, 2009
Proceedings of the Evolving OpenMP in an Age of Extreme Parallelism, 2009
Proceedings of the ICPP 2009, 2009
2008
Proceedings of the 9th workshop on MEmory performance, 2008
Automatic Pre-Fetch and Modulo Scheduling Transformations for the Cell BE Architecture.
Proceedings of the Languages and Compilers for Parallel Computing, 2008
Proceedings of the Sixth International Symposium on Code Generation and Optimization (CGO 2008), 2008
Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, 2008
2007
Proceedings of the Languages and Compilers for Parallel Computing, 2007
2006
Employing nested OpenMP for the parallelization of multi-zone computational fluid dynamics applications.
J. Parallel Distributed Comput., 2006
Proceedings of the Languages and Compilers for Parallel Computing, 2006
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006
2005
Proceedings of the OpenMP Shared Memory Parallel Programming - International Workshops, 2005
Proceedings of the 19th Annual International Conference on Supercomputing, 2005
2003
2002
Dual-Level Parallelism Exploitation with OpenMP in Coastal Ocean Circulation Modeling.
Proceedings of the High Performance Computing, 4th International Symposium, 2002
2001
Proceedings of the OpenMP Shared Memory Parallel Programming, 2001
Proceedings of the 2001 International Conference on Parallel Processing, 2001
2000
Concurr. Pract. Exp., 2000
Proceedings of the Languages and Compilers for Parallel Computing, 2000
Applying Interposition Techniques for Performance Analysis of OpenMP Parallel Applications.
Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000
1999
Thread fork/join techniques for multi-level parallelism exploitation in NUMA multiprocessors.
Proceedings of the 13th international conference on Supercomputing, 1999
Proceedings of the International Conference on Parallel Processing 1999, 1999
1997
Proceedings of the Languages and Compilers for Parallel Computing, 1997