Sally A. McKee

Petar Radojkovic

Eduard Ayguadé

Proceedings of the 2015 International Symposium on Memory Systems, 2015

Exploiting Program Semantics to Place Data in Hybrid Memory.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015

2014

QBLESS: A case for QoS-aware bufferless NoCs.

[BibT_eX]

[DOI]

Proceedings of the IEEE 22nd International Symposium of Quality of Service, 2014

Performance and Energy Analysis of the Restricted Transactional Memory Implementation on Haswell.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Understanding the behavior of in-memory computing workloads.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Symposium on Workload Characterization, 2014

Characterizing and subsetting big data workloads.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Symposium on Workload Characterization, 2014

DTail: a flexible approach to DRAM refresh management.

[BibT_eX]

[DOI]

Proceedings of the 2014 International Conference on Supercomputing, 2014

An Automated Performance-Aware Approach to Reliability Transformations.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

Digging deeper into cluster system logs for failure prediction and root cause diagnosis.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Conference on Cluster Computing, 2014

2013

Improving data access efficiency by using a tagless access buffer (TAB).

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, 2013

2012

Active memory controller.

[BibT_eX]

[DOI]

J. Supercomput., 2012

Techniques to Measure, Model, and Manage Power.

[BibT_eX]

[DOI]

Bhavishya Goel

Magnus Själander

Adv. Comput., 2012

An LTE Uplink Receiver PHY benchmark and subframe-based power management.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2012

Parallelizing more Loops with Compiler Guided Refactoring.

[BibT_eX]

[DOI]

Proceedings of the 41st International Conference on Parallel Processing, 2012

Topic 2: Performance Prediction and Evaluation.

[BibT_eX]

[DOI]

Allen D. Malony

Helen D. Karatza

William J. Knottenbelt

Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

ROSE: : FTTransform - A source-to-source translation framework for exascale fault-tolerance research.

[BibT_eX]

[DOI]

Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, 2012

Design Principles for Synthesizable Processor Cores.

[BibT_eX]

[DOI]

Pascal Schleuniger

Sven Karlsson

Proceedings of the Architecture of Computing Systems - ARCS 2012 - 25th International Conference, Munich, Germany, February 28, 2012

2011

Memory Wall.

[BibT_eX]

[DOI]

Robert W. Wisniewski

Proceedings of the Encyclopedia of Parallel Computing, 2011

Guest Editors' Introduction.

[BibT_eX]

[DOI]

Valentina Salapura

José E. Moreira

Int. J. Parallel Program., 2011

Power-Aware Resource Scheduling in Base Stations.

[BibT_eX]

[DOI]

Proceedings of the MASCOTS 2011, 2011

SoftBeam: Precise tracking of transient faults and vulnerability analysis at processor design time.

[BibT_eX]

[DOI]

Proceedings of the IEEE 29th International Conference on Computer Design, 2011

Performance optimization by dynamic code transformation.

[BibT_eX]

[DOI]

Josef Weidendorfer

Tilman Küstner

Proceedings of the 8th Conference on Computing Frontiers, 2011

2010

An approach to resource-aware co-scheduling for CMPs.

[BibT_eX]

[DOI]

Dimitrios S. Nikolopoulos

Proceedings of the 24th International Conference on Supercomputing, 2010

Portable, scalable, per-core power estimation for intelligent resource management.

[BibT_eX]

[DOI]

Proceedings of the International Green Computing Conference 2010, 2010

Comparing Scalability Prediction Strategies on an SMP of CMPs.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2010 - Parallel Processing, 16th International Euro-Par Conference, Ischia, Italy, August 31, 2010

Designing OS for HPC Applications: Scheduling.

[BibT_eX]

[DOI]

Roberto Gioiosa

Mateo Valero

Proceedings of the 2010 IEEE International Conference on Cluster Computing, 2010

Global management of cache hierarchies.

[BibT_eX]

[DOI]

Mohamed Zahran

Proceedings of the 7th Conference on Computing Frontiers, 2010

2009

Data Cache Techniques to Save Power and Deliver High Performance in Embedded Systems.

[BibT_eX]

[DOI]

Trans. High Perform. Embed. Archit. Compil., 2009

Real time power estimation and thread scheduling via performance counters.

[BibT_eX]

[DOI]

SIGARCH Comput. Archit. News, 2009

Machine learning based online performance prediction for runtime parallelization and task scheduling.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2009

Compiler-enhanced incremental checkpointing for OpenMP applications.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Understanding PARSEC performance on contemporary CMPs.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE International Symposium on Workload Characterization, 2009

Prediction-based power estimation and scheduling for CMPs.

[BibT_eX]

[DOI]

Proceedings of the 23rd international conference on Supercomputing, 2009

Cancellation of loads that return zero using zero-value caches.

[BibT_eX]

[DOI]

Md. Mafijul Islam

Per Stenström

Proceedings of the 23rd international conference on Supercomputing, 2009

PARSEC: hardware profiling of emerging workloads for CMP design.

[BibT_eX]

[DOI]

Proceedings of the 23rd international conference on Supercomputing, 2009

Code density concerns for new architectures.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Computer Design, 2009

Revisiting Cache Block Superloading.

[BibT_eX]

[DOI]

Matthew A. Watkins

Lambert Schaelicke

Proceedings of the High Performance Embedded Architectures and Compilers, 2009

Accomodating Diversity in CMPs with Heterogeneous Frequencies.

[BibT_eX]

[DOI]

Proceedings of the High Performance Embedded Architectures and Compilers, 2009

Core monitors: monitoring performance in multicore processors.

[BibT_eX]

[DOI]

Proceedings of the 6th Conference on Computing Frontiers, 2009

2008

Efficient architectural design space exploration via predictive modeling.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2008

Augmenting priority rule heuristics with justification and rollout to solve the resource-constrained project scheduling problem.

[BibT_eX]

[DOI]

Comput. Oper. Res., 2008

Can hardware performance counters be trusted?

[BibT_eX]

[DOI]

Proceedings of the 4th International Symposium on Workload Characterization (IISWC 2008), 2008

A projection-based optimization framework for abstractions with application to the unstructured mesh domain.

[BibT_eX]

[DOI]

Brian S. White

Daniel J. Quinlan

Proceedings of the 22nd Annual International Conference on Supercomputing, 2008

Using Dynamic Binary Instrumentation to Generate Multi-platform SimPoints: Methodology and Accuracy.

[BibT_eX]

[DOI]

Juan Julián Merelo Guervós

Proceedings of the High Performance Embedded Architectures and Compilers, 2008

Architecture Performance Prediction Using Evolutionary Artificial Neural Networks.

[BibT_eX]

[DOI]

Pedro A. Castillo

Antonio Miguel Mora

Juan Luis Jiménez Laredo

Proceedings of the Applications of Evolutionary Computing, 2008

Archer: A Community Distributed Computing Infrastructure for Computer Architecture Research and Education.

[BibT_eX]

[DOI]

Renato J. O. Figueiredo

Proceedings of the Collaborative Computing: Networking, 2008

Optimizing thread throughput for multithreaded workloads on memory constrained CMPs.

[BibT_eX]

[DOI]

Pedro Ángel Castillo Valdivieso

Proceedings of the 5th Conference on Computing Frontiers, 2008

Evolutionary system for prediction and optimization of hardware architecture performance.

[BibT_eX]

[DOI]

Juan Julián Merelo Guervós

Juan Luis Jiménez Laredo

Proceedings of the IEEE Congress on Evolutionary Computation, 2008

2007

METRIC: Memory tracing via dynamic binary rewriting to identify cache inefficiencies.

[BibT_eX]

[DOI]

Andy Yoo

ACM Trans. Program. Lang. Syst., 2007

Introduction to Part 3.

[BibT_eX]

[DOI]

Trans. High Perform. Embed. Archit. Compil., 2007

Specializing Cache Structures for High Performance and Energy Conservation in Embedded Systems.

[BibT_eX]

[DOI]

Michael J. Geiger

Gary S. Tyson

Trans. High Perform. Embed. Archit. Compil., 2007

Editorial to special issue on reliable computing.

[BibT_eX]

[DOI]

ACM J. Emerg. Technol. Comput. Syst., 2007

Guest Editor's Introduction.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2007

Predicting parallel application performance via machine learning approaches.

[BibT_eX]

[DOI]

Rich Caruana

Concurr. Comput. Pract. Exp., 2007

Methods of inference and learning for performance modeling of parallel applications.

[BibT_eX]

[DOI]

Benjamin C. Lee

David M. Brooks

Dimitrios S. Nikolopoulos

Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2007

Leveraging High Performance Data Cache Techniques to Save Power in Embedded Systems.

[BibT_eX]

[DOI]

Proceedings of the High Performance Embedded Architectures and Compilers, 2007

Identifying energy-efficient concurrency levels using machine learning.

[BibT_eX]

[DOI]

Proceedings of the 2007 IEEE International Conference on Cluster Computing, 2007

A Phase-Adaptive Approach to Increasing Cache Performance.

[BibT_eX]

[DOI]

Matthew A. Watkins

Lambert Schaelicke

Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), 2007

2006

Dynamic program phase detection in distributed shared-memory multiprocessors.

[BibT_eX]

[DOI]

José F. Martínez

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

Rethinking Processor Design: Parameter Correlations.

[BibT_eX]

[DOI]

Nana B. Sam

Prabhakar Kudva

Proceedings of the 13th IEEE International Conference on Electronics, 2006

Efficiently exploring architectural design spaces via predictive modeling.

[BibT_eX]

[DOI]

Rich Caruana

Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, 2006

2005

Improving the computational intensity of unstructured mesh applications.

[BibT_eX]

[DOI]

Brian S. White

Brian Miller

Daniel J. Quinlan

Proceedings of the 19th Annual International Conference on Supercomputing, 2005

Beyond Basic Region Caching: Specializing Cache Structures for High Performance and Energy Conservation.

[BibT_eX]

[DOI]

Michael J. Geiger

Gary S. Tyson

Proceedings of the High Performance Embedded Architectures and Compilers, 2005

An Approach to Performance Prediction for Parallel Applications.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2005, Parallel Processing, 11th International Euro-Par Conference, Lisbon, Portugal, August 30, 2005

Owl: next generation system monitoring.

[BibT_eX]

[DOI]

Proceedings of the Second Conference on Computing Frontiers, 2005

Drowsy region-based caches: minimizing both dynamic and static power dissipation.

[BibT_eX]

[DOI]

Michael J. Geiger

Gary S. Tyson

Proceedings of the Second Conference on Computing Frontiers, 2005

2004

Formal hardware specification languages for protocol compliance verification.

[BibT_eX]

[DOI]

Annette Bunker

Ganesh Gopalakrishnan

ACM Trans. Design Autom. Electr. Syst., 2004

Reflections on the memory wall.

[BibT_eX]

[DOI]

Proceedings of the First Conference on Computing Frontiers, 2004

SimSnap: Fast-Forwarding via Native Execution and Application-Level Checkpointing.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Workshop on Interaction between Compilers and Computer Architecture (INTERACT-8 2004), 2004

2003

A Cost Model For Integrated Restructuring Optimizations.

[BibT_eX]

[DOI]

J. Instr. Level Parallelism, 2003

Restructuring Computations for Temporal Data Cache Locality.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2003

Interactive Locality Optimization on NUMA Architectures.

[BibT_eX]

[DOI]

Proceedings of the Proceedings ACM 2003 Symposium on Software Visualization, 2003

Identifying and Exploiting Spatial Regularity in Data Memory References.

[BibT_eX]

[DOI]

Tushar Mohan

Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003

An MPEG-4 performance study for non-SIMD, general purpose architectures.

[BibT_eX]

[DOI]

Zhen Fang

Mateo Valero

Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software, 2003

A Framework for Portable Shared Memory Programming.

[BibT_eX]

[DOI]

Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

METRIC: Tracking Down Inefficiencies in the Memory Hierarchy via Binary Rewriting.

[BibT_eX]

[DOI]

Jaydeep Marathe

Frank Mueller

Tushar Mohan

Andy Yoo

Proceedings of the 1st IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2003), 2003

2002

Computation regrouping: restructuring programs for temporal data cache locality.

[BibT_eX]

[DOI]

Proceedings of the 16th international conference on Supercomputing, 2002

2001

The Impulse Memory Controller.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2001

Reevaluating Online Superpage Promotion with Hardware Support.

[BibT_eX]

[DOI]

Proceedings of the Seventh International Symposium on High-Performance Computer Architecture (HPCA'01), 2001

A Cost Framework for Evaluating Integrated Restructuring Optimizations.

[BibT_eX]

[DOI]

Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques (PACT 2001), 2001

2000

Dynamic Access Ordering for Streamed Computations.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2000

Algorithmic foundations for a parallel vector access memory system.

[BibT_eX]

[DOI]

Proceedings of the Twelfth annual ACM Symposium on Parallel Algorithms and Architectures, 2000

Online superpage promotion revisited (poster).

[BibT_eX]

[DOI]

Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 2000

Profiling I/O Interrupts in Modern Architectures.

[BibT_eX]

[DOI]

Lambert Schaelicke

Al Davis

Proceedings of the MASCOTS 2000, Proceedings of the 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, 29 August, 2000

Hardware-only stream prefetching and dynamic access ordering.

[BibT_eX]

[DOI]

Chengqiang Zhang

Proceedings of the 14th international conference on Supercomputing, 2000

Design of a Parallel Vector Access Unit for SDRAM Memory Systems.

[BibT_eX]

[DOI]

Binu K. Mathew