John Shalf

Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

2016

NANDFlashSim: High-Fidelity, Microarchitecture-Aware NAND Flash Memory Simulation.

[BibT_eX]

[DOI]

Myoungsoo Jung

Wonil Choi

Shuwen Gao

Ellis Herbert Wilson III

Mahmut Taylan Kandemir

ACM Trans. Storage, 2016

BoxLib with Tiling: An Adaptive Mesh Refinement Software Framework.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 2016

BoxLib with Tiling: An AMR Software Framework.

[BibT_eX]

[DOI]

CoRR, 2016

TiDA: High-Level Programming Abstractions for Data Locality Management.

[BibT_eX]

[DOI]

Didem Unat

Tan Nguyen

Weiqun Zhang

Muhammed Nufail Farooqi

Burak Bastem

Ann S. Almgren

Proceedings of the High Performance Computing - 31st International Conference, 2016

Perilla: metadata-based optimizations of an asynchronous runtime for adaptive mesh refinement.

[BibT_eX]

[DOI]

Muhammed Nufail Farooqi

Proceedings of the International Conference for High Performance Computing, 2016

Characterizing the Performance of Hybrid Memory Cube Using ApexMAP Application Probes.

[BibT_eX]

[DOI]

Khaled Z. Ibrahim

Farzad Fatollahi-Fard

Proceedings of the Second International Symposium on Memory Systems, 2016

OpenSoC Fabric: On-chip network generator.

[BibT_eX]

[DOI]

Farzad Fatollahi-Fard

Proceedings of the 2016 IEEE International Symposium on Performance Analysis of Systems and Software, 2016

Silicon photonic memory interconnect for many-core architectures.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE High Performance Extreme Computing Conference, 2016

2015

Extending Summation Precision for Network Reduction Operations.

[BibT_eX]

[DOI]

Xiaoye S. Li

David H. Bailey

Int. J. Parallel Program., 2015

ExaSAT: An exascale co-design tool for performance modeling.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2015

Computing beyond Moore's Law.

[BibT_eX]

[DOI]

John M. Shalf

Robert Leland

Computer, 2015

OpenNVM: An open-sourced FPGA-based NVM controller for low level memory characterization.

[BibT_eX]

[DOI]

Proceedings of the 33rd IEEE International Conference on Computer Design, 2015

Memory Errors in Modern Systems: The Good, The Bad, and The Ugly.

[BibT_eX]

[DOI]

Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015

NVMMU: A Non-volatile Memory Management Unit for Heterogeneous GPU-SSD Architectures.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015

Integrating 3D Resistive Memory Cache into GPGPU for Energy-Efficient Data Processing.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015

2014

Exploring the future of out-of-core computing with compute-local non-volatile memory.

[BibT_eX]

[DOI]

Myoungsoo Jung

Ellis Herbert Wilson III

Sci. Program., 2014

Abstract machine models and proxy architectures for exascale computing.

[BibT_eX]

[DOI]

Proceedings of the 1st International Workshop on Hardware-Software Co-Design for High Performance Computing, 2014

Variable-width datapath for on-chip network static power reduction.

[BibT_eX]

[DOI]

Proceedings of the Eighth IEEE/ACM International Symposium on Networks-on-Chip, 2014

OpenSoC Fabric: On-Chip Network Generator: Using Chisel to Generate a Parameterizable On-Chip Interconnect Fabric.

[BibT_eX]

[DOI]

Farzad Fatollahi-Fard

Proceedings of the 2014 International Workshop on Network on Chip Architectures, 2014

Collective memory transfers for multi-core chips.

[BibT_eX]

[DOI]

Alexander Williams

Samuel Williams

Proceedings of the 2014 International Conference on Supercomputing, 2014

Triple-A: a Non-SSD based autonomic all-flash array for high performance storage systems.

[BibT_eX]

[DOI]

Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014

2013

Exascale Computing Trends: Adjusting to the "New Normal"' for Computer Architecture.

[BibT_eX]

[DOI]

Peter M. Kogge

Comput. Sci. Eng., 2013

Software Design Space Exploration for Exascale Combustion Co-design.

[BibT_eX]

[DOI]

Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

A communications simulation methodology for AMR codes using task dependency analysis.

[BibT_eX]

[DOI]

Proceedings of the 3rd Workshop on Irregular Applications - Architectures and Algorithms, 2013

Design of a large-scale storage-class RRAM system.

[BibT_eX]

[DOI]

Myoungsoo Jung

Mahmut T. Kandemir

Proceedings of the International Conference on Supercomputing, 2013

Topic 14+16: High-Performance and Scientific Applications and Extreme-Scale Computing - (Introduction).

[BibT_eX]

[DOI]

Marie-Christine Sawley

Thomas C. Schulthess

Proceedings of the Euro-Par 2013 Parallel Processing, 2013

2012

A preliminary evaluation of the hardware acceleration of the Cray Gemini interconnect for PGAS languages and comparison with MPI.

[BibT_eX]

[DOI]

SIGMETRICS Perform. Evaluation Rev., 2012

Optimization of geometric multigrid for emerging multi- and manycore processors.

[BibT_eX]

[DOI]

Proceedings of the SC Conference on High Performance Computing Networking, 2012

The Analysis of Impact of Energy Efficiency Requirements on Programming Environments.

[BibT_eX]

[DOI]

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Billion-particle SIMD-friendly two-point correlation on large-scale HPC cluster systems.

[BibT_eX]

[DOI]

Proceedings of the SC Conference on High Performance Computing Networking, 2012

NANDFlashSim: Intrinsic latency variation aware NAND flash memory system modeling and simulation at microarchitecture level.

[BibT_eX]

[DOI]

Proceedings of the IEEE 28th Symposium on Mass Storage Systems and Technologies, 2012

Toward codesign in high performance computing systems.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE/ACM International Conference on Computer-Aided Design, 2012

Experiences with 100Gbps network applications.

[BibT_eX]

[DOI]

Proceedings of the DIDC'12, 2012

On the Role of Co-design in High Performance Computing.

[BibT_eX]

[DOI]

Proceedings of the Transition of HPC Towards Exascale Computing, 2012

2011

Green Flash: Climate Machine (LBNL).

[BibT_eX]

[DOI]

Proceedings of the Encyclopedia of Parallel Computing, 2011

The International Exascale Software Project roadmap.

[BibT_eX]

[DOI]

Bertrand Braunschweig

Int. J. High Perform. Comput. Appl., 2011

Rethinking Hardware-Software Codesign for Exascale Systems.

[BibT_eX]

[DOI]

Daniel J. Quinlan

Curtis L. Janssen

Computer, 2011

Extracting ultra-scale Lattice Boltzmann performance via hierarchical and distributed auto-tuning.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2011

Multithreaded global address space communication techniques for gyrokinetic fusion applications on ultra-scale platforms.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2011

Gyrokinetic toroidal simulations on leading multi- and manycore HPC systems.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2011

Hardware/software co-design for energy-efficient seismic modeling.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2011

Let there be light!: the future of memory systems is photonics and 3D stacking.

[BibT_eX]

[DOI]

Proceedings of the 2011 ACM SIGPLAN workshop on Memory Systems Performance and Correctness: held in conjunction with PLDI '11, 2011

2010

Communication Requirements and Interconnect Optimization for High-End Scientific Applications.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2010

Exascale Computing Technology Challenges.

[BibT_eX]

[DOI]

Sudip S. Dosanjh

John Morrison

Proceedings of the High Performance Computing for Computational Science - VECPAR 2010, 2010

Parallel I/O performance: From events to ensembles.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

An auto-tuning framework for parallel multicore stencil computations.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Exascale Computing and the Role of Co-Design.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing: From Grids and Clouds to Exascale, 2010

Silicon Nanophotonic Network-on-Chip Using TDM Arbitration.

[BibT_eX]

[DOI]

Proceedings of the IEEE 18th Annual Symposium on High Performance Interconnects, 2010

Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud.

[BibT_eX]

[DOI]

Proceedings of the Cloud Computing, Second International Conference, 2010

Defining future platform requirements for e-Science clouds.

[BibT_eX]

[DOI]

Proceedings of the 1st ACM Symposium on Cloud Computing, 2010

Auto-Tuning Stencil Computations on Multicore and Accelerators.

[BibT_eX]

[DOI]

Proceedings of the Scientific Computing with Multicore and Accelerators., 2010

2009

Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors.

[BibT_eX]

[DOI]

SIAM Rev., 2009

Optimization of sparse matrix-vector multiplication on emerging multicore platforms.

[BibT_eX]

[DOI]

Parallel Comput., 2009

HPC global file system performance analysis using a scientific-application derived benchmark.

[BibT_eX]

[DOI]

Parallel Comput., 2009

Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2009

Energy-Efficient Computing for Extreme-Scale Science.

[BibT_eX]

[DOI]

Computer, 2009

A design methodology for domain-optimized power-efficient supercomputing.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Memory-efficient optimization of Gyrokinetic particle-to-grid interpolation for multicore processors.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

A Comparison of Different Communication Structures for Scalable Parallel Three Dimensional FFTs in First Principles Codes.

[BibT_eX]

[DOI]

Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009

Analysis of photonic networks for a chip multiprocessor using scientific applications.

[BibT_eX]

[DOI]

Proceedings of the Third International Symposium on Networks-on-Chips, 2009

Scalability challenges for massively parallel AMR applications.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Improving Memory Subsystem Performance Using ViVA: Virtual Vector Architecture.

[BibT_eX]

[DOI]

Proceedings of the Architecture of Computing Systems, 2009

Storage Technology.

[BibT_eX]

[DOI]

Jason Hick

Proceedings of the Scientific Data Management - Challenges, Technology, and Deployment., 2009

2008

Towards Ultra-High Resolution Models of Climate and Weather.

[BibT_eX]

[DOI]

Michael F. Wehner

Leonid Oliker

Int. J. High Perform. Comput. Appl., 2008

Scientific Application Performance On Leading Scalar and Vector Supercomputering Platforms.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2008

Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark.

[BibT_eX]

[DOI]

Hongzhang Shan

Katie Antypas

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Lattice Boltzmann simulation optimization on leading multicore platforms.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Power efficiency in high performance computing.

[BibT_eX]

[DOI]

Shoaib Kamil

Erich Strohmaier

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

2007

Scientific Computing Kernels on the Cell Processor.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2007

Cactus Framework: Black Holes to Gamma Ray Bursts

[BibT_eX]

[DOI]

CoRR, 2007

Investigation of leading HPC I/O performance using a scientific-application derived benchmark.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

Scientific Application Performance on Candidate PetaScale Platforms.

[BibT_eX]

[DOI]

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Reconfigurable hybrid interconnection for static and dynamic scientific applications.

[BibT_eX]

[DOI]

Proceedings of the 4th Conference on Computing Frontiers, 2007

2006

Performance Evaluation of Scientific Applications on Modern Parallel Vector Systems.

[BibT_eX]

[DOI]

Jonathan Carter

Leonid Oliker

Proceedings of the High Performance Computing for Computational Science, 2006

HDF5-FastQuery: Accelerating Complex Queries on HDF Datasets using Fast Bitmap Indices.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Scientific and Statistical Database Management, 2006

The potential of the cell processor for scientific computing.

[BibT_eX]

[DOI]

Proceedings of the Third Conference on Computing Frontiers, 2006

Implicit and explicit optimizations for stencil computations.

[BibT_eX]

[DOI]

Proceedings of the 2006 workshop on Memory System Performance and Correctness, 2006

2005

The Astrophysics Simulation Collaboratory Portal: a framework for effective distributed research.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2005

Performance evaluation of the SX-6 vector architecture for scientific computations.

[BibT_eX]

[DOI]

Rob F. Van der Wijngaart

Concurr. Pract. Exp., 2005

Query-Driven Visualization of Large Data Sets.

[BibT_eX]

[DOI]

Proceedings of the 16th IEEE Visualization Conference, 2005

DEX: Increasing the Capability of Scientific Data Analysis Pipelines by Using Efficient Bitmap Indices to Accelerate Scientific Visualization.

[BibT_eX]

Proceedings of the 17th International Conference on Scientific and Statistical Database Management, 2005

Analyzing Ultra-Scale Application Communication Requirements for a Reconfigurable Hybrid Interconnect.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

Impact of modern memory subsystems on cache optimizations for stencil computations.

[BibT_eX]

[DOI]

Proceedings of the 2005 workshop on Memory System Performance, 2005

Consuming Network Bandwidth with Visapult.

[BibT_eX]

[DOI]

Wes Bethel

Proceedings of the Visualization Handbook., 2005

2004

Scientific Computations on Modern Parallel Vector Systems.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE SC2004 Conference on High Performance Networking and Computing, 2004

Identifying Performance Bottlenecks on Modern Microarchitectures Using an Adaptable Probe.

[BibT_eX]

[DOI]

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

2003

Enabling Applications on the Grid: A Gridlab Overview.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2003

The Grid and Future Visualization System Architectures.

[BibT_eX]

[DOI]

E. Wes Bethel

IEEE Computer Graphics and Applications, 2003

Deploying Web-Based Visual Exploration Tools on the Grid.

[BibT_eX]

[DOI]

IEEE Computer Graphics and Applications, 2003

Grid-Distributed Visualizations Using Connectionless Protocols.

[BibT_eX]

[DOI]

E. Wes Bethel