John Shalf
Orcid: 0000-0002-0608-3690Affiliations:
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
According to our database1,
John Shalf
authored at least 157 papers
between 1996 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on zbmath.org
-
on orcid.org
-
on crd.lbl.gov
On csauthors.net:
Bibliography
2024
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
Unlocking the Potential: Performance Portability of Graph Algorithms on Kokkos Framework.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
2023
The 2023 Society for Industrial and Applied Mathematics Conference on Computational Science and Engineering.
Comput. Sci. Eng., 2023
Fast Community Detection in Graphs with Infomap Method using Accelerated Sparse Accumulation.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
Fast Parallel Index Construction for Efficient K-truss-based Local Community Detection in Large Graphs.
Proceedings of the 52nd International Conference on Parallel Processing, 2023
Proceedings of the IEEE High Performance Extreme Computing Conference, 2023
Efficient Intra-Rack Resource Disaggregation for HPC Using Co-Packaged DWDM Photonics.
Proceedings of the IEEE International Conference on Cluster Computing, 2023
2022
ACM Trans. Archit. Code Optim., 2022
ACM Trans. Archit. Code Optim., 2022
2021
Comput. Sci. Eng., 2021
Comput. Sci. Eng., 2021
Proceedings of the 7th IEEE/ACM Workshop on the LLVM Compiler Infrastructure in HPC, 2021
A systematic approach to improving data locality across Fourier transforms and linear algebra operations.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021
HyPC-Map: A Hybrid Parallel Community Detection Algorithm Using Information-Theoretic Approach.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021
2020
PINE: Photonic Integrated Networked Energy efficient datacenters (ENLITENED Program) [Invited].
JOCN, 2020
TIGER: Topology-aware Assignment using Ising machines Application to Classical Algorithm Tasks and Quantum Circuit Gates.
CoRR, 2020
Proceedings of the International Conference for High Performance Computing, 2020
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020
Understanding Quantum Control Processor Capabilities and Limitations through Circuit Characterization.
Proceedings of the International Conference on Rebooting Computing, 2020
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020
2019
Proceedings of the International Conference for High Performance Computing, 2019
Proceedings of the Optical Fiber Communications Conference and Exhibition, 2019
PARADISE - Post-Moore Architecture and Accelerator Design Space Exploration Using Device Level Simulation and Experiments.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2019
Proceedings of the 16th ACM International Conference on Computing Frontiers, 2019
Proceedings of the 16th ACM International Conference on Computing Frontiers, 2019
2018
IEEE Comput. Archit. Lett., 2018
Proceedings of the International Conference for High Performance Computing, 2018
Architectural Opportunities and Challenges from Emerging Photonics in Future Systems.
Proceedings of the Photonics in Switching and Computing, 2018
Proceedings of the Platform for Advanced Scientific Computing Conference, 2018
Proceedings of the International Symposium on Memory Systems, 2018
2017
IEEE Trans. Parallel Distributed Syst., 2017
Towards an Integrated Strategy to Preserve Digital Computing Performance Scaling Using Emerging Technologies.
Proceedings of the High Performance Computing, 2017
Proceedings of the High Performance Computing, 2017
CASPER - Configurable design space exploration of programmable architectures for machine learning using beyond moore devices.
Proceedings of the IEEE/ACM International Symposium on Nanoscale Architectures, 2017
TraceTracker: Hardware/software co-evaluation for large-scale I/O workload reconstruction.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017
Proceedings of the 46th International Conference on Parallel Processing, 2017
Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017
Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017
Proceedings of the Euro-Par 2017: Parallel Processing - 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28, 2017
APHiD: Hierarchical Task Placement to Enable a Tapered Fat Tree Topology for Lower Power and Cost in HPC Networks.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017
2016
ACM Trans. Storage, 2016
SIAM J. Sci. Comput., 2016
Proceedings of the High Performance Computing - 31st International Conference, 2016
Perilla: metadata-based optimizations of an asynchronous runtime for adaptive mesh refinement.
Proceedings of the International Conference for High Performance Computing, 2016
Characterizing the Performance of Hybrid Memory Cube Using ApexMAP Application Probes.
Proceedings of the Second International Symposium on Memory Systems, 2016
Proceedings of the 2016 IEEE International Symposium on Performance Analysis of Systems and Software, 2016
Proceedings of the 2016 IEEE High Performance Extreme Computing Conference, 2016
2015
Int. J. Parallel Program., 2015
Int. J. High Perform. Comput. Appl., 2015
OpenNVM: An open-sourced FPGA-based NVM controller for low level memory characterization.
Proceedings of the 33rd IEEE International Conference on Computer Design, 2015
Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015
NVMMU: A Non-volatile Memory Management Unit for Heterogeneous GPU-SSD Architectures.
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015
Integrating 3D Resistive Memory Cache into GPGPU for Energy-Efficient Data Processing.
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015
2014
Exploring the future of out-of-core computing with compute-local non-volatile memory.
Sci. Program., 2014
Proceedings of the 1st International Workshop on Hardware-Software Co-Design for High Performance Computing, 2014
Proceedings of the Eighth IEEE/ACM International Symposium on Networks-on-Chip, 2014
OpenSoC Fabric: On-Chip Network Generator: Using Chisel to Generate a Parameterizable On-Chip Interconnect Fabric.
Proceedings of the 2014 International Workshop on Network on Chip Architectures, 2014
Proceedings of the 2014 International Conference on Supercomputing, 2014
Triple-A: a Non-SSD based autonomic all-flash array for high performance storage systems.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014
2013
Comput. Sci. Eng., 2013
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013
A communications simulation methodology for AMR codes using task dependency analysis.
Proceedings of the 3rd Workshop on Irregular Applications - Architectures and Algorithms, 2013
Proceedings of the International Conference on Supercomputing, 2013
Topic 14+16: High-Performance and Scientific Applications and Extreme-Scale Computing - (Introduction).
Proceedings of the Euro-Par 2013 Parallel Processing, 2013
2012
A preliminary evaluation of the hardware acceleration of the Cray Gemini interconnect for PGAS languages and comparison with MPI.
SIGMETRICS Perform. Evaluation Rev., 2012
Proceedings of the SC Conference on High Performance Computing Networking, 2012
The Analysis of Impact of Energy Efficiency Requirements on Programming Environments.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012
Billion-particle SIMD-friendly two-point correlation on large-scale HPC cluster systems.
Proceedings of the SC Conference on High Performance Computing Networking, 2012
NANDFlashSim: Intrinsic latency variation aware NAND flash memory system modeling and simulation at microarchitecture level.
Proceedings of the IEEE 28th Symposium on Mass Storage Systems and Technologies, 2012
Proceedings of the 2012 IEEE/ACM International Conference on Computer-Aided Design, 2012
Proceedings of the Transition of HPC Towards Exascale Computing, 2012
2011
Proceedings of the Encyclopedia of Parallel Computing, 2011
Int. J. High Perform. Comput. Appl., 2011
Extracting ultra-scale Lattice Boltzmann performance via hierarchical and distributed auto-tuning.
Proceedings of the Conference on High Performance Computing Networking, 2011
Multithreaded global address space communication techniques for gyrokinetic fusion applications on ultra-scale platforms.
Proceedings of the Conference on High Performance Computing Networking, 2011
Proceedings of the Conference on High Performance Computing Networking, 2011
Proceedings of the Conference on High Performance Computing Networking, 2011
Proceedings of the 2011 ACM SIGPLAN workshop on Memory Systems Performance and Correctness: held in conjunction with PLDI '11, 2011
2010
Communication Requirements and Interconnect Optimization for High-End Scientific Applications.
IEEE Trans. Parallel Distributed Syst., 2010
Proceedings of the High Performance Computing for Computational Science - VECPAR 2010, 2010
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010
Proceedings of the High Performance Computing: From Grids and Clouds to Exascale, 2010
Proceedings of the IEEE 18th Annual Symposium on High Performance Interconnects, 2010
Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud.
Proceedings of the Cloud Computing, Second International Conference, 2010
Proceedings of the 1st ACM Symposium on Cloud Computing, 2010
Proceedings of the Scientific Computing with Multicore and Accelerators., 2010
2009
Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors.
SIAM Rev., 2009
Parallel Comput., 2009
HPC global file system performance analysis using a scientific-application derived benchmark.
Parallel Comput., 2009
Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms.
J. Parallel Distributed Comput., 2009
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009
Memory-efficient optimization of Gyrokinetic particle-to-grid interpolation for multicore processors.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009
A Comparison of Different Communication Structures for Scalable Parallel Three Dimensional FFTs in First Principles Codes.
Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009
Analysis of photonic networks for a chip multiprocessor using scientific applications.
Proceedings of the Third International Symposium on Networks-on-Chips, 2009
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009
Proceedings of the Architecture of Computing Systems, 2009
Proceedings of the Scientific Data Management - Challenges, Technology, and Deployment., 2009
2008
Int. J. High Perform. Comput. Appl., 2008
Scientific Application Performance On Leading Scalar and Vector Supercomputering Platforms.
Int. J. High Perform. Comput. Appl., 2008
Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
2007
Investigation of leading HPC I/O performance using a scientific-application derived benchmark.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007
Reconfigurable hybrid interconnection for static and dynamic scientific applications.
Proceedings of the 4th Conference on Computing Frontiers, 2007
2006
Proceedings of the High Performance Computing for Computational Science, 2006
HDF5-FastQuery: Accelerating Complex Queries on HDF Datasets using Fast Bitmap Indices.
Proceedings of the 18th International Conference on Scientific and Statistical Database Management, 2006
Proceedings of the Third Conference on Computing Frontiers, 2006
Proceedings of the 2006 workshop on Memory System Performance and Correctness, 2006
2005
The Astrophysics Simulation Collaboratory Portal: a framework for effective distributed research.
Future Gener. Comput. Syst., 2005
Concurr. Pract. Exp., 2005
Proceedings of the 16th IEEE Visualization Conference, 2005
DEX: Increasing the Capability of Scientific Data Analysis Pipelines by Using Efficient Bitmap Indices to Accelerate Scientific Visualization.
Proceedings of the 17th International Conference on Scientific and Statistical Database Management, 2005
Analyzing Ultra-Scale Application Communication Requirements for a Reconfigurable Hybrid Interconnect.
Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005
Proceedings of the 2005 workshop on Memory System Performance, 2005
Proceedings of the Visualization Handbook., 2005
2004
Proceedings of the ACM/IEEE SC2004 Conference on High Performance Networking and Computing, 2004
Identifying Performance Bottlenecks on Modern Microarchitectures Using an Adaptable Probe.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004
2003
Int. J. High Perform. Comput. Appl., 2003
IEEE Computer Graphics and Applications, 2003
IEEE Computer Graphics and Applications, 2003
IEEE Computer Graphics and Applications, 2003
Interoperability of Visualization Software and Data Models is NOT an Achievable Goal.
Proceedings of the 14th IEEE Visualization Conference, 2003
Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations.
Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003
Proceedings of the IEEE Symposium on Parallel and Large-Data Visualization and Graphics 2003, 2003
2002
Concurr. Comput. Pract. Exp., 2002
The Astrophysics Simulation Collaboratory: A Science Portal Enabling Community Software Development.
Clust. Comput., 2002
Proceedings of the High Performance Computing for Computational Science, 2002
2001
The Cactus Worm: Experiments with Dynamic Resource Discovery and Allocation in a Grid Environment.
Int. J. High Perform. Comput. Appl., 2001
Proceedings of the 6th International Fall Workshop on Vision, Modeling, and Visualization, 2001
Proceedings of the 3rd Joint Eurographics - IEEE TCVG Symposium on Visualization, 2001
The Astrophysics Simulation Collaboratory Portal: A Science Portal Enabling Community Software Development.
Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing (HPDC-10 2001), 2001
2000
Proceedings of the Ninth IEEE International Symposium on High Performance Distributed Computing, 2000
1999
Diving deep: data-management and visualization strategies for adaptive mesh refinement simulations.
Comput. Sci. Eng., 1999
Numerical Relativity in a Distributed Environment.
Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999
1996
Galaxies Collide On the I-Way: an Example of Heterogeneous Wide-Area Collaborative Supercomputing.
Int. J. High Perform. Comput. Appl., 1996