Joel S. Emer
Orcid: 0000-0002-3459-5466
According to our database1,
Joel S. Emer
authored at least 150 papers
between 1979 and 2024.
Collaborative distances:
Collaborative distances:
Awards
ACM Fellow
ACM Fellow 2004, "For contributions to computer architecture and performance analysis.".
IEEE Fellow
IEEE Fellow 2004, "For contributions to computer architecture and quantitative analysis of processor performance.".
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
Modeling Analog-Digital-Converter Energy and Area for Compute-In-Memory Accelerator Design.
CoRR, 2024
Onyx: A 12nm 756 GOPS/W Coarse-Grained Reconfigurable Array for Accelerating Dense and Sparse Applications.
Proceedings of the IEEE Symposium on VLSI Technology and Circuits 2024, 2024
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024
Azul: An Accelerator for Sparse Iterative Solvers Leveraging Distributed On-Chip Memory.
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
Mind the Gap: Attainable Data Movement and Operational Intensity Bounds for Tensor Algorithms.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
Proceedings of the 36th IEEE Hot Chips Symposium, 2024
Proceedings of the 2024 ACM Workshop on Highlights of Parallel Computing, 2024
2023
Symphony: Orchestrating Sparse and Dense Tensors with Hierarchical Heterogeneous Processing.
ACM Trans. Comput. Syst., 2023
Penetrating Shields: A Systematic Analysis of Memory Corruption Mitigations in the Spectre Era.
CoRR, 2023
Unified Convolution Framework: A compiler-based approach to support sparse convolutions.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
RAELLA: Reforming the Arithmetic for Efficient, Low-Resolution, and Low-Loss Analog PIM: No Retraining Required!
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023
Accelerating Sparse Data Orchestration via Dynamic Reflexive Tiling (Extended Abstract).
Proceedings of the 2023 ACM Workshop on Highlights of Parallel Computing, 2023
Proceedings of the Data Compression Conference, 2023
WACO: Learning Workload-Aware Co-optimization of the Format and Schedule of a Sparse Tensor Program.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
2022
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
Ruby: Improving Hardware Efficiency for Tensor Algebra Accelerators Through Imperfect Factorization.
Proceedings of the International IEEE Symposium on Performance Analysis of Systems and Software, 2022
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022
2021
Commun. ACM, 2021
Mentoring Opportunities in Computer Architecture: Analyzing the Past to Develop the Future.
Proceedings of the ACM/IEEE Workshop on Computer Architecture Education, 2021
Sparseloop: An Analytical, Energy-Focused Design Space Exploration Methodology for Sparse Tensor Accelerators.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2021
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2021
SpZip: Architectural Support for Effective Data Compression In Irregular Applications.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021
2020
Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01766-7, 2020
A 0.32-128 TOPS, Scalable Multi-Chip-Module-Based Deep Neural Network Inference Accelerator With Ground-Referenced Signaling in 16 nm.
IEEE J. Solid State Circuits, 2020
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
An Architecture-Level Energy and Area Estimator for Processing-In-Memory Accelerator Designs.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2020
2019
Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2019
A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator with Ground-Reference Signaling in 16nm.
Proceedings of the 2019 Symposium on VLSI Circuits, Kyoto, Japan, June 9-14, 2019, 2019
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2019
Accelergy: An Architecture-Level Energy Estimation Methodology for Accelerator Designs.
Proceedings of the International Conference on Computer-Aided Design, 2019
Proceedings of the International Conference on Computer-Aided Design, 2019
A 0.11 PJ/OP, 0.32-128 Tops, Scalable Multi-Chip-Module-Based Deep Neural Network Accelerator Designed with A High-Productivity vlsi Methodology.
Proceedings of the 2019 IEEE Hot Chips 31 Symposium (HCS), 2019
Buffets: An Efficient and Composable Storage Idiom for Explicit Decoupled Data Orchestration.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019
2018
IACR Cryptol. ePrint Arch., 2018
Eyeriss v2: A Flexible and High-Performance Accelerator for Emerging Deep Neural Networks.
CoRR, 2018
Harmonizing Speculative and Non-Speculative Execution in Architectures for Ordered Parallelism.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018
Proceedings of the 55th Annual Design Automation Conference, 2018
Proceedings of the 2018 IEEE Custom Integrated Circuits Conference, 2018
2017
(FPL 2015) Scavenger: Automating the Construction of Application-Optimized Memory Hierarchies.
ACM Trans. Reconfigurable Technol. Syst., 2017
Proc. IEEE, 2017
IEEE Micro, 2017
Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks.
IEEE J. Solid State Circuits, 2017
CoRR, 2017
Understanding error propagation in deep learning neural network (DNN) accelerators and applications.
Proceedings of the International Conference for High Performance Computing, 2017
SASSIFI: An architecture-level fault injection tool for GPU application resilience evaluation.
Proceedings of the 2017 IEEE International Symposium on Performance Analysis of Systems and Software, 2017
Towards closing the energy gap between HOG and CNN features for embedded vision (Invited paper).
Proceedings of the IEEE International Symposium on Circuits and Systems, 2017
Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017
Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017
Proceedings of the 51st Asilomar Conference on Signals, Systems, and Computers, 2017
Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017
2016
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016
Proceedings of the Second International Symposium on Memory Systems, 2016
14.5 Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks.
Proceedings of the 2016 IEEE International Solid-State Circuits Conference, 2016
Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016
Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016
2015
Efficient Control and Communication Paradigms for Coarse-Grained Spatial Architectures.
ACM Trans. Comput. Syst., 2015
A fast and accurate analytical technique to compute the AVF of sequential bits in a processor.
Proceedings of the 48th International Symposium on Microarchitecture, 2015
Proceedings of the 48th International Symposium on Microarchitecture, 2015
High performing cache hierarchies for server workloads: Relaxing inclusion to capture the latency benefits of exclusive caches.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
Proceedings of the 25th International Conference on Field Programmable Logic and Applications, 2015
2014
IEEE Micro, 2014
Proceedings of the 2014 IEEE International Symposium on Performance Analysis of Systems and Software, 2014
Proceedings of the 24th International Conference on Field Programmable Logic and Applications, 2014
Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2014
2013
ACM Trans. Archit. Code Optim., 2013
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013
Proceedings of the 23rd International Conference on Field programmable Logic and Applications, 2013
2012
ACM Trans. Archit. Code Optim., 2012
Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012
Proceedings of the 2012 International Conference on Field-Programmable Technology, 2012
Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012
Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, 2012
2011
Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011
Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011
Proceedings of the ACM/SIGDA 19th International Symposium on Field Programmable Gate Arrays, 2011
2010
Achieving Non-Inclusive Cache Performance with Inclusive Caches: Temporal Locality Aware (TLA) Cache Management Policies.
Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010
Design contest overview: Combined architecture for network stream categorization and intrusion detection (CANSCID).
Proceedings of the 8th ACM/IEEE International Conference on Formal Methods and Models for Codesign (MEMOCODE 2010), 2010
Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010
2009
A-Port Networks: Preserving the Timed Behavior of Synchronous Systems for Modeling on FPGAs.
ACM Trans. Reconfigurable Technol. Syst., 2009
Guest Editors' Introduction: Top Picks from the 2008 Computer Architecture Conferences.
IEEE Micro, 2009
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2009
CAMP: A technique to estimate per-structure power at run-time using a few simple parameters.
Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009
Proceedings of the 46th Design Automation Conference, 2009
2008
IEEE Micro, 2008
IEEE Comput. Archit. Lett., 2008
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2008
Proceedings of the ACM/SIGDA 16th International Symposium on Field Programmable Gate Arrays, 2008
Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, 2008
2007
Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007
Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007
2005
Proceedings of the 32st International Symposium on Computer Architecture (ISCA 2005), 2005
Proceedings of the 11th International Conference on High-Performance Computer Architecture (HPCA-11 2005), 2005
2004
Proceedings of the 10th IEEE Pacific Rim International Symposium on Dependable Computing (PRDC 2004), 2004
Proceedings of the 31st International Symposium on Computer Architecture (ISCA 2004), 2004
2003
A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor.
Proceedings of the 36th Annual International Symposium on Microarchitecture, 2003
2002
Proceedings of the 29th International Symposium on Computer Architecture (ISCA 2002), 2002
Proceedings of the Eighth International Symposium on High-Performance Computer Architecture (HPCA'02), 2002
Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-X), 2002
2000
Proceedings of the Sixth International Symposium on High-Performance Computer Architecture, 2000
1999
Proceedings of the 32nd Annual IEEE/ACM International Symposium on Microarchitecture, 1999
Proceedings of the 13th international conference on Supercomputing, 1999
1998
Proceedings of the 25 Years of the International Symposia on Computer Architecture (Selected Papers)., 1998
Proceedings of the 25 Years of the International Symposia on Computer Architecture (Selected Papers)., 1998
Proceedings of the 25th Annual International Symposium on Computer Architecture, 1998
1997
Converting Thread-Level Parallelism to Instruction-Level Parallelism via Simultaneous Multithreading.
ACM Trans. Comput. Syst., 1997
IEEE Micro, 1997
Proceedings of the 24th International Symposium on Computer Architecture, 1997
1996
Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor.
Proceedings of the 23rd Annual International Symposium on Computer Architecture, 1996
Proceedings of the Second International Symposium on High-Performance Computer Architecture, 1996
1995
Proceedings of the 28th Annual International Symposium on Microarchitecture, Ann Arbor, Michigan, USA, November 29, 1995
Proceedings of the 22nd Annual International Symposium on Computer Architecture, 1995
1989
IEEE Trans. Software Eng., 1989
1988
Proceedings of the 8th International Conference on Distributed Computing Systems, 1988
1986
Proceedings of the 2nd ACM SIGOPS European Workshop, 1986
1985
ACM Trans. Comput. Syst., 1985
1979
PhD thesis, 1979