David Brooks

Proceedings of the IEEE Symposium on VLSI Technology and Circuits 2024, 2024

14.5 A 12nm Linux-SMP-Capable RISC-V SoC with 14 Accelerator Types, Distributed Hardware Power Management and Flexible NoC-Based Data Orchestration.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Solid-State Circuits Conference, 2024

Generative AI Beyond LLMs: System Implications of Multi-Modal Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024

JointNF: Enhancing DNN Performance through Adaptive N: M Pruning across both Weight and Activation.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design, 2024

MAD-Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems.

[BibT_eX]

[DOI]

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

BlitzCoin: Fully Decentralized Hardware Power Management for Accelerator-Rich SoCs.

[BibT_eX]

[DOI]

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

VelociTI: An Architecture-level Performance Modeling Framework for Trapped Ion Quantum Computers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Workload Characterization, 2024

Guess & Sketch: Language Model Guided Transpilation.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

CAMEL: Co-Designing AI Models and eDRAMs for Efficient On-Device Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

GPU-based Private Information Retrieval for On-Device Machine Learning Inference.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023

SoCProbe: Compositional Post-Silicon Validation of Heterogeneous NoC-Based SoCs.

[BibT_eX]

[DOI]

IEEE Des. Test, December, 2023

Abisko: Deep codesign of an architecture for spiking neural networks using novel neuromorphic materials.

[BibT_eX]

[DOI]

Marc González Tallada

Narasinga Rao Miniskar

Int. J. High Perform. Comput. Appl., July, 2023

Early DSE and Automatic Generation of Coarse-grained Merged Accelerators.

[BibT_eX]

[DOI]

Iulian Brumar

Georgios Zacharopoulos

ACM Trans. Embed. Comput. Syst., March, 2023

A 16-nm SoC for Noise-Robust Speech and NLP Edge AI Inference With Bayesian Sound Source Separation and Attention-Based DNNs.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, February, 2023

A Binary-Activation, Multi-Level Weight RNN and Training Algorithm for ADC-/DAC-Free and Noise-Resilient Processing-in-Memory Inference With eNVM.

[BibT_eX]

[DOI]

Siming Ma

IEEE Trans. Emerg. Top. Comput., 2023

Trireme: Exploration of Hierarchical Multi-level Parallelism for Hardware Acceleration.

[BibT_eX]

[DOI]

Georgios Zacharopoulos

ACM Trans. Embed. Comput. Syst., 2023

Architectural CO2 Footprint Tool: Designing Sustainable Computer Systems With an Architectural Carbon Modeling Tool.

[BibT_eX]

[DOI]

IEEE Micro, 2023

Carbon Responder: Coordinating Demand Response for the Datacenter Fleet.

[BibT_eX]

[DOI]

CoRR, 2023

INT2.1: Towards Fine-Tunable Quantized Large Language Models with Error Correction through Low-Rank Adaptation.

[BibT_eX]

[DOI]

CoRR, 2023

CAMEL: Co-Designing AI Models and Embedded DRAMs for Efficient On-Device Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Design Space Exploration and Optimization for Carbon-Efficient Extended Reality Systems.

[BibT_eX]

[DOI]

CoRR, 2023

GreenScale: Carbon-Aware Systems for Edge Computing.

[BibT_eX]

[DOI]

CoRR, 2023

PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices.

[BibT_eX]

[DOI]

CoRR, 2023

Hardware Resilience Properties of Text-Guided Image Classifiers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

S3: Increasing GPU Utilization during Generative Inference for Higher Throughput.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A 12nm 18.1TFLOPs/W Sparse Transformer Processor with Entropy-Based Early Exit, Mixed-Precision Predication and Fine-Grained Power Management.

[BibT_eX]

[DOI]

Emmanouil-Ioannis Farsarakis

Proceedings of the IEEE International Solid- State Circuits Conference, 2023

Is the Future Cold or Tall? Design Space Exploration of Cryogenic and 3D Embedded Cache Memory.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

Characterizing the Scalability of Graph Convolutional Networks on Intel® PIUMA.

[BibT_eX]

[DOI]

Matthew Joseph Adiletta

Jesmin Jahan Tithi

Gerasimos Gerogiannis

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

Carbon-Efficient Design Optimization for Computing Systems.

[BibT_eX]

[DOI]

Proceedings of the 2nd Workshop on Sustainable Computer Systems, 2023

MAVFI: An End-to-End Fault Analysis Framework with Anomaly Detection and Recovery for Micro Aerial Vehicles.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

MP-Rec: Hardware-Software Co-design to Enable Multi-path Recommendation.

[BibT_eX]

[DOI]

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

Carbon Explorer: A Holistic Framework for Designing Carbon Aware Datacenters.

[BibT_eX]

[DOI]

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022

End-to-End Synthesis of Dynamically Controlled Machine Learning Accelerators.

[BibT_eX]

[DOI]

Serena Curzel

IEEE Trans. Computers, 2022

Chasing Carbon: The Elusive Environmental Footprint of Computing.

[BibT_eX]

[DOI]

IEEE Micro, 2022

Bridging Python to Silicon: The SODA Toolchain.

[BibT_eX]

[DOI]

IEEE Micro, 2022

SMIV: A 16-nm 25-mm² SoC for IoT With Arm Cortex-A53, eFPGA, and Coherent Accelerators.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2022

Architectural Implications of Embedding Dimension during GCN on CPU and GPU.

[BibT_eX]

[DOI]

Matthew Adiletta

CoRR, 2022

Impala: Low-Latency, Communication-Efficient Private Deep Learning Inference.

[BibT_eX]

[DOI]

CoRR, 2022

Tabula: Efficiently Computing Nonlinear Activation Functions for Secure Neural Network Inference.

[BibT_eX]

[DOI]

CoRR, 2022

A Holistic Approach for Designing Carbon Aware Datacenters.

[BibT_eX]

[DOI]

CoRR, 2022

Trireme: Exploring Hierarchical Multi-Level Parallelism for Domain Specific Hardware Acceleration.

[BibT_eX]

[DOI]

Georgios Zacharopoulos

CoRR, 2022

Sustainable AI: Environmental Implications, Challenges and Opportunities.

[BibT_eX]

[DOI]

Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

Automatic Domain-Specific SoC Design for Autonomous Unmanned Aerial Vehicles.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

ACT: designing sustainable computer systems with an architectural carbon modeling tool.

[BibT_eX]

[DOI]

Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

ASAP: automatic synthesis of area-efficient and precision-aware CGRAs.

[BibT_eX]

[DOI]

Ganesh Gopalakrishnan

Ang Li

Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

A Scalable Methodology for Agile Chip Development with Open-Source Hardware Components.

[BibT_eX]

[DOI]

Nandhini Chandramoorthy

Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022

NVMExplorer: A Framework for Cross-Stack Comparisons of Embedded Non-Volatile Memories.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

CoopMC: Algorithm-Architecture Co-Optimization for Markov Chain Monte Carlo Accelerators.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

From High-Level Frameworks to custom Silicon with SODA.

[BibT_eX]

[DOI]

Serena Curzel

Proceedings of the 2022 IEEE Hot Chips 34 Symposium, 2022

A 12nm Agile-Designed SoC for Swarm-Based Perception with Heterogeneous IP Blocks, a Reconfigurable Memory Hierarchy, and an 800MHz Multi-Plane NoC.

[BibT_eX]

[DOI]

Tianyu Jia

Paolo Mantovani

Nandhini Chandramoorthy

Proceedings of the 48th IEEE European Solid State Circuits Conference, 2022

GoldenEye: A Platform for Evaluating Emerging Numerical Data Formats in DNN Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2022

OMU: A Probabilistic 3D Occupancy Mapping Accelerator for Real-time OctoMap at the Edge.

[BibT_eX]

[DOI]

Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

A joint management middleware to improve training performance of deep recommendation systems with SSDs.

[BibT_eX]

[DOI]

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021

Exploiting Parallelism Opportunities with Deep Learning Frameworks.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2021

Sustainable AI: Environmental Implications, Challenges and Opportunities.

[BibT_eX]

[DOI]

CoRR, 2021

MAVFI: An End-to-End Fault Analysis Framework with Anomaly Detection and Recovery for Micro Aerial Vehicles.

[BibT_eX]

[DOI]

CoRR, 2021

Machine Learning-Based Automated Design Space Exploration for Autonomous Aerial Robots.

[BibT_eX]

[DOI]

CoRR, 2021

EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

9.8 A 25mm2 SoC for IoT Devices with 18ms Noise-Robust Speech-to-Text Latency via Bayesian Speech Denoising and Attention-Based Sequence-to-Sequence DNN Speech Recognition in 16nm FinFET.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Solid-State Circuits Conference, 2021

Application-driven Design Exploration for Dense Ferroelectric Embedded Non-volatile Memories.

[BibT_eX]

[DOI]

Mohammad Mehdi Sharifi

Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2021

Gradient Disaggregation: Breaking Privacy in Federated Learning by Reconstructing the User Participant Matrix.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Cheetah: Optimizing and Accelerating Homomorphic Encryption for Private Inference.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

SM6: A 16nm System-on-Chip for Accurate and Noise-Robust Attention-Based NLP Applications : The 33rd Hot Chips Symposium - August 22-24, 2021.

[BibT_eX]

[DOI]

Proceedings of the IEEE Hot Chips 33 Symposium, 2021

RecSSD: near data processing for solid state drive based recommendation inference.

[BibT_eX]

[DOI]

Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

Towards Automatic and Agile AI/ML Accelerator Design with End-to-End Synthesis.

[BibT_eX]

[DOI]

Jeff Jun Zhang

Antonino Tumeo

Proceedings of the 32nd IEEE International Conference on Application-specific Systems, 2021

FlexACC: A Programmable Accelerator with Application-Specific ISA for Flexible Deep Neural Network Inference.

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE International Conference on Application-specific Systems, 2021

2020

SMAUG: End-to-End Full-Stack Simulation Infrastructure for Deep Learning Workloads.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2020

CHIPKIT: An Agile, Reusable Open-Source Framework for Rapid Test Chip Development.

[BibT_eX]

[DOI]

IEEE Micro, 2020

EdgeBERT: Optimizing On-Chip Inference for Multi-Task NLP.

[BibT_eX]

[DOI]

CoRR, 2020

Cheetah: Optimizations and Methods for PrivacyPreserving Inference via Homomorphic Encryption.

[BibT_eX]

[DOI]

CoRR, 2020

CHIPKIT: An agile, reusable open-source framework for rapid test chip development.

[BibT_eX]

[DOI]

CoRR, 2020

The Sky Is Not the Limit: A Visual Performance Model for Cyber-Physical Co-Design in Autonomous Machines.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2020

A 3mm2 Programmable Bayesian Inference Accelerator for Unsupervised Machine Perception using Parallel Gibbs Sampling in 16nm.

[BibT_eX]

[DOI]

Proceedings of the IEEE Symposium on VLSI Circuits, 2020

A Systematic Methodology for Analysis of Deep Learning Hardware and Software Platforms.

[BibT_eX]

[DOI]

Yu Wang

Proceedings of the Third Conference on Machine Learning and Systems, 2020

MLPerf Training Benchmark.

[BibT_eX]

[DOI]

Proceedings of the Third Conference on Machine Learning and Systems, 2020

A comprehensive methodology to determine optimal coherence interfaces for many-accelerator SoCs.

[BibT_eX]

[DOI]

Proceedings of the ISLPED '20: ACM/IEEE International Symposium on Low Power Electronics and Design, 2020

RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing.

[BibT_eX]

[DOI]

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

DeepRecSys: A System for Optimizing End-To-End At-Scale Neural Recommendation Inference.

[BibT_eX]

[DOI]

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

Cross-Stack Workload Characterization of Deep Recommendation Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Workload Characterization, 2020

SODA: a New Synthesis Infrastructure for Agile Hardware Design of Machine Learning Accelerators.

[BibT_eX]

[DOI]

Marco Minutoli

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

The Architectural Implications of Facebook's DNN-Based Personalized Recommendation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

A Scalable Bayesian Inference Accelerator for Unsupervised Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Hot Chips 32 Symposium, 2020

Emerging Neural Workloads and Their Impact on Hardware.

[BibT_eX]

[DOI]

Ann Franchesca Laguna

Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

Invited: Software Defined Accelerators From Learning Tools Environment.

[BibT_eX]

[DOI]

Antonino Tumeo

Marco Minutoli

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Algorithm-Hardware Co-Design of Adaptive Floating-Point Encodings for Resilient Deep Learning Inference.

[BibT_eX]

[DOI]

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

2019

Predicting New Workload or CPU Performance by Analyzing Public Datasets.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2019

MEMTI: Optimizing On-Chip Nonvolatile Storage for Visual Multitask Inference at the Edge.

[BibT_eX]

[DOI]

IEEE Micro, 2019

A 16-nm Always-On DNN Processor With Adaptive Clocking and Multi-Cycle Banked SRAMs.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2019

A binary-activation, multi-level weight RNN and training algorithm for processing-in-memory inference with eNVM.

[BibT_eX]

[DOI]

Siming Ma

CoRR, 2019

MLPerf Training Benchmark.

[BibT_eX]

[DOI]

CoRR, 2019

AdaptivFloat: A Floating-point based Data Type for Resilient Deep Learning Inference.

[BibT_eX]

[DOI]

CoRR, 2019

Benchmarking TPU, GPU, and CPU Platforms for Deep Learning.

[BibT_eX]

[DOI]

Yu Wang

CoRR, 2019

The Architectural Implications of Facebook's DNN-based Personalized Recommendation.

[BibT_eX]

[DOI]

CoRR, 2019

Determining Optimal Coherency Interface for Many-Accelerator SoCs Using Bayesian Optimization.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2019

A 16nm 25mm2 SoC with a 54.5x Flexibility-Efficiency Range from Dual-Core Arm Cortex-A53 to eFPGA and Cache-Coherent Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 2019 Symposium on VLSI Circuits, Kyoto, Japan, June 9-14, 2019, 2019

CHAMPVis: Comparative Hierarchical Analysis of Microarchitectural Performance.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Workshop on Programming and Performance Visualization Tools, 2019

MaxNVM: Maximizing DNN Storage Density and Inference Efficiency with Sparse Encoding and Error Mitigation.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

Demystifying Bayesian Inference Workloads.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2019

Application of Approximate Matrix Multiplication to Neural Networks and Distributed SLAM.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

Machine Learning at Facebook: Understanding Inference at the Edge.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

Accelerating Bayesian Inference on Structured Graphs Using Parallel Gibbs Sampling.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019

FlexGibbs: Reconfigurable Parallel Gibbs Sampling Accelerator for Structured Graphs.

[BibT_eX]

[DOI]

Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2019

MASR: A Modular Accelerator for Sparse RNNs.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018

Assisting High-Level Synthesis Improve SpMV Benchmark Through Dynamic Dependence Analysis.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. II Express Briefs, 2018

An Area-Efficient 8-Bit Single-Ended ADC With Extended Input Voltage Range.

[BibT_eX]

[DOI]

Simon Chaput

IEEE Trans. Circuits Syst. II Express Briefs, 2018

DNN Engine: A 28-nm Timing-Error Tolerant Sparse Deep Neural Network Processor for IoT Applications.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2018

Cloud No Longer a Silver Bullet, Edge to the Rescue.

[BibT_eX]

[DOI]

Yuhao Zhu

CoRR, 2018

Co-designed systems for deep learning hardware accelerators.

[BibT_eX]

[DOI]

Proceedings of the 2018 International Symposium on VLSI Design, 2018

Weightless: Lossy weight encoding for deep neural network compression.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

A Wide Dynamic Range Sparse FC-DNN Processor with Multi-Cycle Banked SRAM Read and Adaptive Clocking in 16nm FinFET.

[BibT_eX]

[DOI]

Proceedings of the 44th IEEE European Solid State Circuits Conference, 2018

Ares: a framework for quantifying the resilience of deep neural networks.

[BibT_eX]

[DOI]

Proceedings of the 55th Annual Design Automation Conference, 2018

On-chip deep neural network storage with multi-level eNVM.

[BibT_eX]

[DOI]

Proceedings of the 55th Annual Design Automation Conference, 2018

2017

Deep Learning for Computer Architects

[BibT_eX]

[DOI]

Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01756-8, 2017

A 16-Core Voltage-Stacked System With Adaptive Clocking and an Integrated Switched-Capacitor DC-DC Converter.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2017

Cognitive Computing Safety: The New Horizon for Reliability / The Design and Evolution of Deep Learning Workloads.

[BibT_eX]

[DOI]

IEEE Micro, 2017

Ultra-Low-Power Processors.

[BibT_eX]

[DOI]

John Sartori

IEEE Micro, 2017

2017 International Symposium on Computer Architecture Influential Paper Award.

[BibT_eX]

[DOI]

IEEE Micro, 2017

A Fully Integrated Battery-Powered System-on-Chip in 40-nm CMOS for Closed-Loop Control of Insect-Scale Pico-Aerial Vehicle.

[BibT_eX]

[DOI]

Pierre-Emile J. Duhamel

Robert J. Wood

IEEE J. Solid State Circuits, 2017

CARB: A C-State Power Management Arbiter for Latency-Critical Workloads.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2017

Automatically accelerating non-numerical programs by architecture-compiler co-design.

[BibT_eX]

[DOI]

Commun. ACM, 2017

Methods and infrastructure in the era of accelerator-centric architectures.

[BibT_eX]

[DOI]

Proceedings of the IEEE 60th International Midwest Symposium on Circuits and Systems, 2017

14.3 A 28nm SoC with a 1.2GHz 568nJ/prediction sparse deep-neural-network engine with >0.1 timing error rate tolerance for IoT applications.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Solid-State Circuits Conference, 2017

21.5 A 3-to-5V input 100Vpp output 57.7mW 0.42% THD+N highly integrated piezoelectric actuator driver.

[BibT_eX]

[DOI]

Simon Chaput

Proceedings of the 2017 IEEE International Solid-State Circuits Conference, 2017

A case for efficient accelerator design space exploration via Bayesian optimization.

[BibT_eX]

[DOI]

Brandon Reagen

Parthasarathy Ranganathan

Proceedings of the 2017 IEEE/ACM International Symposium on Low Power Electronics and Design, 2017

Using dynamic dependence analysis to improve the quality of high-level synthesis designs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Circuits and Systems, 2017

Applications of Deep Neural Networks for Ultra Low Power IoT.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

Very Low Voltage (VLV) Design.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

Ivory: Early-Stage Design Space Exploration Tool for Integrated Voltage Regulators.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Design Automation Conference, 2017

Mallacc: Accelerating Memory Allocation.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

Sub-uJ deep neural networks for embedded applications.

[BibT_eX]

[DOI]

Proceedings of the 51st Asilomar Conference on Signals, Systems, and Computers, 2017

2016

Profiling a Warehouse-Scale Computer.

[BibT_eX]

[DOI]

Svilen Kanev

Juan Pablo Darago

Kim M. Hazelwood

Tipp Moseley

IEEE Micro, 2016

A Fully Integrated Reconfigurable Switched-Capacitor DC-DC Converter With Four Stacked Output Channels for Voltage Stacking Applications.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2016

Co-designing accelerators and SoC interfaces using gem5-Aladdin.

[BibT_eX]

[DOI]

Sam Likun Xi

Vijayalakshmi Srinivasan

Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Fathom: reference workloads for modern deep learning methods.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on Workload Characterization, 2016

2015

Research Infrastructures for Hardware Accelerators

[BibT_eX]

[DOI]

Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01750-6, 2015

The Aladdin Approach to Accelerator Design and Modeling.

[BibT_eX]

[DOI]

IEEE Micro, 2015

A multi-chip system optimized for insect-scale flapping-wing robots.

[BibT_eX]

[DOI]

Proceedings of the Symposium on VLSI Circuits, 2015

A 16-core voltage-stacked system with an integrated switched-capacitor DC-DC converter.

[BibT_eX]

[DOI]

Proceedings of the Symposium on VLSI Circuits, 2015

Circuit and system design for robotic flying vehicles.

[BibT_eX]

[DOI]

Proceedings of the VLSI Design, Automation and Test, 2015

Addressing the computing technology-capability gap: The coming Golden Age of design.

[BibT_eX]

[DOI]

Elizabeth Farrell Helbling

Proceedings of the 10th IEEE International Conference on Networking, 2015

Quantifying sources of error in McPAT and potential impacts on architectural studies.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

A power electronics unit to drive piezoelectric actuators for flying microrobots.

[BibT_eX]

[DOI]

Mario Lok

Xuan Zhang

Robert J. Wood

Proceedings of the 2015 IEEE Custom Integrated Circuits Conference, 2015

HELIX-UP: relaxing program semantics to unleash parallelization.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015

2014

Evaluating Adaptive Clocking for Supply-Noise Resilience in Battery-Powered Aerial Microrobotic System-on-Chip.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. I Regul. Pap., 2014

Aladdin: A pre-RTL, power-performance accelerator simulator enabling large design space exploration of customized architectures.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

HELIX-RC: An architecture-compiler co-design for automatic parallelization of irregular programs.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

MachSuite: Benchmarks for accelerator design and customized architectures.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Symposium on Workload Characterization, 2014

Tradeoffs between power management and tail latency in warehouse-scale applications.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Symposium on Workload Characterization, 2014

Multi-accelerator system development with the ShrinkFit acceleration framework.

[BibT_eX]

[DOI]

Michael J. Lyons

Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

2013

Shrink-Fit: A Framework for Flexible Accelerator Sizing.

[BibT_eX]

[DOI]

Michael J. Lyons

IEEE Comput. Archit. Lett., 2013

ISA-independent workload characterization and its implications for specialized architectures.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013

Characterizing and evaluating voltage noise in multi-core near-threshold processors.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013

Energy characterization and instruction-level energy model of Intel's Xeon Phi processor.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013

Quantifying acceleration: Power/performance trade-offs of application kernels in hardware.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013

Supply-noise resilient adaptive clocking for battery-powered aerial microrobotic System-on-Chip in 40nm CMOS.

[BibT_eX]

[DOI]

Proceedings of the IEEE 2013 Custom Integrated Circuits Conference, 2013

A fully integrated battery-connected switched-capacitor 4: 1 voltage regulator with 70% peak efficiency using bottom-plate charge recycling.

[BibT_eX]

[DOI]

Proceedings of the IEEE 2013 Custom Integrated Circuits Conference, 2013

2012

The accelerator store: A shared memory framework for accelerator-based systems.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2012

Helix: Making the Extraction of Thread-Level Parallelism Mainstream.

[BibT_eX]

[DOI]

IEEE Micro, 2012

A Fully-Integrated 3-Level DC-DC Converter for Nanosecond-Scale DVFS.

[BibT_eX]

[DOI]

Wonyoung Kim

IEEE J. Solid State Circuits, 2012

Evaluation of voltage stacking for near-threshold multicore computing.

[BibT_eX]

[DOI]

Sae Kyu Lee

Proceedings of the International Symposium on Low Power Electronics and Design, 2012

XIOSim: power-performance modeling of mobile x86 cores.

[BibT_eX]

[DOI]

Svilen Kanev

Proceedings of the International Symposium on Low Power Electronics and Design, 2012

The HELIX project: overview and directions.

[BibT_eX]

[DOI]

Proceedings of the 49th Annual Design Automation Conference 2012, 2012

HELIX: automatic parallelization of irregular programs for chip multiprocessing.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2012

2011

Automating Design of Voltage Interpolation to Address Process Variations.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2011

Resilient Architectures via Collaborative Design: Maximizing Commodity Processor Performance in the Presence of Variations.

[BibT_eX]

[DOI]

Vijay Janapa Reddi

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2011

Voltage Noise in Production Processors.

[BibT_eX]

[DOI]

IEEE Micro, 2011

CPUs, GPUs, and Hybrid Computing.

[BibT_eX]

[DOI]

IEEE Micro, 2011

An Accelerator-Based Wireless Sensor Network Processor in 130 nm CMOS.

[BibT_eX]

[DOI]

IEEE J. Emerg. Sel. Topics Circuits Syst., 2011

A fully-integrated 3-level DC/DC converter for nanosecond-scale DVS with fast shunt regulation.

[BibT_eX]

[DOI]

Wonyoung Kim

Proceedings of the IEEE International Solid-State Circuits Conference, 2011

Hardware in the loop for optical flow sensing in a robotic bee.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2011

Achieving uniform performance and maximizing throughput in the presence of heterogeneity.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011

Implementing a hybrid SRAM / eDRAM NUCA architecture.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on High Performance Computing, 2011

Dimetrodon: processor-level preventive thermal management via idle cycle injection.

[BibT_eX]

[DOI]

Proceedings of the 48th Design Automation Conference, 2011

The alarms project: A hardware/software approach to addressing parameter variations.

[BibT_eX]

[DOI]

Proceedings of the 16th Asia South Pacific Design Automation Conference, 2011

2010

Eliminating voltage emergencies via software-guided code transformations.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2010

Applied inference: Case studies in microarchitectural design.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2010

Predicting Voltage Droops Using Recurring Program and Microarchitectural Event Activity.

[BibT_eX]

[DOI]

IEEE Micro, 2010

Can Subthreshold and Near-Threshold Circuits Go Mainstream?

[BibT_eX]

[DOI]

Benton H. Calhoun

IEEE Micro, 2010

The Accelerator Store framework for high-performance, low-power accelerator-based systems.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2010

Voltage Smoothing: Characterizing and Mitigating Voltage Noise in Production Processors via Software-Guided Thread Scheduling.

[BibT_eX]

[DOI]

Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010

2009

Energy- and area-efficient architectures through application clustering and architectural heterogeneity.

[BibT_eX]

[DOI]

Lukasz Strozek

ACM Trans. Archit. Code Optim., 2009

Revival: A Variation-Tolerant Architecture Using Voltage Interpolation and Variable Latency.

[BibT_eX]

[DOI]

IEEE Micro, 2009

Tribeca: design for PVT variations with local recovery and fine-grained adaptation.

[BibT_eX]

[DOI]

Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009

Place and route considerations for voltage interpolated designs.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Quality of Electronic Design (ISQED 2009), 2009

The design of a bloom filter hardware accelerator for ultra low power systems.

[BibT_eX]

[DOI]

Michael J. Lyons

Proceedings of the 2009 International Symposium on Low Power Electronics and Design, 2009

Thread motion: fine-grained power management for multi-core systems.

[BibT_eX]

[DOI]

Krishna K. Rangan

Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009

Empirical performance models for 3T1D memories.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Computer Design, 2009

Design and test strategies for microarchitectural post-fabrication tuning.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Computer Design, 2009

Voltage emergency prediction: Using signatures to reduce operating margins.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009

An event-guided approach to reducing voltage noise in processors.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation and Test in Europe, 2009

Software-assisted hardware reliability: abstracting circuit-level challenges to the software stack.

[BibT_eX]

[DOI]

Proceedings of the 46th Design Automation Conference, 2009

An accelerator-based wireless sensor network processor in 130nm CMOS.

[BibT_eX]

[DOI]

Proceedings of the 2009 International Conference on Compilers, 2009

2008

Replacing 6T SRAMs with 3T1D DRAMs in the L1 Data Cache to Combat Process Variability.

[BibT_eX]

[DOI]

IEEE Micro, 2008

Guest Editors' Introduction: Top Picks from the Computer Architecture Conferences of 2007.

[BibT_eX]

[DOI]

Sarita V. Adve

Craig B. Zilles

IEEE Micro, 2008

Survey of Hardware Systems for Wireless Sensor Networks.

[BibT_eX]

[DOI]

J. Low Power Electron., 2008

CPR: Composable performance regression for scalable multiprocessor models.

[BibT_eX]

[DOI]

Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-41 2008), 2008

A Process-Variation-Tolerant Floating-Point Unit with Voltage Interpolation and Variable Latency.

[BibT_eX]

[DOI]

Proceedings of the 2008 IEEE International Solid-State Circuits Conference, 2008

Instruction-driven clock scheduling with glitch mitigation.

[BibT_eX]

[DOI]

Proceedings of the 2008 International Symposium on Low Power Electronics and Design, 2008

System design considerations for sensor network applications.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2008), 2008

Evaluation of voltage interpolation to address process variations.

[BibT_eX]

[DOI]

Kevin Brownell

Proceedings of the 2008 International Conference on Computer-Aided Design, 2008

Roughness of microarchitectural design topologies and its implications for optimization.

[BibT_eX]

[DOI]

Proceedings of the 14th International Conference on High-Performance Computer Architecture (HPCA-14 2008), 2008

System level analysis of fast, per-core DVFS using on-chip switching regulators.

[BibT_eX]

[DOI]

Proceedings of the 14th International Conference on High-Performance Computer Architecture (HPCA-14 2008), 2008

DeCoR: A Delayed Commit and Rollback mechanism for handling inductive noise in processors.

[BibT_eX]

[DOI]

Proceedings of the 14th International Conference on High-Performance Computer Architecture (HPCA-14 2008), 2008

Efficiency trends and limits from comprehensive microarchitectural adaptivity.

[BibT_eX]

[DOI]

Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, 2008

2007

Spatial Sampling and Regression Strategies.

[BibT_eX]

[DOI]

IEEE Micro, 2007

Power, Thermal, and Reliability Modeling in Nanometer-Scale Microprocessors.

[BibT_eX]

[DOI]

IEEE Micro, 2007

Methods of inference and learning for performance modeling of parallel applications.

[BibT_eX]

[DOI]

Bronis R. de Supinski

Martin Schulz

Karan Singh

Sally A. McKee

Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2007

Process Variation Tolerant 3T1D-Based Cache Architectures.

[BibT_eX]

[DOI]

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-40 2007), 2007

Towards a software approach to mitigate voltage emergencies.

[BibT_eX]

[DOI]

Proceedings of the 2007 International Symposium on Low Power Electronics and Design, 2007

Architectural power models for SRAM and CAM structures based on hybrid analytical/empirical techniques.

[BibT_eX]

[DOI]

Kerem Turgay

Proceedings of the 2007 International Conference on Computer-Aided Design, 2007

Illustrative Design Space Studies with Microarchitectural Regression Models.

[BibT_eX]

[DOI]

Proceedings of the 13st International Conference on High-Performance Computer Architecture (HPCA-13 2007), 2007

Understanding voltage variations in chip multiprocessors using a distributed power-delivery network.

[BibT_eX]

[DOI]

Proceedings of the 2007 Design, Automation and Test in Europe Conference and Exposition, 2007

2006

Dynamic-Compiler-Driven Control for Microprocessor Energy and Performance.

[BibT_eX]

[DOI]

IEEE Micro, 2006

Mitigating the Impact of Process Variations on Processor Register Files and Execution Units.

[BibT_eX]

[DOI]

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-39 2006), 2006

System-on-Chip Architecture Design for Intelligent Sensor Networks.

[BibT_eX]

[DOI]

Proceedings of the Second International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2006), 2006

Microarchitecture parameter selection to optimize system performance under process variation.

[BibT_eX]

[DOI]

Proceedings of the 2006 International Conference on Computer-Aided Design, 2006

CMP design space exploration subject to physical constraints.

[BibT_eX]

[DOI]

Proceedings of the 12th International Symposium on High-Performance Computer Architecture, 2006

Efficient architectures through application clustering and architectural heterogeneity.

[BibT_eX]

[DOI]

Lukasz Strozek

Proceedings of the 2006 International Conference on Compilers, 2006

Architecture and circuit techniques for low-throughput, energy-constrained systems across technology generations.

[BibT_eX]

[DOI]

Proceedings of the 2006 International Conference on Compilers, 2006

Accurate and efficient regression modeling for microarchitectural performance and power prediction.

[BibT_eX]

[DOI]

Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, 2006

2005

A Dynamic Compilation Framework for Controlling Microprocessor Energy and Performance.

[BibT_eX]

[DOI]

Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-38 2005), 2005

Power and thermal effects of SRAM vs. Latch-Mux design styles and clock gating choices.

[BibT_eX]

[DOI]

Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005

An Ultra Low Power System Architecture for Sensor Network Applications.

[BibT_eX]

[DOI]

Proceedings of the 32st International Symposium on Computer Architecture (ISCA 2005), 2005

Performance, Energy, and Thermal Considerations for SMT and CMP Architectures.

[BibT_eX]

[DOI]

Proceedings of the 11th International Conference on High-Performance Computer Architecture (HPCA-11 2005), 2005

2004

Integrated Analysis of Power and Performance for Pipelined Microprocessors.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2004

Power-performance simulation: design and validation strategies.

[BibT_eX]

[DOI]

Pradip Bose

SIGMETRICS Perform. Evaluation Rev., 2004

TinyBench: The Case For A Standardized Benchmark Suite for TinyOS Based Wireless Sensor Network Devices.

[BibT_eX]

[DOI]

Matt Welsh

Proceedings of the 29th Annual IEEE Conference on Local Computer Networks (LCN 2004), 2004

Understanding the energy efficiency of simultaneous multithreading.

[BibT_eX]

[DOI]

Proceedings of the 2004 International Symposium on Low Power Electronics and Design, 2004

Eliminating voltage emergencies via microarchitectural voltage control feedback and dynamic optimization.

[BibT_eX]

[DOI]

Kim M. Hazelwood

Proceedings of the 2004 International Symposium on Low Power Electronics and Design, 2004

Evaluating Techniques for Exploiting Instruction Slack.

[BibT_eX]

[DOI]

Yau Chin

John Sheu

Proceedings of the 22nd IEEE International Conference on Computer Design: VLSI in Computers & Processors (ICCD 2004), 2004

2003

New methodology for early-stage, microarchitecture-level power-performance analysis of microprocessors.

[BibT_eX]

[DOI]

Michael G. Rosenfield

IBM J. Res. Dev., 2003

Control Techniques to Eliminate Voltage Emergencies in High Performance Processors.

[BibT_eX]

[DOI]

Russ Joseph

Proceedings of the Ninth International Symposium on High-Performance Computer Architecture (HPCA'03), 2003

2002

Early-Stage Definition of LPX: A Low Power Issue-Execute Processor.

[BibT_eX]

[DOI]

Proceedings of the Power-Aware Computer Systems, Second International Workshop, 2002

Optimizing pipelines for power and performance.

[BibT_eX]

[DOI]

Proceedings of the 35th Annual International Symposium on Microarchitecture, 2002

2001

Dynamic Thermal Management for High-Performance Microprocessors.

[BibT_eX]

[DOI]

Proceedings of the Seventh International Symposium on High-Performance Computer Architecture (HPCA'01), 2001

A circuit level implementation of an adaptive issue queue for power-aware microprocessors.

[BibT_eX]

[DOI]

Proceedings of the 11th ACM Great Lakes Symposium on VLSI 2001, 2001

2000

Value-based clock gating and operation packing: dynamic strategies for improving processor power and performance.

[BibT_eX]

[DOI]

ACM Trans. Comput. Syst., 2000

Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors.

[BibT_eX]

[DOI]

IEEE Micro, 2000

An Adaptive Issue Queue for Reduced Power at High Performance.

[BibT_eX]

[DOI]

Proceedings of the Power-Aware Computer Systems, First International Workshop, 2000

Power-Performance Modeling and Tradeoff Analysis for a High End Microprocessor.

[BibT_eX]

[DOI]

Proceedings of the Power-Aware Computer Systems, First International Workshop, 2000

Wattch: a framework for architectural-level power analysis and optimizations.

[BibT_eX]

[DOI]

Vivek Tiwari

Proceedings of the 27th International Symposium on Computer Architecture (ISCA 2000), 2000

1999

Dynamically Exploiting Narrow Width Operands to Improve Processor Power and Performance.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Symposium on High-Performance Computer Architecture, 1999

Implementing Application-Specific Cache-Coherence Protocols in Configurable Hardware.

[BibT_eX]

[DOI]