Gu-Yeon Wei

Proceedings of the IEEE Symposium on VLSI Technology and Circuits 2024, 2024

14.5 A 12nm Linux-SMP-Capable RISC-V SoC with 14 Accelerator Types, Distributed Hardware Power Management and Flexible NoC-Based Data Orchestration.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Solid-State Circuits Conference, 2024

Generative AI Beyond LLMs: System Implications of Multi-Modal Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024

JointNF: Enhancing DNN Performance through Adaptive N: M Pruning across both Weight and Activation.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design, 2024

MAD-Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems.

[BibT_eX]

[DOI]

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

BlitzCoin: Fully Decentralized Hardware Power Management for Accelerator-Rich SoCs.

[BibT_eX]

[DOI]

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

VelociTI: An Architecture-level Performance Modeling Framework for Trapped Ion Quantum Computers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Workload Characterization, 2024

Guess & Sketch: Language Model Guided Transpilation.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

CAMEL: Co-Designing AI Models and eDRAMs for Efficient On-Device Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

GPU-based Private Information Retrieval for On-Device Machine Learning Inference.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023

SoCProbe: Compositional Post-Silicon Validation of Heterogeneous NoC-Based SoCs.

[BibT_eX]

[DOI]

IEEE Des. Test, December, 2023

Abisko: Deep codesign of an architecture for spiking neural networks using novel neuromorphic materials.

[BibT_eX]

[DOI]

Marc González Tallada

Narasinga Rao Miniskar

Int. J. High Perform. Comput. Appl., July, 2023

Early DSE and Automatic Generation of Coarse-grained Merged Accelerators.

[BibT_eX]

[DOI]

Iulian Brumar

Georgios Zacharopoulos

ACM Trans. Embed. Comput. Syst., March, 2023

A 16-nm SoC for Noise-Robust Speech and NLP Edge AI Inference With Bayesian Sound Source Separation and Attention-Based DNNs.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, February, 2023

A Binary-Activation, Multi-Level Weight RNN and Training Algorithm for ADC-/DAC-Free and Noise-Resilient Processing-in-Memory Inference With eNVM.

[BibT_eX]

[DOI]

Siming Ma

IEEE Trans. Emerg. Top. Comput., 2023

Trireme: Exploration of Hierarchical Multi-level Parallelism for Hardware Acceleration.

[BibT_eX]

[DOI]

Georgios Zacharopoulos

ACM Trans. Embed. Comput. Syst., 2023

Architectural CO2 Footprint Tool: Designing Sustainable Computer Systems With an Architectural Carbon Modeling Tool.

[BibT_eX]

[DOI]

IEEE Micro, 2023

INT2.1: Towards Fine-Tunable Quantized Large Language Models with Error Correction through Low-Rank Adaptation.

[BibT_eX]

[DOI]

CoRR, 2023

CAMEL: Co-Designing AI Models and Embedded DRAMs for Efficient On-Device Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Design Space Exploration and Optimization for Carbon-Efficient Extended Reality Systems.

[BibT_eX]

[DOI]

CoRR, 2023

PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices.

[BibT_eX]

[DOI]

CoRR, 2023

Hardware Resilience Properties of Text-Guided Image Classifiers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

S3: Increasing GPU Utilization during Generative Inference for Higher Throughput.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A 12nm 18.1TFLOPs/W Sparse Transformer Processor with Entropy-Based Early Exit, Mixed-Precision Predication and Fine-Grained Power Management.

[BibT_eX]

[DOI]

Emmanouil-Ioannis Farsarakis

Proceedings of the IEEE International Solid- State Circuits Conference, 2023

Is the Future Cold or Tall? Design Space Exploration of Cryogenic and 3D Embedded Cache Memory.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

Characterizing the Scalability of Graph Convolutional Networks on Intel® PIUMA.

[BibT_eX]

[DOI]

Matthew Joseph Adiletta

Jesmin Jahan Tithi

Gerasimos Gerogiannis

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

Carbon-Efficient Design Optimization for Computing Systems.

[BibT_eX]

[DOI]

Proceedings of the 2nd Workshop on Sustainable Computer Systems, 2023

MAVFI: An End-to-End Fault Analysis Framework with Anomaly Detection and Recovery for Micro Aerial Vehicles.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

MP-Rec: Hardware-Software Co-design to Enable Multi-path Recommendation.

[BibT_eX]

[DOI]

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022

End-to-End Synthesis of Dynamically Controlled Machine Learning Accelerators.

[BibT_eX]

[DOI]

Serena Curzel

IEEE Trans. Computers, 2022

Chasing Carbon: The Elusive Environmental Footprint of Computing.

[BibT_eX]

[DOI]

IEEE Micro, 2022

Bridging Python to Silicon: The SODA Toolchain.

[BibT_eX]

[DOI]

IEEE Micro, 2022

SMIV: A 16-nm 25-mm² SoC for IoT With Arm Cortex-A53, eFPGA, and Coherent Accelerators.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2022

Architectural Implications of Embedding Dimension during GCN on CPU and GPU.

[BibT_eX]

[DOI]

Matthew Adiletta

CoRR, 2022

Impala: Low-Latency, Communication-Efficient Private Deep Learning Inference.

[BibT_eX]

[DOI]

CoRR, 2022

Tabula: Efficiently Computing Nonlinear Activation Functions for Secure Neural Network Inference.

[BibT_eX]

[DOI]

CoRR, 2022

Specialized Accelerators and Compiler Flows: Replacing Accelerator APIs with a Formal Software/Hardware Interface.

[BibT_eX]

[DOI]

CoRR, 2022

Trireme: Exploring Hierarchical Multi-Level Parallelism for Domain Specific Hardware Acceleration.

[BibT_eX]

[DOI]

Georgios Zacharopoulos

CoRR, 2022

Automatic Domain-Specific SoC Design for Autonomous Unmanned Aerial Vehicles.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

ACT: designing sustainable computer systems with an architectural carbon modeling tool.

[BibT_eX]

[DOI]

Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

ASAP: automatic synthesis of area-efficient and precision-aware CGRAs.

[BibT_eX]

[DOI]

Ganesh Gopalakrishnan

Ang Li

Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

A Scalable Methodology for Agile Chip Development with Open-Source Hardware Components.

[BibT_eX]

[DOI]

Nandhini Chandramoorthy

Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022

NVMExplorer: A Framework for Cross-Stack Comparisons of Embedded Non-Volatile Memories.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

CoopMC: Algorithm-Architecture Co-Optimization for Markov Chain Monte Carlo Accelerators.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

From High-Level Frameworks to custom Silicon with SODA.

[BibT_eX]

[DOI]

Serena Curzel

Proceedings of the 2022 IEEE Hot Chips 34 Symposium, 2022

A 12nm Agile-Designed SoC for Swarm-Based Perception with Heterogeneous IP Blocks, a Reconfigurable Memory Hierarchy, and an 800MHz Multi-Plane NoC.

[BibT_eX]

[DOI]

Tianyu Jia

Paolo Mantovani

Nandhini Chandramoorthy

Proceedings of the 48th IEEE European Solid State Circuits Conference, 2022

GoldenEye: A Platform for Evaluating Emerging Numerical Data Formats in DNN Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2022

OMU: A Probabilistic 3D Occupancy Mapping Accelerator for Real-time OctoMap at the Edge.

[BibT_eX]

[DOI]

Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

A joint management middleware to improve training performance of deep recommendation systems with SSDs.

[BibT_eX]

[DOI]

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021

MAVFI: An End-to-End Fault Analysis Framework with Anomaly Detection and Recovery for Micro Aerial Vehicles.

[BibT_eX]

[DOI]

CoRR, 2021

Quantifying and Maximizing the Benefits of Back-End Noise Adaption on Attention-Based Speech Recognition Models.

[BibT_eX]

[DOI]

Coleman Hooper

Thierry Tambe

CoRR, 2021

Machine Learning-Based Automated Design Space Exploration for Autonomous Aerial Robots.

[BibT_eX]

[DOI]

CoRR, 2021

EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

9.8 A 25mm2 SoC for IoT Devices with 18ms Noise-Robust Speech-to-Text Latency via Bayesian Speech Denoising and Attention-Based Sequence-to-Sequence DNN Speech Recognition in 16nm FinFET.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Solid-State Circuits Conference, 2021

Application-driven Design Exploration for Dense Ferroelectric Embedded Non-volatile Memories.

[BibT_eX]

[DOI]

Mohammad Mehdi Sharifi

Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2021

Gradient Disaggregation: Breaking Privacy in Federated Learning by Reconstructing the User Participant Matrix.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Cheetah: Optimizing and Accelerating Homomorphic Encryption for Private Inference.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

SM6: A 16nm System-on-Chip for Accurate and Noise-Robust Attention-Based NLP Applications : The 33rd Hot Chips Symposium - August 22-24, 2021.

[BibT_eX]

[DOI]

Proceedings of the IEEE Hot Chips 33 Symposium, 2021

RecSSD: near data processing for solid state drive based recommendation inference.

[BibT_eX]

[DOI]

Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

Towards Automatic and Agile AI/ML Accelerator Design with End-to-End Synthesis.

[BibT_eX]

[DOI]

Jeff Jun Zhang

Antonino Tumeo

Proceedings of the 32nd IEEE International Conference on Application-specific Systems, 2021

FlexACC: A Programmable Accelerator with Application-Specific ISA for Flexible Deep Neural Network Inference.

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE International Conference on Application-specific Systems, 2021

2020

SMAUG: End-to-End Full-Stack Simulation Infrastructure for Deep Learning Workloads.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2020

CHIPKIT: An Agile, Reusable Open-Source Framework for Rapid Test Chip Development.

[BibT_eX]

[DOI]

IEEE Micro, 2020

MLPerf: An Industry Standard Benchmark Suite for Machine Learning Performance.

[BibT_eX]

[DOI]

IEEE Micro, 2020

EdgeBERT: Optimizing On-Chip Inference for Multi-Task NLP.

[BibT_eX]

[DOI]

CoRR, 2020

Cheetah: Optimizations and Methods for PrivacyPreserving Inference via Homomorphic Encryption.

[BibT_eX]

[DOI]

CoRR, 2020

CHIPKIT: An agile, reusable open-source framework for rapid test chip development.

[BibT_eX]

[DOI]

CoRR, 2020

The Sky Is Not the Limit: A Visual Performance Model for Cyber-Physical Co-Design in Autonomous Machines.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2020

A 3mm2 Programmable Bayesian Inference Accelerator for Unsupervised Machine Perception using Parallel Gibbs Sampling in 16nm.

[BibT_eX]

[DOI]

Proceedings of the IEEE Symposium on VLSI Circuits, 2020

A Systematic Methodology for Analysis of Deep Learning Hardware and Software Platforms.

[BibT_eX]

[DOI]

Yu Wang

Proceedings of the Third Conference on Machine Learning and Systems, 2020

MLPerf Training Benchmark.

[BibT_eX]

[DOI]

Proceedings of the Third Conference on Machine Learning and Systems, 2020

A comprehensive methodology to determine optimal coherence interfaces for many-accelerator SoCs.

[BibT_eX]

[DOI]

Proceedings of the ISLPED '20: ACM/IEEE International Symposium on Low Power Electronics and Design, 2020

DeepRecSys: A System for Optimizing End-To-End At-Scale Neural Recommendation Inference.

[BibT_eX]

[DOI]

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

Cross-Stack Workload Characterization of Deep Recommendation Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Workload Characterization, 2020

SODA: a New Synthesis Infrastructure for Agile Hardware Design of Machine Learning Accelerators.

[BibT_eX]

[DOI]

Marco Minutoli

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

A Scalable Bayesian Inference Accelerator for Unsupervised Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Hot Chips 32 Symposium, 2020

Invited: Software Defined Accelerators From Learning Tools Environment.

[BibT_eX]

[DOI]

Antonino Tumeo

Marco Minutoli

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Algorithm-Hardware Co-Design of Adaptive Floating-Point Encodings for Resilient Deep Learning Inference.

[BibT_eX]

[DOI]

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Structured Compression by Weight Encryption for Unstructured Pruning and Quantization.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Predicting New Workload or CPU Performance by Analyzing Public Datasets.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2019

MEMTI: Optimizing On-Chip Nonvolatile Storage for Visual Multitask Inference at the Edge.

[BibT_eX]

[DOI]

IEEE Micro, 2019

A 16-nm Always-On DNN Processor With Adaptive Clocking and Multi-Cycle Banked SRAMs.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2019

A binary-activation, multi-level weight RNN and training algorithm for processing-in-memory inference with eNVM.

[BibT_eX]

[DOI]

Siming Ma

CoRR, 2019

MLPerf Training Benchmark.

[BibT_eX]

[DOI]

CoRR, 2019

AdaptivFloat: A Floating-point based Data Type for Resilient Deep Learning Inference.

[BibT_eX]

[DOI]

CoRR, 2019

Benchmarking TPU, GPU, and CPU Platforms for Deep Learning.

[BibT_eX]

[DOI]

Yu Wang

CoRR, 2019

Learning Low-Rank Approximation for CNNs.

[BibT_eX]

[DOI]

CoRR, 2019

Structured Compression by Unstructured Pruning for Sparse Quantized Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2019

Network Pruning for Low-Rank Binary Indexing.

[BibT_eX]

[DOI]

CoRR, 2019

Determining Optimal Coherency Interface for Many-Accelerator SoCs Using Bayesian Optimization.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2019

A 16nm 25mm2 SoC with a 54.5x Flexibility-Efficiency Range from Dual-Core Arm Cortex-A53 to eFPGA and Cache-Coherent Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 2019 Symposium on VLSI Circuits, Kyoto, Japan, June 9-14, 2019, 2019

CHAMPVis: Comparative Hierarchical Analysis of Microarchitectural Performance.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Workshop on Programming and Performance Visualization Tools, 2019

MaxNVM: Maximizing DNN Storage Density and Inference Efficiency with Sparse Encoding and Error Mitigation.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

Demystifying Bayesian Inference Workloads.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2019

Accelerating Bayesian Inference on Structured Graphs Using Parallel Gibbs Sampling.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019

FlexGibbs: Reconfigurable Parallel Gibbs Sampling Accelerator for Structured Graphs.

[BibT_eX]

[DOI]

Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2019

MASR: A Modular Accelerator for Sparse RNNs.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018

Assisting High-Level Synthesis Improve SpMV Benchmark Through Dynamic Dependence Analysis.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. II Express Briefs, 2018

An Area-Efficient 8-Bit Single-Ended ADC With Extended Input Voltage Range.

[BibT_eX]

[DOI]

Simon Chaput

IEEE Trans. Circuits Syst. II Express Briefs, 2018

DNN Engine: A 28-nm Timing-Error Tolerant Sparse Deep Neural Network Processor for IoT Applications.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2018

Cloud No Longer a Silver Bullet, Edge to the Rescue.

[BibT_eX]

[DOI]

Yuhao Zhu

CoRR, 2018

Weightless: Lossy weight encoding for deep neural network compression.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

A Wide Dynamic Range Sparse FC-DNN Processor with Multi-Cycle Banked SRAM Read and Adaptive Clocking in 16nm FinFET.

[BibT_eX]

[DOI]

Proceedings of the 44th IEEE European Solid State Circuits Conference, 2018

Ares: a framework for quantifying the resilience of deep neural networks.

[BibT_eX]

[DOI]

Proceedings of the 55th Annual Design Automation Conference, 2018

On-chip deep neural network storage with multi-level eNVM.

[BibT_eX]

[DOI]

Proceedings of the 55th Annual Design Automation Conference, 2018

2017

Deep Learning for Computer Architects

[BibT_eX]

[DOI]

Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01756-8, 2017

A 16-Core Voltage-Stacked System With Adaptive Clocking and an Integrated Switched-Capacitor DC-DC Converter.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2017

Cognitive Computing Safety: The New Horizon for Reliability / The Design and Evolution of Deep Learning Workloads.

[BibT_eX]

[DOI]

IEEE Micro, 2017

A Fully Integrated Battery-Powered System-on-Chip in 40-nm CMOS for Closed-Loop Control of Insect-Scale Pico-Aerial Vehicle.

[BibT_eX]

[DOI]

Pierre-Emile J. Duhamel

IEEE J. Solid State Circuits, 2017

Automatically accelerating non-numerical programs by architecture-compiler co-design.

[BibT_eX]

[DOI]

Commun. ACM, 2017

Methods and infrastructure in the era of accelerator-centric architectures.

[BibT_eX]

[DOI]

Proceedings of the IEEE 60th International Midwest Symposium on Circuits and Systems, 2017

14.3 A 28nm SoC with a 1.2GHz 568nJ/prediction sparse deep-neural-network engine with >0.1 timing error rate tolerance for IoT applications.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Solid-State Circuits Conference, 2017

21.5 A 3-to-5V input 100Vpp output 57.7mW 0.42% THD+N highly integrated piezoelectric actuator driver.

[BibT_eX]

[DOI]

Simon Chaput

Proceedings of the 2017 IEEE International Solid-State Circuits Conference, 2017

A case for efficient accelerator design space exploration via Bayesian optimization.

[BibT_eX]

[DOI]

Brandon Reagen

Parthasarathy Ranganathan

Proceedings of the 2017 IEEE/ACM International Symposium on Low Power Electronics and Design, 2017

Using dynamic dependence analysis to improve the quality of high-level synthesis designs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Circuits and Systems, 2017

Applications of Deep Neural Networks for Ultra Low Power IoT.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

Very Low Voltage (VLV) Design.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

Ivory: Early-Stage Design Space Exploration Tool for Integrated Voltage Regulators.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Design Automation Conference, 2017

Mallacc: Accelerating Memory Allocation.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

Sub-uJ deep neural networks for embedded applications.

[BibT_eX]

[DOI]

Proceedings of the 51st Asilomar Conference on Signals, Systems, and Computers, 2017

2016

Profiling a Warehouse-Scale Computer.

[BibT_eX]

[DOI]

Svilen Kanev

Juan Pablo Darago

Kim M. Hazelwood

Tipp Moseley

IEEE Micro, 2016

A Fully Integrated Reconfigurable Switched-Capacitor DC-DC Converter With Four Stacked Output Channels for Voltage Stacking Applications.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2016

Co-designing accelerators and SoC interfaces using gem5-Aladdin.

[BibT_eX]

[DOI]

Yakun Sophia Shao

Sam Likun Xi

Vijayalakshmi Srinivasan

Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators.

[BibT_eX]

[DOI]

Elizabeth Farrell Helbling

Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Fathom: reference workloads for modern deep learning methods.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on Workload Characterization, 2016

2015

The Aladdin Approach to Accelerator Design and Modeling.

[BibT_eX]

[DOI]

IEEE Micro, 2015

A multi-chip system optimized for insect-scale flapping-wing robots.

[BibT_eX]

[DOI]

Proceedings of the Symposium on VLSI Circuits, 2015

A 16-core voltage-stacked system with an integrated switched-capacitor DC-DC converter.

[BibT_eX]

[DOI]

Proceedings of the Symposium on VLSI Circuits, 2015

Quantifying sources of error in McPAT and potential impacts on architectural studies.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

A power electronics unit to drive piezoelectric actuators for flying microrobots.

[BibT_eX]

[DOI]

Mario Lok

Xuan Zhang

Proceedings of the 2015 IEEE Custom Integrated Circuits Conference, 2015

HELIX-UP: relaxing program semantics to unleash parallelization.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015

2014

ADC-Based Backplane Receiver Design-Space Exploration.

[BibT_eX]

[DOI]

Hayun Chung

IEEE Trans. Very Large Scale Integr. Syst., 2014

Evaluating Adaptive Clocking for Supply-Noise Resilience in Battery-Powered Aerial Microrobotic System-on-Chip.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. I Regul. Pap., 2014

Aladdin: A pre-RTL, power-performance accelerator simulator enabling large design space exploration of customized architectures.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

HELIX-RC: An architecture-compiler co-design for automatic parallelization of irregular programs.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

MachSuite: Benchmarks for accelerator design and customized architectures.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Symposium on Workload Characterization, 2014

Tradeoffs between power management and tail latency in warehouse-scale applications.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Symposium on Workload Characterization, 2014

Multi-accelerator system development with the ShrinkFit acceleration framework.

[BibT_eX]

[DOI]

Michael J. Lyons

Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

2013

Shrink-Fit: A Framework for Flexible Accelerator Sizing.

[BibT_eX]

[DOI]

Michael J. Lyons

IEEE Comput. Archit. Lett., 2013

Characterizing and evaluating voltage noise in multi-core near-threshold processors.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013

Quantifying acceleration: Power/performance trade-offs of application kernels in hardware.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013

Supply-noise resilient adaptive clocking for battery-powered aerial microrobotic System-on-Chip in 40nm CMOS.

[BibT_eX]

[DOI]

Proceedings of the IEEE 2013 Custom Integrated Circuits Conference, 2013

A fully integrated battery-connected switched-capacitor 4: 1 voltage regulator with 70% peak efficiency using bottom-plate charge recycling.

[BibT_eX]

[DOI]

Proceedings of the IEEE 2013 Custom Integrated Circuits Conference, 2013

2012

The accelerator store: A shared memory framework for accelerator-based systems.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2012

Helix: Making the Extraction of Thread-Level Parallelism Mainstream.

[BibT_eX]

[DOI]

IEEE Micro, 2012

A Fully-Integrated 3-Level DC-DC Converter for Nanosecond-Scale DVFS.

[BibT_eX]

[DOI]

Wonyoung Kim

IEEE J. Solid State Circuits, 2012

Evaluation of voltage stacking for near-threshold multicore computing.

[BibT_eX]

[DOI]

Sae Kyu Lee

Proceedings of the International Symposium on Low Power Electronics and Design, 2012

XIOSim: power-performance modeling of mobile x86 cores.

[BibT_eX]

[DOI]

Svilen Kanev

Proceedings of the International Symposium on Low Power Electronics and Design, 2012

The HELIX project: overview and directions.

[BibT_eX]

[DOI]

Proceedings of the 49th Annual Design Automation Conference 2012, 2012

HELIX: automatic parallelization of irregular programs for chip multiprocessing.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2012

2011

Automating Design of Voltage Interpolation to Address Process Variations.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2011

Voltage Noise in Production Processors.

[BibT_eX]

[DOI]

IEEE Micro, 2011

An Accelerator-Based Wireless Sensor Network Processor in 130 nm CMOS.

[BibT_eX]

[DOI]

IEEE J. Emerg. Sel. Topics Circuits Syst., 2011

A fully-integrated 3-level DC/DC converter for nanosecond-scale DVS with fast shunt regulation.

[BibT_eX]

[DOI]

Wonyoung Kim

Proceedings of the IEEE International Solid-State Circuits Conference, 2011

Hardware in the loop for optical flow sensing in a robotic bee.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2011

Achieving uniform performance and maximizing throughput in the presence of heterogeneity.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011

Area efficient phase calibration of a 1.6 GHz multiphase DLL.

[BibT_eX]

[DOI]

Ankur Agrawal

Proceedings of the 2011 IEEE Custom Integrated Circuits Conference, 2011

2010

Eliminating voltage emergencies via software-guided code transformations.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2010

Predicting Voltage Droops Using Recurring Program and Microarchitectural Event Activity.

[BibT_eX]

[DOI]

IEEE Micro, 2010

The Accelerator Store framework for high-performance, low-power accelerator-based systems.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2010

Voltage Smoothing: Characterizing and Mitigating Voltage Noise in Production Processors via Software-Guided Thread Scheduling.

[BibT_eX]

[DOI]

Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010

Energetics of flapping-wing robotic insects: towards autonomous hovering flight.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010

2009

Revival: A Variation-Tolerant Architecture Using Voltage Interpolation and Variable Latency.

[BibT_eX]

[DOI]

Xiaoyao Liang

IEEE Micro, 2009

An 8×5 Gb/s Parallel Receiver With Collaborative Timing Recovery.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2009

Tribeca: design for PVT variations with local recovery and fine-grained adaptation.

[BibT_eX]

[DOI]

Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009

Place and route considerations for voltage interpolated designs.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Quality of Electronic Design (ISQED 2009), 2009

Thread motion: fine-grained power management for multi-core systems.

[BibT_eX]

[DOI]

Krishna K. Rangan

Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009

Milligram-scale high-voltage power electronics for piezoelectric microrobots.

[BibT_eX]

[DOI]

Michael Karpelson

Proceedings of the 2009 IEEE International Conference on Robotics and Automation, 2009

Empirical performance models for 3T1D memories.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Computer Design, 2009

Design and test strategies for microarchitectural post-fabrication tuning.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Computer Design, 2009

Voltage emergency prediction: Using signatures to reduce operating margins.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009

An event-guided approach to reducing voltage noise in processors.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation and Test in Europe, 2009

Software-assisted hardware reliability: abstracting circuit-level challenges to the software stack.

[BibT_eX]

[DOI]

Proceedings of the 46th Design Automation Conference, 2009

Digital wireline and PLL techniques.

[BibT_eX]

[DOI]

Afshin Momtaz

Proceedings of the IEEE Custom Integrated Circuits Conference, 2009

Design-space exploration of backplane receivers with high-speed ADCs and digital equalization.

[BibT_eX]

[DOI]

Hayun Chung

Proceedings of the IEEE Custom Integrated Circuits Conference, 2009

An accelerator-based wireless sensor network processor in 130nm CMOS.

[BibT_eX]

[DOI]

Proceedings of the 2009 International Conference on Compilers, 2009

2008

Replacing 6T SRAMs with 3T1D DRAMs in the L1 Data Cache to Combat Process Variability.

[BibT_eX]

[DOI]

IEEE Micro, 2008

A High-Throughput Maximum a Posteriori Probability Detector.

[BibT_eX]

[DOI]

Aleksandar Kavcic

IEEE J. Solid State Circuits, 2008

A Highly Digital MDLL-Based Clock Multiplier That Leverages a Self-Scrambling Time-to-Digital Converter to Achieve Subpicosecond Jitter Performance.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2008

A Wide-Tracking Range Clock and Data Recovery Circuit.

[BibT_eX]

[DOI]

Un-Ku Moon

IEEE J. Solid State Circuits, 2008

A Sub-Picosecond Resolution 0.5-1.5 GHz Digital-to-Phase Converter.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2008

Survey of Hardware Systems for Wireless Sensor Networks.

[BibT_eX]

[DOI]

J. Low Power Electron., 2008

A Process-Variation-Tolerant Floating-Point Unit with Voltage Interpolation and Variable Latency.

[BibT_eX]

[DOI]

Xiaoyao Liang

Proceedings of the 2008 IEEE International Solid-State Circuits Conference, 2008

An 8×3.2Gb/s Parallel Receiver with Collaborative Timing Recovery.

[BibT_eX]

[DOI]

Ankur Agrawal

Proceedings of the 2008 IEEE International Solid-State Circuits Conference, 2008

Instruction-driven clock scheduling with glitch mitigation.

[BibT_eX]

[DOI]

Proceedings of the 2008 International Symposium on Low Power Electronics and Design, 2008

Design of low-power short-distance opto-electronic transceiver front-ends with scalable supply voltages and frequencies.

[BibT_eX]

[DOI]

Xuning Chen

Li-Shiuan Peh

Proceedings of the 2008 International Symposium on Low Power Electronics and Design, 2008

System design considerations for sensor network applications.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2008), 2008

A review of actuation and power electronics options for flapping-wing robotic insects.

[BibT_eX]

[DOI]

Michael Karpelson

Proceedings of the 2008 IEEE International Conference on Robotics and Automation, 2008

Evaluation of voltage interpolation to address process variations.

[BibT_eX]

[DOI]

Kevin Brownell

Proceedings of the 2008 International Conference on Computer-Aided Design, 2008

System level analysis of fast, per-core DVFS using on-chip switching regulators.

[BibT_eX]

[DOI]

Proceedings of the 14th International Conference on High-Performance Computer Architecture (HPCA-14 2008), 2008

DeCoR: A Delayed Commit and Rollback mechanism for handling inductive noise in processors.

[BibT_eX]

[DOI]

Proceedings of the 14th International Conference on High-Performance Computer Architecture (HPCA-14 2008), 2008

A 12.5-Gbps, 7-bit transmit DAC with 4-tap LUT-based equalization in 0.13μm CMOS.

[BibT_eX]

[DOI]

Hayun Chung

Andrew Liu

Proceedings of the IEEE 2008 Custom Integrated Circuits Conference, 2008

A 8×5 Gb/s source-synchronous receiver with clock generator phase error correction.

[BibT_eX]

[DOI]

Ankur Agrawal

Proceedings of the IEEE 2008 Custom Integrated Circuits Conference, 2008

2007

Process Variation Tolerant 3T1D-Based Cache Architectures.

[BibT_eX]

[DOI]

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-40 2007), 2007

Towards a software approach to mitigate voltage emergencies.

[BibT_eX]

[DOI]

Proceedings of the 2007 International Symposium on Low Power Electronics and Design, 2007

Serial Sum-Product Architecture for Low-Density Parity-Check Codes.

[BibT_eX]

[DOI]

Erich F. Haratsch

Proceedings of the 16th International Conference on Computer Communications and Networks, 2007

A Bit-Node Centric Architecture for Low-Density Parity-Check Decoders.

[BibT_eX]

[DOI]

Erich F. Haratsch

Proceedings of the Global Communications Conference, 2007

Understanding voltage variations in chip multiprocessors using a distributed power-delivery network.

[BibT_eX]

[DOI]

Proceedings of the 2007 Design, Automation and Test in Europe Conference and Exposition, 2007

Digitally-Enhanced Phase-Locking Circuits.

[BibT_eX]

[DOI]

Proceedings of the IEEE 2007 Custom Integrated Circuits Conference, 2007

A Comprehensive Phase-Transfer Model for Delay-Locked Loops.

[BibT_eX]

[DOI]

Proceedings of the IEEE 2007 Custom Integrated Circuits Conference, 2007

2006

System-on-Chip Architecture Design for Intelligent Sensor Networks.

[BibT_eX]

[DOI]

Proceedings of the Second International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2006), 2006

Adaptive-Bandwidth Mixing PLL/DLL Based Multi-Phase Clock Generator for Optimal Jitter Performance.

[BibT_eX]

[DOI]

Amber Han-Yuan Tan

Proceedings of the IEEE 2006 Custom Integrated Circuits Conference, 2006

Phase Mismatch Detection and Compensation for PLL/DLL Based Multi-Phase Clock Generator.

[BibT_eX]

[DOI]

Amber Han-Yuan Tan

Proceedings of the IEEE 2006 Custom Integrated Circuits Conference, 2006

Pulsenet - A Parallel Flash Sampler and Digital Processor IC for Optical SETI.

[BibT_eX]

[DOI]

Proceedings of the IEEE 2006 Custom Integrated Circuits Conference, 2006

A 1.6Gbps Digital Clock and Data Recovery Circuit.

[BibT_eX]

[DOI]

Proceedings of the IEEE 2006 Custom Integrated Circuits Conference, 2006

Architecture and circuit techniques for low-throughput, energy-constrained systems across technology generations.

[BibT_eX]

[DOI]

Proceedings of the 2006 International Conference on Compilers, 2006

2005

An Ultra Low Power System Architecture for Sensor Network Applications.

[BibT_eX]

[DOI]

Proceedings of the 32st International Symposium on Computer Architecture (ISCA 2005), 2005

Exploring the Design Space of Power-Aware Opto-Electronic Networked Systems.

[BibT_eX]

[DOI]

Proceedings of the 11th International Conference on High-Performance Computer Architecture (HPCA-11 2005), 2005

2004

Pipelined parallel architecture for high throughput MAP detectors.

[BibT_eX]

Aleksandar Kavcic

Proceedings of the 2004 International Symposium on Circuits and Systems, 2004

Jitter in high-speed serial and parallel links.

[BibT_eX]

Proceedings of the 2004 International Symposium on Circuits and Systems, 2004

A mixed PLL/DLL architecture for low jitter clock generation.

[BibT_eX]

Yong-Cheol Bae

Proceedings of the 2004 International Symposium on Circuits and Systems, 2004

2003

Guest editorial.

[BibT_eX]

[DOI]

Michael H. Perrott

IEEE Trans. Circuits Syst. II Express Briefs, 2003

Design of CMOS adaptive-bandwidth PLL/DLLs: a general approach.

[BibT_eX]

[DOI]

Jaeha Kim

Mark A. Horowitz

IEEE Trans. Circuits Syst. II Express Briefs, 2003

Analysis of PLL clock jitter in high-speed serial links.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. II Express Briefs, 2003

An adaptive PAM-4 5-Gb/s backplane transceiver in 0.25-μm CMOS.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2003

2002

An adaptive PAM-4 5 Gb/s backplane transceiver in 0.25 μm CMOS.

[BibT_eX]

[DOI]

David A. Yokoyama-Martin

Proceedings of the IEEE 2002 Custom Integrated Circuits Conference, 2002

2000

A variable-frequency parallel I/O interface with adaptive power-supply regulation.

[BibT_eX]

[DOI]

Jaeha Kim

Dean Liu

Stefanos Sidiropoulos

Mark A. Horowitz

IEEE J. Solid State Circuits, 2000

1999

A fully digital, energy-efficient, adaptive power-supply regulator.

[BibT_eX]

[DOI]

Mark Horowitz

IEEE J. Solid State Circuits, 1999

1996

A low power switching power supply for self-clocked systems.

[BibT_eX]

[DOI]