Vivienne Sze

Orcid: 0000-0003-4841-3990

Affiliations:
  • Massachusetts Institute of Technology, MA, USA


According to our database1, Vivienne Sze authored at least 108 papers between 2006 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
GMMap: Memory-Efficient Continuous Occupancy Map Using Gaussian Mixture Model.
IEEE Trans. Robotics, 2024

DecTrain: Deciding When to Train a DNN Online.
CoRR, 2024

LoopTree: Exploring the Fused-layer Dataflow Accelerator Design Space.
CoRR, 2024

GEVO: Memory-Efficient Monocular Visual Odometry Using Gaussians.
CoRR, 2024

Modeling Analog-Digital-Converter Energy and Area for Compute-In-Memory Accelerator Design.
CoRR, 2024

Gemino: Practical and Robust Neural Compression for Video Conferencing.
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024

CiMLoop: A Flexible, Accurate, and Fast Compute-In-Memory Modeling Tool.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024

Architecture-Level Modeling of Photonic Deep Neural Network Accelerators.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024

2023
Unleashing the Power of Deep Learning.
Commun. ACM, July, 2023

Individualized Tracking of Neurocognitive-State-Dependent Eye-Movement Features Using Mobile Devices.
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., March, 2023

Data Centers on Wheels: Emissions From Computing Onboard Autonomous Vehicles.
IEEE Micro, 2023

Tailors: Accelerating Sparse Tensor Algebra by Overbooking Buffer Capacity.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

LoopTree: Enabling Exploration of Fused-layer Dataflow Accelerators.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

RAELLA: Reforming the Arithmetic for Efficient, Low-Resolution, and Low-Loss Analog PIM: No Retraining Required!
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

2022
App-Based Saccade Latency and Directional Error Determination Across the Adult Age Spectrum.
IEEE Trans. Biomed. Eng., 2022

Developing a Series of AI Challenges for the United States Department of the Air Force.
CoRR, 2022

Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Uncertainty from Motion for DNN Monocular Depth Estimation.
Proceedings of the 2022 International Conference on Robotics and Automation, 2022

Memory-Efficient Gaussian Fitting for Depth Images in Real Time.
Proceedings of the 2022 International Conference on Robotics and Automation, 2022

2021
Searching for Efficient Multi-Stage Vision Transformers.
CoRR, 2021


Session 9 Overview: ML Processors From Cloud to Edge Machine Learning Subcommittee.
Proceedings of the IEEE International Solid-State Circuits Conference, 2021

Sparseloop: An Analytical, Energy-Focused Design Space Exploration Methodology for Sparse Tensor Accelerators.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2021

Architecture-Level Energy Estimation for Heterogeneous Computing Systems.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2021

Efficient Computation of Map-scale Continuous Mutual Information on Chip in Real Time.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Domain-Specific Language Abstractions for Compression.
Proceedings of the 31st Data Compression Conference, 2021

NetAdaptV2: Efficient Neural Architecture Search With Fast Super-Network Training and Architecture Optimization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Efficient Processing of Deep Neural Networks
Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01766-7, 2020

Measuring Saccade Latency Using Smartphone Cameras.
IEEE J. Biomed. Health Informatics, 2020

Low Power Depth Estimation of Rigid Objects for Time-of-Flight Imaging.
IEEE Trans. Circuits Syst. Video Technol., 2020

FSMI: Fast computation of Shannon mutual information for information-theoretic mapping.
Int. J. Robotics Res., 2020

Freely scalable and reconfigurable optical hardware for deep learning.
CoRR, 2020

Depth Map Estimation of Dynamic Scenes Using Prior Depth Information.
CoRR, 2020

Efficient Computing for AI and Robotics.
Proceedings of the 2020 International Symposium on VLSI Design, Automation and Test, 2020

An Architecture-Level Energy and Area Estimator for Processing-In-Memory Accelerator Designs.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2020

Balancing Actuation and Computing Energy in Motion Planning.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

An Efficient and Continuous Approach to Information-Theoretic Exploration.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

2019
Navion: A 2-mW Fully Integrated Real-Time Visual-Inertial Odometry Accelerator for Autonomous Navigation of Nano Drones.
IEEE J. Solid State Circuits, 2019

Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2019

Design Considerations for Efficient Deep Neural Networks on Processing-in-Memory Accelerators.
CoRR, 2019

SysML: The New Frontier of Machine Learning Systems.
CoRR, 2019

DeeperLab: Single-Shot Image Parser.
CoRR, 2019

High-Throughput Computation of Shannon Mutual Information on Chip.
Proceedings of the Robotics: Science and Systems XV, 2019

FSMI: Fast Computation of Shannon Mutual Information for Information-Theoretic Mapping.
Proceedings of the International Conference on Robotics and Automation, 2019

FastDepth: Fast Monocular Depth Estimation on Embedded Systems.
Proceedings of the International Conference on Robotics and Automation, 2019

Low Power Adaptive Time-of-Flight Imaging for Multiple Rigid Objects.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Accelergy: An Architecture-Level Energy Estimation Methodology for Accelerator Designs.
Proceedings of the International Conference on Computer-Aided Design, 2019

2018
A Fully Integrated Energy-Efficient H.265/HEVC Decoder With eDRAM for Wearable Devices.
IEEE J. Solid State Circuits, 2018

Navion: A 2mW Fully Integrated Real-Time Visual-Inertial Odometry Accelerator for Autonomous Navigation of Nano Drones.
CoRR, 2018

Eyeriss v2: A Flexible and High-Performance Accelerator for Emerging Deep Neural Networks.
CoRR, 2018

NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications.
CoRR, 2018

Navion: A Fully Integrated Energy-Efficient Visual-Inertial Odometry Accelerator for Autonomous Navigation of Nano Drones.
Proceedings of the 2018 IEEE Symposium on VLSI Circuits, 2018

EE2: Workshop on circuits for social good.
Proceedings of the 2018 IEEE International Solid-State Circuits Conference, 2018

Depth Estimation of Non-Rigid Objects for Time-Of-Flight Imaging.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Enabling Saccade Latency Measurements with Consumer-Grade Cameras.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Determination of Saccade Latency Distributions using Video Recordings from Consumer-grade Devices.
Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2018

NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications.
Proceedings of the Computer Vision - ECCV 2018, 2018

Hardware for machine learning: Challenges and opportunities.
Proceedings of the 2018 IEEE Custom Integrated Circuits Conference, 2018

2017
Efficient Processing of Deep Neural Networks: A Tutorial and Survey.
Proc. IEEE, 2017

Using Dataflow to Optimize Energy Efficiency of Deep Neural Network Accelerators.
IEEE Micro, 2017

A 58.6 mW 30 Frames/s Real-Time Programmable Multiobject Detection Accelerator With Deformable Parts Models on Full HD 1920×1080 Videos.
IEEE J. Solid State Circuits, 2017

Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks.
IEEE J. Solid State Circuits, 2017

Towards Closing the Energy Gap Between HOG and CNN Features for Embedded Vision.
CoRR, 2017

Visual-Inertial Odometry on Chip: An Algorithm-and-Hardware Co-design Approach.
Proceedings of the Robotics: Science and Systems XIII, 2017

Towards closing the energy gap between HOG and CNN features for embedded vision (Invited paper).
Proceedings of the IEEE International Symposium on Circuits and Systems, 2017

Low power depth estimation for time-of-flight imaging.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

FAST: A Framework to Accelerate Super-Resolution Processing on Compressed Videos.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Hardware for machine learning: Challenges and opportunities.
Proceedings of the 2017 IEEE Custom Integrated Circuits Conference, 2017

A method to estimate the energy consumption of deep neural networks.
Proceedings of the 51st Asilomar Conference on Signals, Systems, and Computers, 2017

2016
An Energy-Efficient Hardware Implementation of HOG-Based Object Detection at 1080HD 60 fps with Multi-Scale Support.
J. Signal Process. Syst., 2016

Introduction to the Special Issue on HEVC Extensions and Efficient HEVC Implementations.
IEEE Trans. Circuits Syst. Video Technol., 2016

FAST: Free Adaptive Super-Resolution via Transfer for Compressed Videos.
CoRR, 2016

A 58.6mW Real-Time Programmable Object Detector with Multi-Scale Multi-Object Support Using Deformable Parts Model on 1920x1080 Video at 30fps.
CoRR, 2016

A 58.6mW real-time programmable object detector with multi-scale multi-object support using deformable parts model on 1920×1080 video at 30fps.
Proceedings of the 2016 IEEE Symposium on VLSI Circuits, 2016

14.5 Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks.
Proceedings of the 2016 IEEE International Solid-State Circuits Conference, 2016

Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

2015
A Deeply Pipelined CABAC Decoder for HEVC Supporting Level 6.2 High-Tier Applications.
IEEE Trans. Circuits Syst. Video Technol., 2015

Rotate intra block copy for still image coding.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

2014
Decoder Hardware Architecture for HEVC.
Proceedings of the High Efficiency Video Coding (HEVC), Algorithms and Architectures, 2014

Entropy Coding in HEVC.
Proceedings of the High Efficiency Video Coding (HEVC), Algorithms and Architectures, 2014

A 249-Mpixel/s HEVC Video-Decoder Chip for 4K Ultra-HD Applications.
IEEE J. Solid State Circuits, 2014

Energy-efficient HOG-based object detection at 1080HD 60 fps with multi-scale support.
Proceedings of the 2014 IEEE Workshop on Signal Processing Systems, 2014

Energy and area-efficient hardware implementation of HEVC inverse transform and dequantization.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

A 2014 Mbin/s deeply pipelined CABAC decoder for HEVC.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

2013
Cost and Coding Efficient Motion Estimation Design Considerations for High Efficiency Video Coding (HEVC) Standard.
IEEE J. Sel. Top. Signal Process., 2013

Core Transform Design in the High Efficiency Video Coding (HEVC) Standard.
IEEE J. Sel. Top. Signal Process., 2013

A comparison of CABAC throughput for HEVC/H.265 VS. AVC/H.264.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2013

A 249Mpixel/s HEVC video-decoder chip for Quad Full HD applications.
Proceedings of the 2013 IEEE International Solid-State Circuits Conference, 2013

2012
Joint Algorithm-Architecture Optimization of CABAC.
J. Signal Process. Syst., 2012

High Throughput CABAC Entropy Coding in HEVC.
IEEE Trans. Circuits Syst. Video Technol., 2012

A Highly Parallel and Scalable CABAC Decoder for Next Generation Video Coding.
IEEE J. Solid State Circuits, 2012

Parallelization of CABAC transform coefficient coding for HEVC.
Proceedings of the 2012 Picture Coding Symposium, 2012

Memory cost vs. coding efficiency trade-offs for HEVC motion estimation engine.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

Hardware-aware motion estimation search algorithm development for high-efficiency video coding (HEVC) standard.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

Unified forward+inverse transform architecture for HEVC.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

2011
HEVC ALF decode complexity analysis and reduction.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

Joint algorithm-architecture optimization of CABAC to increase speed and reduce area cost.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Parallel algorithms and architectures for low power video decoding.
PhD thesis, 2010

Technologies for Ultradynamic Voltage Scaling.
Proc. IEEE, 2010

2009
Multicore Processing and Efficient On-Chip Caching for H.264 and Future Video Decoders.
IEEE Trans. Circuits Syst. Video Technol., 2009

Low-Power Impulse UWB Architectures and Circuits.
Proc. IEEE, 2009

A 0.7-V 1.8-mW H.264/AVC 720p Video Decoder.
IEEE J. Solid State Circuits, 2009

A high throughput CABAC algorithm using syntax element partitioning.
Proceedings of the International Conference on Image Processing, 2009

2008
Parallel CABAC for low power video coding.
Proceedings of the International Conference on Image Processing, 2008

2007
A 0.4-V UWB baseband processor.
Proceedings of the 2007 International Symposium on Low Power Electronics and Design, 2007

2006
An Energy Efficient Sub-Threshold Baseband Processor Architecture for Pulsed Ultra-Wideband Communications.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006


  Loading...