Swagath Venkataramani

Orcid: 0000-0002-0470-6364

According to our database1, Swagath Venkataramani authored at least 83 papers between 2012 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
SmartQuant: CXL-based AI Model Store in Support of Runtime Configurable Weight Quantization.
CoRR, 2024

Enhance DNN Adversarial Robustness and Efficiency via Injecting Noise to Non-Essential Neurons.
CoRR, 2024


2022
OnSRAM: Efficient Inter-Node On-Chip Scratchpad Management in Deep Learning Accelerators.
ACM Trans. Embed. Comput. Syst., November, 2022

A 7-nm Four-Core Mixed-Precision AI Chip With 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling.
IEEE J. Solid State Circuits, 2022

Deep Compression of Pre-trained Transformer Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Approximate Computing and the Efficient Machine Learning Expedition.
Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022

2021

Efficient Management of Scratch-Pad Memories in Deep Learning Accelerators.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2021

Efficacy of Pruning in Ultra-Low Precision DNNs.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2021


4-Bit Quantization of LSTM-Based Speech Recognition Models.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Value Similarity Extensions for Approximate Computing in General-Purpose Processors.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

2020
DyVEDeep: Dynamic Variable Effort Deep Neural Networks.
ACM Trans. Embed. Comput. Syst., 2020

Logic Synthesis of Approximate Circuits.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Efficient AI System Design With Cross-Layer Approximate Computing.
Proc. IEEE, 2020


Ultra-Low Precision 4-bit Training of Deep Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019
SparCE: Sparsity Aware General-Purpose Core Extensions to Accelerate Deep Neural Networks.
IEEE Trans. Computers, 2019

DeepTools: Compiler and Execution Runtime Extensions for RaPiD AI Accelerator.
IEEE Micro, 2019

Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Accurate and Efficient 2-bit Quantized Neural Networks.
Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019

Dynamic Spike Bundling for Energy-Efficient Spiking Neural Networks.
Proceedings of the 2019 IEEE/ACM International Symposium on Low Power Electronics and Design, 2019

Performance-driven Programming of Multi-TFLOP Deep Learning Accelerators.
Proceedings of the IEEE International Symposium on Workload Characterization, 2019

Workload-aware Automatic Parallelization for Multi-GPU DNN Training.
Proceedings of the IEEE International Conference on Acoustics, 2019

Memory and Interconnect Optimizations for Peta-Scale Deep Learning Systems.
Proceedings of the 26th IEEE International Conference on High Performance Computing, 2019

Data Subsetting: A Data-Centric Approach to Approximate Computing.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

BiScaled-DNN: Quantizing Long-tailed Datastructures with Two Scale Factors for Deep Neural Networks.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

A Compiler for Deep Neural Network Accelerators to Generate Optimized Code for a Wide Range of Data Parameters from a Hand-crafted Computation Kernel.
Proceedings of the IEEE Symposium in Low-Power and High-Speed Chips, 2019

Automatic Synthesis Techniques for Approximate Circuits.
Proceedings of the Approximate Circuits, Methodologies and CAD., 2019

Approximate Computing Techniques for Deep Neural Networks.
Proceedings of the Approximate Circuits, Methodologies and CAD., 2019

2018
Energy-Efficient Neural Computing with Approximate Multipliers.
ACM J. Emerg. Technol. Comput. Syst., 2018

Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN).
CoRR, 2018

PACT: Parameterized Clipping Activation for Quantized Neural Networks.
CoRR, 2018


Taming the beast: Programming Peta-FLOP class Deep Learning Systems.
Proceedings of the International Symposium on Low Power Electronics and Design, 2018


Exploiting approximate computing for deep learning acceleration.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

Dyhard-DNN: even more DNN acceleration with dynamic hardware reconfiguration.
Proceedings of the 55th Annual Design Automation Conference, 2018

Compensated-DNN: energy efficient low-precision deep neural networks by compensating quantization errors.
Proceedings of the 55th Annual Design Automation Conference, 2018

2017
Energy-Efficient Reduce-and-Rank Using Input-Adaptive Approximations.
IEEE Trans. Very Large Scale Integr. Syst., 2017

Energy-Efficient Object Detection Using Semantic Decomposition.
IEEE Trans. Very Large Scale Integr. Syst., 2017

Approximate Error Detection With Stochastic Checkers.
IEEE Trans. Very Large Scale Integr. Syst., 2017

DyVEDeep: Dynamic Variable Effort Deep Neural Networks.
CoRR, 2017

A Programmable Event-driven Architecture for Evaluating Spiking Neural Networks.
Proceedings of the 2017 IEEE/ACM International Symposium on Low Power Electronics and Design, 2017

ScaleDeep: A Scalable Compute Architecture for Learning and Evaluating Deep Networks.
Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017


Approximate computing for spiking neural networks.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

STAxCache: An approximate, energy efficient STT-MRAM cache.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

Accelerator Design for Deep Learning Training: Extended Abstract: Invited.
Proceedings of the 54th Annual Design Automation Conference, 2017

POSTER: Design Space Exploration for Performance Optimization of Deep Neural Networks on Shared Memory Accelerators.
Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016
Approximate computing: An integrated cross-layer framework
PhD thesis, 2016

EMBIRA: An Accelerator for Model-Based Iterative Reconstruction.
IEEE Trans. Very Large Scale Integr. Syst., 2016

Approximate Computing.
Proceedings of the 29th International Conference on VLSI Design and 15th International Conference on Embedded Systems, 2016

STOCK: Stochastic Checkers for Low-overhead Approximate Error Detection.
Proceedings of the 2016 International Symposium on Low Power Electronics and Design, 2016

Multiplier-less Artificial Neurons exploiting error resiliency for energy-efficient neural computing.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

Approximation through logic isolation for the design of quality configurable circuits.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

Invited - Cross-layer approximations for neuromorphic computing: from devices to circuits and systems.
Proceedings of the 53rd Annual Design Automation Conference, 2016

Designing approximate circuits using clock overgating.
Proceedings of the 53rd Annual Design Automation Conference, 2016

Efficient embedded learning for IoT devices.
Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016

2015
Exploring Spin-Transfer-Torque Devices for Logic Applications.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2015

Object Detection using Semantic Decomposition for Energy-Efficient Neural Computing.
CoRR, 2015

Spintastic: <u>spin</u>-based s<u>t</u>och<u>astic</u> logic for energy-efficient computing.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

Computing approximately, and efficiently.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

SAPPHIRE: an always-on context-aware computer vision system for portable devices.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

Quality configurable reduce-and-rank for energy efficient approximate computing.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

Scalable-effort classifiers for energy-efficient machine learning.
Proceedings of the 52nd Annual Design Automation Conference, 2015

Approximate computing and the quest for computing efficiency.
Proceedings of the 52nd Annual Design Automation Conference, 2015

Approximate storage for energy efficient spintronic memories.
Proceedings of the 52nd Annual Design Automation Conference, 2015

Exploiting On-Device Image Classification for Energy Efficiency in Ambient-Aware Systems.
Proceedings of the Mobile Cloud Visual Media Computing - From Interaction to Service, 2015

2014
AxNN: energy-efficient neuromorphic systems using approximate computing.
Proceedings of the International Symposium on Low Power Electronics and Design, 2014

Variation tolerant design of a vector processor for recognition, mining and synthesis.
Proceedings of the International Symposium on Low Power Electronics and Design, 2014

StoRM: a stochastic recognition and mining processor.
Proceedings of the International Symposium on Low Power Electronics and Design, 2014

STAG: Spintronic-Tape Architecture for GPGPU cache hierarchies.
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

Approximate computing for efficient information processing.
Proceedings of the 12th IEEE Symposium on Embedded Systems for Real-time Multimedia, 2014

ASLAN: Synthesis of approximate sequential circuits.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

2013
Quality programmable vector processors for approximate computing.
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013

Substitute-and-simplify: a unified design paradigm for approximate and quality configurable circuits.
Proceedings of the Design, Automation and Test in Europe, 2013

Relax-and-retime: a methodology for energy-efficient recovery based design.
Proceedings of the 50th Annual Design Automation Conference 2013, 2013

Approximate computing: An integrated hardware approach.
Proceedings of the 2013 Asilomar Conference on Signals, 2013

2012
SALSA: systematic logic synthesis of approximate circuits.
Proceedings of the 49th Annual Design Automation Conference 2012, 2012


  Loading...