Asit K. Mishra

Affiliations:
  • Penn State, University Park, USA


According to our database1, Asit K. Mishra authored at least 46 papers between 2008 and 2021.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2021
Exploiting Activation based Gradient Output Sparsity to Accelerate Backpropagation in CNNs.
CoRR, 2021

Accelerating Sparse Deep Neural Networks.
CoRR, 2021

2019
Opportunistic computing in GPU architectures.
Proceedings of the 46th International Symposium on Computer Architecture, 2019

2018
Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs.
CoRR, 2018

WRPN & Apprentice: Methods for Training and Inference using Low-Precision Numerics.
CoRR, 2018

WRPN: Wide Reduced-Precision Networks.
Proceedings of the 6th International Conference on Learning Representations, 2018

Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy.
Proceedings of the 6th International Conference on Learning Representations, 2018

In-Package Domain-Specific ASICs for Intel® Stratix® 10 FPGAs: A Case Study of Accelerating Deep Learning Using TensorTile ASIC.
Proceedings of the 28th International Conference on Field Programmable Logic and Applications, 2018

In-Package Domain-Specific ASICs for Intel® Stratix® 10 FPGAs: A Case Study of Accelerating Deep Learning Using TensorTile ASIC(Abstract Only).
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

A Customizable Matrix Multiplication Framework for the Intel HARPv2 Xeon+FPGA Platform: A Deep Learning Case Study.
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

Exploration of Low Numeric Precision Deep Learning Inference Using Intel® FPGAs: (Abstract Only).
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

Exploration of Low Numeric Precision Deep Learning Inference Using Intel® FPGAs.
Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

2017
Low Precision RNNs: Quantizing RNNs Without Losing Accuracy.
CoRR, 2017

WRPN: Training and Inference using Wide Reduced-Precision Networks.
CoRR, 2017

High performance binary neural networks on the Xeon+FPGA™ platform.
Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017

Fine-grained accelerators for sparse machine learning workloads.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

2016
ScalCore: Designing a core for voltage scalability.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC.
Proceedings of the 2016 International Conference on Field-Programmable Technology, 2016

Accelerating recurrent neural networks in analytics servers: Comparison of FPGA, CPU, GPU, and ASIC.
Proceedings of the 26th International Conference on Field Programmable Logic and Applications, 2016

Hardware accelerator for analytics of sparse data.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015
A sparse matrix vector multiply accelerator for support vector machine.
Proceedings of the 2015 International Conference on Compilers, 2015

2014
Tangle: Route-oriented dynamic voltage minimization for variation-afflicted, energy-efficient on-chip networks.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

2013
Orchestrated scheduling and prefetching for GPGPUs.
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

Runnemede: An architecture for Ubiquitous High-Performance Computing.
Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013

A heterogeneous multiple network-on-chip design: an application-aware approach.
Proceedings of the 50th Annual Design Automation Conference 2013, 2013

OWL: cooperative thread array aware scheduling techniques for improving GPGPU performance.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013

2012
Cache revive: architecting volatile STT-RAM caches for enhanced performance in CMPs.
Proceedings of the 49th Annual Design Automation Conference 2012, 2012

PEPON: performance-aware hierarchical power budgeting for NoC based multicores.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

Application-aware prefetch prioritization in on-chip networks.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011
RAFT: A router architecture with frequency tuning for on-chip networks.
J. Parallel Distributed Comput., 2011

Exploiting Heterogeneity for Energy Efficiency in Chip Multiprocessors.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2011

METE: meeting end-to-end QoS in multicores through system-wide resource management.
Proceedings of the SIGMETRICS 2011, 2011

A case for heterogeneous on-chip interconnects for CMPs.
Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011

Architecting on-chip interconnects for stacked 3D STT-RAM caches in CMPs.
Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011

ACCESS: Smart scheduling for asymmetric cache CMPs.
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011

An energy-efficient heterogeneous CMP based on hybrid TFET-CMOS cores.
Proceedings of the 48th Design Automation Conference, 2011

2010
Towards characterizing cloud backend workloads: insights from Google compute clusters.
SIGMETRICS Perform. Evaluation Rev., 2010

Coordinated power management of voltage islands in CMPs.
Proceedings of the SIGMETRICS 2010, 2010

CPM in CMPs: Coordinated Power Management in Chip-Multiprocessors.
Proceedings of the Conference on High Performance Computing Networking, 2010

2009
A case for integrated processor-cache partitioning in chip multiprocessors.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

A case for dynamic frequency tuning in on-chip networks.
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009

Design and evaluation of a hierarchical on-chip interconnect for next-generation CMPs.
Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009

2008
MIRA: A Multi-layered On-Chip Interconnect Router Architecture.
Proceedings of the 35th International Symposium on Computer Architecture (ISCA 2008), 2008

Detection of Arcing in Low Voltage Distribution Systems.
Proceedings of the IEEE Reglon 10 Colloquium and Third International Conference on Industrial and Information Systems, 2008

Performance and power optimization through data compression in Network-on-Chip architectures.
Proceedings of the 14th International Conference on High-Performance Computer Architecture (HPCA-14 2008), 2008


  Loading...