Sreenivas Subramoney
Orcid: 0000-0001-5372-0173
According to our database1,
Sreenivas Subramoney
authored at least 65 papers
between 2000 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-efficient Genome Analysis.
ACM Trans. Archit. Code Optim., March, 2024
CiMNet: Towards Joint Optimization for DNN Architecture and Configuration for Compute-In-Memory Hardware.
CoRR, 2024
Proceedings of the 2024 USENIX Annual Technical Conference, 2024
Constable: Improving Performance and Power Efficiency by Safely Eliminating Load Instruction Execution.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
2023
Enhanced regularization for on-chip training using analog and temporary memory weights.
Neural Networks, August, 2023
CoRR, 2023
Reclaimer: A Reinforcement Learning Approach to Dynamic Resource Allocation for Cloud Microservices.
CoRR, 2023
VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUs.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023
2022
A Unified Programmable Edge Matrix Processor for Deep Neural Networks and Matrix Algebra.
ACM Trans. Embed. Comput. Syst., September, 2022
IEEE Trans. Neural Networks Learn. Syst., 2022
CoRR, 2022
Disrupting Low-Write-Energy vs. Fast-Read Dilemma in RRAM to Enable L1 Instruction Cache.
Proceedings of the VLSI Design and Test - 26th International Symposium, 2022
Speculative Code Compaction: Eliminating Dead Code via Speculative Microcode Transformations.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
SeGraM: a universal hardware accelerator for genomic sequence-to-graph and sequence-to-sequence mapping.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
2021
CoRR, 2021
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
Proceedings of the ISMM '21: 2021 ACM SIGPLAN International Symposium on Memory Management, 2021
REDUCT: Keep it Close, Keep it Cool! : Efficient Scaling of DNN Inference on Multi-core CPUs with Near-Cache Compute.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
Proceedings of the 29th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2021
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021
2020
ACM Trans. Embed. Comput. Syst., 2020
Proximu: Efficiently Scaling DNN Inference in Multi-core CPUs through Near-Cache Compute.
CoRR, 2020
Look-Up Table based Energy Efficient Processing in Cache Support for Neural Network Acceleration.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
GenASM: A High-Performance, Low-Power Approximate String Matching Acceleration Framework for Genome Sequence Analysis.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
Characterization of Data Generating Neural Network Applications on x86 CPU Architecture.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2020
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020
Proceedings of the IEEE International Conference on Image Processing, 2020
PSB-RNN: A Processing-in-Memory Systolic Array Architecture using Block Circulant Matrices for Recurrent Neural Networks.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020
2019
Towards the adoption of Local Branch Predictors in Modern Out-of-Order Superscalar Processors.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019
Bandwidth-Aware Last-Level Caching: Efficiently Coordinating Off-Chip Read and Write Bandwidth.
Proceedings of the 37th IEEE International Conference on Computer Design, 2019
Visual Inertial Odometry At the Edge: A Hardware-Software Co-design Approach for Ultra-low Latency and Power.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019
2018
Proceedings of the International Symposium on Memory Systems, 2018
Criticality Aware Tiered Cache Hierarchy: A Fundamental Relook at Multi-Level Cache Hierarchies.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018
Density Tradeoffs of Non-Volatile Memory as a Replacement for SRAM Based Last Level Cache.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018
Closed yet open DRAM: achieving low latency and high performance in DRAM memory systems.
Proceedings of the 55th Annual Design Automation Conference, 2018
2017
Cooperative Multi-Agent Reinforcement Learning-Based Co-optimization of Cores, Caches, and On-chip Network.
ACM Trans. Archit. Code Optim., 2017
ACM Trans. Archit. Code Optim., 2017
Near-Optimal Access Partitioning for Memory Hierarchies with Multiple Heterogeneous Bandwidth Sources.
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017
A coordinated multi-agent reinforcement learning approach to multi-level cache co-partitioning.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017
2016
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016
Machine Learned Machines: Adaptive co-optimization of caches, cores, and On-chip Network.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016
2014
Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014
2013
Efficient management of last-level caches in graphics processors for 3D scene rendering workloads.
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013
2012
Introducing hierarchy-awareness in replacement and bypass algorithms for last-level caches.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
2011
Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011
2004
Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation 2004, 2004
2000
Proceedings of the ISMM 2000, 2000