Arrvindh Shriraman

Anagha Molakalmur Anil Kumar

Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2024

METAL: Caching Multi-level Indexes in Domain-Specific Architectures.

[BibT_eX]

[DOI]

Aditya Prasanna

Jonathan Balkind

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2022

OptimizedDP: An Efficient, User-friendly Library For Optimal Control and Dynamic Programming.

[BibT_eX]

[DOI]

CoRR, 2022

X-cache: a modular architecture for domain-specific caches.

[BibT_eX]

[DOI]

Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

mu-grind: A Framework for Dynamically Instrumenting HLS-Generated RTL.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022

2021

Real-Time Hamilton-Jacobi Reachability Analysis of Autonomous System With An FPGA.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

SPAGHETTI: Streaming Accelerators for Highly Sparse GEMM on FPGAs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

X-Layer: Building Composable Pipelined Dataflows for Low-Rank Convolutions.

[BibT_eX]

[DOI]

Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021

2020

Real-Time Formal Verification of Autonomous Systems With An FPGA.

[BibT_eX]

[DOI]

CoRR, 2020

Safety-Guaranteed Real-Time Trajectory Planning for Underwater Vehicles in Plane-Progressive Waves.

[BibT_eX]

[DOI]

Proceedings of the 59th IEEE Conference on Decision and Control, 2020

2019

μIR -An intermediate representation for transforming and optimizing the microarchitecture of application accelerators.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

Deepframe: A Profile-Driven Compiler for Spatial Hardware Accelerators.

[BibT_eX]

[DOI]

Apala Guha

Naveen Vedula

Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018

Scalable distributed visual computing for line-rate video streams.

[BibT_eX]

[DOI]

Proceedings of the 9th ACM Multimedia Systems Conference, 2018

TAPAS: Generating Parallel Accelerators from Parallel Programs.

[BibT_eX]

[DOI]

Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

NACHOS: Software-Driven Hardware-Assisted Memory Disambiguation for Accelerators.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

2017

Needle: Leveraging Program Analysis to Analyze and Extract Accelerators from Whole Programs.

[BibT_eX]

[DOI]

Nick Sumner

Steve Margerm

Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

2016

Chainsaw: Von-neumann accelerators to leverage fused instruction chains.

[BibT_eX]

[DOI]

Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

SPEC-AX and PARSEC-AX: extracting accelerator benchmarks from microprocessor benchmarks.

[BibT_eX]

[DOI]

William N. Sumner

Proceedings of the 2016 IEEE International Symposium on Workload Characterization, 2016

Peruse and Profit: Estimating the Accelerability of Loops.

[BibT_eX]

[DOI]

Amirali Sharifian

Nick Sumner

Proceedings of the 2016 International Conference on Supercomputing, 2016

How to Speed Up CUDA-WSat-PcL by 5x.

[BibT_eX]

[DOI]

Heng Liu

Evgenia Ternovska

Proceedings of the Fourth International Symposium on Computing and Networking, 2016

2015

Fusion: design tradeoffs in coherent cache hierarchies for accelerators.

[BibT_eX]

[DOI]

Naveen Vedula

Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

DASX: Hardware Accelerator for Software Data Structures.

[BibT_eX]

[DOI]

Naveen Vedula

Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

2014

Cache Coherence for GPU Architectures.

[BibT_eX]

[DOI]

IEEE Micro, 2014

SQRL: hardware accelerator for collecting software data structures.

[BibT_eX]

[DOI]

Dan Lin

Jordon Phillips

Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

Bitwise data parallelism in regular expression matching.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013

An Application-Tailored Approach to Hardware Cache Coherence.

[BibT_eX]

[DOI]

Hongzhou Zhao

Computer, 2013

Protozoa: adaptive granularity cache coherence.

[BibT_eX]

[DOI]

Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

Verifying safety and liveness for the FlexTM hybrid transactional memory.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation and Test in Europe, 2013

Power containers: an OS facility for fine-grained power and energy management on multicore servers.

[BibT_eX]

[DOI]

Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013

2012

Power and energy containers for multicore servers.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, 2012

Amoeba-Cache: Adaptive Blocks for Eliminating Waste in the Memory Hierarchy.

[BibT_eX]

[DOI]

Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, 2012

Parabix: Boosting the efficiency of text processing on commodity processors.

[BibT_eX]

[DOI]

Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012

2011

Analyzing Conflicts in Hardware-Supported Memory Transactions.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2011

SPATL: Honey, I Shrunk the Coherence Directory.

[BibT_eX]

[DOI]

Hongzhou Zhao

Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010

Implementation tradeoffs in the design of flexible transactional memory support.

[BibT_eX]

[DOI]

Michael L. Scott

J. Parallel Distributed Comput., 2010

Sentry: light-weight auxiliary memory access control.

[BibT_eX]

[DOI]

Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

SPACE: sharing pattern-based directory coherence for multicore scalability.

[BibT_eX]

[DOI]

Hongzhou Zhao

Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010

2009

Tapping into Parallelism with Transactional Memory.

[BibT_eX]

[DOI]

Michael L. Scott

Refereeing conflicts in hardware transactional memory.

[BibT_eX]

[DOI]

Proceedings of the 23rd international conference on Supercomputing, 2009

2008

Flexible Decoupled Transactional Memory Support.

[BibT_eX]

[DOI]

Michael L. Scott

Proceedings of the 35th International Symposium on Computer Architecture (ISCA 2008), 2008

2007

Nonblocking transactions without indirection using alert-on-update.

[BibT_eX]

[DOI]

Proceedings of the SPAA 2007: Proceedings of the 19th Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2007

Alert-on-update: a communication aid for shared memory multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2007

An integrated hardware-software approach to flexible transactional memory.

[BibT_eX]

[DOI]

Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007

2006

PASCOM: Power Model for Supercomputers.

[BibT_eX]

[DOI]

Nagarajan Venkateswaran

Niranjan Soundararajan

Proceedings of the Architecture of Computing Systems, 2006

2005

Memory In Processor-Supercomputer On a Chip: Processor Design and Execution Semantics for Massive Single-Chip Performance.

[BibT_eX]

[DOI]

Nagarajan Venkateswaran