Paul Gratz

Orcid: 0000-0001-7120-7189

According to our database1, Paul Gratz authored at least 92 papers between 2006 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Coherence Attacks and Countermeasures in Interposer-based Chiplet Systems.
ACM Trans. Archit. Code Optim., June, 2024

Exposing Shadow Branches.
CoRR, 2024

Correct Wrong Path.
CoRR, 2024

Aiding Microprocessor Performance Validation with Machine Learning.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024

Flow Correlator: A Flow Table Cache Management Strategy.
Proceedings of the 33rd International Conference on Computer Communications and Networks, 2024

2023
KVRangeDB: Range Queries for a Hash-based Key-Value Device.
ACM Trans. Storage, August, 2023

Machine Learning for Microprocessor Performance Bug Localization.
CoRR, 2023

Last-Level Cache Insertion and Promotion Policy in the Presence of Aggressive Prefetching.
IEEE Comput. Archit. Lett., 2023

A Characterization of the Effects of Software Instruction Prefetching on an Aggressive Front-end.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

2022
Software Hint-Driven Data Management for Hybrid Memory in Mobile Systems.
ACM Trans. Embed. Comput. Syst., 2022

SIMD-Matcher: A SIMD-based Arbitrary Matching Framework.
ACM Trans. Archit. Code Optim., 2022

Reducing Minor Page Fault Overheads through Enhanced Page Walker.
ACM Trans. Archit. Code Optim., 2022

The Championship Simulator: Architectural Simulation for Education and Competition.
CoRR, 2022

Hardware Trojan Threats to Cache Coherence in Modern 2.5D Chiplet Systems.
IEEE Comput. Archit. Lett., 2022

Page Size Aware Cache Prefetching.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Composite Instruction Prefetching.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

SLAP-CC: Set-Level Adaptive Prefetching for Compressed Caches.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

Stay in your Lane: A NoC with Low-overhead Multi-packet Bypassing.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

2021
Interposer-Based Root of Trust.
CoRR, 2021

KVRAID: high performance, write efficient, update friendly erasure coding scheme for KV-SSDs.
Proceedings of the SYSTOR '21: The 14th ACM International Systems and Storage Conference, 2021

SEEC: stochastic escape express channel.
Proceedings of the International Conference for High Performance Computing, 2021

Pitstop: Enabling a Virtual Network Free Network-on-Chip.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

Automatic Microprocessor Performance Bug Detection.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

An FPGA-based Hybrid Memory Emulation System.
Proceedings of the 31st International Conference on Field-Programmable Logic and Applications, 2021

CMRC: Comprehensive Microarchitectural Register Coalescing for GPGPUs.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

OpenMem: Hardware/Software Cooperative Management for Mobile Memory System.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

2020
Hardware Memory Management for Future Mobile Hybrid Memory Systems.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

FPGA-based Hyrbid Memory Emulation System.
CoRR, 2020

Virtualize and share non-volatile memories in user space.
CCF Trans. High Perform. Comput., 2020

SB-Fetch: synchronization aware hardware prefetching for chip multiprocessors.
Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

DRAIN: Deadlock Removal for Arbitrary Irregular Networks.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

Exploiting Zero Data to Reduce Register File and Execution Unit Dynamic Power Consumption in GPGPUs.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

A Generic FPGA Accelerator for Minimum Storage Regenerating Codes.
Proceedings of the 25th Asia and South Pacific Design Automation Conference, 2020

2019
GenMatcher: A Generic Clustering-Based Arbitrary Matching Framework.
ACM Trans. Archit. Code Optim., 2019

Synchronized Progress in Interconnection Networks (SPIN): A New Theory for Deadlock Freedom.
IEEE Micro, 2019

Optimizing Post-Copy Live Migration with System-Level Checkpoint Using Fabric-Attached Memory.
Proceedings of the 2019 IEEE/ACM Workshop on Memory Centric High Performance Computing, 2019

vNVML: An Efficient User Space Library for Virtualizing and Sharing Non-Volatile Memories.
Proceedings of the 35th Symposium on Mass Storage Systems and Technologies, 2019

SWAP: Synchronized Weaving of Adjacent Packets for Network Deadlock Resolution.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

Perceptron-based prefetch filtering.
Proceedings of the 46th International Symposium on Computer Architecture, 2019

SpecLock: Speculative Lock Forwarding.
Proceedings of the 37th IEEE International Conference on Computer Design, 2019

The Best of IEEE Computer Architecture Letters in 2018.
Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

2018
Guest Editorial: Emerging Technologies and Architectures for Manycore Computing Part 1: Hardware Techniques.
IEEE Trans. Multi Scale Comput. Syst., 2018

SDPR: Improving Latency and Bandwidth in On-Chip Interconnect Through Simultaneous Dual-Path Routing.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

MTB-Fetch: Multithreading Aware Hardware Prefetching for Chip Multiprocessors.
IEEE Comput. Archit. Lett., 2018

2017
Speculative paging for future NVM storage.
Proceedings of the International Symposium on Memory Systems, 2017

Minimal exercise vector generation for reliability improvement.
Proceedings of the 23rd IEEE International Symposium on On-Line Testing and Robust System Design, 2017

Kill the Program Counter: Reconstructing Program Behavior in the Processor Cache Hierarchy.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

2016
GCA: Global Congestion Awareness for Load Balance in Networks-on-Chip.
IEEE Trans. Parallel Distributed Syst., 2016

Resource Sharing Centric Dynamic Voltage and Frequency Scaling for CMP Cores, Uncore, and Memory.
ACM Trans. Design Autom. Electr. Syst., 2016

Path confidence based lookahead prefetching.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

2015
Use It or Lose It: Proactive, Deterministic Longevity in Future Chip Multiprocessors.
ACM Trans. Design Autom. Electr. Syst., 2015

Wear-Aware Adaptive Routing for Networks-on-Chips.
Proceedings of the 9th International Symposium on Networks-on-Chip, 2015

Dynamic Memory Pressure Aware Ballooning.
Proceedings of the 2015 International Symposium on Memory Systems, 2015

Shared Last-Level Caches and The Case for Longer Timeslices.
Proceedings of the 2015 International Symposium on Memory Systems, 2015

Having your cake and eating it too: Energy savings without performance loss through resource sharing driven power management.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

Clotho: Proactive wearout deceleration in Chip-Multiprocessor interconnects.
Proceedings of the 33rd IEEE International Conference on Computer Design, 2015

Energy-efficient implementations of GF (p) and GF(2m) elliptic curve cryptography.
Proceedings of the 33rd IEEE International Conference on Computer Design, 2015

A control-theoretic approach for energy efficient CPU-GPU subsystem in mobile platforms.
Proceedings of the 52nd Annual Design Automation Conference, 2015

Bandwidth-efficient on-chip interconnect designs for GPGPUs.
Proceedings of the 52nd Annual Design Automation Conference, 2015

2014
WaveSync: Low-Latency Source-Synchronous Bypass Network-on-Chip Architecture.
ACM Trans. Design Autom. Electr. Syst., 2014

LumiNOC: A Power-Efficient, High-Performance, Photonic Network-on-Chip.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2014

Spatial Locality Speculation to Reduce Energy in Chip-Multiprocessor Networks-on-Chip.
IEEE Trans. Computers, 2014

Towards platform level power management in mobile systems.
Proceedings of the 27th IEEE International System-on-Chip Conference, 2014

STORM: A Simple Traffic-Optimized Router Microarchitecture for Networks-on-Chip.
Proceedings of the Eighth IEEE/ACM International Symposium on Networks-on-Chip, 2014

B-Fetch: Branch Prediction Directed Prefetching for Chip-Multiprocessors.
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

The design space of ultra-low energy asymmetric cryptography.
Proceedings of the 2014 IEEE International Symposium on Performance Analysis of Systems and Software, 2014

Stochastic Pre-classification for SDN Data Plane Matching.
Proceedings of the 22nd IEEE International Conference on Network Protocols, 2014

Up by their bootstraps: Online learning in Artificial Neural Networks for CMP uncore power management.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

ILP and TLP in shared memory applications: a limit study.
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013
In-network monitoring and control policy for DVFS of CMP networks-on-chip and last level caches.
ACM Trans. Design Autom. Electr. Syst., 2013

ARI: Adaptive LLC-memory traffic management.
ACM Trans. Archit. Code Optim., 2013

LumiNOC: A low-latency, high-bandwidth per Watt, photonic Network-on-Chip.
Proceedings of the ACM/IEEE International Workshop on System Level Interconnect Prediction, 2013

GCA: Global congestion awareness for load balance in Networks-on-Chip.
Proceedings of the 2013 Seventh IEEE/ACM International Symposium on Networks-on-Chip (NoCS), 2013

Use it or lose it: wear-out and lifetime in future chip multiprocessors.
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013

Bidirectional interconnect design for low latency high bandwidth NoC.
Proceedings of 2013 International Conference on IC Design & Technology, 2013

Power gating with block migration in chip-multiprocessor last-level caches.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013

Stochastic Pre-Classification for Software Defined Firewalls.
Proceedings of the 22nd International Conference on Computer Communication and Networks, 2013

Dynamic voltage and frequency scaling for shared resources in multicore processor designs.
Proceedings of the 50th Annual Design Automation Conference 2013, 2013

2012
B-Fetch: Branch Prediction Directed Prefetching for In-Order Processors.
IEEE Comput. Archit. Lett., 2012

Exploiting path diversity for low-latency and high-bandwidth with the dual-path NoC router.
Proceedings of the 2012 IEEE International Symposium on Circuits and Systems, 2012

LumiNOC: a power-efficient, high-performance, photonic network-on-chip for future parallel architectures.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011
Asynchronous Bypass Channels for Multi-Synchronous NoCs: A Router Microarchitecture, Topology, and Routing Algorithm.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2011

AcENoCs: A Configurable HW/SW Platform for FPGA Accelerated NoC Emulation.
Proceedings of the VLSI Design 2011: 24th International Conference on VLSI Design, 2011

Reducing network-on-chip energy consumption through spatial locality speculation.
Proceedings of the NOCS 2011, 2011

2010
Leveraging Unused Cache Block Words to Reduce Power in CMP Interconnect.
IEEE Comput. Archit. Lett., 2010

Asynchronous Bypass Channels: Improving Performance for Multi-synchronous NoCs.
Proceedings of the NOCS 2010, 2010

2009
An evaluation of the TRIPS computer system.
Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, 2009

2008
Regional congestion awareness for load balance in networks-on-chip.
Proceedings of the 14th International Conference on High-Performance Computer Architecture (HPCA-14 2008), 2008

2007
On-Chip Interconnection Networks of the TRIPS Chip.
IEEE Micro, 2007

Implementation and Evaluation of a Dynamically Routed Processor Operand Network.
Proceedings of the First International Symposium on Networks-on-Chips, 2007

2006
Distributed Microarchitectural Protocols in the TRIPS Prototype Processor.
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-39 2006), 2006

Implementation and Evaluation of On-Chip Network Architectures.
Proceedings of the 24th International Conference on Computer Design (ICCD 2006), 2006


  Loading...