Chen Ding

Proceedings of the SC24-W: Workshops of the International Conference for High Performance Computing, 2024

Measuring Data Access Latency in Large CPU Caches.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Memory Systems, 2024

Implementation of a Two-Level Programmable Cache Emulation and Test System.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Memory Systems, 2024

Parallel Loop Locality Analysis for Symbolic Thread Counts.

[BibT_eX]

[DOI]

Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques, 2024

2023

Cache Programming for Scientific Loops Using Leases.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., September, 2023

DMC4ML: Data Movement Complexity for Machine Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Scalable CP Decomposition for Tensor Learning using GPU Tensor Cores.

[BibT_eX]

[DOI]

CoRR, 2023

Memory Workload Synthesis Using Generative AI.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Memory Systems, 2023

E=MC^2: Efficient Mobility Centric Caching.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Memory Systems, 2023

Blast from the Past: Least Expected Use (LEU) Cache Replacement with Statistical History.

[BibT_eX]

[DOI]

Proceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management, 2023

2022

CARL: Compiler Assigned Reference Leasing.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2022

Cache-coherent CLAM (WIP).

[BibT_eX]

[DOI]

Benjamin Reber

Dorin Patru

Proceedings of the LCTES '22: 23rd ACM SIGPLAN/SIGBED International Conference on Languages, 2022

Beyond time complexity: data movement complexity analysis for matrix multiplication.

[BibT_eX]

[DOI]

Wesley Smith

Aidan Goldfarb

Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

2021

Writeback Modeling: Theory and Application to Zipfian Workloads.

[BibT_eX]

[DOI]

Wesley Smith

Daniel Byrne

Proceedings of the MEMSYS 2021: The International Symposium on Memory Systems, Washington, USA, September 27, 2021

Uniform lease vs. LRU cache: analysis and evaluation.

[BibT_eX]

[DOI]

Proceedings of the ISMM '21: 2021 ACM SIGPLAN International Symposium on Memory Management, 2021

Measuring Cache Complexity Using Data Movement Distance (DMD).

[BibT_eX]

[DOI]

Donovan Snyder

Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

AWLCO: All-Window Length Co-Occurrence.

[BibT_eX]

[DOI]

Proceedings of the 32nd Annual Symposium on Combinatorial Pattern Matching, 2021

2020

PLUM: static parallel program locality analysis under uniform multiplexing.

[BibT_eX]

[DOI]

Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020

CLAM: Compiler Lease of Cache Memory.

[BibT_eX]

[DOI]

Proceedings of the MEMSYS 2020: The International Symposium on Memory Systems, 2020

2019

A Relational Theory of Locality.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2019

Cacheap: Portable and Collaborative I/O Optimization for Graph Processing.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2019

Statistical caching for near memory management.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Memory Systems, 2019

CLAM: Compiler Leasing of Accelerator Memory.

[BibT_eX]

[DOI]

Dong Chen

Dorin Patru

Proceedings of the Languages and Compilers for Parallel Computing, 2019

Timescale functions for parallel memory allocation.

[BibT_eX]

[DOI]

Proceedings of the 2019 ACM SIGPLAN International Symposium on Memory Management, 2019

Codestitcher: inter-procedural basic block layout optimization.

[BibT_eX]

[DOI]

Rahman Lavaee

John Criswell

Proceedings of the 28th International Conference on Compiler Construction, 2019

Beating OPT with Statistical Clairvoyance and Variable Size Caching.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

2018

Fast Miss Ratio Curve Modeling for Storage Cache.

[BibT_eX]

[DOI]

ACM Trans. Storage, 2018

A Measurement Theory of Locality.

[BibT_eX]

[DOI]

CoRR, 2018

Locality analysis through static parallel sampling.

[BibT_eX]

[DOI]

Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2018

Fine-grained data usage analysis by access sampling: seeing the data that is not there.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Memory Systems, 2018

Footprint modeling of cache associativity and granularity.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Memory Systems, 2018

Footmark: A New Formulation for Working Set Statistics.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2018

Prediction and bounds on shared cache demand from memory access interleaving.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM SIGPLAN International Symposium on Memory Management, 2018

PAYJIT: space-optimal JIT compilation and its practical implementation.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Compiler Construction, 2018

All timescale window co-occurrence: efficient analysis and a possible use.

[BibT_eX]

[DOI]

Proceedings of the 28th Annual International Conference on Computer Science and Software Engineering, 2018

2017

Optimal Symbiosis and Fair Scheduling in Shared Cache.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2017

Optimizing Locality-Aware Memory Management of Key-Value Caches.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2017

Cache Exclusivity and Sharing: Theory and Optimization.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2017

LD: Low-Overhead GPU Race Detection Without Access Monitoring.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2017

Rochester Elastic Cache Utility (RECU): Unequal Cache Sharing is Good Economics.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2017

Thread Data Sharing in Cache: Theory and Measurement.

[BibT_eX]

[DOI]

Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017

Memory equalizer for lateral management of heterogeneous memory.

[BibT_eX]

[DOI]

Hai Jin

Proceedings of the International Symposium on Memory Systems, 2017

Adaptive Software Caching for Efficient NVRAM Data Persistence.

[BibT_eX]

[DOI]

Dhruva R. Chakrabarti

Liang Yuan

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

2016

Rethinking Memory Management in Modern Operating System: Horizontal, Vertical or Random?

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2016

Kinetic Modeling of Data Eviction in Cache.

[BibT_eX]

[DOI]

Proceedings of the 2016 USENIX Annual Technical Conference, 2016

Data-centric combinatorial optimization of parallel code.

[BibT_eX]

[DOI]

Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2016

ElCached: Elastic Multi-Level Key-Value Cache.

[BibT_eX]

[DOI]

Proceedings of the 4th Workshop on Interactions of NVM/Flash with Operating Systems and Workloads, 2016

Write Locality and Optimization for Persistent Memory.

[BibT_eX]

[DOI]

Dong Chen

Proceedings of the Second International Symposium on Memory Systems, 2016

Replacement Policies for Heterogeneous Memories.

[BibT_eX]

[DOI]

Jacob Brock

Proceedings of the Second International Symposium on Memory Systems, 2016

Hardware support for protective and collaborative cache sharing.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM SIGPLAN International Symposium on Memory Management, Santa Barbara, CA, USA, June 14, 2016

Rethinking a heap hierarchy as a cache hierarchy: a higher-order theory of memory demand (HOTM).

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM SIGPLAN International Symposium on Memory Management, Santa Barbara, CA, USA, June 14, 2016

Compositional model of coherence and NUMA effects for optimizing thread and data placement.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on Performance Analysis of Systems and Software, 2016

2015

LAMA: Optimized Locality-aware Memory Allocation for Key-value Cache.

[BibT_eX]

[DOI]

Proceedings of the 2015 USENIX Annual Technical Conference, 2015

MMC: a Many-core Memory Connection Model.

[BibT_eX]

[DOI]

Hao Lu

Proceedings of the 2015 International Symposium on Memory Systems, 2015

Optimal Cache Partition-Sharing.

[BibT_eX]

[DOI]

Proceedings of the 44th International Conference on Parallel Processing, 2015

Optimal Footprint Symbiosis in Shared Cache.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

Assessing Safe Task Parallelism in SPEC 2006 INT.

[BibT_eX]

[DOI]

Tongxin Bai

Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

2014

Locality analysis: a nonillion time window problem.

[BibT_eX]

[DOI]

Jacob Brock

SIGMETRICS Perform. Evaluation Rev., 2014

Performance Metrics and Models for Shared Cache.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2014

Affinity-based hash tables.

[BibT_eX]

[DOI]

Brian Gernhardt

Rahman Lavaee

Proceedings of the workshop on Memory Systems Performance and Correctness, 2014

Modeling heap data growth using average liveness.

[BibT_eX]

[DOI]

Mohammad Fozlul Haque Bhuiyan

Proceedings of the International Symposium on Memory Management, 2014

Code Layout Optimization for Defensiveness and Politeness in Shared Cache.

[BibT_eX]

[DOI]

Proceedings of the 43rd International Conference on Parallel Processing, 2014

13th compiler-driven performance workshop (CDP).

[BibT_eX]

[DOI]

Proceedings of 24th Annual International Conference on Computer Science and Software Engineering, 2014

Prioritizing and Scheduling Service Requests under Time Constraints.

[BibT_eX]

[DOI]

Farnaz Dargahi

Chun Wang

Proceedings of the IEEE International Conference on Services Computing, SCC 2014, Anchorage, AK, USA, June 27, 2014

Protection and utilization in shared cache through rationing.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013

A coldness metric for cache optimization.

[BibT_eX]

[DOI]

Raj Parihar

Michael C. Huang

Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, 2013

All-window data liveness.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, 2013

Cache rationing for multicore.

[BibT_eX]

[DOI]

Jacob Brock

Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, 2013

Access Annotation for Safe Program Parallelization.

[BibT_eX]

[DOI]

Lei Liu

Proceedings of the Network and Parallel Computing - 10th IFIP International Conference, 2013

Pacman: program-assisted cache management.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Memory Management, 2013

Defensive loop tiling for shared cache.

[BibT_eX]

[DOI]

Bin Bao

Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, 2013

HOTL: a higher order theory of locality.

[BibT_eX]

[DOI]

Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013

2012

A higher order theory of locality.

[BibT_eX]

[DOI]

Xiaoya Xiang

Proceedings of the 2012 ACM SIGPLAN workshop on Memory Systems Performance and Correctness: held in conjunction with PLDI '12, 2012

A generalized theory of collaborative caching.

[BibT_eX]

[DOI]

Xiaoming Gu

Proceedings of the International Symposium on Memory Management, 2012

Cache Conscious Task Regrouping on Multicore Processors.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

Delta Send-Recv for Dynamic Pipelining in MPI Programs.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

2011

All-window profiling and composable models of cache sharing.

[BibT_eX]

[DOI]

Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011

Two examples of parallel programming without concurrency constructs (PP-CC).

[BibT_eX]

[DOI]

Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011

Safe parallel programming using dynamic dependence hints.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2011

Parallel programming by hints.

[BibT_eX]

[DOI]

Zachary Fletcher

Proceedings of the SPLASH'11 Workshops, 2011

Parallel programming by hints.

[BibT_eX]

[DOI]

Proceedings of the Companion to the 26th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2011

Waste not, want not: resource-based garbage collection in a shared environment.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Memory Management, 2011

On the theory and potential of LRU-MRU collaborative cache management.

[BibT_eX]

[DOI]

Xiaoming Gu

Proceedings of the 10th International Symposium on Memory Management, 2011

Linear-time Modeling of Program Working Set in Shared Cache.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010

Continuous speculative program parallelization in software.

[BibT_eX]

[DOI]

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010

2009

Program locality analysis using reuse distance.

[BibT_eX]

[DOI]

ACM Trans. Program. Lang. Syst., 2009

Fastpath Speculative Parallelization.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2009

A component model of spatial locality.

[BibT_eX]

[DOI]

Proceedings of the 8th International Symposium on Memory Management, 2009

Fast Track: A Software System for Speculative Program Optimization.

[BibT_eX]

[DOI]

Proceedings of the CGO 2009, 2009

2008

All-window profiling of concurrent executions.

[BibT_eX]

[DOI]

Trishul M. Chilimbi

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008

P-OPT: Program-Directed Optimal Cache Management.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2008

2007

Miss Rate Prediction Across Program Inputs and Cache Configurations.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2007

Predicting locality phases for dynamic memory optimization.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2007

Locality approximation using time.

[BibT_eX]

[DOI]

Proceedings of the 34th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2007

Software behavior oriented parallelization.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, 2007

A Key-based Adaptive Transactional Memory Executor.

[BibT_eX]

[DOI]

Tongxin Bai

Chengliang Zhang

William N. Scherer III

Michael L. Scott

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Analysis of input-dependent program behavior using active profiling.

[BibT_eX]

[DOI]

Proceedings of the Workshop on Experimental Computer Science, 2007

Quantifying the cost of context switch.

[BibT_eX]

[DOI]

Chuanpeng Li

Kai Shen

Proceedings of the Workshop on Experimental Computer Science, 2007

Fast Track: Supporting Unsafe Optimizations with Software Speculation.

[BibT_eX]

[DOI]

Kirk Kelsey

Chengliang Zhang

Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), 2007

2006

A hierarchical model of data locality.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2006

Program-level adaptive memory management.

[BibT_eX]

[DOI]

Proceedings of the 5th International Symposium on Memory Management, 2006

Program phase detection and exploitation.

[BibT_eX]

[DOI]

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

2005

Parallelization of Utility Programs Based on Behavior Phase Analysis.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2005

Lightweight reference affinity analysis.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual International Conference on Supercomputing, 2005

Gated memory control for memory monitoring, leak detection and garbage collection.

[BibT_eX]

[DOI]

Proceedings of the 2005 workshop on Memory System Performance, 2005

2004

The Potential of Computation Regrouping for Improving Locality.

[BibT_eX]

[DOI]

Maksim Orlovich

Proceedings of the ACM/IEEE SC2004 Conference on High Performance Networking and Computing, 2004

Array regrouping and structure splitting using whole-program reference affinity.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation 2004, 2004

Phase-Based Miss Rate Prediction Across Program Inputs.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for High Performance Computing, 2004

Adaptive Data Partition for Sorting Using Probability Distribution.

[BibT_eX]

[DOI]

Proceedings of the 33rd International Conference on Parallel Processing (ICPP 2004), 2004

Locality phase prediction.

[BibT_eX]

[DOI]

Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, 2004

The Energy Impact of Aggressive Loop Fusion.

[BibT_eX]

[DOI]

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT 2004), 29 September, 2004

2003

Predicting whole-program locality through reuse distance analysis.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation 2003, 2003

A Hierarchical Model of Reference Affinity.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2003

Miss Rate Prediction across All Program Inputs.

[BibT_eX]

[DOI]

Steve Dropsho

Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques (PACT 2003), 27 September, 2003

2002

Compiler-directed run-time monitoring of program data access.

[BibT_eX]

[DOI]