Mark D. Hill

Orcid: 0000-0002-9717-5741

Affiliations:
  • University of Wisconsin-Madison, Madison, USA


According to our database1, Mark D. Hill authored at least 168 papers between 1983 and 2024.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2004, "For contributions to memory consistency models and memory system design.".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Managing Memory Tiers with CXL in Virtualized Environments.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

2023
Design Tradeoffs in CXL-Based Memory Pools for Public Cloud Platforms.
IEEE Micro, 2023

Pond: CXL-Based Memory Pooling Systems for Cloud Platforms.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
First-generation Memory Disaggregation for Cloud Platforms.
CoRR, 2022

2021
A National Discovery Cloud: Preparing the US for Global Competitiveness in the New Era of 21st Century Digital Transformation.
CoRR, 2021

Advancing Computing's Foundation of US Industry & Society.
CoRR, 2021

A vision to compute like nature: thermodynamically.
Commun. ACM, 2021

Accelerator-level parallelism.
Commun. ACM, 2021

2020
A Primer on Memory Consistency and Cache Coherence, Second Edition
Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01764-3, 2020

Opportunities and Challenges for Next Generation Computing.
CoRR, 2020

Nanotechnology-inspired Information Processing Systems of the Future.
CoRR, 2020

Technical perspective: Why 'correct' computers can leak your information.
Commun. ACM, 2020

MOD: Minimally Ordered Durable Datastructures for Persistent Memory.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

2019
On the Spectre and Meltdown Processor Security Vulnerabilities.
IEEE Micro, 2019

Reflections and Research Advice Upon Receiving the 2019 Eckert-Mauchly Award.
IEEE Micro, 2019

Thermodynamic Computing.
CoRR, 2019

Don't Persist All : Efficient Persistent Data Structures.
CoRR, 2019

Three Other Models of Computer System Performance.
CoRR, 2019

Gables: A Roofline Model for Mobile SoCs.
Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

2018
Devirtualizing Memory in Heterogeneous Systems.
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

2017
Agile Paging for Efficient Memory Virtualization.
IEEE Micro, 2017

Advanced Cyberinfrastructure for Science, Engineering, and Public Policy.
CoRR, 2017

Challenges to Keeping the Computer Industry Centered in the US.
CoRR, 2017

Democratizing Design for Future Computing Platforms.
CoRR, 2017

Retrospective on Amdahl's Law in the Multicore Era.
Computer, 2017

Crossing Guard: Mediating Host-Accelerator Coherence Interactions.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

An Analysis of Persistent Memory Use with WHISPER.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

2016
Probabilistic Directed Writebacks for Exclusive Caches.
SIGARCH Comput. Archit. News, 2016

Proprietary versus Open Instruction Sets.
IEEE Micro, 2016

Range Translations for Fast Virtual Memory.
IEEE Micro, 2016

When to use 3D Die-Stacked Memory for Bandwidth-Constrained Big Data Workloads.
CoRR, 2016

Accelerating Science: A Computing Research Agenda.
CoRR, 2016

21st Century Computer Architecture.
CoRR, 2016

Arch2030: A Vision of Computer Architecture Research over the Next 15 Years.
CoRR, 2016

Security Implications of Third-Party Accelerators.
IEEE Comput. Archit. Lett., 2016

Agile Paging: Exceeding the Best of Nested and Shadow Paging.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Energy-efficient address translation.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

2015
Implications of Emerging 3D GPU Architecture on the Scan Primitive.
SIGMOD Rec., 2015

gem5-gpu: A Heterogeneous CPU-GPU Simulator.
IEEE Comput. Archit. Lett., 2015

Border control: sandboxing accelerators.
Proceedings of the 48th International Symposium on Microarchitecture, 2015

Redundant memory mappings for fast access to large memories.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

Toward GPUs being mainstream in analytic processing: An initial argument using simple scan-aggregate queries.
Proceedings of the 11th International Workshop on Data Management on New Hardware, 2015

Synchronization Using Remote-Scope Promotion.
Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015

2014
BadgerTrap: a tool to instrument x86-64 TLB misses.
SIGARCH Comput. Archit. News, 2014

21st century computer architecture.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014

Efficient Memory Virtualization: Reducing Dimensionality of Nested Page Walks.
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

21st century computer architecture keynote at 2014 international conference on supercomputing (ICS).
Proceedings of the 2014 International Conference on Supercomputing, 2014

Supporting x86-64 address translation for 100s of GPU lanes.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

QuickRelease: A throughput-oriented approach to release consistency on GPUs.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

Heterogeneous-race-free memory models.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014

2013
Heterogeneous system coherence for integrated CPU-GPU systems.
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013

Efficient virtual memory for big memory servers.
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

FreshCache: Statically and dynamically exploiting dataless ways.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013

Research directions for 21st century computer systems: asplos 2013 panel.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013

2012
Supporting Very Large DRAM Caches with Compound-Access Scheduling and MissMap.
IEEE Micro, 2012

Why on-chip cache coherence is here to stay.
Commun. ACM, 2012

Reducing memory reference energy with opportunistic virtual caching.
Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012

2011
A Primer on Memory Consistency and Cache Coherence
Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01733-9, 2011

The gem5 simulator.
SIGARCH Comput. Archit. News, 2011

Efficiently enabling conventional block sizes for very large die-stacked DRAM caches.
Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011

Karma: scalable deterministic record-replay.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

Calvin: Deterministic or not? Free will to choose.
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011

Safe and efficient supervised memory systems.
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011

2009
Two hardware-based approaches for deterministic multiprocessor replay.
Commun. ACM, 2009

Opportunities beyond single-core microprocessors.
Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009

StealthTest: Low Overhead Online Software Testing Using Transactional Memory.
Proceedings of the PACT 2009, 2009

2008
Is transactional memory an oxymoron?
Proc. VLDB Endow., 2008

Virtual Hierarchies.
IEEE Micro, 2008

Performance Pathologies in Hardware Transactional Memory.
IEEE Micro, 2008

Amdahl's Law in the Multicore Era.
Computer, 2008

Notary: Hardware techniques to enhance signatures.
Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-41 2008), 2008

Rerun: Exploiting Episodes for Lightweight Memory Race Recording.
Proceedings of the 35th International Symposium on Computer Architecture (ISCA 2008), 2008

TokenTM: Efficient Execution of Large Transactions with Hardware Transactional Memory.
Proceedings of the 35th International Symposium on Computer Architecture (ISCA 2008), 2008

Amdahl's Law in the multicore era.
Proceedings of the 14th International Conference on High-Performance Computer Architecture (HPCA-14 2008), 2008

2007
A Hardware Memory Race Recorder for Deterministic Replay.
IEEE Micro, 2007

Single-Threaded vs. Multithreaded: Where Should We Focus?
IEEE Micro, 2007

Implementing Signatures for Transactional Memory.
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-40 2007), 2007

Virtual hierarchies to support server consolidation.
Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007

LogTM-SE: Decoupling Hardware Transactional Memory from Caches.
Proceedings of the 13st International Conference on High-Performance Computer Architecture (HPCA-13 2007), 2007

A Case for Deconstructing Hardware Transactional Memory Systems.
Proceedings of the Programming Models for Ubiquitous Parallelism, 02.09. - 07.09.2007, 2007

2006
A Wiki for discussing and promoting best practices in research.
Commun. ACM, 2006

Coherence Ordering for Ring-based Chip Multiprocessors.
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-39 2006), 2006

LogTM: log-based transactional memory.
Proceedings of the 12th International Symposium on High-Performance Computer Architecture, 2006

A regulated transitive reduction (RTR) for longer memory race recording.
Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, 2006

Supporting nested transactional memory in logTM.
Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, 2006

2005
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset.
SIGARCH Comput. Archit. News, 2005

A serializability violation detector for shared-memory server programs.
Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, 2005

Improving Multiple-CMP Systems Using Token Coherence.
Proceedings of the 11th International Conference on High-Performance Computer Architecture (HPCA-11 2005), 2005

2004
Interaction cost and shotgun profiling.
ACM Trans. Archit. Code Optim., 2004

Interaction Cost: For When Event Counts Just Don't Add Up.
IEEE Micro, 2004

Using Speculation to Simplify Multiprocessor Design.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

A Future of Parallel Computer Architectures.
Proceedings of the 33rd International Conference on Parallel Processing (ICPP 2004), 2004

2003
Token Coherence: A New Framework for Shared-Memory Multiprocessors.
IEEE Micro, 2003

Challenges in Computer Architecture Evaluation.
Computer, 2003

Simulating a $2M Commercial Server on a $2K PC.
Computer, 2003

Using Interaction Costs for Microarchitectural Bottleneck Analysis.
Proceedings of the 36th Annual International Symposium on Microarchitecture, 2003

A "Flight Data Recorder" for Enabling Full-System Multiprocessor Deterministic Replay.
Proceedings of the 30th International Symposium on Computer Architecture (ISCA 2003), 2003

Token Coherence: Decoupling Performance and Correctness.
Proceedings of the 30th International Symposium on Computer Architecture (ISCA 2003), 2003

Using Destination-Set Prediction to Improve the Latency/Bandwidth Tradeoff in Shared-Memory Multiprocessors.
Proceedings of the 30th International Symposium on Computer Architecture (ISCA 2003), 2003

Dynamic Verification of End-to-End Multiprocessor Invariants.
Proceedings of the 2003 International Conference on Dependable Systems and Networks (DSN 2003), 2003

2002
Data page layouts for relational databases on deep memory hierarchies.
VLDB J., 2002

Specifying and Verifying a Broadcast and a Multicast Snooping Cache Coherence Protocol.
IEEE Trans. Parallel Distributed Syst., 2002

Full-system timing-first simulation.
Proceedings of the International Conference on Measurements and Modeling of Computer Systems, 2002

SafetyNet: Improving the Availability of Shared Memory Multiprocessors with Global Checkpoint/Recovery.
Proceedings of the 29th International Symposium on Computer Architecture (ISCA 2002), 2002

Slack: Maximizing Performance Under Technological Constraints.
Proceedings of the 29th International Symposium on Computer Architecture (ISCA 2002), 2002

Bandwidth Adaptive Snooping.
Proceedings of the Eighth International Symposium on High-Performance Computer Architecture (HPCA'02), 2002

2001
Cache performance for selected SPEC CPU2000 benchmarks.
SIGARCH Comput. Archit. News, 2001

Weaving Relations for Cache Performance.
Proceedings of the VLDB 2001, 2001

Facile: A Language and Compiler for High-Performance Processor Simulators.
Proceedings of the 2001 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2001

Correctly implementing value prediction in microprocessors that support multithreading or multiprocessing.
Proceedings of the 34th Annual International Symposium on Microarchitecture, 2001

2000
Wisconsin Wind Tunnel II: a fast, portable parallel architecture simulator.
IEEE Concurr., 2000

Making Pointer-Based Data Structures Cache Conscious.
Computer, 2000

How computer architecture trends may affect future distributed systems: from infiniBand clusters to inter-processor speculation (abstract).
Proceedings of the Nineteenth Annual ACM Symposium on Principles of Distributed Computing, 2000

Timestamp snooping: an approach for extending SMPs.
Proceedings of the ASPLOS-IX Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, 2000

1999
DBMSs on a Modern Processor: Where Does Time Go?
Proceedings of the VLDB'99, 1999

A System-Level Specification Framework for I/O Architectures.
Proceedings of the Eleventh Annual ACM Symposium on Parallel Algorithms and Architectures, 1999

Cache-Conscious Structure Layout.
Proceedings of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 1999

Multicast Snooping: A New Coherence Method Using a Multicast Address Network.
Proceedings of the 26th Annual International Symposium on Computer Architecture, 1999

Using Lamport Clocks to Reason about Relaxed Memory Models.
Proceedings of the Fifth International Symposium on High-Performance Computer Architecture, 1999

1998
Making Network Interfaces Less Peripheral.
Computer, 1998

Multiprocessors Should Support Simple Memory-Consistency Models.
Computer, 1998

Design Challenges for High-Performance Network Interfaces - Guest Editors' Introduction.
Computer, 1998

Lamport Clocks: Verifying a Directory Cache-Coherence Protocol.
Proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures, 1998

Using Prediction to Accelerate Coherence Protocols.
Proceedings of the 25th Annual International Symposium on Computer Architecture, 1998

Weak Ordering - A New Definition.
Proceedings of the 25 Years of the International Symposia on Computer Architecture (Selected Papers)., 1998

Retrospective: Weak Ordering - A New Definition.
Proceedings of the 25 Years of the International Symposia on Computer Architecture (Selected Papers)., 1998

Address Translation Mechanisms In Network Interfaces.
Proceedings of the Fourth International Symposium on High-Performance Computer Architecture, Las Vegas, Nevada, USA, January 31, 1998

The Impact of Data Transfer and Buffering Alternatives on Network Interface Design.
Proceedings of the Fourth International Symposium on High-Performance Computer Architecture, Las Vegas, Nevada, USA, January 31, 1998

Sirocco: Cost-Effective Fine-Grain Distributed Shared Memory.
Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques, 1998

1997
Relaxed Consistency and Coherence Granularity in DSM Systems: A Performance Evaluation.
Proceedings of the Sixth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), 1997

1996
Optimistic Simulation of Parallel Architectures Using Program Executables.
Proceedings of the Tenth Workshop on Parallel and Distributed Simulation, 1996

Coherent Network Interfaces for Fine-Grain Communication.
Proceedings of the 23rd Annual International Symposium on Computer Architecture, 1996

The Tempest approach to distributed shared memory.
Proceedings of the 1996 International Conference on Computer Design (ICCD '96), 1996

1995
Cost-Effective Parallel Computing.
Computer, 1995

Where Is Software Headed? A Virtual Roundtable.
Computer, 1995

A New Page Table for 64-bit Address Spaces.
Proceedings of the Fifteenth ACM Symposium on Operating System Principles, 1995

Efficient Support for Irregular Applications on Distributed-Memory Machines.
Proceedings of the Fifth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPOPP), 1995

Tempest: A Substrate for Portable Parallel Programs.
Proceedings of the COMPCON '95: Technologies for the Information Superhighway, 1995

1994
A Comparison of Trace-Sampling Techniques for Multi-Megabyte Caches.
IEEE Trans. Computers, 1994

The Wisconsin Wind Tunnel project: an annotated bibliography.
SIGARCH Comput. Archit. News, 1994

Application-specific protocols for user-level shared memory.
Proceedings of the Proceedings Supercomputing '94, 1994

An evaluation of directory protocols for medium-scale shared-memory multiprocessors.
Proceedings of the 8th international conference on Supercomputing, 1994

Surpassing the TLB Performance of Superpages with Less Operating System Support.
Proceedings of the ASPLOS-VI Proceedings, 1994

1993
A Unified Formalization of Four Shared-Memory Models.
IEEE Trans. Parallel Distributed Syst., 1993

Cooperative Shared Memory: Software and Hardware Support for Scalable Multiprocesors.
ACM Trans. Comput. Syst., 1993

Performance Implications of Tolerating Cache Faults.
IEEE Trans. Computers, 1993

Wisconsin Architectural Research Tool Set.
SIGARCH Comput. Archit. News, 1993

Cache performance of the SPEC92 benchmark suite.
IEEE Micro, 1993

The Wisconsin Wind Tunnel: Virtual Prototyping of Parallel Computers.
Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems, 1993

Mechanisms for Cooperative Shared Memory.
Proceedings of the 20th Annual International Symposium on Computer Architecture, 1993

1992
Page Placement Algorithms for Large Real-Indexed Caches.
ACM Trans. Comput. Syst., 1992

Programming for Different Memory Consistency Models.
J. Parallel Distributed Comput., 1992

Tradeoffs in Supporting Two Page Sizes.
Proceedings of the 19th Annual International Symposium on Computer Architecture. Gold Coast, 1992

1991
A Model for Estimating Trace-Sample Miss Ratios.
Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems, 1991

Implementing Stack Simulation for Highly-Associative Memories.
Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems, 1991

Detecting Data Races on Weak Memory Systems.
Proceedings of the 18th Annual International Symposium on Computer Architecture. Toronto, 1991

Comparison of Hardware and Software Cache Coherence Schemes.
Proceedings of the 18th Annual International Symposium on Computer Architecture. Toronto, 1991

1990
Cache performance of the integer SPEC benchmarks on a RISC.
SIGARCH Comput. Archit. News, 1990

What is scalability?
SIGARCH Comput. Archit. News, 1990

Cache Considerations for Multiprocessor Programmers.
Commun. ACM, 1990

Implementing Sequential Consistency in Cache-Based Systems.
Proceedings of the 1990 International Conference on Parallel Processing, 1990

1989
A VLSI chip set for a multiprocessor workstation. I. An RISC microprocessor with coprocessor interface and support for symbolic processing.
IEEE J. Solid State Circuits, December, 1989

Evaluating Associativity in CPU Caches.
IEEE Trans. Computers, 1989

Inexpensive Implementations of Set-Associativity.
Proceedings of the 16th Annual International Symposium on Computer Architecture. Jerusalem, 1989

1988
A Case for Direct-Mapped Caches.
Computer, 1988

1986
An In-Cache Address Translation Mechanism.
Proceedings of the 13th Annual Symposium on Computer Architecture, Tokyo, Japan, June 1986, 1986

1984
Experimental Evaluation of On-Chip Microprocessor Cache Memories.
Proceedings of the 11th Annual Symposium on Computer Architecture, 1984

1983
Architecture of a VLSI Instruction Cache for a RISC
Proceedings of the 10th Annual Symposium on Computer Architecture, 1983, 1983


  Loading...