Onur Mutlu
Orcid: 0000-0002-0075-2312Affiliations:
- ETH Zurich
- Carnegie Mellon University, Pittsburgh, USA
According to our database1,
Onur Mutlu
authored at least 554 papers
between 2003 and 2024.
Collaborative distances:
Collaborative distances:
Awards
ACM Fellow
ACM Fellow 2017, "For contributions to computer architecture research, especially in memory systems".
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on zbmath.org
-
on orcid.org
-
on ece.cmu.edu
On csauthors.net:
Bibliography
2024
IEEE Trans. Computers, September, 2024
Sectored DRAM: A Practical Energy-Efficient and High-Performance Fine-Grained DRAM Architecture.
ACM Trans. Archit. Code Optim., September, 2024
ACM Trans. Archit. Code Optim., September, 2024
IEEE Trans. Computers, May, 2024
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., March, 2024
ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-efficient Genome Analysis.
ACM Trans. Archit. Code Optim., March, 2024
Hardware Acceleration for Knowledge Graph Processing: Challenges & Recent Developments.
CoRR, 2024
Understanding the Security Benefits and Overheads of Emerging Industry Solutions to DRAM Read Disturbance.
CoRR, 2024
Leveraging Adversarial Detection to Enable Scalable and Low Overhead RowHammer Mitigations.
CoRR, 2024
Amplifying Main Memory-Based Timing Covert and Side Channels using Processing-in-Memory Operations.
CoRR, 2024
Analysis of Distributed Optimization Algorithms on a Real Processing-In-Memory System.
CoRR, 2024
Virtuoso: An Open-Source, Comprehensive and Modular Simulation Framework for Virtual Memory Research.
CoRR, 2024
PUMA: Efficient and Low-Cost Memory Allocation and Alignment Support for Processing-Using-Memory Architectures.
CoRR, 2024
MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data Processing.
CoRR, 2024
CoRR, 2024
CoRR, 2024
Material modeling and recent findings in transcatheter aortic valve implantation simulations.
Comput. Methods Programs Biomed., 2024
Address Scaling: Architectural Support for Fine-Grained Thread-Safe Metadata Management.
IEEE Comput. Archit. Lett., 2024
IEEE Comput. Archit. Lett., 2024
SpyHammer: Understanding and Exploiting RowHammer Under Fine-Grained Temperature Variations.
IEEE Access, 2024
IEEE Access, 2024
ABACuS: All-Bank Activation Counters for Scalable and Low Overhead RowHammer Mitigation.
Proceedings of the 33rd USENIX Security Symposium, 2024
Self-Managing DRAM: A Low-Cost Framework for Enabling Autonomous and Efficient DRAM Maintenance Operations.
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024
BreakHammer: Enhancing RowHammer Mitigations by Carefully Throttling Suspect Threads.
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024
SwiftRL: Towards Efficient Reinforcement Learning on Real Processing-In-Memory Systems.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024
QUETZAL: Vector Acceleration Framework for Modern Genome Sequence Analysis Algorithms.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
MegIS: High-Performance, Energy-Efficient, and Low-Cost Metagenomic Analysis with In-Storage Processing.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
Constable: Improving Performance and Power Efficiency by Safely Eliminating Load Instruction Execution.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
Functionally-Complete Boolean Logic in Real DRAM Chips: Experimental Characterization and Analysis.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
Spatial Variation-Aware Read Disturbance Defenses: Experimental Analysis of Real DRAM Chips and Implications on Future Solutions.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data Computing.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
Simultaneous Many-Row Activation in Off-the-Shelf DRAM Chips: Experimental Characterization and Analysis.
Proceedings of the 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2024
Read Disturbance in High Bandwidth Memory: A Detailed Experimental Study on HBM2 DRAM Chips.
Proceedings of the 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2024
An Experimental Characterization of Combined RowHammer and RowPress Read Disturbance in Modern DRAM Chips.
Proceedings of the 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2024
AERO: Adaptive Erase Operation for Improving Lifetime and Performance of Modern NAND Flash-Based SSDs.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
PIM-Opt: Demystifying Distributed Optimization Algorithms on a Real-World Processing-In-Memory System.
Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques, 2024
2023
DRAM Bender: An Extensible and Versatile FPGA-Based Infrastructure to Easily Test State-of-the-Art DRAM Chips.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., December, 2023
IEEE Trans. Computers, October, 2023
Scrooge: a fast and memory-frugal genomic sequence aligner for CPUs, GPUs, and ASICs.
Bioinform., May, 2023
A framework for high-throughput sequence alignment using real processing-in-memory systems.
Bioinform., May, 2023
ACM Trans. Archit. Code Optim., March, 2023
How does hemodynamics affect rupture tissue mechanics in abdominal aortic aneurysm: Focus on wall shear stress derived parameters, time-averaged wall shear stress, oscillatory shear index, endothelial cell activation potential, and relative residence time.
Comput. Biol. Medicine, March, 2023
IEEE Trans. Emerg. Top. Comput., 2023
ACM SIGOPS Oper. Syst. Rev., 2023
Eng. Appl. Artif. Intell., 2023
PULSAR: Simultaneous Many-Row Activation for Reliable and High-Performance Computing in Off-the-Shelf DRAM Chips.
CoRR, 2023
CoRR, 2023
MetaFast: Enabling Fast Metagenomic Classification via Seed Counting and Edit Distance Approximation.
CoRR, 2023
SequenceLab: A Comprehensive Benchmark of Computational Methods for Comparing Genomic Sequences.
CoRR, 2023
CoRR, 2023
Understanding Read Disturbance in High Bandwidth Memory: An Experimental Analysis of Real HBM2 DRAM Chips.
CoRR, 2023
CoRR, 2023
Retrospective: Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors.
CoRR, 2023
Retrospective: An Experimental Study of Data Retention Behavior in Modern DRAM Devices: Implications for Retention Time Profiling Mechanisms.
CoRR, 2023
Retrospective: A Scalable Processing-in-Memory Accelerator for Parallel Graph Processing.
CoRR, 2023
TransPimLib: A Library for Efficient Transcendental Functions on Processing-in-Memory Systems.
CoRR, 2023
RedBit: An End-to-End Flexible Framework for Evaluating the Accuracy of Quantized CNNs.
CoRR, 2023
RawHash: enabling fast and accurate real-time analysis of raw nanopore signals for large genomes.
Bioinform., 2023
Extending Memory Capacity in Modern Consumer Systems With Emerging Non-Volatile Memory: Experimental Analysis and Characterization Using the Intel Optane SSD.
IEEE Access, 2023
IEEE Access, 2023
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023
Swordfish: A Framework for Evaluating Deep Neural Network-based Basecalling using Computation-In-Memory with Non-Ideal Memristors.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
Victima: Drastically Increasing Address Translation Reach by Leveraging Underutilized Cache Resources.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
Utopia: Fast and Efficient Address Translation via Hybrid Restrictive & Flexible Virtual-to-Physical Address Mappings.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023
Venice: Improving Solid-State Drive Parallelism at Low Cost via Conflict-Free Accesses.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
Proceedings of the IEEE International Symposium on Workload Characterization, 2023
SPARTA: Spatial Acceleration for Efficient and Scalable Horizontal Diffusion Weather Stencil Computation.
Proceedings of the 37th International Conference on Supercomputing, 2023
Proceedings of the 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2023
Proceedings of the 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Network, 2023
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023
Proceedings of the 28th Asia and South Pacific Design Automation Conference, 2023
Proceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques, 2023
2022
Dataset, July, 2022
ACM Trans. Reconfigurable Technol. Syst., 2022
POCLib: A High-Performance Framework for Enabling Near Orthogonal Processing on Compression.
IEEE Trans. Parallel Distributed Syst., 2022
IEEE Trans. Parallel Distributed Syst., 2022
CoPA: Cold Page Awakening to Overcome Retention Failures in STT-MRAM Based I/O Buffers.
IEEE Trans. Parallel Distributed Syst., 2022
MetaSys: A Practical Open-source Metadata Management System to Implement and Evaluate Cross-layer Optimizations.
ACM Trans. Archit. Code Optim., 2022
SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures.
Proc. ACM Meas. Anal. Comput. Syst., 2022
Accelerating Neural Network Inference With Processing-in-DRAM: From the Edge to the Cloud.
IEEE Micro, 2022
J. Parallel Distributed Comput., 2022
IEEE Des. Test, 2022
TargetCall: Eliminating the Wasted Computation in Basecalling via Pre-Basecalling Filtering.
CoRR, 2022
Utopia: Efficient Address Translation using Hybrid Virtual-to-Physical Address Mapping.
CoRR, 2022
CoRR, 2022
NEON: Enabling Efficient Support for Nonlinear Operations in Resistive RAM-based Neural Network Accelerators.
CoRR, 2022
CoRR, 2022
CoRR, 2022
RevaMp3D: Architecting the Processor Core and Cache Hierarchy for Systems with Monolithically-Integrated Logic and Memory.
CoRR, 2022
Sectored DRAM: An Energy-Efficient High-Throughput and Practical Fine-Grained DRAM Architecture.
CoRR, 2022
A Case for Self-Managing DRAM Chips: Improving Performance, Efficiency, Reliability, and Security via Autonomous in-DRAM Maintenance Operations.
CoRR, 2022
An Experimental Evaluation of Machine Learning Training on a Real Processing-in-Memory System.
CoRR, 2022
COVIDHunter: COVID-19 pandemic wave prediction and mitigation via seasonality-aware modeling.
CoRR, 2022
Going From Molecules to Genomic Variations to Scientific Discovery: Intelligent Algorithms and Architectures for Intelligent Genome Analysis.
CoRR, 2022
Enabling High-Performance and Energy-Efficient Hybrid Transactional/Analytical Databases with Hardware/Software Cooperation.
CoRR, 2022
Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Systems.
CoRR, 2022
Packaging, containerization, and virtualization of computational omics methods: Advances, challenges, and opportunities.
CoRR, 2022
DeepSketch: A New Machine Learning-Based Reference Search Technique for Post-Deduplication Delta Compression.
CoRR, 2022
GenStore: A High-Performance and Energy-Efficient In-Storage Computing System for Genome Sequence Analysis.
CoRR, 2022
SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Systems.
CoRR, 2022
Bioinform., 2022
Bioinform., 2022
Demeter: A Fast and Energy-Efficient Food Profiler Using Hyperdimensional Computing in Memory.
IEEE Access, 2022
Benchmarking a New Paradigm: Experimental Analysis and Characterization of a Real Processing-in-Memory System.
IEEE Access, 2022
Adv. Comput., 2022
Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures.
Proceedings of the SIGMETRICS/PERFORMANCE '22: ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, Mumbai, India, June 6, 2022
ProbGraph: High-Performance and High-Accuracy Graph Mining with Probabilistic Set Representations.
Proceedings of the SC22: International Conference for High Performance Computing, 2022
HiRA: Hidden Row Activation for Reducing Refresh Latency of Off-the-Shelf DRAM Chips.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
GenPIP: In-Memory Acceleration of Genome Analysis via Tight Integration of Basecalling and Read Mapping.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
AgileWatts: An Energy-Efficient CPU Core Idle-State Architecture for Latency-Sensitive Server Applications.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
Morpheus: Extending the Last Level Cache Capacity in GPU Systems Using Idle GPU Core Resources.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
Hermes: Accelerating Long-Latency Load Requests via Perceptron-Based Off-Chip Load Prediction.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
Flash-Cosmos: In-Flash Bulk Bitwise Operations Using Inherent Computation Capability of NAND Flash Memory.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
Methodologies, Workloads, and Tools for Processing-in-Memory: Enabling the Adoption of Data-Centric Architectures.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022
Heterogeneous Data-Centric Architectures for Modern Data-Intensive Applications: Case Studies in Machine Learning and Databases.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022
PiDRAM: An FPGA-based Framework for End-to-end Evaluation of Processing-in-DRAM Techniques.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022
GenStore: In-Storage Filtering of Genomic Data for High-Performance and Energy-Efficient Genome Analysis.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022
SparseP: Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022
SeGraM: a universal hardware accelerator for genomic sequence-to-graph and sequence-to-sequence mapping.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
Sibyl: adaptive and extensible data placement in hybrid storage systems using online reinforcement learning.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022
High-throughput Pairwise Alignment with the Wavefront Algorithm using Processing-in-Memory.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022
Polynesia: Enabling High-Performance and Energy-Efficient Hybrid Transactional/Analytical Databases with Hardware/Software Co-Design.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022
LEAPER: Fast and Accurate FPGA-based System Performance Prediction via Transfer Learning.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022
DarkGates: A Hybrid Power-Gating Architecture to Mitigate the Performance Impact of Dark-Silicon in High Performance Processors.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022
DeepSketch: A New Machine Learning-Based Reference Search Technique for Post-Deduplication Delta Compression.
Proceedings of the 20th USENIX Conference on File and Storage Technologies, 2022
Understanding RowHammer Under Reduced Wordline Voltage: An Experimental Study Using Real DRAM Devices.
Proceedings of the 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2022
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2022
GenStore: a high-performance in-storage processing system for genome sequence analysis.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022
2021
IEEE Trans. Parallel Distributed Syst., 2021
Unified Holistic Memory Management Supporting Multiple Big Data Processing Frameworks over Hybrid Memories.
ACM Trans. Comput. Syst., 2021
Refresh Triggered Computation: Improving the Energy Efficiency of Convolutional Neural Network Accelerators.
ACM Trans. Archit. Code Optim., 2021
GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra.
Proc. VLDB Endow., 2021
IEEE Micro, 2021
GenShare: Sharing Accurate Differentially-Private Statistics for Genomic Datasets with Dependent Tuples.
CoRR, 2021
CoRR, 2021
Extending Memory Capacity in Consumer Devices with Emerging Non-Volatile Memory: An Experimental Study.
CoRR, 2021
A Deeper Look into RowHammer's Sensitivities: Experimental Analysis of Real DRAM Chips and Implications on Future Attacks and Defenses.
CoRR, 2021
CoRR, 2021
CoRR, 2021
CoRR, 2021
Benchmarking a New Paradigm: An Experimental Analysis of a Real Processing-in-Memory Architecture.
CoRR, 2021
pLUTo: In-DRAM Lookup Tables to Enable Massively Parallel General-Purpose Computation.
CoRR, 2021
SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems.
CoRR, 2021
BurstLink: Techniques for Energy-Efficient Conventional and Virtual Reality Video Display.
CoRR, 2021
GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra.
CoRR, 2021
Polynesia: Enabling Effective Hybrid Transactional/Analytical Databases with Specialized Hardware/Software Co-Design.
CoRR, 2021
Mitigating Edge Machine Learning Inference Bottlenecks: An Empirical Study on Accelerating Google Edge Models.
CoRR, 2021
COVIDHunter: An Accurate, Flexible, and Environment-Aware Open-Source COVID-19 Outbreak Simulation Model.
CoRR, 2021
SneakySnake: a fast and accurate universal genome pre-alignment filter for CPUs, GPUs and FPGAs.
Bioinform., 2021
DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks.
IEEE Access, 2021
HARP: Practically and Effectively Identifying Uncorrectable Errors in Memory Chips That Use On-Die Error-Correcting Codes.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
Uncovering In-DRAM RowHammer Protection Mechanisms: A New Methodology, Custom RowHammer Patterns, and Implications.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
BurstLink: Techniques for Energy-Efficient Video Display for Conventional and Virtual Reality Systems.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
A Deeper Look into RowHammer's Sensitivities: Experimental Analysis of Real DRAM Chipsand Implications on Future Attacks and Defenses.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
QUAC-TRNG: High-Throughput True Random Number Generation Using Quadruple Row Activation in Commodity DRAM Chips.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
IChannels: Exploiting Current Management Mechanisms to Create Covert Channels in Modern Processors.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
CODIC: A Low-Cost Substrate for Enabling Custom In-DRAM Functionalities and Optimizations.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
Energy-Efficient Mobile Robot Control via Run-time Monitoring of Environmental Complexity and Computing Workload.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021
BlockHammer: Preventing RowHammer at Low Cost by Blacklisting Rapidly-Accessed DRAM Rows.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021
Benchmarking Memory-Centric Computing Systems: Analysis of Real Processing-In-Memory Hardware.
Proceedings of the 12th International Green and Sustainable Computing Workshops, 2021
Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021
Understanding Power Consumption and Reliability of High-Bandwidth Memory with Voltage Underscaling.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021
Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021
Google Neural Network Models for Edge Devices: Analyzing and Mitigating Machine Learning Inference Bottlenecks.
Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021
2020
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020
IEEE Internet Comput., 2020
IEEE Des. Test, 2020
Robust Machine Learning Systems: Challenges, Current Trends, Perspectives, and the Road Ahead.
IEEE Des. Test, 2020
Enabling High-Capacity, Latency-Tolerant, and Highly-Concurrent GPU Register Files via Software/Hardware Cooperation.
CoRR, 2020
CoRR, 2020
BioDynaMo: an agent-based simulation platform for scalable computational biology research.
CoRR, 2020
Accelerating B-spline interpolation on GPUs: Application to medical image registration.
Comput. Methods Programs Biomed., 2020
IEEE Comput. Archit. Lett., 2020
Apollo: a sequencing-technology-independent, scalable and accurate assembly polishing algorithm.
Bioinform., 2020
Proceedings of the 2020 International Symposium on VLSI Design, Automation and Test, 2020
Proceedings of the 2020 IEEE Symposium on Security and Privacy, 2020
Proceedings of the 2020 IEEE Symposium on Security and Privacy, 2020
Proceedings of the 32nd IEEE International Symposium on Computer Architecture and High Performance Computing, 2020
FIGARO: Improving System Performance via Fine-Grained In-DRAM Data Relocation and Caching.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
Bit-Exact ECC Recovery (BEER): Determining DRAM On-Die ECC Functions by Exploiting DRAM Data Retention Characteristics.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
FlexWatts: A Power- and Workload-Aware Hybrid Power Delivery Network for Energy-Efficient Microprocessors.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
GenASM: A High-Performance, Low-Power Approximate String Matching Acceleration Framework for Genome Sequence Analysis.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
Proceedings of the ISMM '20: 2020 ACM SIGPLAN International Symposium on Memory Management, 2020
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020
Revisiting RowHammer: An Experimental Analysis of Modern DRAM Devices and Mitigation Techniques.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020
The Virtual Block Interface: A Flexible Alternative to the Conventional Virtual Memory Framework.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020
SysScale: Exploiting Multi-domain Dynamic Voltage and Frequency Scaling for Energy Efficient Mobile Processors.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020
Proceedings of the 37th International Conference on Machine Learning, 2020
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020
WoLFRaM: Enhancing Wear-Leveling and Fault Tolerance in Resistive Memories using Programmable Address Decoders.
Proceedings of the 38th IEEE International Conference on Computer Design, 2020
Proceedings of the 38th IEEE International Conference on Computer Design, 2020
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020
NERO: A Near High-Bandwidth Memory Stencil Accelerator for Weather Prediction Modeling.
Proceedings of the 30th International Conference on Field-Programmable Logic and Applications, 2020
Boyi: A Systematic Framework for Automatically Deciding the Right Execution Model of OpenCL Applications on FPGAs.
Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020
An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration.
Proceedings of the 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2020
Evanesco: Architectural Support for Efficient Data Sanitization in Modern Flash-Based Storage Systems.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020
2019
ACM Trans. Comput. Syst., 2019
ACM Trans. Embed. Comput. Syst., 2019
An Analytical Model for Performance and Lifetime Estimation of Hybrid DRAM-NVM Main Memories.
IEEE Trans. Computers, 2019
ACM Trans. Archit. Code Optim., 2019
AVPP: Address-first Value-next Predictor with Value Prefetching for Improving the Efficiency of Load Value Prediction.
ACM Trans. Archit. Code Optim., 2019
Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-precision Learning.
Proc. VLDB Endow., 2019
Proc. ACM Meas. Anal. Comput. Syst., 2019
Microprocess. Microsystems, 2019
AirLift: A Fast and Comprehensive Technique for Remapping Alignments between Reference Genomes.
CoRR, 2019
CoRR, 2019
Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-precision Learning (Technical Report).
CoRR, 2019
Understanding the Interactions of Workloads and DRAM Types: A Comprehensive Experimental Study.
CoRR, 2019
Bioinform., 2019
Nanopore sequencing technology and tools for genome assembly: computational analysis of the current state, bottlenecks and future directions.
Briefings Bioinform., 2019
Analysis and Modeling of Collaborative Execution Strategies for Heterogeneous CPU-FPGA Architectures.
Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering, 2019
Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2019
Binary Star: Coordinated Reliability in Heterogeneous Memory Systems for High Performance and Scalability.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019
EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network Inference Using Approximate DRAM.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019
SMASH: Co-designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019
CROW: a low-cost substrate for improving DRAM performance, energy efficiency, and reliability.
Proceedings of the 46th International Symposium on Computer Architecture, 2019
Proceedings of the 46th International Symposium on Computer Architecture, 2019
Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019
D-RaNGe: Using Commodity DRAM Devices to Generate True Random Numbers with Low Latency and High Throughput.
Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019
Proceedings of the Workshop on Hot Topics in Operating Systems, 2019
Understanding and Modeling On-Die Error Correction in Modern DRAM: An Experimental Study Using Real Devices.
Proceedings of the 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2019
NAPEL: Near-Memory Computing Application Performance Prediction via Ensemble Learning.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
Proceedings of the Constructive Side-Channel Analysis and Secure Design, 2019
Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019
Proceedings of the Approximate Circuits, Methodologies and CAD., 2019
2018
Mosaic: Enabling Application-Transparent Support for Multiple Page Sizes in Throughput Processors.
ACM SIGOPS Oper. Syst. Rev., 2018
Efficient Document Analytics on Compressed Data: Method, Challenges, Algorithms, Insights.
Proc. VLDB Endow., 2018
Improving 3D NAND Flash Memory Lifetime by Tolerating Early Retention Loss and Process Variation.
Proc. ACM Meas. Anal. Comput. Syst., 2018
What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study.
Proc. ACM Meas. Anal. Comput. Syst., 2018
ECI-Cache: A High-Endurance and Cost-Efficient I/O Caching Scheme for Virtualized Platforms.
Proc. ACM Meas. Anal. Comput. Syst., 2018
Enabling Efficient RDMA-based Synchronous Mirroring of Persistent Memory Transactions.
CoRR, 2018
D-RaNGe: Violating DRAM Timing Constraints for High-Throughput True Random Number Generation using Commodity DRAM Devices.
CoRR, 2018
Techniques for Efficiently Handling Power Surges in Fuel Cell Powered Data Centers: Modeling, Analysis, Results.
CoRR, 2018
Recent Advances in Overcoming Bottlenecks in Memory Systems and Managing Memory Resources in GPU Systems.
CoRR, 2018
Predictable Performance and Fairness Through Accurate Slowdown Estimation in Shared Main Memory Systems.
CoRR, 2018
CoRR, 2018
Characterizing, Exploiting, and Mitigating Vulnerabilities in MLC NAND Flash Memory Programming.
CoRR, 2018
CoRR, 2018
LISA: Increasing Internal Connectivity in DRAM for Fast Data Movement and Low Latency.
CoRR, 2018
Voltron: Understanding and Exploiting the Voltage-Latency-Reliability Trade-Offs in Modern DRAM Chips to Improve Energy Efficiency.
CoRR, 2018
Flexible-Latency DRAM: Understanding and Exploiting Latency Variation in Modern DRAM Chips.
CoRR, 2018
CoRR, 2018
Experimental Characterization, Optimization, and Recovery of Data Retention Errors in MLC NAND Flash Memory.
CoRR, 2018
Decoupling GPU Programming Models from Resource Management for Enhanced Programming Ease, Portability, and Performance.
CoRR, 2018
CoRR, 2018
Mosaic: An Application-Transparent Hardware-Software Cooperative Memory Manager for GPUs.
CoRR, 2018
High-Performance and Energy-Effcient Memory Scheduler Design for Heterogeneous Systems.
CoRR, 2018
CoRR, 2018
Holistic Management of the GPGPU Memory Hierarchy to Manage Warp-level Latency Tolerance.
CoRR, 2018
Zorua: Enhancing Programming Ease, Portability, and Performance in GPUs by Decoupling Programming Models from Resource Management.
CoRR, 2018
Enabling the Adoption of Processing-in-Memory: Challenges, Mechanisms, Future Research Directions.
CoRR, 2018
GRIM-Filter: Fast seed location filtering in DNA read mapping using processing-in-memory technologies.
BMC Genom., 2018
Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018
Processing data where it makes sense in modern computing systems: Enabling in-memory computation.
Proceedings of the 7th Mediterranean Conference on Embedded Computing, 2018
A Case for Richer Cross-Layer Abstractions: Bridging the Semantic Gap with Expressive Memory.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018
The Locality Descriptor: A Holistic Cross-Layer Abstraction to Express Data Locality In GPUs.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018
Proceedings of the Internet Measurement Conference 2018, 2018
Zwift: A Programming Framework for High Performance Text Analytics on Compressed Data.
Proceedings of the 32nd International Conference on Supercomputing, 2018
Solar-DRAM: Reducing DRAM Access Latency by Exploiting the Variation in Local Bitlines.
Proceedings of the 36th IEEE International Conference on Computer Design, 2018
HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018
The DRAM Latency PUF: Quickly Evaluating Physical Unclonable Functions by Exploiting the Latency-Reliability Tradeoff in Modern Commodity DRAM Devices.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018
Proceedings of the 16th USENIX Conference on File and Storage Technologies, 2018
Proceedings of the 55th Annual Design Automation Conference, 2018
LTRF: Enabling High-Capacity Register Files for GPUs via Hardware/Software Cooperative Register Prefetching.
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018
SPECTR: Formal Supervisory Control and Coordination for Many-core Systems Resource Management.
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018
Slim NoC: A Low-Diameter On-Chip Network Topology for High Energy Efficiency and Scalability.
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018
2017
Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms.
Proc. ACM Meas. Anal. Comput. Syst., 2017
Understanding Reduced-Voltage Operation in Modern DRAM Devices: Experimental Characterization, Analysis, and Mechanisms.
Proc. ACM Meas. Anal. Comput. Syst., 2017
Error Characterization, Mitigation, and Recovery in Flash-Memory-Based Solid-State Drives.
Proc. IEEE, 2017
Improving the reliability of chip-off forensic analysis of NAND flash memory devices.
Digit. Investig., 2017
CoRR, 2017
CoRR, 2017
Improving the Performance and Endurance of Persistent Memory with Loose-Ordering Consistency.
CoRR, 2017
Understanding Reduced-Voltage Operation in Modern DRAM Chips: Characterization, Analysis, and Mechanisms.
CoRR, 2017
LazyPIM: Efficient Support for Cache Coherence in Processing-in-Memory Architectures.
CoRR, 2017
A Case for Memory Content-Based Detection and Mitigation of Data-Dependent Failures in DRAM.
IEEE Comput. Archit. Lett., 2017
IEEE Comput. Archit. Lett., 2017
GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.
Bioinform., 2017
Adv. Comput., 2017
Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, 2017
Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation, 2017
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017
Ambit: in-memory accelerator for bulk bitwise operations using commodity DRAM technology.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017
Detecting and mitigating data-dependent DRAM failures by exploiting current memory content.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017
Mosaic: a GPU memory manager with application-transparent support for multiple page sizes.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017
The Reach Profiler (REAPER): Enabling the Mitigation of DRAM Retention Failures via Profiling at Aggressive Conditions.
Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017
Carpool: a bufferless on-chip network supporting adaptive multicast and hotspot alleviation.
Proceedings of the International Conference on Supercomputing, 2017
SoftMC: A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies.
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017
Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques.
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017
Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017
Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017
2016
IEEE Trans. Parallel Distributed Syst., 2016
ACM Trans. Archit. Code Optim., 2016
DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators.
ACM Trans. Archit. Code Optim., 2016
ACM Trans. Archit. Code Optim., 2016
Real Time Syst., 2016
A case for hierarchical rings with deflection routing: An energy-efficient on-chip communication substrate.
Parallel Comput., 2016
Common Bonds: MIPS, HPS, Two-Level Branch Prediction, and Compressed Code RISC Processor.
IEEE Micro, 2016
Enabling Accurate and Practical Online Flash Channel Modeling for Modern MLC NAND Flash Memory.
IEEE J. Sel. Areas Commun., 2016
IEEE Des. Test, 2016
CoRR, 2016
The Processing Using Memory Paradigm: In-DRAM Bulk Copy, Initialization, Bitwise AND and OR.
CoRR, 2016
Buddy-RAM: Improving the Performance and Efficiency of Bulk Bitwise Operations Using DRAM.
CoRR, 2016
Heterogeneous-Reliability Memory: Exploiting Application-Level Memory Error Tolerance.
CoRR, 2016
Reducing DRAM Latency by Exploiting Design-Induced Latency Variation in Modern DRAM Chips.
CoRR, 2016
Enabling Efficient Dynamic Resizing of Large DRAM Caches via A Hardware Consistent Hashing Mechanism.
CoRR, 2016
Reducing Performance Impact of DRAM Refresh by Parallelizing Refreshes with Accesses.
CoRR, 2016
Achieving both High Energy Efficiency and High Performance in On-Chip Communication using Hierarchical Rings with Deflection Routing.
CoRR, 2016
GateKeeper: Enabling Fast Pre-Alignment in DNA Short Read Mapping with a New Streaming Accelerator Architecture.
CoRR, 2016
Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science, 2016
Understanding Latency Variation in Modern DRAM Chips: Experimental Characterization, Analysis, and Optimization.
Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science, 2016
Proceedings of the 2016 International Symposium on Rapid System Prototyping, 2016
Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, 2016
Proceedings of the 4th Workshop on Interactions of NVM/Flash with Operating Systems and Workloads, 2016
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016
Continuous runahead: Transparent hardware acceleration for memory intensive workloads.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016
Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016
A model for Application Slowdown Estimation in on-chip networks and its use for improving system fairness and performance.
Proceedings of the 34th IEEE International Conference on Computer Design, 2016
Accelerating pointer chasing in 3D-stacked memory: Challenges, mechanisms, evaluation.
Proceedings of the 34th IEEE International Conference on Computer Design, 2016
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016
Low-Cost Inter-Linked Subarrays (LISA): Enabling fast inter-subarray data movement in DRAM.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016
PARBOR: An Efficient System-Level Technique to Detect Data-Dependent Failures in DRAM.
Proceedings of the 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2016
Invited - Who is the major threat to tomorrow's security?: you, the hardware designer.
Proceedings of the 53rd Annual Design Automation Conference, 2016
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016
2015
IEEE Trans. Computers, 2015
Introducing the MICRO Test of Time Awards: Concept, Process, 2014 Winners, and the Future.
IEEE Micro, 2015
SQUASH: Simple QoS-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators.
CoRR, 2015
CoRR, 2015
CoRR, 2015
Simultaneous Multi Layer Access: A High Bandwidth and Low Cost 3D-Stacked Memory Interface.
CoRR, 2015
Shifted Hamming distance: a fast and accurate SIMD-friendly filter to accelerate alignment verification in read mapping.
Bioinform., 2015
Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, 2015
Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2015
Proceedings of the 2015 International Conference on Embedded Computer Systems: Architectures, 2015
A Low-Overhead, Fully-Distributed, Guaranteed-Delivery Routing Algorithm for Faulty Network-on-Chips.
Proceedings of the 9th International Symposium on Networks-on-Chip, 2015
WARM: Improving NAND flash memory lifetime with write-hotness aware retention management.
Proceedings of the IEEE 31st Symposium on Mass Storage Systems and Technologies, 2015
Proceedings of the IEEE 31st Symposium on Mass Storage Systems and Technologies, 2015
The application slowdown model: quantifying and controlling the impact of inter-application interference at shared caches and main memory.
Proceedings of the 48th International Symposium on Microarchitecture, 2015
Gather-scatter DRAM: in-DRAM address translation to improve the spatial locality of non-unit strided accesses.
Proceedings of the 48th International Symposium on Microarchitecture, 2015
ThyNVM: enabling software-transparent crash consistency in persistent memory systems.
Proceedings of the 48th International Symposium on Microarchitecture, 2015
Proceedings of the 8th International Workshop on Network on Chip Architectures, 2015
Comparative evaluation of FPGA and ASIC implementations of bufferless and buffered routing algorithms for on-chip networks.
Proceedings of the Sixteenth International Symposium on Quality Electronic Design, 2015
A case for core-assisted bottleneck acceleration in GPUs: enabling flexible data compression with assist warps.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015
Page overlays: an enhanced virtual memory framework to enable fine-grained memory management.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015
PIM-enabled instructions: a low-overhead, locality-aware processing-in-memory architecture.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
Data retention in MLC NAND flash memory: Characterization, optimization, and recovery.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2015
Revisiting Memory Errors in Large-Scale Production Data Centers: Analysis and Modeling of New Trends from the Field.
Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2015
Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery.
Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2015
Decoupled Direct Memory Access: Isolating CPU and IO Traffic by Leveraging a Dual-Data-Port DRAM.
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015
2014
Efficient Data Mapping and Buffering Techniques for Multilevel Cell Phase-Change Memories.
ACM Trans. Archit. Code Optim., 2014
Mitigating Prefetcher-Caused Pollution Using Informed Caching Policies for Prefetched Blocks.
ACM Trans. Archit. Code Optim., 2014
Supercomput. Front. Innov., 2014
The efficacy of error mitigation techniques for DRAM retention failures: a comparative experimental study.
Proceedings of the ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, 2014
Proceedings of the ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, 2014
Proceedings of the 26th IEEE International Symposium on Computer Architecture and High Performance Computing, 2014
Proceedings of the 20th IEEE Real-Time and Embedded Technology and Applications Symposium, 2014
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014
Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors.
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014
The Blacklisting Memory Scheduler: Achieving high performance and fairness at low cost.
Proceedings of the 32nd IEEE International Conference on Computer Design, 2014
Proceedings of the 32nd IEEE International Conference on Computer Design, 2014
Proceedings of the 32nd IEEE International Conference on Computer Design, 2014
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014
Characterizing Application Memory Error Vulnerability to Optimize Datacenter Cost via Heterogeneous-Reliability Memory.
Proceedings of the 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2014
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014
Memory Systems.
Proceedings of the Computing Handbook, 2014
2013
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013
Linearly compressed pages: a low-complexity, low-latency main memory compression framework.
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013
EMERALD: Characterization of emerging applications and algorithms for low-power devices.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013
An experimental study of data retention behavior in modern DRAM devices: implications for retention time profiling mechanisms.
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013
LightTx: A lightweight transactional design in flash-based SSDs to support flexible transactions.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013
Program interference in MLC NAND flash memory: Characterization, modeling, and mitigation.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013
MISE: Providing performance predictability and improving fairness in shared main memory systems.
Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013
Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013
Application-to-core mapping policies to reduce memory system interference in multi-core systems.
Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013
Threshold voltage distribution in MLC NAND flash memory: characterization, analysis, and modeling.
Proceedings of the Design, Automation and Test in Europe, 2013
Proceedings of the 50th Annual Design Automation Conference 2013, 2013
OWL: cooperative thread array aware scheduling techniques for improving GPGPU performance.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013
2012
Fairness via Source Throttling: A Configurable and High-Performance Fairness Substrate for Multicore Memory Systems.
ACM Trans. Comput. Syst., 2012
Enabling Efficient and Scalable Hybrid Memories Using Fine-Granularity DRAM Cache Management.
IEEE Comput. Archit. Lett., 2012
On-chip networks from a networking perspective: congestion and scalability in many-core interconnects.
Proceedings of the ACM SIGCOMM 2012 Conference, 2012
Proceedings of the IEEE 24th International Symposium on Computer Architecture and High Performance Computing, 2012
Proceedings of the 2012 Sixth IEEE/ACM International Symposium on Networks-on-Chip (NoCS), 2012
Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012
Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012
Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems.
Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012
Proceedings of the 30th International IEEE Conference on Computer Design, 2012
Proceedings of the 30th International IEEE Conference on Computer Design, 2012
Flash correct-and-refresh: Retention-aware error management for increased flash memory lifetime.
Proceedings of the 30th International IEEE Conference on Computer Design, 2012
Error patterns in MLC NAND flash memory: Measurement, characterization, and analysis.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012
Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, 2012
The evicted-address filter: a unified mechanism to address both cache pollution and thrashing.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
Linearly compressed pages: a main memory compression framework with low complexity and low latency.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
Application-to-core mapping policies to reduce memory interference in multi-core systems.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
2011
FIST: A fast, lightweight, FPGA-friendly packet latency estimator for NoC modeling in full-system simulations.
Proceedings of the NOCS 2011, 2011
Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011
Reducing memory interference in multicore systems via application-aware memory channel partitioning.
Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011
Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011
Memory systems in the many-core era: challenges, opportunities, and solution directions.
Proceedings of the 10th International Symposium on Memory Management, 2011
Kilo-NOC: a heterogeneous network-on-chip architecture for scalability and service guarantees.
Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011
Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011
Poster: revisiting virtual channel memory for performance and fairness on multi-core architecture.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011
Proceedings of the 8th International Conference on Autonomic Computing, 2011
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011
2010
IEEE Micro, 2010
Proceedings of the 28th IEEE VLSI Test Symposium, 2010
QuaLe: A Quantum-Leap Inspired Model for Non-stationary Analysis of NoC Traffic in Chip Multi-processors.
Proceedings of the NOCS 2010, 2010
Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010
Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010
Proceedings of the Computer Architecture, 2010
Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010
ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers.
Proceedings of the 16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 2010
Proceedings of the 9th ACM Workshop on Hot Topics in Networks. HotNets 2010, Monterey, CA, USA - October 20, 2010
Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems.
Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, 2010
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010
2009
Virtual Program Counter (VPC) Prediction: Very Low Cost Indirect Branch Prediction Using Conditional Branch Prediction Hardware.
IEEE Trans. Computers, 2009
IEEE Trans. Computers, 2009
Parallelism-Aware Batch Scheduling: Enabling High-Performance and Fair Shared Memory Controllers.
IEEE Micro, 2009
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009
Preemptive virtual clock: a flexible, efficient, and cost-effective QOS scheme for networks-on-chip.
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009
Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009
Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009
Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009
Proceedings of the 2009 International Conference on Computer-Aided Design, 2009
Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009
Techniques for bandwidth-efficient prefetching of linked data structures in hybrid prefetching systems.
Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009
Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, 2009
2008
Guest Editors' Introduction: Interaction of Many-Core Computer Architecture and Operating Systems.
IEEE Micro, 2008
Proceedings of the Twenty-Seventh Annual ACM Symposium on Principles of Distributed Computing, 2008
Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-41 2008), 2008
Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-41 2008), 2008
Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems.
Proceedings of the 35th International Symposium on Computer Architecture (ISCA 2008), 2008
Proceedings of the 35th International Symposium on Computer Architecture (ISCA 2008), 2008
Proceedings of the 14th International Conference on High-Performance Computer Architecture (HPCA-14 2008), 2008
Improving the performance of object-oriented languages with dynamic predication of indirect jumps.
Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, 2008
2007
IEEE Micro, 2007
Proceedings of the 16th USENIX Security Symposium, Boston, MA, USA, August 6-10, 2007, 2007
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-40 2007), 2007
Software-Based Online Detection of Hardware Defects Mechanisms, Architectural Support, and Evaluation.
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-40 2007), 2007
VPC prediction: reducing the cost of indirect branches via hardware-based dynamic devirtualization.
Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007
Feedback Directed Prefetching: Improving the Performance and Bandwidth-Efficiency of Hardware Prefetchers.
Proceedings of the 13st International Conference on High-Performance Computer Architecture (HPCA-13 2007), 2007
Profile-assisted Compiler Support for Dynamic Predication in Diverge-Merge Processors.
Proceedings of the Fifth International Symposium on Code Generation and Optimization (CGO 2007), 2007
2006
Address-Value Delta (AVD) Prediction: A Hardware Technique for Efficiently Parallelizing Dependent Cache Misses.
IEEE Trans. Computers, 2006
IEEE Micro, 2006
IEEE Micro, 2006
Diverge-Merge Processor (DMP): Dynamic Predicated Execution of Complex Control-Flow Graphs Based on Frequently Executed Paths.
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-39 2006), 2006
Proceedings of the 33rd International Symposium on Computer Architecture (ISCA 2006), 2006
Proceedings of the Fourth IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2006), 2006
2005
An Analysis of the Performance Impact of Wrong-Path Memory References on Out-of-Order and Runahead Execution Processors.
IEEE Trans. Computers, 2005
Using the First-Level Caches as Filters to Reduce the Pollution Caused by Speculative Memory References.
Int. J. Parallel Program., 2005
On Reusing the Results of Pre-Executed Instructions in a Runahead Execution Processor.
IEEE Comput. Archit. Lett., 2005
Address-Value Delta (AVD) Prediction: Increasing the Effectiveness of Runahead Execution by Exploiting Regular Memory Allocation Patterns.
Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-38 2005), 2005
Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution.
Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-38 2005), 2005
Proceedings of the 32st International Symposium on Computer Architecture (ISCA 2005), 2005
Microarchitecture-Based Introspection: A Technique for Transient-Fault Tolerance in Microprocessors.
Proceedings of the 2005 International Conference on Dependable Systems and Networks (DSN 2005), 28 June, 2005
2004
Proceedings of the 3rd Workshop on Memory Performance Issues, 2004
Cache Filtering Techniques to Reduce the Negative Impact of Useless Speculative Memory References on Processor Performance.
Proceedings of the 16th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2004), 2004
Wrong Path Events: Exploiting Unusual and Illegal Program Behavior for Early Misprediction Detection and Recovery.
Proceedings of the 37th Annual International Symposium on Microarchitecture (MICRO-37 2004), 2004
2003
IEEE Micro, 2003
Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-Order Processors.
Proceedings of the Ninth International Symposium on High-Performance Computer Architecture (HPCA'03), 2003