Onur Mutlu

CoRR, 2024

Accelerating Graph Neural Networks on Real Processing-In-Memory Systems.

[BibT_eX]

[DOI]

CoRR, 2024

Rethinking the Producer-Consumer Relationship in Modern DRAM-Based Systems.

[BibT_eX]

[DOI]

Aditya Manglik

A. Rahman D. M. H. Al-Nabti

CoRR, 2024

Topologies of Reasoning: Demystifying Chains, Trees, and Graphs of Thoughts.

[BibT_eX]

[DOI]

CoRR, 2024

Material modeling and recent findings in transcatheter aortic valve implantation simulations.

[BibT_eX]

[DOI]

Huseyin Cagatay Yalcin

Comput. Methods Programs Biomed., 2024

Address Scaling: Architectural Support for Fine-Grained Thread-Safe Metadata Management.

[BibT_eX]

[DOI]

Deepanjali Mishra

IEEE Comput. Archit. Lett., 2024

Ramulator 2.0: A Modern, Modular, and Extensible DRAM Simulator.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2024

SpyHammer: Understanding and Exploiting RowHammer Under Fine-Grained Temperature Variations.

[BibT_eX]

[DOI]

Ulrich Rührmair

IEEE Access, 2024

MATSA: An MRAM-Based Energy-Efficient Accelerator for Time Series Analysis.

[BibT_eX]

[DOI]

IEEE Access, 2024

ABACuS: All-Bank Activation Counters for Scalable and Low Overhead RowHammer Mitigation.

[BibT_eX]

[DOI]

Proceedings of the 33rd USENIX Security Symposium, 2024

Self-Managing DRAM: A Low-Cost Framework for Enabling Autonomous and Efficient DRAM Maintenance Operations.

[BibT_eX]

[DOI]

Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

BreakHammer: Enhancing RowHammer Mitigations by Carefully Throttling Suspect Threads.

[BibT_eX]

[DOI]

Oguz Ergin

Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

SwiftRL: Towards Efficient Reinforcement Learning on Real Processing-In-Memory Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024

QUETZAL: Vector Acceleration Framework for Modern Genome Sequence Analysis Algorithms.

[BibT_eX]

[DOI]

Julian Pavon

Iván Vargas Valdivieso

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

MegIS: High-Performance, Energy-Efficient, and Low-Cost Metagenomic Analysis with In-Storage Processing.

[BibT_eX]

[DOI]

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

Constable: Improving Performance and Power Efficiency by Safely Eliminating Load Instruction Execution.

[BibT_eX]

[DOI]

Sreenivas Subramoney

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

Functionally-Complete Boolean Logic in Real DRAM Chips: Experimental Characterization and Analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

Spatial Variation-Aware Read Disturbance Defenses: Experimental Analysis of Real DRAM Chips and Implications on Future Solutions.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data Computing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

CoMeT: Count-Min-Sketch-based Row Tracking to Mitigate RowHammer at Low Cost.

[BibT_eX]

[DOI]

F. Nisa Bostanci

Ismail Emir Yüksel

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

Simultaneous Many-Row Activation in Off-the-Shelf DRAM Chips: Experimental Characterization and Analysis.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2024

Read Disturbance in High Bandwidth Memory: A Detailed Experimental Study on HBM2 DRAM Chips.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2024

An Experimental Characterization of Combined RowHammer and RowPress Read Disturbance in Modern DRAM Chips.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2024

AERO: Adaptive Erase Operation for Improving Lifetime and Performance of Modern NAND Flash-Based SSDs.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

PIM-Opt: Demystifying Distributed Optimization Algorithms on a Real-World Processing-In-Memory System.

[BibT_eX]

[DOI]

Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques, 2024

2023

DRAM Bender: An Extensible and Versatile FPGA-Based Infrastructure to Easily Test State-of-the-Art DRAM Chips.

[BibT_eX]

[DOI]

Mohammad Hashem Haghbayan

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., December, 2023

Run-Time Resource Management in CMPs Handling Multiple Aging Mechanisms.

[BibT_eX]

[DOI]

Antonio Miele

Juha Plosila

IEEE Trans. Computers, October, 2023

Scrooge: a fast and memory-frugal genomic sequence aligner for CPUs, GPUs, and ASICs.

[BibT_eX]

[DOI]

Bioinform., May, 2023

A framework for high-throughput sequence alignment using real processing-in-memory systems.

[BibT_eX]

[DOI]

Bioinform., May, 2023

PiDRAM: A Holistic End-to-end FPGA-based Framework for Processing-in-DRAM.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., March, 2023

How does hemodynamics affect rupture tissue mechanics in abdominal aortic aneurysm: Focus on wall shear stress derived parameters, time-averaged wall shear stress, oscillatory shear index, endothelial cell activation potential, and relative residence time.

[BibT_eX]

[DOI]

Huseyin Cagatay Yalcin

Comput. Biol. Medicine, March, 2023

ALP: Alleviating CPU-Memory Data Movement Overheads in Memory-Centric Systems.

[BibT_eX]

[DOI]

Nastaran Hajinazar

IEEE Trans. Emerg. Top. Comput., 2023

Using Local Cache Coherence for Disaggregated Memory Systems.

[BibT_eX]

[DOI]

ACM SIGOPS Oper. Syst. Rev., 2023

Fetal ECG extraction from maternal ECG using deeply supervised LinkNet++ model.

[BibT_eX]

[DOI]

Arafat Rahman

Sakib Mahmud

Muhammad E. H. Chowdhury

Huseyin Cagatay Yalcin

Eng. Appl. Artif. Intell., 2023

PULSAR: Simultaneous Many-Row Activation for Reliable and High-Performance Computing in Off-the-Shelf DRAM Chips.

[BibT_eX]

[DOI]

Ismail Emir Yuksel

Yahya Can Tugrul

F. Nisa Bostanci

CoRR, 2023

MetaStore: High-Performance Metagenomic Analysis via In-Storage Computing.

[BibT_eX]

[DOI]

CoRR, 2023

MetaFast: Enabling Fast Metagenomic Classification via Seed Counting and Edit Distance Approximation.

[BibT_eX]

[DOI]

Maximilian-David Rumpf

Serghei Mangul

CoRR, 2023

SequenceLab: A Comprehensive Benchmark of Computational Methods for Comparing Genomic Sequences.

[BibT_eX]

[DOI]

Maximilian-David Rumpf

CoRR, 2023

An In-Memory Architecture for High-Performance Long-Read Pre-Alignment Filtering.

[BibT_eX]

[DOI]

CoRR, 2023

Understanding Read Disturbance in High Bandwidth Memory: An Experimental Analysis of Real HBM2 DRAM Chips.

[BibT_eX]

[DOI]

Majd Osseiran

CoRR, 2023

DaPPA: A Data-Parallel Framework for Processing-in-Memory Architectures.

[BibT_eX]

[DOI]

CoRR, 2023

GateSeeder: Near-memory CPU-FPGA Acceleration of Short and Long Read Mapping.

[BibT_eX]

[DOI]

CoRR, 2023

Retrospective: Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors.

[BibT_eX]

[DOI]

CoRR, 2023

Retrospective: An Experimental Study of Data Retention Behavior in Modern DRAM Devices: Implications for Retention Time Profiling Mechanisms.

[BibT_eX]

[DOI]

CoRR, 2023

Retrospective: RAIDR: Retention-Aware Intelligent DRAM Refresh.

[BibT_eX]

[DOI]

CoRR, 2023

Retrospective: A Scalable Processing-in-Memory Accelerator for Parallel Graph Processing.

[BibT_eX]

[DOI]

CoRR, 2023

Memory-Centric Computing.

[BibT_eX]

[DOI]

CoRR, 2023

Accelerating Genome Analysis via Algorithm-Architecture Co-Design.

[BibT_eX]

[DOI]

CoRR, 2023

TransPimLib: A Library for Efficient Transcendental Functions on Processing-in-Memory Systems.

[BibT_eX]

[DOI]

CoRR, 2023

RedBit: An End-to-End Flexible Framework for Evaluating the Accuracy of Quantized CNNs.

[BibT_eX]

[DOI]

CoRR, 2023

RawHash: enabling fast and accurate real-time analysis of raw nanopore signals for large genomes.

[BibT_eX]

[DOI]

Bioinform., 2023

Extending Memory Capacity in Modern Consumer Systems With Emerging Non-Volatile Memory: Experimental Analysis and Characterization Using the Intel Optane SSD.

[BibT_eX]

[DOI]

IEEE Access, 2023

Casper: Accelerating Stencil Computations Using Near-Cache Processing.

[BibT_eX]

[DOI]

IEEE Access, 2023

High-Performance and Scalable Agent-Based Simulation with BioDynaMo.

[BibT_eX]

[DOI]

Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023

Swordfish: A Framework for Evaluating Deep Neural Network-based Basecalling using Computation-In-Memory with Non-Ideal Memristors.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Victima: Drastically Increasing Address Translation Reach by Leveraging Underutilized Cache Resources.

[BibT_eX]

[DOI]

Davide Basilio Bartolini

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Utopia: Fast and Efficient Address Translation via Hybrid Restrictive & Flexible Virtual-to-Physical Address Mappings.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

TransPimLib: Efficient Transcendental Functions for Processing-in-Memory Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

Evaluating Machine LearningWorkloads on Memory-Centric Computing Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

Venice: Improving Solid-State Drive Parallelism at Low Cost via Conflict-Free Accesses.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

RowPress: Amplifying Read Disturbance in Modern DRAM Chips.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

Evaluating Homomorphic Operations on a Real-World Processing-In-Memory System.

[BibT_eX]

[DOI]

Harshita Gupta

Mayank Kabra

Proceedings of the IEEE International Symposium on Workload Characterization, 2023

SPARTA: Spatial Acceleration for Efficient and Scalable Horizontal Diffusion Weather Stencil Computation.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Supercomputing, 2023

An Experimental Analysis of RowHammer in HBM2 DRAM Chips.

[BibT_eX]

[DOI]

Majd Osseiran

Proceedings of the 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2023

Message from the DSN 2023 Program Chairs.

[BibT_eX]

[DOI]

Xavier Défago

Proceedings of the 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Network, 2023

Invited: Accelerating Genome Analysis via Algorithm-Architecture Co-Design.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Lightning Talk: Memory-Centric Computing.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Fundamentally Understanding and Solving RowHammer.

[BibT_eX]

[DOI]

Proceedings of the 28th Asia and South Pacific Design Automation Conference, 2023

SimplePIM: A Software Framework for Productive and Efficient Processing-in-Memory.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques, 2023

2022

pLUTo: Enabling Massively Parallel Computation In DRAM via Lookup Tables.

[BibT_eX]

[DOI]

Dataset, July, 2022

Accelerating Weather Prediction Using Near-Memory Reconfigurable Fabric.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., 2022

POCLib: A High-Performance Framework for Enabling Near Orthogonal Processing on Compression.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2022

Exploring Data Analytics Without Decompression on Embedded GPU Systems.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2022

CoPA: Cold Page Awakening to Overcome Retention Failures in STT-MRAM Based I/O Buffers.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2022

MetaSys: A Practical Open-source Metadata Management System to Implement and Evaluate Cross-layer Optimizations.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2022

SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures.

[BibT_eX]

[DOI]

Proc. ACM Meas. Anal. Comput. Syst., 2022

Accelerating Neural Network Inference With Processing-in-DRAM: From the Edge to the Cloud.

[BibT_eX]

[DOI]

IEEE Micro, 2022

Optically connected memory for disaggregated data centers.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2022

Guest Editors' Introduction: Near-Memory and In-Memory Processing.

[BibT_eX]

[DOI]

Hai Li

Alaa R. Alameldeen

IEEE Des. Test, 2022

TargetCall: Eliminating the Wasted Computation in Basecalling via Pre-Basecalling Filtering.

[BibT_eX]

[DOI]

CoRR, 2022

Utopia: Efficient Address Translation using Hybrid Virtual-to-Physical Address Mapping.

[BibT_eX]

[DOI]

Rahul Bera

Kosta Stojiljkovic

CoRR, 2022

TuRaN: True Random Number Generation Using Supply Voltage Underscaling in SRAMs.

[BibT_eX]

[DOI]

Nika Mansouri-Ghiasi

Oguz Ergin

CoRR, 2022

Taming Large-Scale Genomic Analyses via Sparsified Genomics.

[BibT_eX]

[DOI]

Julien Eudine

CoRR, 2022

NEON: Enabling Efficient Support for Nonlinear Operations in Resistive RAM-based Neural Network Accelerators.

[BibT_eX]

[DOI]

CoRR, 2022

Accelerating Time Series Analysis via Processing using Non-Volatile Memories.

[BibT_eX]

[DOI]

CoRR, 2022

A Framework for Designing Efficient Deep Learning-Based Genomic Basecallers.

[BibT_eX]

[DOI]

CoRR, 2022

RevaMp3D: Architecting the Processor Core and Cache Hierarchy for Systems with Monolithically-Integrated Logic and Memory.

[BibT_eX]

[DOI]

Nika Mansouri-Ghiasi

CoRR, 2022

SpyHammer: Using RowHammer to Remotely Spy on Temperature.

[BibT_eX]

[DOI]

Ulrich Rührmair

CoRR, 2022

LEAPER: Modeling Cloud FPGA-based Systems via Transfer Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Sectored DRAM: An Energy-Efficient High-Throughput and Practical Fine-Grained DRAM Architecture.

[BibT_eX]

[DOI]

Oguz Ergin

CoRR, 2022

A Case for Self-Managing DRAM Chips: Improving Performance, Efficiency, Reliability, and Security via Autonomous in-DRAM Maintenance Operations.

[BibT_eX]

[DOI]

CoRR, 2022

An Experimental Evaluation of Machine Learning Training on a Real Processing-in-Memory System.

[BibT_eX]

[DOI]

CoRR, 2022

COVIDHunter: COVID-19 pandemic wave prediction and mitigation via seasonality-aware modeling.

[BibT_eX]

[DOI]

Stefan W. Tell

CoRR, 2022

Going From Molecules to Genomic Variations to Scientific Discovery: Intelligent Algorithms and Architectures for Intelligent Genome Analysis.

[BibT_eX]

[DOI]

CoRR, 2022

Enabling High-Performance and Energy-Efficient Hybrid Transactional/Analytical Databases with Hardware/Software Cooperation.

[BibT_eX]

[DOI]

CoRR, 2022

A Case for Transparent Reliability in DRAM Systems.

[BibT_eX]

[DOI]

Aditya Manglik

Andre M. Ribeiro-dos-Santos

CoRR, 2022

Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Systems.

[BibT_eX]

[DOI]

CoRR, 2022

Packaging, containerization, and virtualization of computational omics methods: Advances, challenges, and opportunities.

[BibT_eX]

[DOI]

Malak S. Abedalthagafi

Serghei Mangul

CoRR, 2022

DeepSketch: A New Machine Learning-Based Reference Search Technique for Post-Deduplication Delta Compression.

[BibT_eX]

[DOI]

CoRR, 2022

GenStore: A High-Performance and Energy-Efficient In-Storage Computing System for Genome Sequence Analysis.

[BibT_eX]

[DOI]

CoRR, 2022

SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Systems.

[BibT_eX]

[DOI]

CoRR, 2022

FastRemap: a tool for quickly remapping reads between genome assemblies.

[BibT_eX]

[DOI]

Bioinform., 2022

BioDynaMo: a modular platform for high-performance agent-based simulation.

[BibT_eX]

[DOI]

Bioinform., 2022

Demeter: A Fast and Energy-Efficient Food Profiler Using Hyperdimensional Computing in Memory.

[BibT_eX]

[DOI]

IEEE Access, 2022

Benchmarking a New Paradigm: Experimental Analysis and Characterization of a Real Processing-in-Memory System.

[BibT_eX]

[DOI]

IEEE Access, 2022

Chapter Eight - The design of an energy-efficient deflection-based on-chip network.

[BibT_eX]

[DOI]

Adv. Comput., 2022

Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures.

[BibT_eX]

[DOI]

Proceedings of the SIGMETRICS/PERFORMANCE '22: ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, Mumbai, India, June 6, 2022

ProbGraph: High-Performance and High-Accuracy Graph Mining with Probabilistic Set Representations.

[BibT_eX]

[DOI]

Raghavendra Kanakagiri

Proceedings of the SC22: International Conference for High Performance Computing, 2022

HiRA: Hidden Row Activation for Reducing Refresh Latency of Off-the-Shelf DRAM Chips.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

GenPIP: In-Memory Acceleration of Genome Analysis via Tight Integration of Basecalling and Read Mapping.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

AgileWatts: An Energy-Efficient CPU Core Idle-State Architecture for Latency-Sensitive Server Applications.

[BibT_eX]

[DOI]

Kleovoulos Kalaitzidis

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

pLUTo: Enabling Massively Parallel Computation in DRAM via Lookup Tables.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Morpheus: Extending the Last Level Cache Capacity in GPU Systems Using Idle GPU Core Resources.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Hermes: Accelerating Long-Latency Load Requests via Perceptron-Based Off-Chip Load Prediction.

[BibT_eX]

[DOI]

Rahul Bera

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Flash-Cosmos: In-Flash Bulk Bitwise Operations Using Inherent Computation Capability of NAND Flash Memory.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Methodologies, Workloads, and Tools for Processing-in-Memory: Enabling the Adoption of Data-Centric Architectures.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

Heterogeneous Data-Centric Architectures for Modern Data-Intensive Applications: Case Studies in Machine Learning and Databases.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

PiDRAM: An FPGA-based Framework for End-to-end Evaluation of Processing-in-DRAM Techniques.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

GenStore: In-Storage Filtering of Genomic Data for High-Performance and Energy-Efficient Genome Analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

Machine Learning Training on a Real Processing-in-Memory System.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

SparseP: Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

Exploiting Near-Data Processing to Accelerate Time Series Analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

SeGraM: a universal hardware accelerator for genomic sequence-to-graph and sequence-to-sequence mapping.

[BibT_eX]

[DOI]

Damla Senol Cali

Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

Sibyl: adaptive and extensible data placement in hybrid storage systems using online reinforcement learning.

[BibT_eX]

[DOI]

Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

Algorithmic Improvement and GPU Acceleration of the GenASM Algorithm.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

High-throughput Pairwise Alignment with the Wavefront Algorithm using Processing-in-Memory.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

Polynesia: Enabling High-Performance and Energy-Efficient Hybrid Transactional/Analytical Databases with Hardware/Software Co-Design.

[BibT_eX]

[DOI]

Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

LEAPER: Fast and Accurate FPGA-based System Performance Prediction via Transfer Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE 40th International Conference on Computer Design, 2022

DarkGates: A Hybrid Power-Gating Architecture to Mitigate the Performance Impact of Dark-Silicon in High Performance Processors.

[BibT_eX]

[DOI]

Jawad Haj-Yahya

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

DR-STRaNGe: End-to-End System Design for DRAM-based True Random Number Generators.

[BibT_eX]

[DOI]

F. Nisa Bostanci

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

DeepSketch: A New Machine Learning-Based Reference Search Technique for Post-Deduplication Delta Compression.

[BibT_eX]

[DOI]

Proceedings of the 20th USENIX Conference on File and Storage Technologies, 2022

Understanding RowHammer Under Reduced Wordline Voltage: An Experimental Study Using Real DRAM Devices.

[BibT_eX]

[DOI]

Geraldo F. de Oliviera

Proceedings of the 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2022

A Compiler Framework for Optimizing Dynamic Parallelism on GPUs.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2022

GenStore: a high-performance in-storage processing system for genome sequence analysis.

[BibT_eX]

[DOI]

Syed Mohammad Asad Hassan Jafri

Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021

TADOC: Text analytics directly on compression.

[BibT_eX]

[DOI]

VLDB J., 2021

ETICA: Efficient Two-Level I/O Caching Architecture for Virtualized Platforms.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2021

Unified Holistic Memory Management Supporting Multiple Big Data Processing Frameworks over Hybrid Memories.

[BibT_eX]

[DOI]

ACM Trans. Comput. Syst., 2021

Refresh Triggered Computation: Improving the Energy Efficiency of Convolutional Neural Network Accelerators.

[BibT_eX]

[DOI]

Ahmed Hemani

ACM Trans. Archit. Code Optim., 2021

GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2021

FPGA-Based Near-Memory Acceleration of Modern Data-Intensive Applications.

[BibT_eX]

[DOI]

Damla Senol Cali

Henk Corporaal

IEEE Micro, 2021

GenShare: Sharing Accurate Differentially-Private Statistics for Genomic Datasets with Dependent Tuples.

[BibT_eX]

[DOI]

Özgür Ulusoy

Erman Ayday

CoRR, 2021

Casper: Accelerating Stencil Computation using Near-cache Processing.

[BibT_eX]

[DOI]

CoRR, 2021

Energy-Efficient Deflection-based On-chip Networks: Topology, Routing, Flow Control.

[BibT_eX]

[DOI]

CoRR, 2021

Extending Memory Capacity in Consumer Devices with Emerging Non-Volatile Memory: An Experimental Study.

[BibT_eX]

[DOI]

CoRR, 2021

A Deeper Look into RowHammer's Sensitivities: Experimental Analysis of Real DRAM Chips and Implications on Future Attacks and Defenses.

[BibT_eX]

[DOI]

CoRR, 2021

NERO: Accelerating Weather Prediction using Near-Memory Reconfigurable Fabric.

[BibT_eX]

[DOI]

CoRR, 2021

Security Analysis of the Silver Bullet Technique for RowHammer Prevention.

[BibT_eX]

[DOI]

Fabrice Devaux

CoRR, 2021

Near-Optimal Privacy-Utility Tradeoff in Genomic Studies Using Selective SNP Hiding.

[BibT_eX]

[DOI]

CoRR, 2021

SIMDRAM: An End-to-End Framework for Bit-Serial SIMD Computing in DRAM.

[BibT_eX]

[DOI]

CoRR, 2021

Benchmarking a New Paradigm: An Experimental Analysis of a Real Processing-in-Memory Architecture.

[BibT_eX]

[DOI]

CoRR, 2021

pLUTo: In-DRAM Lookup Tables to Enable Massively Parallel General-Purpose Computation.

[BibT_eX]

[DOI]

CoRR, 2021

SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems.

[BibT_eX]

[DOI]

Raghavendra Kanakagiri

Grzegorz Kwasniewski

Jakub Beránek

Kacper Janda

Salvatore Di Girolamo

Marek Konieczny

Torsten Hoefler

CoRR, 2021

BurstLink: Techniques for Energy-Efficient Conventional and Virtual Reality Video Display.

[BibT_eX]

[DOI]

CoRR, 2021

GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra.

[BibT_eX]

[DOI]

CoRR, 2021

Polynesia: Enabling Effective Hybrid Transactional/Analytical Databases with Specialized Hardware/Software Co-Design.

[BibT_eX]

[DOI]

CoRR, 2021

Mitigating Edge Machine Learning Inference Bottlenecks: An Empirical Study on Accelerating Google Edge Models.

[BibT_eX]

[DOI]

CoRR, 2021

COVIDHunter: An Accurate, Flexible, and Environment-Aware Open-Source COVID-19 Outbreak Simulation Model.

[BibT_eX]

[DOI]

Stefan W. Tell

CoRR, 2021

SneakySnake: a fast and accurate universal genome pre-alignment filter for CPUs, GPUs and FPGAs.

[BibT_eX]

[DOI]

Bioinform., 2021

DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks.

[BibT_eX]

[DOI]

IEEE Access, 2021

HARP: Practically and Effectively Identifying Uncorrectable Errors in Memory Chips That Use On-Die Error-Correcting Codes.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

Uncovering In-DRAM RowHammer Protection Mechanisms: A New Methodology, Custom RowHammer Patterns, and Implications.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

BurstLink: Techniques for Energy-Efficient Video Display for Conventional and Virtual Reality Systems.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems.

[BibT_eX]

[DOI]

Raghavendra Kanakagiri

Grzegorz Kwasniewski

Jakub Beránek

Kacper Janda

Salvatore Di Girolamo

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning.

[BibT_eX]

[DOI]

Rahul Bera

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

A Deeper Look into RowHammer's Sensitivities: Experimental Analysis of Real DRAM Chipsand Implications on Future Attacks and Defenses.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

QUAC-TRNG: High-Throughput True Random Number Generation Using Quadruple Row Activation in Commodity DRAM Chips.

[BibT_eX]

[DOI]

Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

IChannels: Exploiting Current Management Mechanisms to Create Covert Channels in Modern Processors.

[BibT_eX]

[DOI]

Ivan Puddu

Sherif Abdelmonem Sayed Mohamed

Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

CODIC: A Low-Cost Substrate for Enabling Custom In-DRAM Functionalities and Optimizations.

[BibT_eX]

[DOI]

Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

Energy-Efficient Mobile Robot Control via Run-time Monitoring of Environmental Complexity and Computing Workload.

[BibT_eX]

[DOI]

Mohammad Hashem Haghbayan

Antonio Miele

Juha Plosila

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

G-TADOC: Enabling Efficient GPU-Based Text Analytics without Decompression.

[BibT_eX]

[DOI]

Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

BlockHammer: Preventing RowHammer at Low Cost by Blacklisting Rapidly-Accessed DRAM Rows.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

SynCron: Efficient Synchronization Support for Near-Data-Processing Architectures.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

Benchmarking Memory-Centric Computing Systems: Analysis of Real Processing-In-Memory Hardware.

[BibT_eX]

[DOI]

Proceedings of the 12th International Green and Sustainable Computing Workshops, 2021

Modeling FPGA-Based Systems via Few-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

Intelligent Architectures for Intelligent Computing Systems.

[BibT_eX]

[DOI]

Seyed Saber Nabavi Larimi

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

Understanding Power Consumption and Reliability of High-Bandwidth Memory with Voltage Underscaling.

[BibT_eX]

[DOI]

Behzad Salami

Osman S. Unsal

Adrián Cristal Kestelman

Hamid Sarbazi-Azad

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

SIMDRAM: a framework for bit-serial SIMD processing using DRAM.

[BibT_eX]

[DOI]

Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

Rethinking software runtimes for disaggregated memory.

[BibT_eX]

[DOI]

Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

Reducing solid-state drive read latency by optimizing read-retry.

[BibT_eX]

[DOI]

Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

Aging-Aware Request Scheduling for Non-Volatile Main Memory.

[BibT_eX]

[DOI]

Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021

Google Neural Network Models for Edge Devices: Analyzing and Mitigating Machine Learning Inference Bottlenecks.

[BibT_eX]

[DOI]

Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021

2020

RowHammer: A Retrospective.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Accelerating Genome Analysis: A Primer on an Ongoing Journey.

[BibT_eX]

[DOI]

IEEE Micro, 2020

Rethinking Divide and Conquer - Towards Holistic Interfaces of the Computing Stack.

[BibT_eX]

[DOI]

Schahram Dustdar

IEEE Internet Comput., 2020

Guest Editorial: Robust Resource-Constrained Systems for Machine Learning.

[BibT_eX]

[DOI]

Theocharis Theocharides

Muhammad Shafique

Jungwook Choi

IEEE Des. Test, 2020

Robust Machine Learning Systems: Challenges, Current Trends, Perspectives, and the Road Ahead.

[BibT_eX]

[DOI]

Muhammad Shafique

Mahum Naseer

Theocharis Theocharides

IEEE Des. Test, 2020

A Modern Primer on Processing in Memory.

[BibT_eX]

[DOI]

CoRR, 2020

Enabling High-Capacity, Latency-Tolerant, and Highly-Concurrent GPU Register Files via Software/Hardware Cooperation.

[BibT_eX]

[DOI]

Amirhossein Mirhosseini

Seyyed Hossein Seyyedaghaei Rezaei

CoRR, 2020

Intelligent Management of Mobile Systems through Computational Self-Awareness.

[BibT_eX]

[DOI]

CoRR, 2020

BioDynaMo: an agent-based simulation platform for scalable computational biology research.

[BibT_eX]

[DOI]

CoRR, 2020

Accelerating B-spline interpolation on GPUs: Application to medical image registration.

[BibT_eX]

[DOI]

Comput. Methods Programs Biomed., 2020

NoM: Network-on-Memory for Inter-Bank Data Transfer in Highly-Banked Memories.

[BibT_eX]

[DOI]

Mehdi Modarressi

Masoud Daneshtalab

IEEE Comput. Archit. Lett., 2020

Apollo: a sequencing-technology-independent, scalable and accurate assembly polishing algorithm.

[BibT_eX]

[DOI]

Bioinform., 2020

Intelligent Architectures for Intelligent Machines.

[BibT_eX]

[DOI]

Proceedings of the 2020 International Symposium on VLSI Design, Automation and Test, 2020

TRRespass: Exploiting the Many Sides of Target Row Refresh.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE Symposium on Security and Privacy, 2020

Are We Susceptible to Rowhammer? An End-to-End Methodology for Cloud Providers.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE Symposium on Security and Privacy, 2020

Optically Connected Memory for Disaggregated Data Centers.

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE International Symposium on Computer Architecture and High Performance Computing, 2020

FIGARO: Improving System Performance via Fine-Grained In-DRAM Data Relocation and Caching.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

Bit-Exact ECC Recovery (BEER): Determining DRAM On-Die ECC Functions by Exploiting DRAM Data Retention Characteristics.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

FlexWatts: A Power- and Workload-Aware Hybrid Power Delivery Network for Energy-Efficient Microprocessors.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

GenASM: A High-Performance, Low-Power Approximate String Matching Acceleration Framework for Genome Sequence Analysis.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

Improving phase change memory performance with data content aware access.

[BibT_eX]

[DOI]

Proceedings of the ISMM '20: 2020 ACM SIGPLAN International Symposium on Memory Management, 2020

CLR-DRAM: A Low-Cost DRAM Architecture Enabling Dynamic Capacity-Latency Trade-Off.

[BibT_eX]

[DOI]

Jisung Park

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

Revisiting RowHammer: An Experimental Analysis of Modern DRAM Devices and Mitigation Techniques.

[BibT_eX]

[DOI]

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

The Virtual Block Interface: A Flexible Alternative to the Conventional Virtual Memory Framework.

[BibT_eX]

[DOI]

Nastaran Hajinazar

Pratyush Patel

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

SysScale: Exploiting Multi-domain Dynamic Voltage and Frequency Scaling for Energy Efficient Mobile Processors.

[BibT_eX]

[DOI]

Jawad Haj-Yahya

Efraim Rotem

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

The Non-IID Data Quagmire of Decentralized Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Enabling Efficient Random Access to Hierarchically-Compressed Data.

[BibT_eX]

[DOI]

Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

WoLFRaM: Enhancing Wear-Leveling and Fault Tolerance in Resistive Memories using Programmable Address Decoders.

[BibT_eX]

[DOI]

Proceedings of the 38th IEEE International Conference on Computer Design, 2020

NATSA: A Near-Data Processing Accelerator for Time Series Analysis.

[BibT_eX]

[DOI]

Proceedings of the 38th IEEE International Conference on Computer Design, 2020

Techniques for Reducing the Connected-Standby Energy Consumption of Mobile Devices.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

NERO: A Near High-Bandwidth Memory Stencil Accelerator for Weather Prediction Modeling.

[BibT_eX]

[DOI]

Proceedings of the 30th International Conference on Field-Programmable Logic and Applications, 2020

Boyi: A Systematic Framework for Automatically Deciding the Right Execution Model of OpenCL Applications on FPGAs.

[BibT_eX]

[DOI]

Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020

An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration.

[BibT_eX]

[DOI]

Adrián Cristal Kestelman

Osman S. Unsal

Hamid Sarbazi-Azad

Proceedings of the 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2020

Evanesco: Architectural Support for Efficient Data Sanitization in Modern Flash-Based Storage Systems.

[BibT_eX]

[DOI]

Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

2019

Highly Concurrent Latency-tolerant Register Files for GPUs.

[BibT_eX]

[DOI]

Amirhossein Mirhosseini

ACM Trans. Comput. Syst., 2019

Enabling and Exploiting Partition-Level Parallelism (PALP) in Phase Change Memories.

[BibT_eX]

[DOI]

ACM Trans. Embed. Comput. Syst., 2019

An Analytical Model for Performance and Lifetime Estimation of Hybrid DRAM-NVM Main Memories.

[BibT_eX]

[DOI]

Reza Salkhordeh

Hossein Asadi

IEEE Trans. Computers, 2019

ITAP: Idle-Time-Aware Power Management for GPU Execution Units.

[BibT_eX]

[DOI]

Seyed Borna Ehsani

Hajar Falahati

ACM Trans. Archit. Code Optim., 2019

AVPP: Address-first Value-next Predictor with Value Prefetching for Improving the Efficiency of Load Value Prediction.

[BibT_eX]

[DOI]

Rodolfo Azevedo

ACM Trans. Archit. Code Optim., 2019

Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-precision Learning.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2019

Demystifying Complex Workload-DRAM Interactions: An Experimental Study.

[BibT_eX]

[DOI]

Proc. ACM Meas. Anal. Comput. Syst., 2019

Processing data where it makes sense: Enabling in-memory computation.

[BibT_eX]

[DOI]

Microprocess. Microsystems, 2019

Processing-in-memory: A workload-driven perspective.

[BibT_eX]

[DOI]

IBM J. Res. Dev., 2019

AirLift: A Fast and Comprehensive Technique for Remapping Alignments between Reference Genomes.

[BibT_eX]

[DOI]

CoRR, 2019

A Workload and Programming Ease Driven Perspective of Processing-in-Memory.

[BibT_eX]

[DOI]

CoRR, 2019

In-DRAM Bulk Bitwise Execution Engine.

[BibT_eX]

[DOI]

CoRR, 2019

Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-precision Learning (Technical Report).

[BibT_eX]

[DOI]

CoRR, 2019

Understanding the Interactions of Workloads and DRAM Types: A Comprehensive Experimental Study.

[BibT_eX]

[DOI]

CoRR, 2019

Dataplant: In-DRAM Security Mechanisms for Low-Cost Devices.

[BibT_eX]

[DOI]

CoRR, 2019

Shouji: a fast and efficient pre-alignment filter for sequence alignment.

[BibT_eX]

[DOI]

Bioinform., 2019

Nanopore sequencing technology and tools for genome assembly: computational analysis of the current state, bottlenecks and future directions.

[BibT_eX]

[DOI]

Briefings Bioinform., 2019

Analysis and Modeling of Collaborative Execution Strategies for Heterogeneous CPU-FPGA Architectures.

[BibT_eX]

[DOI]

Sitao Huang

Li-Wen Chang

Izzat El Hajj

Simon Garcia De Gonzalo

Sai Rahul Chalamalasetti

Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering, 2019

Panthera: holistic memory management for big data processing over hybrid memories.

[BibT_eX]

[DOI]

Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2019

Binary Star: Coordinated Reliability in Heterogeneous Memory Systems for High Performance and Scalability.

[BibT_eX]

[DOI]

Xiao Liu

David Roberts

Jishen Zhao

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network Inference Using Approximate DRAM.

[BibT_eX]

[DOI]

Skanda Koppula

Roknoddin Azizi

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

SMASH: Co-designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

DSPatch: Dual Spatial Pattern Prefetcher.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

CROW: a low-cost substrate for improving DRAM performance, energy efficiency, and reliability.

[BibT_eX]

[DOI]

Proceedings of the 46th International Symposium on Computer Architecture, 2019

CoNDA: efficient cache coherence support for near-data accelerators.

[BibT_eX]

[DOI]

Proceedings of the 46th International Symposium on Computer Architecture, 2019

A Scalable Priority-Aware Approach to Managing Data Center Server Power.

[BibT_eX]

[DOI]

Yang Li

Charles R. Lefurgy

Karthick Rajamani

Malcolm S. Allen-Ware

Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

D-RaNGe: Using Commodity DRAM Devices to Generate True Random Numbers with Low Latency and High Throughput.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

Project PBerry: FPGA Acceleration for Remote Memory.

[BibT_eX]

[DOI]

Proceedings of the Workshop on Hot Topics in Operating Systems, 2019

Understanding and Modeling On-Die Error Correction in Modern DRAM: An Experimental Study Using Real Devices.

[BibT_eX]

[DOI]

Proceedings of the 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2019

NAPEL: Near-Memory Computing Application Performance Prediction via Ensemble Learning.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual Design Automation Conference 2019, 2019

Enabling Practical Processing in and near Memory for Data-Intensive Computing.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual Design Automation Conference 2019, 2019

RowHammer and Beyond.

[BibT_eX]

[DOI]

Proceedings of the Constructive Side-Channel Analysis and Secure Design, 2019

Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs.

[BibT_eX]

[DOI]

Simon Garcia De Gonzalo

Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019

A Framework for Memory Oversubscription Management in Graphics Processing Units.

[BibT_eX]

[DOI]

Chen Li

Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

Towards Breaking the Memory Bandwidth Wall Using Approximate Value Prediction.

[BibT_eX]

[DOI]

Proceedings of the Approximate Circuits, Methodologies and CAD., 2019

2018

Mosaic: Enabling Application-Transparent Support for Multiple Page Sizes in Throughput Processors.

[BibT_eX]

[DOI]

ACM SIGOPS Oper. Syst. Rev., 2018

Efficient Document Analytics on Compressed Data: Method, Challenges, Algorithms, Insights.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2018

Improving 3D NAND Flash Memory Lifetime by Tolerating Early Retention Loss and Process Variation.

[BibT_eX]

[DOI]

Proc. ACM Meas. Anal. Comput. Syst., 2018

What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study.

[BibT_eX]

[DOI]

Proc. ACM Meas. Anal. Comput. Syst., 2018

ECI-Cache: A High-Endurance and Cost-Efficient I/O Caching Scheme for Virtualized Platforms.

[BibT_eX]

[DOI]

Saba Ahmadian

Hossein Asadi

Proc. ACM Meas. Anal. Comput. Syst., 2018

Iterative Modulo Scheduling.

[BibT_eX]

[DOI]

IEEE Micro, 2018

Evaluating Row Buffer Locality in Future Non-Volatile Main Memories.

[BibT_eX]

[DOI]

Jing Li

CoRR, 2018

Enabling Efficient RDMA-based Synchronous Mirroring of Persistent Memory Transactions.

[BibT_eX]

[DOI]

CoRR, 2018

SLIDER: Fast and Efficient Computation of Banded Sequence Alignment.

[BibT_eX]

[DOI]

CoRR, 2018

D-RaNGe: Violating DRAM Timing Constraints for High-Throughput True Random Number Generation using Commodity DRAM Devices.

[BibT_eX]

[DOI]

CoRR, 2018

Techniques for Efficiently Handling Power Surges in Fuel Cell Powered Data Centers: Modeling, Analysis, Results.

[BibT_eX]

[DOI]

CoRR, 2018

Recent Advances in DRAM and Flash Memory Architectures.

[BibT_eX]

[DOI]

CoRR, 2018

Recent Advances in Overcoming Bottlenecks in Memory Systems and Managing Memory Resources in GPU Systems.

[BibT_eX]

[DOI]

CoRR, 2018

Predictable Performance and Fairness Through Accurate Slowdown Estimation in Shared Main Memory Systems.

[BibT_eX]

[DOI]

CoRR, 2018

Exploiting Row-Level Temporal Locality in DRAM to Reduce the Memory Access Latency.

[BibT_eX]

[DOI]

CoRR, 2018

RowClone: Accelerating Data Movement and Initialization Using DRAM.

[BibT_eX]

[DOI]

CoRR, 2018

Characterizing, Exploiting, and Mitigating Vulnerabilities in MLC NAND Flash Memory Programming.

[BibT_eX]

[DOI]

CoRR, 2018

Read Disturb Errors in MLC NAND Flash Memory.

[BibT_eX]

[DOI]

CoRR, 2018

SoftMC: Practical DRAM Characterization Using an FPGA-Based Infrastructure.

[BibT_eX]

[DOI]

CoRR, 2018

LISA: Increasing Internal Connectivity in DRAM for Fast Data Movement and Low Latency.

[BibT_eX]

[DOI]

CoRR, 2018

Voltron: Understanding and Exploiting the Voltage-Latency-Reliability Trade-Offs in Modern DRAM Chips to Improve Energy Efficiency.

[BibT_eX]

[DOI]

Kevin K. Chang

CoRR, 2018

Flexible-Latency DRAM: Understanding and Exploiting Latency Variation in Modern DRAM Chips.

[BibT_eX]

[DOI]

CoRR, 2018

Tiered-Latency DRAM: Enabling Low-Latency Main Memory at Low Cost.

[BibT_eX]

[DOI]

CoRR, 2018

Adaptive-Latency DRAM: Reducing DRAM Latency by Exploiting Timing Margins.

[BibT_eX]

[DOI]

CoRR, 2018

Experimental Characterization, Optimization, and Recovery of Data Retention Errors in MLC NAND Flash Memory.

[BibT_eX]

[DOI]

CoRR, 2018

Decoupling GPU Programming Models from Resource Management for Enhanced Programming Ease, Portability, and Performance.

[BibT_eX]

[DOI]

CoRR, 2018

Exploiting the DRAM Microarchitecture to Increase Memory-Level Parallelism.

[BibT_eX]

[DOI]

CoRR, 2018

Reducing DRAM Refresh Overheads with Refresh-Access Parallelism.

[BibT_eX]

[DOI]

CoRR, 2018

Mosaic: An Application-Transparent Hardware-Software Cooperative Memory Manager for GPUs.

[BibT_eX]

[DOI]

CoRR, 2018

High-Performance and Energy-Effcient Memory Scheduler Design for Heterogeneous Systems.

[BibT_eX]

[DOI]

CoRR, 2018

A Memory Controller with Row Buffer Locality Awareness for Hybrid Memory Systems.

[BibT_eX]

[DOI]

HanBin Yoon

Rachael A. Harding

CoRR, 2018

Holistic Management of the GPGPU Memory Hierarchy to Manage Warp-level Latency Tolerance.

[BibT_eX]

[DOI]

CoRR, 2018

Zorua: Enhancing Programming Ease, Portability, and Performance in GPUs by Decoupling Programming Models from Resource Management.

[BibT_eX]

[DOI]

CoRR, 2018

Enabling the Adoption of Processing-in-Memory: Challenges, Mechanisms, Future Research Directions.

[BibT_eX]

[DOI]

Kevin Hsieh

Amirali Boroumand

CoRR, 2018

Focus: Querying Large Video Datasets with Low Latency and Low Cost.

[BibT_eX]

[DOI]

Kevin Hsieh

Ganesh Ananthanarayanan

CoRR, 2018

GRIM-Filter: Fast seed location filtering in DNA read mapping using processing-in-memory technologies.

[BibT_eX]

[DOI]

BMC Genom., 2018

Focus: Querying Large Video Datasets with Low Latency and Low Cost.

[BibT_eX]

[DOI]

Kevin Hsieh

Ganesh Ananthanarayanan

Peter Bodík

Shivaram Venkataraman

Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018

Reducing DRAM Latency via Charge-Level-Aware Look-Ahead Partial Restoration.

[BibT_eX]

[DOI]

Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

Processing data where it makes sense in modern computing systems: Enabling in-memory computation.

[BibT_eX]

[DOI]

Proceedings of the 7th Mediterranean Conference on Embedded Computing, 2018

A Case for Richer Cross-Layer Abstractions: Bridging the Semantic Gap with Expressive Memory.

[BibT_eX]

[DOI]

Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

The Locality Descriptor: A Holistic Cross-Layer Abstraction to Express Data Locality In GPUs.

[BibT_eX]

[DOI]

Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

FLIN: Enabling Fairness and Enhancing Performance in Modern NVMe Solid State Drives.

[BibT_eX]

[DOI]

Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

HICOMB Keynote 2.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

A Large Scale Study of Data Center Network Reliability.

[BibT_eX]

[DOI]

Tianyin Xu

Kaushik Veeraraghavan

Proceedings of the Internet Measurement Conference 2018, 2018

Zwift: A Programming Framework for High Performance Text Analytics on Compressed Data.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Supercomputing, 2018

Solar-DRAM: Reducing DRAM Access Latency by Exploiting the Variation in Local Bitlines.

[BibT_eX]

[DOI]

Proceedings of the 36th IEEE International Conference on Computer Design, 2018

HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

The DRAM Latency PUF: Quickly Evaluating Physical Unclonable Functions by Exploiting the Latency-Reliability Tradeoff in Modern Commodity DRAM Devices.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices.

[BibT_eX]

[DOI]

Proceedings of the 16th USENIX Conference on File and Storage Technologies, 2018

VRL-DRAM: improving DRAM performance via variable refresh latency.

[BibT_eX]

[DOI]

Anup Das

Proceedings of the 55th Annual Design Automation Conference, 2018

LTRF: Enabling High-Capacity Register Files for GPUs via Hardware/Software Cooperative Register Prefetching.

[BibT_eX]

[DOI]

Amirhossein Mirhosseini

Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

SPECTR: Formal Supervisory Control and Coordination for Many-core Systems Resource Management.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks.

[BibT_eX]

[DOI]

Amirali Boroumand

Youngsok Kim

Parthasarathy Ranganathan

Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

Slim NoC: A Low-Diameter On-Chip Network Topology for High Energy Efficiency and Scalability.

[BibT_eX]

[DOI]

Syed Minhaj Hassan

Sudhakar Yalamanchili

Torsten Hoefler

Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

MASK: Redesigning the GPU Memory Hierarchy to Support Multi-Application Concurrency.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

2017

Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms.

[BibT_eX]

[DOI]

Gennady Pekhimenko

Proc. ACM Meas. Anal. Comput. Syst., 2017

Understanding Reduced-Voltage Operation in Modern DRAM Devices: Experimental Characterization, Analysis, and Mechanisms.

[BibT_eX]

[DOI]

Kevin K. Chang

Proc. ACM Meas. Anal. Comput. Syst., 2017

Error Characterization, Mitigation, and Recovery in Flash-Memory-Based Solid-State Drives.

[BibT_eX]

[DOI]

Proc. IEEE, 2017

Improving the reliability of chip-off forensic analysis of NAND flash memory devices.

[BibT_eX]

[DOI]

Digit. Investig., 2017

Improving DRAM Performance by Parallelizing Refreshes with Accesses.

[BibT_eX]

[DOI]

CoRR, 2017

Errors in Flash-Memory-Based Solid-State Drives: Analysis, Mitigation, and Recovery.

[BibT_eX]

[DOI]

CoRR, 2017

Improving Multi-Application Concurrency Support Within the GPU Memory System.

[BibT_eX]

[DOI]

CoRR, 2017

Using ECC DRAM to Adaptively Increase Memory Capacity.

[BibT_eX]

[DOI]

CoRR, 2017

Improving the Performance and Endurance of Persistent Memory with Loose-Ordering Consistency.

[BibT_eX]

[DOI]

CoRR, 2017

Understanding Reduced-Voltage Operation in Modern DRAM Chips: Characterization, Analysis, and Mechanisms.

[BibT_eX]

[DOI]

Kevin K. Chang

CoRR, 2017

LazyPIM: Efficient Support for Cache Coherence in Processing-in-Memory Architectures.

[BibT_eX]

[DOI]

CoRR, 2017

A Case for Memory Content-Based Detection and Mitigation of Data-Dependent Failures in DRAM.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2017

LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2017

GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.

[BibT_eX]

[DOI]

Bioinform., 2017

Chapter Four - Simple Operations in Memory to Reduce Data Movement.

[BibT_eX]

[DOI]

Adv. Comput., 2017

Concurrent Data Structures for Near-Memory Computing.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, 2017

Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds.

[BibT_eX]

[DOI]

Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation, 2017

Banshee: bandwidth-efficient DRAM caching via software/hardware cooperation.

[BibT_eX]

[DOI]

Xiangyao Yu

Christopher J. Hughes

Nadathur Satish

Srinivas Devadas

Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

Ambit: in-memory accelerator for bulk bitwise operations using commodity DRAM technology.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

Detecting and mitigating data-dependent DRAM failures by exploiting current memory content.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

Mosaic: a GPU memory manager with application-transparent support for multiple page sizes.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

The Reach Profiler (REAPER): Enabling the Mitigation of DRAM Retention Failures via Profiling at Aggressive Conditions.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

Carpool: a bufferless on-chip network supporting adaptive multicast and hotspot alleviation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Supercomputing, 2017

SoftMC: A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

FPGA-Accelerated Dense Linear Machine Learning: A Precision-Convergence Trade-Off.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017

The RowHammer problem and other issues we may face as memory becomes denser.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

Utility-Based Hybrid Memory Management.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

2016

BLISS: Balancing Performance, Fairness and Complexity in Memory Access Scheduling.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2016

RFVP: Rollback-Free Value Prediction with Safe-to-Approximate Loads.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2016

DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2016

Simultaneous Multi-Layer Access: Improving 3D-Stacked Memory Bandwidth at Low Cost.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2016

Bounding and reducing memory interference in COTS-based multi-core systems.

[BibT_eX]

[DOI]

Real Time Syst., 2016

A case for hierarchical rings with deflection routing: An energy-efficient on-chip communication substrate.

[BibT_eX]

[DOI]

Parallel Comput., 2016

The 2014 MICRO Test of Time Award Winners: From 1978 to 1992.

[BibT_eX]

[DOI]

IEEE Micro, 2016

Common Bonds: MIPS, HPS, Two-Level Branch Prediction, and Compressed Code RISC Processor.

[BibT_eX]

[DOI]

IEEE Micro, 2016

Enabling Accurate and Practical Online Flash Channel Modeling for Modern MLC NAND Flash Memory.

[BibT_eX]

[DOI]

IEEE J. Sel. Areas Commun., 2016

Mitigating the Memory Bottleneck With Approximate Load Value Prediction.

[BibT_eX]

[DOI]

IEEE Des. Test, 2016

A Framework for Accelerating Bottlenecks in GPU Execution with Assist Warps.

[BibT_eX]

[DOI]

CoRR, 2016

The Processing Using Memory Paradigm: In-DRAM Bulk Copy, Initialization, Bitwise AND and OR.

[BibT_eX]

[DOI]

CoRR, 2016

Buddy-RAM: Improving the Performance and Efficiency of Bulk Bitwise Operations Using DRAM.

[BibT_eX]

[DOI]

CoRR, 2016

Heterogeneous-Reliability Memory: Exploiting Application-Level Memory Error Tolerance.

[BibT_eX]

[DOI]

Badriddine M. Khessib

Kushagra Vaid

CoRR, 2016

Tiered-Latency DRAM (TL-DRAM).

[BibT_eX]

[DOI]

CoRR, 2016

Reducing DRAM Latency by Exploiting Design-Induced Latency Variation in Modern DRAM Chips.

[BibT_eX]

[DOI]

Donghyuk Lee

Samira Manabi Khan

Lavanya Subramanian

CoRR, 2016

Adaptive-Latency DRAM (AL-DRAM).

[BibT_eX]

[DOI]

CoRR, 2016

RowHammer: Reliability Analysis and Security Implications.

[BibT_eX]

[DOI]

CoRR, 2016

Enabling Efficient Dynamic Resizing of Large DRAM Caches via A Hardware Consistent Hashing Mechanism.

[BibT_eX]

[DOI]

CoRR, 2016

Reducing Performance Impact of DRAM Refresh by Parallelizing Refreshes with Accesses.

[BibT_eX]

[DOI]

CoRR, 2016

Achieving both High Energy Efficiency and High Performance in On-Chip Communication using Hierarchical Rings with Deflection Routing.

[BibT_eX]

[DOI]

CoRR, 2016

GateKeeper: Enabling Fast Pre-Alignment in DNA Short Read Mapping with a New Streaming Accelerator Architecture.

[BibT_eX]

[DOI]

CoRR, 2016

Ramulator: A Fast and Extensible DRAM Simulator.

[BibT_eX]

[DOI]

Yoongu Kim

Weikun Yang

IEEE Comput. Archit. Lett., 2016

Optimal seed solver: optimizing seed selection in read mapping.

[BibT_eX]

[DOI]

Bioinform., 2016

Exploiting Core Criticality for Enhanced GPU Performance.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science, 2016

Understanding Latency Variation in Modern DRAM Chips: Experimental Characterization, Analysis, and Optimization.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science, 2016

Keynote: rethinking memory system design.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Symposium on Rapid System Prototyping, 2016

Yak: A High-Performance Big-Data-Friendly Garbage Collector.

[BibT_eX]

[DOI]

Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, 2016

NVMOVE: Helping Programmers Move to Byte-Based Persistence.

[BibT_eX]

[DOI]

Proceedings of the 4th Workshop on Interactions of NVM/Flash with Operating Systems and Workloads, 2016

Zorua: A holistic approach to resource virtualization in GPUs.

[BibT_eX]

[DOI]

Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Continuous runahead: Transparent hardware acceleration for memory intensive workloads.

[BibT_eX]

[DOI]

Milad Hashemi

Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems.

[BibT_eX]

[DOI]

Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Accelerating Dependent Cache Misses with an Enhanced Memory Controller.

[BibT_eX]

[DOI]

Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

A model for Application Slowdown Estimation in on-chip networks and its use for improving system fairness and performance.

[BibT_eX]

[DOI]

Proceedings of the 34th IEEE International Conference on Computer Design, 2016

Accelerating pointer chasing in 3D-stacked memory: Challenges, mechanisms, evaluation.

[BibT_eX]

[DOI]

Proceedings of the 34th IEEE International Conference on Computer Design, 2016

A case for toggle-aware compression for GPU systems.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

SizeCap: Efficiently handling power surges in fuel cell powered data centers.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

ChargeCache: Reducing DRAM latency by exploiting row access locality.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

Low-Cost Inter-Linked Subarrays (LISA): Enabling fast inter-subarray data movement in DRAM.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

PARBOR: An Efficient System-Level Technique to Detect Data-Dependent Failures in DRAM.

[BibT_eX]

[DOI]

Samira Manabi Khan

Donghyuk Lee

Proceedings of the 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2016

Invited - Who is the major threat to tomorrow's security?: you, the hardware designer.

[BibT_eX]

[DOI]

Wayne P. Burleson

Mohit Tiwari

Proceedings of the 53rd Annual Design Automation Conference, 2016

Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

μC-States: Fine-grained GPU Datapath Power Management.

[BibT_eX]

[DOI]

Onur Kayiran

Adwait Jog

Ashutosh Pattnaik

Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015

High-Performance and Lightweight Transaction Support in Flash-Based SSDs.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2015

Introducing the MICRO Test of Time Awards: Concept, Process, 2014 Winners, and the Future.

[BibT_eX]

[DOI]

Rich Belgard

IEEE Micro, 2015

SQUASH: Simple QoS-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators.

[BibT_eX]

[DOI]

CoRR, 2015

The Blacklisting Memory Scheduler: Balancing Performance, Fairness and Complexity.

[BibT_eX]

[DOI]

CoRR, 2015

Managing Hybrid Main Memories with a Page-Utility Driven Performance Model.

[BibT_eX]

[DOI]

CoRR, 2015

Simultaneous Multi Layer Access: A High Bandwidth and Low Cost 3D-Stacked Memory Interface.

[BibT_eX]

[DOI]

CoRR, 2015

Fast Bulk Bitwise AND and OR in DRAM.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2015

Toggle-Aware Compression for GPUs.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2015

Shifted Hamming distance: a fast and accurate SIMD-friendly filter to accelerate alignment verification in read mapping.

[BibT_eX]

[DOI]

Bioinform., 2015

A-DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters.

[BibT_eX]

[DOI]

Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, 2015

A Large-Scale Study of Flash Memory Failures in the Field.

[BibT_eX]

[DOI]

Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2015

Rethinking memory system design for data-intensive computing.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Embedded Computer Systems: Architectures, 2015

A Low-Overhead, Fully-Distributed, Guaranteed-Delivery Routing Algorithm for Faulty Network-on-Chips.

[BibT_eX]

[DOI]

Mohammad Fattah

Antti Airola

Proceedings of the 9th International Symposium on Networks-on-Chip, 2015

WARM: Improving NAND flash memory lifetime with write-hotness aware retention management.

[BibT_eX]

[DOI]

Proceedings of the IEEE 31st Symposium on Mass Storage Systems and Technologies, 2015

Amnesic cache management for non-volatile memory.

[BibT_eX]

[DOI]

Proceedings of the IEEE 31st Symposium on Mass Storage Systems and Technologies, 2015

The application slowdown model: quantifying and controlling the impact of inter-application interference at shared caches and main memory.

[BibT_eX]

[DOI]

Proceedings of the 48th International Symposium on Microarchitecture, 2015

Gather-scatter DRAM: in-DRAM address translation to improve the spatial locality of non-unit strided accesses.

[BibT_eX]

[DOI]

Proceedings of the 48th International Symposium on Microarchitecture, 2015

ThyNVM: enabling software-transparent crash consistency in persistent memory systems.

[BibT_eX]

[DOI]

Proceedings of the 48th International Symposium on Microarchitecture, 2015

Rethinking Memory System Design (along with Interconnects).

[BibT_eX]

[DOI]

Proceedings of the 8th International Workshop on Network on Chip Architectures, 2015

Comparative evaluation of FPGA and ASIC implementations of bufferless and buffered routing algorithms for on-chip networks.

[BibT_eX]

[DOI]

Yu Cai

Ken Mai

Proceedings of the Sixteenth International Symposium on Quality Electronic Design, 2015

A case for core-assisted bottleneck acceleration in GPUs: enabling flexible data compression with assist warps.

[BibT_eX]

[DOI]

Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

Page overlays: an enhanced virtual memory framework to enable fine-grained memory management.

[BibT_eX]

[DOI]

Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

PIM-enabled instructions: a low-overhead, locality-aware processing-in-memory architecture.

[BibT_eX]

[DOI]

Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

A scalable processing-in-memory accelerator for parallel graph processing.

[BibT_eX]

[DOI]

Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

Exploiting compressed block size as an indicator of future reuse.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

Adaptive-latency DRAM: Optimizing DRAM timing for the common-case.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

Data retention in MLC NAND flash memory: Characterization, optimization, and recovery.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

AVATAR: A Variable-Retention-Time (VRT) Aware Refresh for DRAM Systems.

[BibT_eX]

[DOI]

Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2015

Revisiting Memory Errors in Large-Scale Production Data Centers: Analysis and Modeling of New Trends from the Field.

[BibT_eX]

[DOI]

Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2015

Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery.

[BibT_eX]

[DOI]

Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2015

Decoupled Direct Memory Access: Isolating CPU and IO Traffic by Leveraging a Dual-Data-Port DRAM.

[BibT_eX]

[DOI]

Donghyuk Lee

Lavanya Subramanian

Jongmoo Choi

Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015

Exploiting Inter-Warp Heterogeneity to Improve GPGPU Performance.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015

2014

Efficient Data Mapping and Buffering Techniques for Multilevel Cell Phase-Change Memories.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2014

Mitigating Prefetcher-Caused Pollution Using Informed Caching Policies for Prefetched Blocks.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2014

Research Problems and Opportunities in Memory Systems.

[BibT_eX]

[DOI]

Lavanya Subramanian

Supercomput. Front. Innov., 2014

The efficacy of error mitigation techniques for DRAM retention failures: a comparative experimental study.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, 2014

Neighbor-cell assisted error correction for MLC NAND flash memories.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, 2014

Design and Evaluation of Hierarchical Rings with Deflection Routing.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Symposium on Computer Architecture and High Performance Computing, 2014

Bounding memory interference delay in COTS-based multi-core systems.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE Real-Time and Embedded Technology and Applications Symposium, 2014

FIRM: Fair and High-Performance Memory Control for Persistent Memory Systems.

[BibT_eX]

[DOI]

Jishen Zhao

Nachiappan Chidambaram Nachiappan

Yuan Xie

Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

Managing GPU Concurrency in Heterogeneous Architectures.

[BibT_eX]

[DOI]

Onur Kayiran

Adwait Jog

Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

The Dirty-Block Index.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

The Blacklisting Memory Scheduler: Achieving high performance and fairness at low cost.

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

Loose-Ordering Consistency for persistent memory.

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

The heterogeneous block architecture.

[BibT_eX]

[DOI]

Chris Fallin

Chris Wilkerson

Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

Improving cache performance using read-write partitioning.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

Improving DRAM performance by parallelizing refreshes with accesses.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

Characterizing Application Memory Error Vulnerability to Optimize Datacenter Cost via Heterogeneous-Reliability Memory.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2014

Rollback-free value prediction with approximate loads.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

Warp-aware trace scheduling for GPUs.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

Memory Systems.

[BibT_eX]

Yoongu Kim

Proceedings of the Computing Handbook, 2014

2013

Accelerating read mapping with FastHASH.

[BibT_eX]

[DOI]

BMC Genom., 2013

RowClone: fast and energy-efficient in-DRAM bulk data copy and initialization.

[BibT_eX]

[DOI]

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013

Linearly compressed pages: a low-complexity, low-latency main memory compression framework.

[BibT_eX]

[DOI]

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013

EMERALD: Characterization of emerging applications and algorithms for low-power devices.

[BibT_eX]

[DOI]

Vijaykrishnan Narayanan

Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013

Evaluating STT-RAM as an energy-efficient main memory alternative.

[BibT_eX]

[DOI]

Emre Kultursay

Mahmut T. Kandemir

Anand Sivasubramaniam

Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013

An experimental study of data retention behavior in modern DRAM devices: implications for retention time profiling mechanisms.

[BibT_eX]

[DOI]

Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

Orchestrated scheduling and prefetching for GPGPUs.

[BibT_eX]

[DOI]

Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

Utility-based acceleration of multithreaded applications on asymmetric CMPs.

[BibT_eX]

[DOI]

Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

LightTx: A lightweight transactional design in flash-based SSDs to support flexible transactions.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013

Program interference in MLC NAND flash memory: Characterization, modeling, and mitigation.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013

MISE: Providing performance predictability and improving fairness in shared main memory systems.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013

Tiered-latency DRAM: A low latency and low cost DRAM architecture.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013

Application-to-core mapping policies to reduce memory system interference in multi-core systems.

[BibT_eX]

[DOI]

Reetuparna Das

Akhilesh Kumar

Mani Azimi

Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013

Threshold voltage distribution in MLC NAND flash memory: characterization, analysis, and modeling.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation and Test in Europe, 2013

A heterogeneous multiple network-on-chip design: an application-aware approach.

[BibT_eX]

[DOI]

Asit K. Mishra

Nachiappan Chidambaram Nachiappan

Chita R. Das

Proceedings of the 50th Annual Design Automation Conference 2013, 2013

OWL: cooperative thread array aware scheduling techniques for improving GPGPU performance.

[BibT_eX]

[DOI]

Adwait Jog

Onur Kayiran

Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013

2012

Fairness via Source Throttling: A Configurable and High-Performance Fairness Substrate for Multicore Memory Systems.

[BibT_eX]

[DOI]

ACM Trans. Comput. Syst., 2012

A QoS-Enabled On-Die Interconnect Fabric for Kilo-Node Chips.

[BibT_eX]

[DOI]

IEEE Micro, 2012

Enabling Efficient and Scalable Hybrid Memories Using Fine-Granularity DRAM Cache Management.

[BibT_eX]

[DOI]

Parthasarathy Ranganathan

IEEE Comput. Archit. Lett., 2012

On-chip networks from a networking perspective: congestion and scalability in many-core interconnects.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGCOMM 2012 Conference, 2012

HAT: Heterogeneous Adaptive Throttling for On-Chip Networks.

[BibT_eX]

[DOI]

Kevin Kai-Wei Chang

Chris Fallin

Proceedings of the IEEE 24th International Symposium on Computer Architecture and High Performance Computing, 2012

MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect.

[BibT_eX]

[DOI]

Proceedings of the 2012 Sixth IEEE/ACM International Symposium on Networks-on-Chip (NoCS), 2012

RAIDR: Retention-aware intelligent DRAM refresh.

[BibT_eX]

[DOI]

Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012

A case for exploiting subarray-level parallelism (SALP) in DRAM.

[BibT_eX]

[DOI]

Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012

Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems.

[BibT_eX]

[DOI]

Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012

Row buffer locality aware caching policies for hybrid memories.

[BibT_eX]

[DOI]

HanBin Yoon

Rachael Harding

Proceedings of the 30th International IEEE Conference on Computer Design, 2012

A case for small row buffers in non-volatile main memories.

[BibT_eX]

[DOI]

Jing Li

Proceedings of the 30th International IEEE Conference on Computer Design, 2012

Flash correct-and-refresh: Retention-aware error management for increased flash memory lifetime.

[BibT_eX]

[DOI]

Proceedings of the 30th International IEEE Conference on Computer Design, 2012

Error patterns in MLC NAND flash memory: Measurement, characterization, and analysis.

[BibT_eX]

[DOI]

Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

Bottleneck identification and scheduling in multithreaded applications.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, 2012

The evicted-address filter: a unified mechanism to address both cache pollution and thrashing.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

Base-delta-immediate compression: practical data compression for on-chip caches.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

Linearly compressed pages: a main memory compression framework with low complexity and low latency.

[BibT_eX]

[DOI]

Gennady Pekhimenko

Todd C. Mowry

Nachiappan Chidambaram Nachiappan

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

Application-aware prefetch prioritization in on-chip networks.

[BibT_eX]

[DOI]

Asit K. Mishra

Mahmut T. Kandemir

Anand Sivasubramaniam

Chita R. Das

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

Application-to-core mapping policies to reduce memory interference in multi-core systems.

[BibT_eX]

[DOI]

Reetuparna Das

Akhilesh Kumar

Mani Azimi

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011

Prefetch-Aware Memory Controllers.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2011

Data Marshaling for Multicore Systems.

[BibT_eX]

[DOI]

IEEE Micro, 2011

Top Picks [Guest editors' introduction].

[BibT_eX]

[DOI]

IEEE Micro, 2011

Thread Cluster Memory Scheduling.

[BibT_eX]

[DOI]

IEEE Micro, 2011

Aérgia: A Network-on-Chip Exploiting Packet Latency Slack.

[BibT_eX]

[DOI]

IEEE Micro, 2011

FIST: A fast, lightweight, FPGA-friendly packet latency estimator for NoC modeling in full-system simulations.

[BibT_eX]

[DOI]

Michael Papamichael

James C. Hoe

Proceedings of the NOCS 2011, 2011

Improving GPU performance via large warps and two-level warp scheduling.

[BibT_eX]

[DOI]

Veynu Narasiman

Michael Shebanow

Chang Joo Lee

Rustam Miftakhutdinov

Sai Prashanth Muralidhara

Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011

Reducing memory interference in multicore systems via application-aware memory channel partitioning.

[BibT_eX]

[DOI]

Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011

Parallel application memory scheduling.

[BibT_eX]

[DOI]

Eiman Ebrahimi

Rustam Miftakhutdinov

Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011

Memory systems in the many-core era: challenges, opportunities, and solution directions.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Memory Management, 2011

Kilo-NOC: a heterogeneous network-on-chip architecture for scalability and service guarantees.

[BibT_eX]

[DOI]

Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011

Prefetch-aware shared resource management for multi-core systems.

[BibT_eX]

[DOI]

Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011

Poster: revisiting virtual channel memory for performance and fairness on multi-core architecture.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

Memory power management via dynamic voltage/frequency scaling.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Autonomic Computing, 2011

CHIPPER: A low-complexity bufferless deflection router.

[BibT_eX]

[DOI]

Chris Fallin

Chris Craik

Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011

2010

Accelerating Critical Section Execution with Asymmetric Multicore Architectures.

[BibT_eX]

[DOI]

IEEE Micro, 2010

Phase-Change Technology and the Future of Main Memory.

[BibT_eX]

[DOI]

IEEE Micro, 2010

Phase change memory architecture and the quest for scalability.

[BibT_eX]

[DOI]

Commun. ACM, 2010

Concurrent autonomous self-test for uncore components in system-on-chips.

[BibT_eX]

[DOI]

Proceedings of the 28th IEEE VLSI Test Symposium, 2010

QuaLe: A Quantum-Leap Inspired Model for Non-stationary Analysis of NoC Traffic in Chip Multi-processors.

[BibT_eX]

[DOI]

Proceedings of the NOCS 2010, 2010

Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior.

[BibT_eX]

[DOI]

Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010

Data marshaling for multi-core architectures.

[BibT_eX]

[DOI]

Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

Topology-Aware Quality-of-Service Support in Highly Integrated Chip Multiprocessors.

[BibT_eX]

[DOI]

Boris Grot

Stephen W. Keckler

Proceedings of the Computer Architecture, 2010

Aérgia: exploiting packet latency slack in on-chip networks.

[BibT_eX]

[DOI]

Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers.

[BibT_eX]

[DOI]

Proceedings of the 16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 2010

Next generation on-chip networks: what kind of congestion control do we need?

[BibT_eX]

[DOI]

Proceedings of the 9th ACM Workshop on Hot Topics in Networks. HotNets 2010, Monterey, CA, USA - October 20, 2010

Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, 2010

Efficient runahead threads.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010

2009

Virtual Program Counter (VPC) Prediction: Very Low Cost Indirect Branch Prediction Using Conditional Branch Prediction Hardware.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2009

A Flexible Software-Based Framework for Online Detection of Hardware Defects.

[BibT_eX]

[DOI]

Kypros Constantinides

Todd M. Austin

Valeria Bertacco

IEEE Trans. Computers, 2009

Parallelism-Aware Batch Scheduling: Enabling High-Performance and Fair Shared Memory Controllers.

[BibT_eX]

[DOI]

IEEE Micro, 2009

Improving memory bank-level parallelism in the presence of prefetching.

[BibT_eX]

[DOI]

Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009

Preemptive virtual clock: a flexible, efficient, and cost-effective QOS scheme for networks-on-chip.

[BibT_eX]

[DOI]

Boris Grot

Stephen W. Keckler

Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009

Coordinated control of multiple prefetchers in multi-core systems.

[BibT_eX]

[DOI]

Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009

Application-aware prioritization mechanisms for on-chip networks.

[BibT_eX]

[DOI]

Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009

A case for bufferless routing in on-chip networks.

[BibT_eX]

[DOI]

Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009

Architecting phase change memory as a scalable dram alternative.

[BibT_eX]

[DOI]

Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009

Flexible reference-counting-based hardware acceleration for garbage collection.

[BibT_eX]

[DOI]

José A. Joao

Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009

Operating system scheduling for efficient online self-test in robust systems.

[BibT_eX]

[DOI]

Yanjing Li

Subhasish Mitra

Proceedings of the 2009 International Conference on Computer-Aided Design, 2009

Express Cube Topologies for on-Chip Interconnects.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009

Techniques for bandwidth-efficient prefetching of linked data structures in hybrid prefetching systems.

[BibT_eX]

[DOI]

Eiman Ebrahimi

Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009

Accelerating critical section execution with asymmetric multi-core architectures.

[BibT_eX]

[DOI]

Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, 2009

2008

Guest Editors' Introduction: Interaction of Many-Core Computer Architecture and Operating Systems.

[BibT_eX]

[DOI]

Sangyeun Cho

Tao Li

IEEE Micro, 2008

Dynamic Predication of Indirect Jumps.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2008

Distributed order scheduling and its application to multi-core dram controllers.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Seventh Annual ACM Symposium on Principles of Distributed Computing, 2008

Prefetch-Aware DRAM Controllers.

[BibT_eX]

[DOI]

Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-41 2008), 2008

Online design bug detection: RTL analysis, flexible mechanisms, and evaluation.

[BibT_eX]

[DOI]

Kypros Constantinides

Todd M. Austin

Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-41 2008), 2008

Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems.

[BibT_eX]

[DOI]

Proceedings of the 35th International Symposium on Computer Architecture (ISCA 2008), 2008

Self-Optimizing Memory Controllers: A Reinforcement Learning Approach.

[BibT_eX]

[DOI]

Proceedings of the 35th International Symposium on Computer Architecture (ISCA 2008), 2008

Performance-aware speculation control using wrong path usefulness prediction.

[BibT_eX]

[DOI]

Proceedings of the 14th International Conference on High-Performance Computer Architecture (HPCA-14 2008), 2008

Improving the performance of object-oriented languages with dynamic predication of indirect jumps.

[BibT_eX]

[DOI]

Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, 2008

2007

Diverge-Merge Processor: Generalized and Energy-Efficient Dynamic Predication.

[BibT_eX]

[DOI]

IEEE Micro, 2007

Memory Performance Attacks: Denial of Memory Service in Multi-Core Systems.

[BibT_eX]

[DOI]

Proceedings of the 16th USENIX Security Symposium, Boston, MA, USA, August 6-10, 2007, 2007

Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-40 2007), 2007

Software-Based Online Detection of Hardware Defects Mechanisms, Architectural Support, and Evaluation.

[BibT_eX]

[DOI]

Kypros Constantinides

Todd M. Austin

Valeria Bertacco

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-40 2007), 2007

VPC prediction: reducing the cost of indirect branches via hardware-based dynamic devirtualization.

[BibT_eX]

[DOI]

Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007

Feedback Directed Prefetching: Improving the Performance and Bandwidth-Efficiency of Hardware Prefetchers.

[BibT_eX]

[DOI]

Proceedings of the 13st International Conference on High-Performance Computer Architecture (HPCA-13 2007), 2007

Profile-assisted Compiler Support for Dynamic Predication in Diverge-Merge Processors.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Symposium on Code Generation and Optimization (CGO 2007), 2007

2006

Address-Value Delta (AVD) Prediction: A Hardware Technique for Efficiently Parallelizing Dependent Cache Misses.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2006

Efficient Runahead Execution: Power-Efficient Memory Latency Tolerance.

[BibT_eX]

[DOI]

IEEE Micro, 2006

Wish Branches: Enabling Adaptive and Aggressive Predicated Execution.

[BibT_eX]

[DOI]

IEEE Micro, 2006

Diverge-Merge Processor (DMP): Dynamic Predicated Execution of Complex Control-Flow Graphs Based on Frequently Executed Paths.

[BibT_eX]

[DOI]

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-39 2006), 2006

A Case for MLP-Aware Cache Replacement.

[BibT_eX]

[DOI]

Proceedings of the 33rd International Symposium on Computer Architecture (ISCA 2006), 2006

2D-Profiling: Detecting Input-Dependent Branches with a Single Input Data Set.

[BibT_eX]

[DOI]

Proceedings of the Fourth IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2006), 2006

2005

An Analysis of the Performance Impact of Wrong-Path Memory References on Out-of-Order and Runahead Execution Processors.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2005

Using the First-Level Caches as Filters to Reduce the Pollution Caused by Speculative Memory References.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2005

On Reusing the Results of Pre-Executed Instructions in a Runahead Execution Processor.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2005

Address-Value Delta (AVD) Prediction: Increasing the Effectiveness of Runahead Execution by Exploiting Regular Memory Allocation Patterns.

[BibT_eX]

[DOI]

Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-38 2005), 2005

Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution.

[BibT_eX]

[DOI]

Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-38 2005), 2005

Techniques for Efficient Processing in Runahead Execution Engines.

[BibT_eX]

[DOI]

Proceedings of the 32st International Symposium on Computer Architecture (ISCA 2005), 2005

Microarchitecture-Based Introspection: A Technique for Transient-Fault Tolerance in Microprocessors.

[BibT_eX]

[DOI]

Moinuddin K. Qureshi