Carole-Jean Wu

Raghuraman Krishnamoorthi

Proceedings of the IEEE International Symposium on Workload Characterization, 2022

Hercules: Heterogeneity-Aware Inference Serving for At-Scale Personalized Recommendation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

A joint management middleware to improve training performance of deep recommendation systems with SSDs.

[BibT_eX]

[DOI]

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

RecShard: statistical feature-based memory optimization for industry-scale neural recommendation.

[BibT_eX]

[DOI]

Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021

Exploiting Parallelism Opportunities with Deep Learning Frameworks.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2021

Dynamic Temperature Management of Near-Sensor Processing for Energy-Efficient High-Fidelity Imaging.

[BibT_eX]

[DOI]

Sensors, 2021

The Vision Behind MLPerf: Understanding AI Inference Performance.

[BibT_eX]

[DOI]

IEEE Micro, 2021

Low-Precision Hardware Architectures Meet Recommendation Model Inference at Scale.

[BibT_eX]

[DOI]

IEEE Micro, 2021

SecNDP: Secure Near-Data Processing with Untrusted Memory.

[BibT_eX]

[DOI]

IACR Cryptol. ePrint Arch., 2021

Sustainable AI: Environmental Implications, Challenges and Opportunities.

[BibT_eX]

[DOI]

CoRR, 2021

Understanding and Co-designing the Data Ingestion Pipeline for Industry-Scale RecSys Training.

[BibT_eX]

[DOI]

CoRR, 2021

Socio-Technological Challenges and Opportunities: Paths Forward.

[BibT_eX]

[DOI]

Parthasarathy Ranganathan

Srilatha Manne

Sarah Bird

Shane Greenstein

CoRR, 2021

SVP-CF: Selection via Proxy for Collaborative Filtering Data.

[BibT_eX]

[DOI]

Noveen Sachdeva

Julian J. McAuley

CoRR, 2021

Energy-Efficient Mapping for a Network of DNN Models at the Edge.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Smart Computing, 2021

TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models.

[BibT_eX]

[DOI]

Proceedings of the Fourth Conference on Machine Learning and Systems, 2021

Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery.

[BibT_eX]

[DOI]

Proceedings of the Fourth Conference on Machine Learning and Systems, 2021

AutoFL: Enabling Heterogeneity-Aware Energy Efficient Federated Learning.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

Understanding Capacity-Driven Scale-Out Neural Recommendation Inference.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2021

Understanding Training Efficiency of Deep Learning Recommendation Models at Scale.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

RecSSD: near data processing for solid state drive based recommendation inference.

[BibT_eX]

[DOI]

Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

2020

GEVO: GPU Code Optimization Using Evolutionary Computation.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2020

MLPerf: An Industry Standard Benchmark Suite for Machine Learning Performance.

[BibT_eX]

[DOI]

IEEE Micro, 2020

CPR: Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery.

[BibT_eX]

[DOI]

CoRR, 2020

AutoScale: Optimizing Energy Efficiency of End-to-End Edge Inference under Stochastic Variance.

[BibT_eX]

[DOI]

CoRR, 2020

Developing a Recommendation Benchmark for MLPerf Training and Inference.

[BibT_eX]

[DOI]

CoRR, 2020

MLPerf Training Benchmark.

[BibT_eX]

[DOI]

Proceedings of the Third Conference on Machine Learning and Systems, 2020

AutoScale: Energy Efficiency Optimization for Stochastic Edge Inference Using Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

MLPerf Inference Benchmark.

[BibT_eX]

[DOI]

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing.

[BibT_eX]

[DOI]

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

DeepRecSys: A System for Optimizing End-To-End At-Scale Neural Recommendation Inference.

[BibT_eX]

[DOI]

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

Cross-Stack Workload Characterization of Deep Recommendation Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Workload Characterization, 2020

The Architectural Implications of Facebook's DNN-Based Personalized Recommendation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

GEVO-ML: a proposal for optimizing ML code with evolutionary computation.

[BibT_eX]

[DOI]

Proceedings of the GECCO '20: Genetic and Evolutionary Computation Conference, 2020

Emerging Neural Workloads and Their Impact on Hardware.

[BibT_eX]

[DOI]

Ann Franchesca Laguna

Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

2019

Optimizing User Satisfaction of Mobile Workloads Subject to Various Sources of Uncertainties.

[BibT_eX]

[DOI]

Benjamin Gaudette

Sarma B. K. Vrudhula

IEEE Trans. Mob. Comput., 2019

Configurable-ECC: Architecting a Flexible ECC Scheme to Support Different Sized Accesses in High Bandwidth Memory Systems.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2019

MLPerf Training Benchmark.

[BibT_eX]

[DOI]

CoRR, 2019

The Architectural Implications of Facebook's DNN-based Personalized Recommendation.

[BibT_eX]

[DOI]

CoRR, 2019

Deep Learning Recommendation Model for Personalization and Recommendation Systems.

[BibT_eX]

[DOI]

CoRR, 2019

Genetic improvement of GPU code.

[BibT_eX]

[DOI]

Jhe-Yu Liou

Stephanie Forrest

Proceedings of the 6th International Workshop on Genetic Improvement, 2019

Machine Learning at Facebook: Understanding Inference at the Edge.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

Understanding the Future of Energy Efficiency in Multi-Module GPUs.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

2018

DORA: Optimizing Smartphone Energy Efficiency and Web Browser Performance under Interference.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2018

LATTE-CC: Latency Tolerance Aware Adaptive Cache Compression Management for Energy Efficient GPUs.

[BibT_eX]

[DOI]

Vignesh Soundararajan

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

2017

MCM-GPU: Multi-Chip-Module GPUs for Continued Performance Scalability.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

Understanding the thermal challenges of high-performance mobile devices with a detailed platform temperature model.

[BibT_eX]

[DOI]

Ying-Ju Yu

Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017

Performance characterization, prediction, and optimization for heterogeneous systems with multi-level memory interference.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017

2016

Using Low Cost Erasure and Error Correction Schemes to Improve Reliability of Commodity DRAM Systems.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2016

RATT-ECC: Rate Adaptive Two-Tiered Error Correction Codes for Reliable 3D Die-Stacked Memory.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2016

ID-cache: instruction and memory divergence based cache management for GPUs.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on Workload Characterization, 2016

Ctrl-C: Instruction-Aware Control Loop Based Adaptive Cache Bypassing for GPUs.

[BibT_eX]

[DOI]

Proceedings of the 34th IEEE International Conference on Computer Design, 2016

Improving smartphone user experience by balancing performance and energy with probabilistic QoS guarantee.

[BibT_eX]

[DOI]

Benjamin Gaudette

Sarma B. K. Vrudhula

Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

2015

E-ECC: Low Power Erasure and Error Correction Schemes for Increasing Reliability of Commodity DRAM Systems.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Symposium on Memory Systems, 2015

A study of mobile device utilization.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Symposium on Performance Analysis of Systems and Software, 2015

CAWA: coordinated warp scheduling and cache prioritization for critical warp acceleration of GPGPU workloads.

[BibT_eX]

[DOI]

Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

Characterization and Throttling-Based Mitigation of Memory Interference for Heterogeneous Smartphones.

[BibT_eX]

[DOI]

Davesh Shingari

Proceedings of the 2015 IEEE International Symposium on Workload Characterization, 2015

2014

STEAM: A Smart Temperature and Energy Aware Multicore Controller.

[BibT_eX]

[DOI]

ACM Trans. Embed. Comput. Syst., 2014

Architectural Thermal Energy Harvesting Opportunities for Sustainable Computing.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2014

Characterizing the latency hiding ability of GPUs.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Symposium on Performance Analysis of Systems and Software, 2014

Quantifying the energy cost of data movement for emerging smart phone workloads on mobile platforms.

[BibT_eX]

[DOI]

Dhinakaran Pandiyan

Proceedings of the 2014 IEEE International Symposium on Workload Characterization, 2014

ReMAP: Reuse and memory access cost aware eviction policy for last level cache management.

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

Quantitative Analysis of Control Flow Checking Mechanisms for Soft Errors.

[BibT_eX]

[DOI]

Aviral Shrivastava

Abhishek Rhisheekesan

Reiley Jeyapaul

Proceedings of the 51st Annual Design Automation Conference 2014, 2014

CAWS: criticality-aware warp scheduling for GPGPU workloads.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013

Performance, energy characterizations and architectural implications of an emerging mobile platform benchmark suite - MobileBench.

[BibT_eX]

[DOI]

Dhinakaran Pandiyan

Proceedings of the IEEE International Symposium on Workload Characterization, 2013

2011

Adaptive timekeeping replacement: Fine-grained capacity management for shared CMP caches.

[BibT_eX]

[DOI]

Margaret Martonosi

ACM Trans. Archit. Code Optim., 2011

PACMan: prefetch-aware cache management for high performance caching.

[BibT_eX]

[DOI]

Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011

SHiP: signature-based hit predictor for high performance caching.

[BibT_eX]

[DOI]

Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011

Characterization and dynamic mitigation of intra-application cache interference.

[BibT_eX]

[DOI]