Keren Zhou

Orcid: 0000-0002-7977-3182

Affiliations:
  • George Mason University, VA, USA
  • OpenAI
  • Rice University, TX, USA (PhD)


According to our database1, Keren Zhou authored at least 25 papers between 2015 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Centimani: Enabling Fast AI Accelerator Selection for DNN Training with a Novel Performance Predictor.
Proceedings of the 2024 USENIX Annual Technical Conference, 2024

FASTEN: Fast GPU-accelerated Segmented Matrix Multiplication for Heterogenous Graph Neural Networks.
Proceedings of the 38th ACM International Conference on Supercomputing, 2024


2023
Hardware-Aware Compression with Random Operation Access Specific Tile (ROAST) Hashing.
Proceedings of the International Conference on Machine Learning, 2023

DrGPUM: Guiding Memory Optimization for GPU-Accelerated Applications.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
An Automated Tool for Analysis and Tuning of GPU-Accelerated Code in HPC Applications.
IEEE Trans. Parallel Distributed Syst., 2022

Paw-Net: Stacking ensemble deep learning for segmenting scanning electron microscopy images of fine-grained shale samples.
Comput. Geosci., 2022

Efficient model compression with Random Operation Access Specific Tile (ROAST) hashing.
CoRR, 2022

Accelerating high-order stencils on GPUs.
Concurr. Comput. Pract. Exp., 2022

Low overhead and context sensitive profiling of CPU-accelerated applications.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

ValueExpert: exploring value patterns in GPU-accelerated applications.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021
Measurement and analysis of GPU-accelerated applications with HPCToolkit.
Parallel Comput., 2021

Measurement and Analysis of GPU-Accelerated OpenCL Computations on Intel GPUs.
Proceedings of the IEEE/ACM International Workshop on Programming and Performance Visualization Tools, 2021



GPA: A GPU Performance Advisor Based on Instruction Sampling.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2021

2020
GVProf: a value profiler for GPU-based clusters.
Proceedings of the International Conference for High Performance Computing, 2020

A tool for top-down performance analysis of GPU-accelerated applications.
Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020

Tools for top-down performance analysis of GPU-accelerated applications.
Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

2019
A Tool for Performance Analysis of GPU-Accelerated Applications.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019

2018
Quadboost: A Scalable Concurrent Quadtree.
IEEE Trans. Parallel Distributed Syst., 2018

2017
Understanding the GPU Microarchitecture to Achieve Bare-Metal Performance Tuning.
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017

A performance analysis framework for exploiting GPU microarchitectural capability.
Proceedings of the International Conference on Supercomputing, 2017

2015
Multi-Classes Feature Engineering with Sliding Window for Purchase Prediction in Mobile Commerce.
Proceedings of the IEEE International Conference on Data Mining Workshop, 2015

BF-MapReduce: A Bloom Filter Based Efficient Lightweight Search.
Proceedings of the IEEE Conference on Collaboration and Internet Computing, 2015


  Loading...