Peng Jiang

Orcid: 0000-0001-7743-6062

Affiliations:
  • University of Iowa, Iowa City, IA, USA
  • Ohio State University, Department of Computer Science and Engineering, OH, USA (PhD 2019)


According to our database1, Peng Jiang authored at least 30 papers between 2016 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
GCSM: GPU-Accelerated Continuous Subgraph Matching for Large Graphs.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

cuKE: An Efficient Code Generator for Score Function Computation in Knowledge Graph Embedding.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

CereSZ: Enabling and Scaling Error-bounded Lossy Compression on Cerebras CS-2.
Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing, 2024

2023
PIMMiner: A High-performance PIM Architecture-aware Graph Mining Framework.
CoRR, 2023

End-to-End LU Factorization of Large Matrices on GPUs.
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023

2022
STMatch: Accelerating Graph Pattern Matching on GPU with Stack-Based Loop Optimizations.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

Exposing and Exploiting Fine-Grained Block Structures for Fast and Accurate Sparse Training.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Scaling and Selecting GPU Methods for All Pairs Shortest Paths (APSP) Computations.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

Rethinking graph data placement for graph neural network training on multiple GPUs.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

SampleMine: A Framework for Applying Random Sampling to Subgraph Pattern Mining through Loop Perforation.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022

2021
Communication-Efficient Sampling for Distributed Training of Graph Convolutional Networks.
CoRR, 2021

An Efficient Graph Mining System for Large Patterns.
CoRR, 2021

Exploring PIM Architecture for High-Performance Graph Pattern Mining.
IEEE Comput. Archit. Lett., 2021

Scaling Sparse Matrix Multiplication on CPU-GPU Nodes.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

2020
Combining SIMD and Many/Multi-core Parallelism for Finite-state Machines with Enumerative Speculation.
ACM Trans. Parallel Comput., 2020

Adaptive Periodic Averaging: A Practical Approach to Reducing Communication in Distributed Learning.
CoRR, 2020

Scaling out speculative execution of finite-state machines with parallel merge.
Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020

A novel data transformation and execution strategy for accelerating sparse matrix multiplication on GPUs.
Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020

Accelerating Sparse CNN Inference on GPUs with Performance-Aware Weight Pruning.
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020

2019
Accelerating distributed stochastic gradient descent with adaptive periodic parameter averaging: poster.
Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2019

Enabling prefix sum parallelism pattern for recurrences with principled function reconstruction.
Proceedings of the 28th International Conference on Compiler Construction, 2019

A Methodology for Characterizing Sparse Datasets and Its Application to SIMD Performance Prediction.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018
Revealing parallel scans and reductions in sequential loops through function reconstruction.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018

A Linear Speedup Analysis of Distributed Deep Learning with Sparse and Quantized Communication.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Conflict-free vectorization of associative irregular applications with recent SIMD architectural advances.
Proceedings of the 2018 International Symposium on Code Generation and Optimization, 2018

Revealing parallel scans and reductions in recurrences through function reconstruction.
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018

2017
Combining SIMD and Many/Multi-core Parallelism for Finite State Machines with Enumerative Speculation.
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017

Efficient SIMD and MIMD parallelization of hash-based aggregation by conflict mitigation.
Proceedings of the International Conference on Supercomputing, 2017

2016
Reusing Data Reorganization for Efficient SIMD Parallelization of Adaptive Irregular Applications.
Proceedings of the 2016 International Conference on Supercomputing, 2016

Exploiting recent SIMD architectural advances for irregular applications.
Proceedings of the 2016 International Symposium on Code Generation and Optimization, 2016


  Loading...