Kun Wu

Orcid: 0000-0002-0149-1409

Affiliations:
  • University of Illinois Urbana-Champaign, IL, USA


According to our database1, Kun Wu authored at least 13 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
TBA: Faster Large Language Model Training Using SSD-Based Activation Offloading.
CoRR, 2024

LSM-GNN: Large-scale Storage-based Multi-GPU GNN Training by Optimizing Data Transfer Scheme.
CoRR, 2024

Hector: An Efficient Programming and Compilation Framework for Implementing Relational Graph Neural Networks in GPU Architectures.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
PIGEON: Optimizing CUDA Code Generator for End-to-End Training and Inference of Relational Graph Neural Networks.
CoRR, 2023

2022
Graph Neural Network Training and Data Tiering.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

2021
PyLog: An Algorithm-Centric Python-Based FPGA Programming and Synthesis Flow.
IEEE Trans. Computers, 2021

Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture.
Proc. VLDB Endow., 2021

Graph Neural Network Training with Data Tiering.
CoRR, 2021

PyTorch-Direct: Enabling GPU Centric Data Access for Very Large Graph Neural Network Training with Irregular Accesses.
CoRR, 2021

A Python-based High-Level Programming Flow for CPU-FPGA Heterogeneous Systems : (Invited Paper).
Proceedings of the IEEE/ACM Programming Environments for Heterogeneous Computing, 2021

TEMPI: An Interposed MPI Library with a Canonical Representation of CUDA-aware Datatypes.
Proceedings of the HPDC '21: The 30th International Symposium on High-Performance Parallel and Distributed Computing, 2021

2020
Fast CUDA-Aware MPI Datatypes without Platform Support.
CoRR, 2020

2019
Memory-Bound Proof-of-Work Acceleration for Blockchain Applications.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019


  Loading...