Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021

Efficient Sparse Matrix Kernels based on Adaptive Workload-Balancing and Parallel-Reduction.

[BibT_eX]

[DOI]

CoRR, 2021

CogDL: An Extensive Toolkit for Deep Learning on Graphs.

[BibT_eX]

[DOI]

CoRR, 2021

Exploiting Online Locality and Reduction Parallelism for Sampled Dense Matrix Multiplication on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 39th IEEE International Conference on Computer Design, 2021

Rerec: In-ReRAM Acceleration with Access-Aware Mapping for Personalized Recommendation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

3M-AI: A Multi-task and Multi-core Virtualization Framework for Multi-FPGA AI Systems in the Cloud.

[BibT_eX]

[DOI]

Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

2020

GE-SpMM: general-purpose sparse matrix-matrix multiplication on GPUs for graph neural networks.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2020

GraphSDH: A General Graph Sampling Framework with Distribution and Hierarchy.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

LessMine: Reducing Sample Space and Data Access for Dense Pattern Mining.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

MNSIM 2.0: A Behavior-Level Modeling Tool for Memristor-based Neuromorphic Computing Systems.

[BibT_eX]

[DOI]

Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

An Order Sampling Processing-in-Memory Architecture for Approximate Graph Pattern Mining.

[BibT_eX]

[DOI]

Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

Enable Efficient and Flexible FPGA Virtualization for Deep Learning in the Cloud.

[BibT_eX]

[DOI]

Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020

INCAME: INterruptible CNN Accelerator for Multi-robot Exploration.

[BibT_eX]

[DOI]

Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020

Enabling Efficient and Flexible FPGA Virtualization for Deep Learning in the Cloud.

[BibT_eX]

[DOI]

Proceedings of the 28th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2020

INCA: INterruptible CNN Accelerator for Multi-tasking in Embedded Robots.

[BibT_eX]

[DOI]

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

2019

GraphH: A Processing-in-Memory Architecture for Large-Scale Graph Processing.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

HyVE: Hybrid Vertex-Edge Memory Hierarchy for Energy-Efficient Graph Processing.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2019

Centrifuge: Evaluating full-system HLS-generated heterogenous-accelerator SoCs using FPGA-Acceleration.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computer-Aided Design, 2019

A Configurable Multi-Precision CNN Computing Framework Based on Single Bit RRAM.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual Design Automation Conference 2019, 2019

Memory-Bound Proof-of-Work Acceleration for Blockchain Applications.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual Design Automation Conference 2019, 2019

GraphSAR: a sparsity-aware processing-in-memory architecture for large-scale graph processing on ReRAMs.

[BibT_eX]

[DOI]

Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

2018

GraphIA: an in-situ accelerator for large-scale graph processing.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Memory Systems, 2018

NewGraph: Balanced Large-Scale Graph Processing on FPGAs with Low Preprocessing Overheads.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

HyVE: Hybrid vertex-edge memory hierarchy for energy-efficient graph processing.

[BibT_eX]

[DOI]

Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

2017

ForeGraph: Exploring Large-scale Graph Processing on Multi-FPGA Architecture.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

2016

NXgraph: An efficient graph processing system on a single machine.

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE International Conference on Data Engineering, 2016

Approximate Frequent Itemset Mining for streaming data on FPGA.

[BibT_eX]

[DOI]

Proceedings of the 26th International Conference on Field Programmable Logic and Applications, 2016

FPGP: Graph Processing Framework on FPGA A Case Study of Breadth-First Search.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016

2015

A self-aware data compression system on FPGA in Hadoop.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Field Programmable Technology, 2015

2014

Online scheduling for FPGA computation in the Cloud.

[BibT_eX]

[DOI]

Proceedings of the 2014 International Conference on Field-Programmable Technology, 2014

2013

DTW-Based Subsequence Similarity Search on AMD Heterogeneous Computing Platform.

[BibT_eX]

[DOI]

Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013

Guohao Dai

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...