Hongwu Peng

Orcid: 0000-0003-2025-2195

According to our database1, Hongwu Peng authored at least 33 papers between 2021 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
ASPLOS 2024 Artifact for "MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training".
Dataset, February, 2024

ASPLOS 2024 Artifact for "MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training".
Dataset, February, 2024

APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking.
CoRR, 2024

SSNet: A Lightweight Multi-Party Computation Scheme for Practical Privacy-Preserving Machine Learning Service in the Cloud.
CoRR, 2024

Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate.
CoRR, 2024

Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM.
CoRR, 2024

Evaluating Emerging AI/ML Accelerators: IPU, RDU, and NVIDIA/AMD GPUs.
Proceedings of the Companion of the 15th ACM/SPEC International Conference on Performance Engineering, 2024

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
MaxK-GNN: Towards Theoretical Speed Limits for Accelerating Graph Neural Networks Training.
CoRR, 2023

Advanced Large Language Model (LLM)-Driven Verilog Development: Enhancing Power, Performance, and Area Optimization in Code Synthesis.
CoRR, 2023

RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inference.
CoRR, 2023

LinGCN: Structural Linearized Graph Convolutional Network for Homomorphically Encrypted Inference.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

AQ2PNN: Enabling Two-party Privacy-Preserving Deep Neural Network Inference with Adaptive Quantization.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

AutoReP: Automatic ReLU Replacement for Fast Private Network Inference.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution Networks.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

PASNet: Polynomial Architecture Search Framework for Two-party Computation-based Secure Neural Network Deployment.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

2022
Aerial Manipulation Using a Novel Unmanned Aerial Vehicle Cyber-Physical System.
CoRR, 2022

An Automatic and Efficient BERT Pruning for Edge AI Systems.
Proceedings of the 23rd International Symposium on Quality Electronic Design, 2022

Towards Sparsification of Graph Neural Networks.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

CoDG-ReRAM: An Algorithm-Hardware Co-design to Accelerate Semi-Structured GNNs on ReRAM.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

A length adaptive algorithm-hardware co-design of transformer on FPGA through sparse attention and dynamic pipelining.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021
Detecting Gender Bias in Transformer-based Models: A Case Study on BERT.
CoRR, 2021

Optimizing FPGA-based Accelerator Design for Large-Scale Molecular Similarity Search.
CoRR, 2021

Binary Complex Neural Network Acceleration on FPGA.
CoRR, 2021

Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI.
Proceedings of the 22nd International Symposium on Quality Electronic Design, 2021

Accelerating Transformer-based Deep Learning Models on FPGAs using Column Balanced Block Pruning.
Proceedings of the 22nd International Symposium on Quality Electronic Design, 2021

Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

Optimizing FPGA-based Accelerator Design for Large-Scale Molecular Similarity Search (Special Session Paper).
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

Accommodating Transformer onto FPGA: Coupling the Balanced Model Compression and FPGA-Implementation Optimization.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

HMC-TRAN: A Tensor-core Inspired Hierarchical Model Compression for Transformer-based DNNs on GPU.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

Binary Complex Neural Network Acceleration on FPGA : (Invited Paper).
Proceedings of the 32nd IEEE International Conference on Application-specific Systems, 2021


  Loading...