2025

DS-TPU: Dynamical System for on-Device Lifelong Graph Learning with Nonlinear Node Interaction.

[DOI]

Chunshu Wu

Ruibing Song

Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

DS-LLM: Leveraging Dynamical Systems to Enhance Both Training and Inference of Large Language Models.

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Diff-PIC: Revolutionizing Particle-In-Cell Nuclear Fusion Simulation with Diffusion Models.

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

InstaTrain: Adaptive Training via Ultra-Fast Natural Annealing within Dynamical Systems.

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Nature-GL: A Revolutionary Learning Paradigm Unleashing Nature's Power in Real-World Spatial-Temporal Graph Learning.

[DOI]

Proceedings of the 30th Asia and South Pacific Design Automation Conference, 2025

2024

FPGA-Accelerated Range-Limited Molecular Dynamics.

[DOI]

IEEE Trans. Computers, June, 2024

Diff-PIC: Revolutionizing Particle-In-Cell Simulation for Advancing Nuclear Fusion with Diffusion Models.

[DOI]

CoRR, 2024

Inertial Confinement Fusion Forecasting via LLMs.

[DOI]

CoRR, 2024

Visual Fourier Prompt Tuning.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Bridging the Gap Between LLMs and LNS with Dynamic Data Format and Architecture Codesign.

[DOI]

Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

DS-GL: Advancing Graph Learning via Harnessing Nature's Power within Scalable Dynamical Systems.

[DOI]

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

SmartFuse: Reconfigurable Smart Switches to Accelerate Fused Collectives in HPC Applications.

[DOI]

Proceedings of the 38th ACM International Conference on Supercomputing, 2024

Extending Power of Nature from Binary to Real-Valued Graph Learning in Real World.

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

FASDA: An FPGA-Aided, Scalable and Distributed Accelerator for Range-Limited Molecular Dynamics.

[DOI]

Proceedings of the International Conference for High Performance Computing, 2023

FLASH: FPGA-Accelerated Smart Switches with GCN Case Study.

[DOI]

Proceedings of the 37th International Conference on Supercomputing, 2023

Software-Hardware Co-design of Heterogeneous SmartNIC System for Recommendation Models Inference and Training.

[DOI]

Proceedings of the 37th International Conference on Supercomputing, 2023

2022

Optimized Mappings for Symmetric Range-Limited Molecular Force Calculations on FPGAs.

[DOI]

Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

A Framework for Neural Network Inference on FPGA-Centric SmartNICs.

[DOI]

Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

FCsN: A FPGA-Centric SmartNIC Framework for Neural Networks.

[DOI]

Proceedings of the 30th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2022

2021

O3BNN-R: An Out-of-Order Architecture for High-Performance and Regularized BNN Inference.

[DOI]

IEEE Trans. Parallel Distributed Syst., 2021

I-GCN: A Graph Convolutional Network Accelerator with Runtime Locality Enhancement through Islandization.

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

System-Level Modeling of GPU/FPGA Clusters for Molecular Dynamics Simulations.

[DOI]

Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

A Survey: Handling Irregularities in Neural Network Acceleration with FPGAs.

[DOI]

Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

Upgrade of FPGA Range-Limited Molecular Dynamics to Handle Hundreds of Processors.

[DOI]

Proceedings of the 29th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2021

2020

AWB-GCN: A Graph Convolutional Network Accelerator with Runtime Workload Rebalancing.

[DOI]

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

A Communication-Efficient Multi-Chip Design for Range-Limited Molecular Dynamics.

[DOI]

Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

CQNN: a CGRA-based QNN Framework.

[DOI]

Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

2019

UWB-GCN: Hardware Acceleration of Graph-Convolution-Network through Runtime Workload Rebalancing.

[DOI]

CoRR, 2019

Fully integrated FPGA molecular dynamics simulations.

[DOI]

Proceedings of the International Conference for High Performance Computing, 2019

O3BNN: an out-of-order architecture for high-performance binarized neural network inference with fine-grained pruning.

[DOI]

Proceedings of the ACM International Conference on Supercomputing, 2019

LP-BNN: Ultra-low-Latency BNN Inference with Layer Parallelism.

[DOI]

Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019