Tong Geng

Uday Kumar Reddy Vengalam

Proceedings of the International Joint Conference on Neural Networks, 2024

SmartFuse: Reconfigurable Smart Switches to Accelerate Fused Collectives in HPC Applications.

[BibT_eX]

[DOI]

Proceedings of the 38th ACM International Conference on Supercomputing, 2024

Prototypical Transformer As Unified Motion Learners.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Extending Power of Nature from Binary to Real-Valued Graph Learning in Real World.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Dissecting Tensor Cores via Microbenchmarks: Latency, Throughput and Numeric Behaviors.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2023

ClusterFormer: Clustering As A Universal Visual Learner.

[BibT_eX]

[DOI]

CoRR, 2023

Machine Learning Automated Approach for Enormous Synchrotron X-Ray Diffraction Data Interpretation.

[BibT_eX]

[DOI]

CoRR, 2023

RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inference.

[BibT_eX]

[DOI]

CoRR, 2023

FASDA: An FPGA-Aided, Scalable and Distributed Accelerator for Range-Limited Molecular Dynamics.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2023

MGG: Accelerating Graph Neural Networks with Fine-Grained Intra-Kernel Communication-Computation Pipelining on Multi-GPU Platforms.

[BibT_eX]

[DOI]

Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

LinGCN: Structural Linearized Graph Convolutional Network for Homomorphically Encrypted Inference.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

ClusterFomer: Clustering As A Universal Visual Learner.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Supporting Energy-based Learning with an Ising Machine substrate: a Case Study on RBM.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

FLASH: FPGA-Accelerated Smart Switches with GCN Case Study.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Supercomputing, 2023

Software-Hardware Co-design of Heterogeneous SmartNIC System for Recommendation Models Inference and Training.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Supercomputing, 2023

Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

PASNet: Polynomial Architecture Search Framework for Two-party Computation-based Secure Neural Network Deployment.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

ML-CGRA: An Integrated Compilation Framework to Enable Efficient Machine Learning Acceleration on CGRAs.

[BibT_eX]

[DOI]

Yixuan Luo

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Ising-CF: A Pathbreaking Collaborative Filtering Method Through Efficient Ising Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

TransFlow: Transformer as Flow Learner.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Ising-Traffic: Using Ising Machine Learning to Predict Traffic Congestion under Uncertainty.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Arctic Sea Ice Freeboard Estimation and Variations From Operation IceBridge.

[BibT_eX]

[DOI]

IEEE Trans. Geosci. Remote. Sens., 2022

An improved algorithm for extracting crossovers of satellite ground tracks.

[BibT_eX]

[DOI]

Comput. Geosci., 2022

Empowering GNNs with Fine-grained Communication-Computation Pipelining on Multi-GPU Platforms.

[BibT_eX]

[DOI]

CoRR, 2022

GMI-DRL: Empowering Multi-GPU Deep Reinforcement Learning with GPU Spatial Multiplexing.

[BibT_eX]

[DOI]

CoRR, 2022

GAAF: Searching Activation Functions for Binary Neural Networks through Genetic Algorithm.

[BibT_eX]

[DOI]

Yanfei Li

Samuel Alexander Stein

Ang Li

Huimin Yu

CoRR, 2022

Dissecting Tensor Cores via Microbenchmarks: Latency, Throughput and Numerical Behaviors.

[BibT_eX]

[DOI]

CoRR, 2022

Reconfigurable switches for high performance and flexible MPI collectives.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2022

ASAP: automatic synthesis of area-efficient and precision-aware CGRAs.

[BibT_eX]

[DOI]

Ganesh Gopalakrishnan

Ang Li

Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

CEAZ: accelerating parallel I/O via hardware-algorithm co-designed adaptive lossy compression.

[BibT_eX]

[DOI]

Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

Towards Sparsification of Graph Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE 40th International Conference on Computer Design, 2022

CoDG-ReRAM: An Algorithm-Hardware Co-design to Accelerate Semi-Structured GNNs on ReRAM.

[BibT_eX]

[DOI]

Proceedings of the IEEE 40th International Conference on Computer Design, 2022

On the Design of Quantum Graph Convolutional Neural Network in the NISQ-Era and Beyond.

[BibT_eX]

[DOI]

Proceedings of the IEEE 40th International Conference on Computer Design, 2022

Towards Real-Time Temporal Graph Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE 40th International Conference on Computer Design, 2022

The Viability of Using Online Prediction to Perform Extra Work while Executing BSP Applications.

[BibT_eX]

[DOI]

Proceedings of the IEEE High Performance Extreme Computing Conference, 2022

GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

DRIPS: Dynamic Rebalancing of Pipelined Streaming Applications on CGRAs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

Optimized Mappings for Symmetric Range-Limited Molecular Force Calculations on FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

A Framework for Neural Network Inference on FPGA-Centric SmartNICs.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

FCsN: A FPGA-Centric SmartNIC Framework for Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 30th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2022

A length adaptive algorithm-hardware co-design of transformer on FPGA through sparse attention and dynamic pipelining.

[BibT_eX]

[DOI]

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021

BCNN: Binary complex neural network.

[BibT_eX]

[DOI]

Microprocess. Microsystems, November, 2021

FPGA-based high-performance neural network acceleration

[BibT_eX]

[DOI]

PhD thesis, 2021

ARENA: Asynchronous Reconfigurable Accelerator Ring to Enable Data-Centric Parallel Computing.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2021

O3BNN-R: An Out-of-Order Architecture for High-Performance and Regularized BNN Inference.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2021

Arctic Sea Ice Freeboard Retrieval from Envisat Altimetry Data.

[BibT_eX]

[DOI]

Remote. Sens., 2021

DEM Generation with ICESat-2 Altimetry Data for the Three Antarctic Ice Shelves: Ross, Filchner-Ronne and Amery.

[BibT_eX]

[DOI]

Remote. Sens., 2021

Optimizing FPGA-based Accelerator Design for Large-Scale Molecular Similarity Search.

[BibT_eX]

[DOI]

CoRR, 2021

Binary Complex Neural Network Acceleration on FPGA.

[BibT_eX]

[DOI]

CoRR, 2021

CEAZ: Accelerating Parallel I/O via Hardware-Algorithm Co-Design of Efficient and Adaptive Lossy Compression.

[BibT_eX]

[DOI]

CoRR, 2021

APNN-TC: accelerating arbitrary precision neural networks on ampere GPU tensor cores.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2021

I-GCN: A Graph Convolutional Network Accelerator with Runtime Locality Enhancement through Islandization.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

Accelerating Transformer-based Deep Learning Models on FPGAs using Column Balanced Block Pruning.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Symposium on Quality Electronic Design, 2021

DynPaC: Coarse-Grained, Dynamic, and Partially Reconfigurable Array for Streaming Applications.

[BibT_eX]

[DOI]

Chenhao Xie

Proceedings of the 39th IEEE International Conference on Computer Design, 2021

G-CoS: GNN-Accelerator Co-Search Towards Both Better Accuracy and Efficiency.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

Optimizing FPGA-based Accelerator Design for Large-Scale Molecular Similarity Search (Special Session Paper).

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

FL-DISCO: Federated Generative Adversarial Network for Graph-based Molecule Drug Discovery: Special Session Paper.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

System-Level Modeling of GPU/FPGA Clusters for Molecular Dynamics Simulations.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

Workload Imbalance in HPC Applications: Effect on Performance of In-Network Processing.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

A Survey: Handling Irregularities in Neural Network Acceleration with FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

Upgrade of FPGA Range-Limited Molecular Dynamics to Handle Hundreds of Processors.

[BibT_eX]

[DOI]

Proceedings of the 29th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2021

Binary Complex Neural Network Acceleration on FPGA : (Invited Paper).

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE International Conference on Application-specific Systems, 2021

OpenCGRA: Democratizing Coarse-Grained Reconfigurable Arrays.

[BibT_eX]

[DOI]

Jeff Zhang

Marco Minutoli

Vito Giovanni Castellana

Proceedings of the 32nd IEEE International Conference on Application-specific Systems, 2021

Comparison Lift: Bandit-based Experimentation System for Online Advertising.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

FPDeep: Scalable Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2020

Estimating Arctic Sea Ice Thickness with CryoSat-2 Altimetry Data Using the Least Squares Adjustment Method.

[BibT_eX]

[DOI]

Sensors, 2020

AWB-GCN: A Graph Convolutional Network Accelerator with Runtime Workload Rebalancing.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

CSB-RNN: a faster-than-realtime RNN acceleration framework with compressed structured blocks.

[BibT_eX]

[DOI]

Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

A Reconfigurable Compute-in-the-Network FPGA Assistant for High-Level Collective Support with Distributed Matrix Multiply Case Study.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Field-Programmable Technology, 2020

A Communication-Efficient Multi-Chip Design for Range-Limited Molecular Dynamics.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

FPGAs in the Network and Novel Communicator Support Accelerate MPI Collectives.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

CQNN: a CGRA-based QNN Framework.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

FP-AMG: FPGA-Based Acceleration Framework for Algebraic Multigrid Solvers.

[BibT_eX]

[DOI]

Proceedings of the 28th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2020

Online Evaluation of Audiences for Targeted Advertising via Bandit Experiments.

[BibT_eX]

[DOI]