Practical Online Reinforcement Learning for Microprocessors With Micro-Armed Bandit.
IEEE Micro, 2024
Distributed-Memory Parallel Algorithms for Sparse Matrix and Sparse Tall-and-Skinny Matrix Multiplication.
Proceedings of the International Conference for High Performance Computing, 2024
HotTiles: Accelerating SpMM with Heterogeneous Accelerator Architectures.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
Two-Face: Combining Collective and One-Sided Communication for Efficient Distributed SpMM.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
Input-sensitive dense-sparse primitive compositions for GNN acceleration.
CoRR, 2023
Micro-Armed Bandit: Lightweight & Reusable Reinforcement Learning for Microarchitecture Decision-Making.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
Characterizing the Scalability of Graph Convolutional Networks on Intel<sup>®</sup> PIUMA.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023
SPADE: A Flexible and Scalable Accelerator for SpMM and SDDMM.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
Deep Reinforcement Learning Acceleration for Real-Time Edge Computing Mixed Integer Programming Problems.
IEEE Access, 2022