Adaptive Blockwise Task-interleaved Pipeline Parallelism.
CoRR, 2024
A Holistic Functionalization Approach to Optimizing Imperative Tensor Programs in Deep Learning.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024
EasyView: Enabling and Scheduling Tensor Views in Deep Learning Compilers.
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 51st International Conference on Parallel Processing, 2022
Fast and Accurate Stereo Vision System on FPGA.
ACM Trans. Reconfigurable Technol. Syst., 2014
A fast and high quality stereo matching algorithm on FPGA.
Proceedings of the 22nd International Conference on Field Programmable Logic and Applications (FPL), 2012
A real-time stereo vision system using a tree-structured dynamic programming on FPGA.
Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012