Customizable Computing - From Single Chip to Datacenters.
Proc. IEEE, 2019
CPU-FPGA Coscheduling for Big Data Applications.
IEEE Des. Test, 2018
CPU-FPGA Co-Optimization for Big Data Applications: A Case Study of In-Memory Samtool Sorting (Abstract Only).
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017
Resource and Data Management in Accelerator-Rich Architectures.
PhD thesis, 2016
Software Infrastructure for Enabling FPGA-Based Accelerations in Data Centers: Invited Paper.
Proceedings of the 2016 International Symposium on Low Power Electronics and Design, 2016
Invited - Heterogeneous datacenters: options and opportunities.
Proceedings of the 53rd Annual Design Automation Conference, 2016
Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale.
Proceedings of the Seventh ACM Symposium on Cloud Computing, 2016
Source-to-Source Optimization for HLS.
Proceedings of the FPGAs for Software Programmers, 2016
CMOST: a system-level FPGA compilation framework.
Proceedings of the 52nd Annual Design Automation Conference, 2015
A scalable, high-performance customized priority queue.
Proceedings of the 24th International Conference on Field Programmable Logic and Applications, 2014
Combining computation and communication optimizations in system synthesis for streaming applications.
Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2014
Energy-efficient computing using adaptive table lookup based on nonvolatile memories.
Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013
Accelerator-rich CMPs: From concept to real hardware.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013
Efficient system-level mapping from streaming applications to FPGAs (abstract only).
Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013
Combining module selection and replication for throughput-driven streaming programs.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012
3D recursive Gaussian IIR on GPU and FPGAs - A case for accelerating bandwidth-bounded applications.
Proceedings of the IEEE 9th Symposium on Application Specific Processors, 2011
Accelerating Fluid Registration Algorithm on Multi-FPGA Platforms.
Proceedings of the International Conference on Field Programmable Logic and Applications, 2011
Domain-specific processor with 3D integration for medical image processing.
Proceedings of the 22nd IEEE International Conference on Application-specific Systems, 2011