FLARE: Flexibly Sharing Commodity GPUs to Enforce QoS and Improve Utilization.
Proceedings of the Languages and Compilers for Parallel Computing, 2019
Analysis of classic algorithms on highly-threaded many-core architectures.
Future Gener. Comput. Syst., 2018
Lite-Service: A Framework to Build and Schedule Telecom Applications in Device, Edge and Cloud.
Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018
Task-D: A Task Based Programming Framework for Distributed System.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015
A memory access model for highly-threaded many-core architectures.
Future Gener. Comput. Syst., 2014
Theoretical analysis of classic algorithms on highly-threaded many-core GPUs.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014
Analysis of classic algorithms on GPUs.
Proceedings of the International Conference on High Performance Computing & Simulation, 2014
Performance modeling for highly-threaded many-core GPUs.
Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014
A Performance Model for Memory Bandwidth Constrained Applications on Graphics Engines.
Proceedings of the 23rd IEEE International Conference on Application-Specific Systems, 2012
Bloom Filter Performance on Graphics Engines.
Proceedings of the International Conference on Parallel Processing, 2011