2019
FLARE: Flexibly Sharing Commodity GPUs to Enforce QoS and Improve Utilization.
Proceedings of the Languages and Compilers for Parallel Computing, 2019

2018
Analysis of classic algorithms on highly-threaded many-core architectures.
Future Gener. Comput. Syst., 2018

Lite-Service: A Framework to Build and Schedule Telecom Applications in Device, Edge and Cloud.
Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

2015
Task-D: A Task Based Programming Framework for Distributed System.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

2014
A memory access model for highly-threaded many-core architectures.
Future Gener. Comput. Syst., 2014

Theoretical analysis of classic algorithms on highly-threaded many-core GPUs.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014

Analysis of classic algorithms on GPUs.
Proceedings of the International Conference on High Performance Computing & Simulation, 2014

Performance modeling for highly-threaded many-core GPUs.
Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014

2012
A Performance Model for Memory Bandwidth Constrained Applications on Graphics Engines.
Proceedings of the 23rd IEEE International Conference on Application-Specific Systems, 2012

2011
Bloom Filter Performance on Graphics Engines.
Proceedings of the International Conference on Parallel Processing, 2011