QURE: AI-Assisted and Automatically Verified UDF Inlining.
Proc. ACM Manag. Data, February, 2025
Buffer Management for Out-of-GPU LLM Execution.
Proceedings of the Workshop on Data Management for End-to-End Machine Learning, 2025
Hydro: Adaptive Query Processing of ML Queries.
CoRR, 2024
GPU Database Systems Characterization and Optimization.
Proc. VLDB Endow., November, 2023
Interactive Demonstration of EVA.
Proc. VLDB Endow., 2023
Revisiting Query Performance in GPU Database Systems.
CoRR, 2023
EHT-SR: An Entropy-Based Hybrid Approach for Faster Super-Resolution.
Proceedings of the IEEE International Symposium on Multimedia, 2023
Reducing Inference Latency with Concurrent Architectures for Image Recognition at Edge.
Proceedings of the IEEE International Conference on Edge Computing and Communications, 2023
Creating Robust Deep Neural Networks with Coded Distributed Computing for IoT.
Proceedings of the IEEE International Conference on Edge Computing and Communications, 2023
EVA: An End-to-End Exploratory Video Analytics System.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Seventh Workshop on Data Management for End-to-End Machine Learning, 2023
FiGO: Fine-Grained Query Optimization in Video Analytics.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022
Securing GPU via region-based bounds checking.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
Creating Robust Deep Neural Networks With Coded Distributed Computing for IoT Systems.
CoRR, 2021
THIA: Accelerating Video Analytics using Early Inference and Fine-Grained Query Planning.
CoRR, 2021
FAFNIR: Accelerating Sparse Gathering by Using Efficient Near-Memory Intelligent Reduction.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021
Toward Collaborative Inferencing of Deep Neural Networks on Internet-of-Things Devices.
IEEE Internet Things J., 2020
Reducing Inference Latency with Concurrent Architectures for Image Recognition.
CoRR, 2020
Edge-Tailored Perception: Fast Inferencing in-the-Edge with Efficient Model Distribution.
CoRR, 2020
Collaborative Execution of Deep Neural Networks on Internet of Things Devices.
CoRR, 2019
Characterizing the Execution of Deep Neural Networks on Collaborative Robots and Edge Devices.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Practice and Experience in Advanced Research Computing on Rise of the Machines (learning), 2019
Characterizing the Deployment of Deep Neural Networks on Commercial Edge Devices.
Proceedings of the IEEE International Symposium on Workload Characterization, 2019
Capella: Customizing Perception for Edge Devices by Efficiently Allocating FPGAs to DNNs.
Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019
Robustly Executing DNNs in IoT Systems Using Coded Distributed Computing.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
Video analytics from edge to server: work-in-progress.
Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis Companion, 2019
Distributed Perception by Collaborative Robots.
IEEE Robotics Autom. Lett., 2018
Musical Chair: Efficient Real-Time Recognition Using Collaborative IoT Devices.
CoRR, 2018
Real-Time Image Recognition Using Collaborative IoT Devices.
Proceedings of the 1st on Reproducible Quality-Efficient Systems Tournament on Co-designing Pareto-efficient Deep Learning, 2018