MoDA: Modulation Adapter for Fine-Grained Visual Grounding in Instructional MLLMs.
CoRR, June, 2025
FT2TF: First-Person Statement Text-to-Talking Face Generation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025
Is Attention All You Need For Actigraphy? Foundation Models of Wearable Accelerometer Data for Mental Health Research.
CoRR, 2024
Multi-layer Learnable Attention Mask for Multimodal Tasks.
CoRR, 2024
LangNav: Language as a Perceptual Representation for Navigation.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024
Learning Human Action Recognition Representations Without Real Humans.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Leveraging Temporal Context in Low Representational Power Regimes.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
How Transferable are Video Representations Based on Synthetic Data?
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Cross-Modal Discrete Representation Learning.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
Spoken Moments: Learning Joint Audio-Visual Representations From Video Descriptions.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Half&Half: New Tasks and Benchmarks for Studying Visual Common Sense.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019
Automatic Adaptation of Object Detectors to New Domains Using Self-Training.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Unsupervised Hard Example Mining from Videos for Improved Object Detection.
Proceedings of the Computer Vision - ECCV 2018, 2018
End-to-End Face Detection and Cast Grouping in Movies Using Erdös-Rényi Clustering.
Proceedings of the IEEE International Conference on Computer Vision, 2017
A randomized algorithm for natural object colorization.
Comput. Graph. Forum, 2014
Jigsaw puzzle image retrieval via pairwise compatibility measurement.
Proceedings of the International Conference on Big Data and Smart Computing, BIGCOMP 2014, 2014
Non-Simultaneous Sampling Deactivation during the Parameter Approximation of a Topic Model.
KSII Trans. Internet Inf. Syst., 2013
Depth consistency evaluation for error-pose detection.
Proceedings of the Sixth International Conference on Machine Vision, 2013
Clustering space-time interest points for action representation.
Proceedings of the Sixth International Conference on Machine Vision, 2013
An Intelligent Multi-Sensor Surveillance System for Elderly Care.
Smart Comput. Rev., 2012
Essential Body-Joint and Atomic Action Detection for Human Activity Recognition Using Longest Common Subsequence Algorithm.
Proceedings of the Computer Vision - ACCV 2012 Workshops, 2012