What can Off-the-Shelves Large Multi-Modal Models do for Dynamic Scene Graph Generation?
CoRR, March, 2025
Pro-Cap: Leveraging a Frozen Vision-Language Model for Hateful Meme Detection.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
MERMAID: A Dataset and Framework for Multimodal Meme Semantic Understanding.
Proceedings of the IEEE International Conference on Big Data, 2023