Divide and Conquer: Rethinking Ambiguous Candidate Identification in Multimodal Dialogues with Pseudo-Labelling.
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2024
RECANTFormer: Referring Expression Comprehension with Varying Numbers of Targets.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
SimpleMTOD: A Simple Language Model for Multimodal Task-Oriented Dialogue with Symbolic Scene Representation.
CoRR, 2023
Multitask Multimodal Prompted Training for Interactive Embodied Task Completion.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Demonstrating EMMA: Embodied MultiModal Agent for Language-guided Action Execution in 3D Simulated Environments.
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2022