SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation.
CoRR, January, 2025
AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation.
CoRR, 2024
Manipulate-Anything: Automating Real-World Robots using Vision-Language Models.
Proceedings of the Conference on Robot Learning, 6-9 November 2024, Munich, Germany., 2024
Read My Mind: A Multi-Modal Dataset for Human Belief Prediction.
CoRR, 2023
A Study of Comfortability between Interactive AI and Human.
CoRR, 2023
MVTrans: Multi-View Perception of Transparent Objects.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023
NEWTON: Are Large Language Models Capable of Physical Reasoning?
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
AR2-D2: Training a Robot Without a Robot.
Proceedings of the Conference on Robot Learning, 2023
Predicting 3D shapes, masks, and properties of materials, liquids, and objects inside transparent containers, using the TransProteus CGI dataset.
CoRR, 2021
CONetV2: Efficient Auto-Channel Size Optimization for CNNs.
Proceedings of the 20th IEEE International Conference on Machine Learning and Applications, 2021
Seeing Glass: Joint Point-Cloud and Depth Completion for Transparent Objects.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021