Self-supervised vision transformers for semantic segmentation.
Comput. Vis. Image Underst., 2025
Seer: Language Instructed Video Prediction with Latent Diffusion Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Reinforcement Learning with Foundation Priors: Let Embodied Agent Efficiently Learn on Its Own.
Proceedings of the Conference on Robot Learning, 6-9 November 2024, Munich, Germany., 2024
Cross-modality image translation: CT image synthesis of MR brain images using multi generative network with perceptual supervision.
Comput. Methods Programs Biomed., July, 2023
Foundation Reinforcement Learning: towards Embodied Generalist Agents with Foundation Prior Assistance.
CoRR, 2023