2025
AssetDropper: Asset Extraction via Diffusion Models with Reward-Driven Optimization.
CoRR, June, 2025

Improving Foundation Model for Endoscopy Video Analysis via Representation Learning on Long Sequences.
IEEE J. Biomed. Health Informatics, May, 2025

StyleAR: Customizing Multimodal Autoregressive Model for Style-Aligned Text-to-Image Generation.
CoRR, May, 2025

DanceGRPO: Unleashing GRPO on Visual Generation.
CoRR, May, 2025

GarmentX: Autoregressive Parametric Representations for High-Fidelity 3D Garment Generation.
CoRR, April, 2025

MuMA: 3D PBR Texturing via Multi-Channel Multi-View Generation and Agentic Post-Processing.
CoRR, March, 2025

Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation.
CoRR, March, 2025

Unleash the Power of State Space Model for Whole Slide Image With Local Aware Scanning and Importance Resampling.
IEEE Trans. Medical Imaging, February, 2025

AnyCharV: Bootstrap Controllable Character Video Generation with Fine-to-Coarse Guidance.
CoRR, February, 2025

Multi-Sensor Learning Enables Information Transfer Across Different Sensory Data and Augments Multi-Modality Imaging.
IEEE Trans. Pattern Anal. Mach. Intell., January, 2025

Generalizable Human Gaussians from Single-View Image.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MAR-3D: Progressive Masked Auto-regressor for High-Resolution 3D Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Large Images Are Gaussians: High-Quality Large Image Representation with Levels of 2D Gaussian Splatting.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
ToolBridge: An Open-Source Dataset to Equip LLMs with External Tool Capabilities.
CoRR, 2024

Generative Enhancement for 3D Medical Images.
CoRR, 2024

Deformable Endoscopic Tissues Reconstruction with Gaussian Splatting.
CoRR, 2024

CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects.
CoRR, 2024

Low-to-High Frequency Progressive K-Space Learning for MRI Reconstruction.
Proceedings of the Machine Learning in Medical Imaging - 15th International Workshop, 2024

EndoGS: Deformable Endoscopic Tissues Reconstruction with Gaussian Splatting.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024 Workshops, 2024

HFGS: 4D Gaussian Splatting with Emphasis on Spatial and Temporal High-Frequency Components for Endoscopic Scene Reconstruction.
Proceedings of the 35th British Machine Vision Conference, 2024

2023
IDRNet: Intervention-Driven Relation Network for Semantic Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Make-A-Volume: Leveraging Latent Diffusion Models for Cross-Modality 3D Brain MRI Synthesis.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Cheap Lunch for Medical Image Segmentation by Fine-Tuning SAM on Few Exemplars.
Proceedings of the Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, 2023

2020
Machine Learning-Based Resource Optimization for D2D Communication Underlaying Networks.
Proceedings of the 92nd IEEE Vehicular Technology Conference, 2020