AssetDropper: Asset Extraction via Diffusion Models with Reward-Driven Optimization.
CoRR, June, 2025
Improving Foundation Model for Endoscopy Video Analysis via Representation Learning on Long Sequences.
IEEE J. Biomed. Health Informatics, May, 2025
StyleAR: Customizing Multimodal Autoregressive Model for Style-Aligned Text-to-Image Generation.
CoRR, May, 2025
DanceGRPO: Unleashing GRPO on Visual Generation.
,
,
,
,
,
,
,
,
,
,
CoRR, May, 2025
GarmentX: Autoregressive Parametric Representations for High-Fidelity 3D Garment Generation.
CoRR, April, 2025
MuMA: 3D PBR Texturing via Multi-Channel Multi-View Generation and Agentic Post-Processing.
,
,
,
,
,
,
,
,
,
,
CoRR, March, 2025
Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation.
CoRR, March, 2025
Unleash the Power of State Space Model for Whole Slide Image With Local Aware Scanning and Importance Resampling.
IEEE Trans. Medical Imaging, February, 2025
AnyCharV: Bootstrap Controllable Character Video Generation with Fine-to-Coarse Guidance.
CoRR, February, 2025
Multi-Sensor Learning Enables Information Transfer Across Different Sensory Data and Augments Multi-Modality Imaging.
IEEE Trans. Pattern Anal. Mach. Intell., January, 2025
Generalizable Human Gaussians from Single-View Image.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
MAR-3D: Progressive Masked Auto-regressor for High-Resolution 3D Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Large Images Are Gaussians: High-Quality Large Image Representation with Levels of 2D Gaussian Splatting.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
ToolBridge: An Open-Source Dataset to Equip LLMs with External Tool Capabilities.
CoRR, 2024
Generative Enhancement for 3D Medical Images.
CoRR, 2024
Deformable Endoscopic Tissues Reconstruction with Gaussian Splatting.
CoRR, 2024
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects.
CoRR, 2024
Low-to-High Frequency Progressive K-Space Learning for MRI Reconstruction.
Proceedings of the Machine Learning in Medical Imaging - 15th International Workshop, 2024
EndoGS: Deformable Endoscopic Tissues Reconstruction with Gaussian Splatting.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024 Workshops, 2024
HFGS: 4D Gaussian Splatting with Emphasis on Spatial and Temporal High-Frequency Components for Endoscopic Scene Reconstruction.
Proceedings of the 35th British Machine Vision Conference, 2024
IDRNet: Intervention-Driven Relation Network for Semantic Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Make-A-Volume: Leveraging Latent Diffusion Models for Cross-Modality 3D Brain MRI Synthesis.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023
Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Cheap Lunch for Medical Image Segmentation by Fine-Tuning SAM on Few Exemplars.
Proceedings of the Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, 2023
Machine Learning-Based Resource Optimization for D2D Communication Underlaying Networks.
Proceedings of the 92nd IEEE Vehicular Technology Conference, 2020