2025
Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models.
CoRR, April, 2025

ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos.
CoRR, March, 2025

Temporal Regularization Makes Your Video Generator Stronger.
CoRR, March, 2025

LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization.
CoRR, March, 2025

MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation.
CoRR, February, 2025

2024
Test-Time Adaptation for Nighttime Color-Thermal Semantic Segmentation.
IEEE Trans. Artif. Intell., October, 2024

VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation.
CoRR, 2024

GoodSAM++: Bridging Domain and Capacity Gaps via Segment Anything Model for Panoramic Semantic Segmentation.
CoRR, 2024

Seeing Clearly, Answering Incorrectly: A Multimodal Robustness Benchmark for Evaluating MLLMs on Leading Questions.
CoRR, 2024

EyeFound: A Multimodal Generalist Foundation Model for Ophthalmic Imaging.
CoRR, 2024

Efficient Multimodal Large Language Models: A Survey.
CoRR, 2024

Evaluating large language models in medical applications: a survey.
CoRR, 2024

Learning High-Quality Navigation and Zooming on Omnidirectional Images in Virtual Reality.
CoRR, 2024

MYCloth: Towards Intelligent and Interactive Online T-Shirt Customization based on User's Preference.
CoRR, 2024

Unsupervised Visible-Infrared ReID via Pseudo-label Correction and Modality-level Alignment.
CoRR, 2024

GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model for Distortion-aware Panoramic Semantic Segmentation.
CoRR, 2024

SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model.
CoRR, 2024

Efficient Multimodal Learning from Data-centric Perspective.
CoRR, 2024

GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model for Distortion-Aware Panoramic Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Deep Learning for Event-based Vision: A Comprehensive Survey and Benchmarks.
CoRR, 2023

Both Style and Distortion Matter: Dual-Path Unsupervised Domain Adaptation for Panoramic Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
FCP-Net: A Feature-Compression-Pyramid Network Guided by Game-Theoretic Interactions for Medical Image Segmentation.
IEEE Trans. Medical Imaging, 2022

A novel MCF-Net: Multi-level context fusion network for 2D medical image segmentation.
Comput. Methods Programs Biomed., 2022

2020
The Impacts of Technology Management on Product Innovation: The Role of Technological Capability.
IEEE Access, 2020

2019
Exploring the Different Combinations of Technological Capability and Technology Management Capability in Different Stages of New Product Development.
IEEE Access, 2019