ROICtrl: Boosting Instance Control for Visual Generation.
CoRR, 2024
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation.
CoRR, 2024
Towards A Better Metric for Text-to-Video Generation.
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Point Cloud Segmentation Algorithm Based on Improved Euclidean Clustering.
IEEE Access, 2024
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
MotionDirector: Motion Customization of Text-to-Video Diffusion Models.
Proceedings of the Computer Vision - ECCV 2024, 2024
DragAnything: Motion Control for Anything Using Entity Representation.
Proceedings of the Computer Vision - ECCV 2024, 2024
MaskINT: Video Editing via Interpolative Non-autoregressive Masked Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing.
CoRR, 2023
MotionDirector: Motion Customization of Text-to-Video Diffusion Models.
CoRR, 2023
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation.
CoRR, 2023
DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models.
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
MobileSal: Extremely Efficient RGB-D Salient Object Detection.
IEEE Trans. Pattern Anal. Mach. Intell., 2022
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation.
CoRR, 2022
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis.
CoRR, 2022
VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder.
Proceedings of the Computer Vision - ECCV 2022, 2022
Lightweight Salient Object Detection via Hierarchical Visual Perception Learning.
IEEE Trans. Cybern., 2021
iNAS: Integral NAS for Device-Aware Salient Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
DOTS: Decoupling Operation and Topology in Differentiable Architecture Search.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
DOTS: Decoupling Operation and Topology in Differentiable Architecture Search.
CoRR, 2020
Generalized Zero-Shot Learning via VAE-Conditioned Generative Flow.
CoRR, 2020
Pyramid Constrained Self-Attention Network for Fast Video Salient Object Detection.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020