2024

ROICtrl: Boosting Instance Control for Visual Generation.

[DOI]

Yuchao Gu

Yipin Zhou

CoRR, 2024

EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.

[DOI]

CoRR, 2024

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation.

[DOI]

CoRR, 2024

Towards A Better Metric for Text-to-Video Generation.

[DOI]

CoRR, 2024

Point Cloud Segmentation Algorithm Based on Improved Euclidean Clustering.

[DOI]

IEEE Access, 2024

EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance.

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

MotionDirector: Motion Customization of Text-to-Video Diffusion Models.

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

DragAnything: Motion Control for Anything Using Entity Representation.

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

MaskINT: Video Editing via Interpolative Non-autoregressive Masked Transformers.

[DOI]

Haoyu Ma

Shahin Mahdizadehaghdam

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing.

[DOI]

CoRR, 2023

MotionDirector: Motion Customization of Text-to-Video Diffusion Models.

[DOI]

CoRR, 2023

Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation.

[DOI]

CoRR, 2023

DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022

MobileSal: Extremely Efficient RGB-D Salient Object Detection.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation.

[DOI]

CoRR, 2022

Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis.

[DOI]

CoRR, 2022

VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder.

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2021

Lightweight Salient Object Detection via Hierarchical Visual Perception Learning.

[DOI]

IEEE Trans. Cybern., 2021

iNAS: Integral NAS for Device-Aware Salient Object Detection.

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

DOTS: Decoupling Operation and Topology in Differentiable Architecture Search.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

DOTS: Decoupling Operation and Topology in Differentiable Architecture Search.

[DOI]

CoRR, 2020

Generalized Zero-Shot Learning via VAE-Conditioned Generative Flow.

[DOI]

CoRR, 2020

Pyramid Constrained Self-Attention Network for Fast Video Salient Object Detection.

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020