Xintao Wang

Orcid: 0000-0001-6585-8604

According to our database¹, Xintao Wang authored at least 134 papers between 2017 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance.

[BibT_eX]

[DOI]

IEEE Trans. Vis. Comput. Graph., February, 2025

2024

ToonCrafter: Generative Cartoon Interpolation.

[BibT_eX]

[DOI]

ACM Trans. Graph., December, 2024

StyleCrafter: Taming Artistic Video Diffusion with Reference-Augmented Adapter Learning.

[BibT_eX]

[DOI]

ACM Trans. Graph., December, 2024

Empowering Real-World Image Super-Resolution With Flexible Interactive Modulation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., November, 2024

Temporally consistent video colorization with deep feature propagation and self-regularization learning.

[BibT_eX]

[DOI]

Comput. Vis. Media, April, 2024

Analysis and Benchmarking of Extending Blind Face Image Restoration to Videos.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2024

Consistent Human Image and Video Generation with Spatially Conditioned Diffusion.

[BibT_eX]

[DOI]

CoRR, 2024

SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints.

[BibT_eX]

[DOI]

CoRR, 2024

3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation.

[BibT_eX]

[DOI]

CoRR, 2024

StyleMaster: Stylize Your Video with Artistic Generation and Translation.

[BibT_eX]

[DOI]

CoRR, 2024

NovelGS: Consistent Novel-view Denoising via Large Gaussian Reconstruction Model.

[BibT_eX]

[DOI]

CoRR, 2024

CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities.

[BibT_eX]

[DOI]

CoRR, 2024

Story3D-Agent: Exploring 3D Storytelling Visualization with Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions.

[BibT_eX]

[DOI]

CoRR, 2024

MINDECHO: Role-Playing Language Agents for Key Opinion Leaders.

[BibT_eX]

[DOI]

CoRR, 2024

Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs.

[BibT_eX]

[DOI]

CoRR, 2024

Image Conductor: Precision Control for Interactive Video Synthesis.

[BibT_eX]

[DOI]

CoRR, 2024

Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals.

[BibT_eX]

[DOI]

CoRR, 2024

VideoTetris: Towards Compositional Text-to-Video Generation.

[BibT_eX]

[DOI]

CoRR, 2024

ReVideo: Remake a Video with Motion and Content Control.

[BibT_eX]

[DOI]

CoRR, 2024

From Persona to Personalization: A Survey on Role-Playing Language Agents.

[BibT_eX]

[DOI]

CoRR, 2024

Character is Destiny: Can Large Language Models Simulate Persona-Driven Decisions in Role-Playing?

[BibT_eX]

[DOI]

CoRR, 2024

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models.

[BibT_eX]

[DOI]

CoRR, 2024

SurveyAgent: A Conversational System for Personalized and Efficient Research Survey.

[BibT_eX]

[DOI]

CoRR, 2024

SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model.

[BibT_eX]

[DOI]

CoRR, 2024

Towards A Better Metric for Text-to-Video Generation.

[BibT_eX]

[DOI]

CoRR, 2024

ConcEPT: Concept-Enhanced Pre-Training for Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

SiamMFF: UAV Object Tracking Algorithm Based on Multi-Scale Feature Fusion.

[BibT_eX]

[DOI]

IEEE Access, 2024

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGGRAPH 2024 Conference Papers, 2024

CustomNet: Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Unifying Image Processing as Visual Prompting Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Making LLaMA SEE and Draw with SEED Tokenizer.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Reinforced Multi-teacher Knowledge Distillation for Unsupervised Sentence Representation.

[BibT_eX]

[DOI]

Xintao Wang

Rize Jin

Shibo Qi

Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2024, 2024

Evaluating Character Understanding of Large Language Models via Character Profiling from Fictional Works.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Capturing Minds, Not Just Words: Enhancing Role-Playing Language Models with Personality-Indicative Data.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

DynamiCrafter: Animating Open-Domain Images with Video Diffusion Priors.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

DreamDiffusion: High-Quality EEG-to-Image Generation with Temporal Masked Signal Modeling and CLIP Alignment.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

X- Adapter: Universal Compatibility of Plugins for Upgraded Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DiffEditor: Boosting Accuracy and Flexibility on Diffusion-Based Image Editing.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

EvalCrafter: Benchmarking and Evaluating Large Video Generation Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

SmartEdit: Exploring Complex Instruction-Based Image Editing with Multimodal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Source Prompt: Coordinated Pre-training of Language Models on Diverse Corpora from Multiple Sources.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024

Light Up the Shadows: Enhance Long-Tailed Entity Grounding with Concept-Guided Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Follow Your Pose: Pose-Guided Text-to-Video Generation Using Pose-Free Videos.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Prototypical Concept Representation.

[BibT_eX]

[DOI]

IEEE Trans. Knowl. Data Eng., July, 2023

Reference-Based Image and Video Super-Resolution via $C^{2}$-Matching.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

GLEAN: Generative Latent Bank for Image Super-Resolution and Beyond.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

AnimateZero: Video Diffusion Models are Zero-Shot Image Animators.

[BibT_eX]

[DOI]

CoRR, 2023

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation.

[BibT_eX]

[DOI]

CoRR, 2023

MagicStick: Controllable Video Editing via Control Handle Transformations.

[BibT_eX]

[DOI]

CoRR, 2023

X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model.

[BibT_eX]

[DOI]

CoRR, 2023

StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter.

[BibT_eX]

[DOI]

CoRR, 2023

Source Prompt: Coordinated Pre-training of Language Models on Diverse Corpora from Multiple Sources.

[BibT_eX]

[DOI]

CoRR, 2023

CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2023

VideoCrafter1: Open Diffusion Models for High-Quality Video Generation.

[BibT_eX]

[DOI]

CoRR, 2023

New Boolean satisfiability problem heuristic strategy: Minimal Positive Negative Product Strategy.

[BibT_eX]

[DOI]

Qun Zhao

Xintao Wang

Menghui Yang

CoRR, 2023

Does Role-Playing Chatbots Capture the Character Personalities? Assessing Personality Traits for Role-Playing Chatbots.

[BibT_eX]

[DOI]

CoRR, 2023

DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors.

[BibT_eX]

[DOI]

CoRR, 2023

Can Large Language Models Understand Real-World Complex Instructions?

[BibT_eX]

[DOI]

CoRR, 2023

HAT: Hybrid Attention Transformer for Image Restoration.

[BibT_eX]

[DOI]

CoRR, 2023

StyleAdapter: A Single-Pass LoRA-Free Model for Stylized Image Generation.

[BibT_eX]

[DOI]

CoRR, 2023

KnowledGPT: Enhancing Large Language Models with Retrieval and Storage Access on Knowledge Bases.

[BibT_eX]

[DOI]

CoRR, 2023

GET3D-: Learning GET3D from Unconstrained Image Collections.

[BibT_eX]

[DOI]

CoRR, 2023

Planting a SEED of Vision in Large Language Model.

[BibT_eX]

[DOI]

CoRR, 2023

Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation.

[BibT_eX]

[DOI]

CoRR, 2023

DreamDiffusion: Generating High-Quality Images from Brain EEG Signals.

[BibT_eX]

[DOI]

CoRR, 2023

InstructP2P: Learning to Edit 3D Point Clouds with Text Instructions.

[BibT_eX]

[DOI]

CoRR, 2023

TaleCrafter: Interactive Story Visualization with Multiple Characters.

[BibT_eX]

[DOI]

CoRR, 2023

Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos.

[BibT_eX]

[DOI]

CoRR, 2023

T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2023

Interactive Story Visualization with Multiple Characters.

[BibT_eX]

[DOI]

Proceedings of the SIGGRAPH Asia 2023 Conference Papers, 2023

Inserting Anybody in Diffusion Models via Celeb Basis.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

OSRT: Omnidirectional Image Super-Resolution with Distortion-aware Transformer.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Activating More Pixels in Image Super-Resolution Transformer.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

NTIRE 2023 Challenge on 360° Omnidirectional Image and Video Super-Resolution: Datasets, Methods and Results.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Mitigating Artifacts in Real-World Video Super-resolution Models.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Accelerating the Training of Video Super-resolution Models.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

MAPS-KB: A Million-Scale Probabilistic Simile Knowledge Base.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Path-Restore: Learning Network Path Selection for Image Restoration.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Hybrid Warping Fusion for Video Frame Interpolation.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., 2022

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation.

[BibT_eX]

[DOI]

CoRR, 2022

Reference-based Image and Video Super-Resolution via C2-Matching.

[BibT_eX]

[DOI]

CoRR, 2022

Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis.

[BibT_eX]

[DOI]

CoRR, 2022

FaceFormer: Scale-aware Blind Face Restoration with Transformers.

[BibT_eX]

[DOI]

CoRR, 2022

Activating More Pixels in Image Super-Resolution Transformer.

[BibT_eX]

[DOI]

CoRR, 2022

AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Rethinking Alignment in Video Super-Resolution Transformers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Composite Photograph Harmonization with Complete Background Cues.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization.

[BibT_eX]

[DOI]

Xintao Wang

Chao Dong

Ying Shan

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Language Models as Knowledge Embeddings.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Metric Learning Based Interactive Modulation for Real-World Super-Resolution.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video: Dataset, Methods and Results.

[BibT_eX]

[DOI]

Pablo Navarrete Michelini

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

VFHQ: A High-Quality Dataset and Benchmark for Video Face Super-Resolution.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results.

[BibT_eX]

[DOI]

Eduardo Pérez-Pellitero

Truong Thanh Nhat Mai

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

2021

Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

Towards Vivid and Diverse Image Colorization with Generative Color Prior.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Positional Encoding As Spatial Inductive Bias in GANs.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Towards Real-World Blind Face Restoration With Generative Facial Prior.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Robust Reference-Based Super-Resolution via C2-Matching.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Understanding Deformable Alignment in Video Super-Resolution.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2019

Deep Network Interpolation for Continuous Imagery Effect Transition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

EDVR: Video Restoration With Enhanced Deformable Convolutional Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

NTIRE 2019 Challenge on Video Super-Resolution: Methods and Results.

[BibT_eX]

[DOI]

Rudrabha Mukhopadhyay

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

NTIRE 2019 Challenge on Video Deblurring: Methods and Results.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

2018

Study of effervescent jet breakup under gas expansion disturbance.

[BibT_eX]

[DOI]

J. Vis., 2018

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks.

[BibT_eX]

[DOI]

CoRR, 2018

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Recovering Realistic Texture in Image Super-Resolution by Deep Spatial Feature Transform.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

Xintao Wang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...