Xintao Wang

Orcid: 0000-0001-6585-8604

According to our database1, Xintao Wang authored at least 129 papers between 2017 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
ToonCrafter: Generative Cartoon Interpolation.
ACM Trans. Graph., December, 2024

StyleCrafter: Taming Artistic Video Diffusion with Reference-Augmented Adapter Learning.
ACM Trans. Graph., December, 2024

Empowering Real-World Image Super-Resolution With Flexible Interactive Modulation.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2024

Temporally consistent video colorization with deep feature propagation and self-regularization learning.
Comput. Vis. Media, April, 2024

Analysis and Benchmarking of Extending Blind Face Image Restoration to Videos.
IEEE Trans. Image Process., 2024

CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities.
CoRR, 2024

Story3D-Agent: Exploring 3D Storytelling Visualization with Large Language Models.
CoRR, 2024

MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions.
CoRR, 2024

MINDECHO: Role-Playing Language Agents for Key Opinion Leaders.
CoRR, 2024

Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation.
CoRR, 2024

Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs.
CoRR, 2024

Image Conductor: Precision Control for Interactive Video Synthesis.
CoRR, 2024

Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals.
CoRR, 2024

VideoTetris: Towards Compositional Text-to-Video Generation.
CoRR, 2024

MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.
CoRR, 2024

ReVideo: Remake a Video with Motion and Content Control.
CoRR, 2024

From Persona to Personalization: A Survey on Role-Playing Language Agents.
CoRR, 2024

Character is Destiny: Can Large Language Models Simulate Persona-Driven Decisions in Role-Playing?
CoRR, 2024

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models.
CoRR, 2024

SurveyAgent: A Conversational System for Personalized and Efficient Research Survey.
CoRR, 2024

SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model.
CoRR, 2024

Towards A Better Metric for Text-to-Video Generation.
CoRR, 2024

ConcEPT: Concept-Enhanced Pre-Training for Language Models.
CoRR, 2024

SiamMFF: UAV Object Tracking Algorithm Based on Multi-Scale Feature Fusion.
IEEE Access, 2024

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation.
Proceedings of the ACM SIGGRAPH 2024 Conference Papers, 2024

CustomNet: Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Unifying Image Processing as Visual Prompting Question Answering.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Making LLaMA SEE and Draw with SEED Tokenizer.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Reinforced Multi-teacher Knowledge Distillation for Unsupervised Sentence Representation.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2024, 2024

Evaluating Character Understanding of Large Language Models via Character Profiling from Fictional Works.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Capturing Minds, Not Just Words: Enhancing Role-Playing Language Models with Personality-Indicative Data.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

DynamiCrafter: Animating Open-Domain Images with Video Diffusion Priors.
Proceedings of the Computer Vision - ECCV 2024, 2024

BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion.
Proceedings of the Computer Vision - ECCV 2024, 2024

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation.
Proceedings of the Computer Vision - ECCV 2024, 2024

DreamDiffusion: High-Quality EEG-to-Image Generation with Temporal Masked Signal Modeling and CLIP Alignment.
Proceedings of the Computer Vision - ECCV 2024, 2024

Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

X- Adapter: Universal Compatibility of Plugins for Upgraded Diffusion Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DiffEditor: Boosting Accuracy and Flexibility on Diffusion-Based Image Editing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

EvalCrafter: Benchmarking and Evaluating Large Video Generation Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

SmartEdit: Exploring Complex Instruction-Based Image Editing with Multimodal Large Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Source Prompt: Coordinated Pre-training of Language Models on Diverse Corpora from Multiple Sources.
Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024

Light Up the Shadows: Enhance Long-Tailed Entity Grounding with Concept-Guided Vision-Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion Models.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Follow Your Pose: Pose-Guided Text-to-Video Generation Using Pose-Free Videos.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Prototypical Concept Representation.
IEEE Trans. Knowl. Data Eng., July, 2023

Reference-Based Image and Video Super-Resolution via $C^{2}$-Matching.
IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

GLEAN: Generative Latent Bank for Image Super-Resolution and Beyond.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

AnimateZero: Video Diffusion Models are Zero-Shot Image Animators.
CoRR, 2023

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation.
CoRR, 2023

MagicStick: Controllable Video Editing via Control Handle Transformations.
CoRR, 2023

X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model.
CoRR, 2023

StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter.
CoRR, 2023

Source Prompt: Coordinated Pre-training of Language Models on Diverse Corpora from Multiple Sources.
CoRR, 2023

CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models.
CoRR, 2023

VideoCrafter1: Open Diffusion Models for High-Quality Video Generation.
CoRR, 2023

New Boolean satisfiability problem heuristic strategy: Minimal Positive Negative Product Strategy.
CoRR, 2023

Does Role-Playing Chatbots Capture the Character Personalities? Assessing Personality Traits for Role-Playing Chatbots.
CoRR, 2023

DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors.
CoRR, 2023

Can Large Language Models Understand Real-World Complex Instructions?
CoRR, 2023

HAT: Hybrid Attention Transformer for Image Restoration.
CoRR, 2023

StyleAdapter: A Single-Pass LoRA-Free Model for Stylized Image Generation.
CoRR, 2023

KnowledGPT: Enhancing Large Language Models with Retrieval and Storage Access on Knowledge Bases.
CoRR, 2023

GET3D-: Learning GET3D from Unconstrained Image Collections.
CoRR, 2023

Planting a SEED of Vision in Large Language Model.
CoRR, 2023

Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation.
CoRR, 2023

DreamDiffusion: Generating High-Quality Images from Brain EEG Signals.
CoRR, 2023

InstructP2P: Learning to Edit 3D Point Clouds with Text Instructions.
CoRR, 2023

Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance.
CoRR, 2023

TaleCrafter: Interactive Story Visualization with Multiple Characters.
CoRR, 2023

Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos.
CoRR, 2023

T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models.
CoRR, 2023

Interactive Story Visualization with Multiple Characters.
Proceedings of the SIGGRAPH Asia 2023 Conference Papers, 2023

Inserting Anybody in Diffusion Models via Celeb Basis.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models.
Proceedings of the International Conference on Machine Learning, 2023

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

OSRT: Omnidirectional Image Super-Resolution with Distortion-aware Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Activating More Pixels in Image Super-Resolution Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023


Mitigating Artifacts in Real-World Video Super-resolution Models.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Accelerating the Training of Video Super-resolution Models.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

MAPS-KB: A Million-Scale Probabilistic Simile Knowledge Base.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Path-Restore: Learning Network Path Selection for Image Restoration.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Hybrid Warping Fusion for Video Frame Interpolation.
Int. J. Comput. Vis., 2022

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation.
CoRR, 2022

Reference-based Image and Video Super-Resolution via C2-Matching.
CoRR, 2022

Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis.
CoRR, 2022

FaceFormer: Scale-aware Blind Face Restoration with Transformers.
CoRR, 2022

Activating More Pixels in Image Super-Resolution Transformer.
CoRR, 2022

AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Rethinking Alignment in Video Super-Resolution Transformers.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Composite Photograph Harmonization with Complete Background Cues.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Language Models as Knowledge Embeddings.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Metric Learning Based Interactive Modulation for Real-World Super-Resolution.
Proceedings of the Computer Vision - ECCV 2022, 2022

VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder.
Proceedings of the Computer Vision - ECCV 2022, 2022


VFHQ: A High-Quality Dataset and Benchmark for Video Face Super-Resolution.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022


2021
Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

Towards Vivid and Diverse Image Colorization with Generative Color Prior.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Positional Encoding As Spatial Inductive Bias in GANs.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Towards Real-World Blind Face Restoration With Generative Facial Prior.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Robust Reference-Based Super-Resolution via C2-Matching.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Understanding Deformable Alignment in Video Super-Resolution.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2019
Deep Network Interpolation for Continuous Imagery Effect Transition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

EDVR: Video Restoration With Enhanced Deformable Convolutional Networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019



2018
Study of effervescent jet breakup under gas expansion disturbance.
J. Vis., 2018

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks.
CoRR, 2018

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks.
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Recovering Realistic Texture in Image Super-Resolution by Deep Spatial Feature Transform.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017


  Loading...