Xihui Liu

Orcid: 0000-0003-1831-9952

According to our database1, Xihui Liu authored at least 74 papers between 2016 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
WorldSimBench: Towards Video Generation Models as World Simulators.
CoRR, 2024

PUMA: Empowering Unified MLLM with Multi-granular Visual Generation.
CoRR, 2024

LVD-2M: A Long-take Video Dataset with Temporally Dense Captions.
CoRR, 2024

Loong: Generating Minute-level Long Videos with Autoregressive Language Models.
CoRR, 2024

Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding.
CoRR, 2024

Scene Graph Disentanglement and Composition for Generalizable Complex Image Generation.
CoRR, 2024

LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness.
CoRR, 2024

DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion.
CoRR, 2024

T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation.
CoRR, 2024

OVExp: Open Vocabulary Exploration for Object-Oriented Navigation.
CoRR, 2024

Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images.
CoRR, 2024

GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing.
CoRR, 2024

BEACON: Benchmark for Comprehensive RNA Tasks and Language Models.
CoRR, 2024

4Diffusion: Multi-view Video Diffusion Model for 4D Generation.
CoRR, 2024

DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis.
CoRR, 2024

Editing Massive Concepts in Text-to-Image Diffusion Models.
CoRR, 2024

Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation.
CoRR, 2024

Shape-Guided Diffusion with Inside-Outside Attention.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Hierarchical Diffusion Autoencoders and Disentangled Image Manipulation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

FiT: Flexible Vision Transformer for Diffusion Model.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities.
Proceedings of the Computer Vision - ECCV 2024, 2024

PredBench: Benchmarking Spatio-Temporal Prediction Across Diverse Disciplines.
Proceedings of the Computer Vision - ECCV 2024, 2024

TC4D: Trajectory-Conditioned Text-to-4D Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

DreamComposer: Controllable 3D Object Generation via Multi-View Conditions.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Towards Large-Scale 3D Representation Learning with Multi-Dataset Point Prompt Training.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Point Transformer V3: Simpler, Faster, Stronger.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
A Survey of Reasoning with Foundation Models.
CoRR, 2023

EgoPlan-Bench: Benchmarking Egocentric Embodied Planning with Multimodal Large Language Models.
CoRR, 2023

Drag-A-Video: Non-rigid Video Editing with Point-based Interaction.
CoRR, 2023

Understanding Masked Autoencoders From a Local Contrastive Perspective.
CoRR, 2023

Object2Scene: Putting Objects in Context for Open-Vocabulary 3D Detection.
CoRR, 2023

Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training.
CoRR, 2023

UniG3D: A Unified 3D Object Generation Dataset.
CoRR, 2023

SAM3D: Segment Anything in 3D Scenes.
CoRR, 2023

TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at Scale.
CoRR, 2023

Seeing is not always believing: A Quantitative Study on Human Perception of AI-Generated Images.
CoRR, 2023

RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer.
CoRR, 2023

More Control for Free! Image Synthesis with Semantic Diffusion Guidance.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

OV-PARTS: Towards Open-Vocabulary Part Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Seeing is not always believing: Benchmarking Human and Model Perception of AI-Generated Images.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

CorresNeRF: Image Correspondence Priors for Neural Radiance Fields.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DDP: Diffusion Model for Dense Visual Prediction.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning Transferable Spatiotemporal Representations from Natural Script Knowledge.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

RIFormer: Keep Your Vision Backbone Effective But Removing Token Mixer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Back to the Source: Diffusion-Driven Adaptation to Test-Time Corruption.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

GLeaD: Improving GANs with A Generator-Leading Task.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Internal Tides and Their Intraseasonal Variability on the Continental Slope Northeast of Taiwan Island Derived from Mooring Observations and Satellite Data.
Remote. Sens., 2022

Back to the Source: Diffusion-Driven Test-Time Adaptation.
CoRR, 2022

The ArtBench Dataset: Benchmarking Generative Models with Artworks.
CoRR, 2022

MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval.
CoRR, 2022

BridgeFormer: Bridging Video-text Retrieval with Multiple Choice Questions.
CoRR, 2022

Point Transformer V2: Grouped Vector Attention and Partition-based Pooling.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval.
Proceedings of the Computer Vision - ECCV 2022, 2022

Bridging Video-text Retrieval with Multiple Choice Questions.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Benchmark for Compositional Text-to-Image Synthesis.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Nonlinear dynamics analysis of involute spur gear transmission system.
Proceedings of the AIAM 2021: 3rd International Conference on Artificial Intelligence and Advanced Manufacture, Manchester, United Kingdom, October 23, 2021

2020
Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions.
Proceedings of the Computer Vision - ECCV 2020, 2020

2019
Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Improving Referring Expression Grounding With Cross-Modal Attention-Guided Erasing.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association.
CoRR, 2018

Show, Tell and Discriminate: Image Captioning by Self-retrieval with Partially Labeled Data.
Proceedings of the Computer Vision - ECCV 2018, 2018

Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association.
Proceedings of the Computer Vision - ECCV 2018, 2018

Localization Guided Learning for Pedestrian Attribute Recognition.
Proceedings of the British Machine Vision Conference 2018, 2018

2017
Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Re-identification.
Proceedings of the IEEE International Conference on Computer Vision, 2017

HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Object Detection in Videos with Tubelet Proposal Networks.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
A low-complexity precoding scheme for two-user massive MIMO downlink.
Proceedings of the 17th IEEE International Workshop on Signal Processing Advances in Wireless Communications, 2016

Measurement-Driven Capability Modeling for Mobile Network in Large-Scale Urban Environment.
Proceedings of the 13th IEEE International Conference on Mobile Ad Hoc and Sensor Systems, 2016


  Loading...