Ziwei Liu

Orcid: 0000-0002-4220-5958

Affiliations:
  • Nanyang Technological University, S-Lab, Singapore
  • Chinese University of Hong Kong, Department of Information Engineering, Hong Kong (PhD)


According to our database1, Ziwei Liu authored at least 328 papers between 2014 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Fast-Vid2Vid++: Spatial-Temporal Distillation for Real-Time Video-to-Video Synthesis.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Class-Incremental Learning: A Survey.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Pair Then Relation: Pair-Net for Panoptic Scene Graph Generation.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Transformer-Based Visual Segmentation: A Survey.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Playing for 3D Human Recovery.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Generalized Out-of-Distribution Detection: A Survey.
Int. J. Comput. Vis., December, 2024

PERF: Panoramic Neural Radiance Field From a Single Panorama.
IEEE Trans. Pattern Anal. Mach. Intell., October, 2024

Exploring Point-BEV Fusion for 3D Point Cloud Object Tracking With Transformer.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2024

Detecting and Grounding Multi-Modal Media Manipulation and Beyond.
IEEE Trans. Pattern Anal. Mach. Intell., August, 2024

ReliTalk: Relightable Talking Portrait Generation from a Single Video.
Int. J. Comput. Vis., August, 2024

MotionDiffuse: Text-Driven Human Motion Generation With Diffusion Model.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2024

Talk-to-Edit: Fine-Grained 2D and 3D Facial Editing via Dialog.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2024

Unified 3D and 4D Panoptic Segmentation via Dynamic Shifting Networks.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2024

Exploiting Hierarchical Interactions for Protein Surface Learning.
IEEE J. Biomed. Health Informatics, April, 2024

Guest Editorial: Special Issue on the Promises and Dangers of Large Vision Models.
Int. J. Comput. Vis., April, 2024

Open Long-Tailed Recognition in a Dynamic World.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2024

Robust Partial-to-Partial Point Cloud Registration in a Full Range.
IEEE Robotics Autom. Lett., 2024

High-Fidelity Virtual Try-on with Large-Scale Unpaired Learning.
CoRR, 2024

FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality.
CoRR, 2024

DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes.
CoRR, 2024

VistaDream: Sampling multiview consistent images for single-view scene reconstruction.
CoRR, 2024

AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation.
CoRR, 2024

EgoLM: Multi-Modal Language Model of Egocentric Motions.
CoRR, 2024

Disco4D: Disentangled 4D Human Generation and Animation from a Single Image.
CoRR, 2024

Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution.
CoRR, 2024

3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion.
CoRR, 2024

Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion.
CoRR, 2024

LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation.
CoRR, 2024

LLaVA-OneVision: Easy Visual Task Transfer.
CoRR, 2024

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey.
CoRR, 2024

Point Transformer V3 Extreme: 1st Place Solution for 2024 Waymo Open Dataset Challenge in Semantic Segmentation.
CoRR, 2024

LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models.
CoRR, 2024

VEnhancer: Generative Space-Time Enhancement for Video Generation.
CoRR, 2024

CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation.
CoRR, 2024

WildAvatar: Web-scale In-the-wild Video Dataset for 3D Avatar Creation.
CoRR, 2024

Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT.
CoRR, 2024

FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models.
CoRR, 2024

Long Context Transfer from Language to Vision.
CoRR, 2024

GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation.
CoRR, 2024

Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving.
CoRR, 2024

Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo.
CoRR, 2024

The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition.
CoRR, 2024

DiffTF++: 3D-aware Diffusion Transformer for Large-Vocabulary 3D Generation.
CoRR, 2024

Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving.
CoRR, 2024

WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning.
CoRR, 2024

Make-it-Real: Unleashing Large Multimodal Model's Ability for Painting 3D Objects with Realistic Materials.
CoRR, 2024

MMInA: Benchmarking Multihop Multimodal Internet Agents.
CoRR, 2024

Move Anything with Layered Scene Diffusion.
CoRR, 2024

FashionEngine: Interactive Generation and Editing of 3D Clothed Humans.
CoRR, 2024

Large Motion Model for Unified Multi-Modal Motion Generation.
CoRR, 2024

SurMo: Surface-based 4D Motion Modeling for Dynamic Human Rendering.
CoRR, 2024

Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models.
CoRR, 2024

Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding.
CoRR, 2024

InTeX: Interactive Text-to-texture Synthesis via Unified Depth-aware Inpainting.
CoRR, 2024

3DTopia: Large Text-to-3D Generation Model with Hybrid Diffusion Priors.
CoRR, 2024

A Comprehensive Survey on 3D Content Generation.
CoRR, 2024

Multi-scale 2D Temporal Map Diffusion Models for Natural Language Video Localization.
CoRR, 2024

Latte: Latent Diffusion Transformer for Video Generation.
CoRR, 2024

ReVersion: Diffusion-Based Relation Inversion from Images.
Proceedings of the SIGGRAPH Asia 2024 Conference Papers, 2024

TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model.
Proceedings of the SIGGRAPH Asia 2024 Conference Papers, 2024

ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars.
Proceedings of the ACM SIGGRAPH 2024 Conference Papers, 2024

MMHead: Towards Fine-grained Multi-modal 3D Facial Animation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Large-Vocabulary 3D Diffusion Model with Transformer.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose Estimation.
Proceedings of the Computer Vision - ECCV 2024, 2024

Large Motion Model for Unified Multi-modal Motion Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

WHAC: World-Grounded Humans and Cameras.
Proceedings of the Computer Vision - ECCV 2024, 2024

Deep Nets with Subsampling Layers Unwittingly Discard Useful Activations at Test-Time.
Proceedings of the Computer Vision - ECCV 2024, 2024

Octopus: Embodied Vision-Language Programmer from Environmental Feedback.
Proceedings of the Computer Vision - ECCV 2024, 2024

4D Contrastive Superflows are Dense 3D Representation Learners.
Proceedings of the Computer Vision - ECCV 2024, 2024

[inline-graphic not available: see fulltext] FunQA: Towards Surprising Video Comprehension.
Proceedings of the Computer Vision - ECCV 2024, 2024

FreeInit: Bridging Initialization Gap in Video Diffusion Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

LGM: Large Multi-view Gaussian Model for High-Resolution 3D Content Creation.
Proceedings of the Computer Vision - ECCV 2024, 2024

Nymeria: A Massive Collection of Multimodal Egocentric Daily Motion in the Wild.
Proceedings of the Computer Vision - ECCV 2024, 2024

MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo.
Proceedings of the Computer Vision - ECCV 2024, 2024

MMBench: Is Your Multi-modal Model an All-Around Player?
Proceedings of the Computer Vision - ECCV 2024, 2024

GroupDiff: Diffusion-Based Group Portrait Editing.
Proceedings of the Computer Vision - ECCV 2024, 2024

StructLDM: Structured Latent Diffusion for 3D Human Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

ReSyncer: Rewiring Style-Based Generator for Unified Audio-Visually Synced Facial Performer.
Proceedings of the Computer Vision - ECCV 2024, 2024

ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance.
Proceedings of the Computer Vision - ECCV 2024, 2024

TC4D: Trajectory-Conditioned Text-to-4D Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

Vlogger: Make Your Dream A Vlog.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

InstructVideo: Instructing Video Diffusion Models with Human Feedback.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Fresco: Spatial-Temporal Correspondence for Zero-Shot Video Translation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

CityDreamer: Compositional Generative Model of Unbounded 3D Cities.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Towards Language-Driven Video Inpainting via Multimodal Large Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

SinSR: Diffusion-Based Image Super-Resolution in a Single Step.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

FreeU: Free Lunch in Diffusion U-Net.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Move Anything with Layered Scene Diffusion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Multi-Space Alignments Towards Universal LiDAR Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

VBench: Comprehensive Benchmark Suite for Video Generative Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Digital Life Project: Autonomous 3D Characters with Social Intelligence.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024


VideoBooth: Diffusion-based Video Generation with Image Prompts.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Few-shot forgery detection via Guided Adversarial Interpolation.
Pattern Recognit., December, 2023

Bailando++: 3D Dance GPT With Choreographic Memory.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

SceneDreamer: Unbounded 3D Scene Generation From 2D Image Collections.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Towards Real-World Visual Tracking With Temporal Contexts.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

GP-UNIT: Generative Prior for Versatile Unsupervised Image-to-Image Translation.
IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

Full-Spectrum Out-of-Distribution Detection.
Int. J. Comput. Vis., October, 2023

Variational Relational Point Completion Network for Robust 3D Classification.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2023

Semi-Supervised Domain Generalization with Stochastic StyleMatch.
Int. J. Comput. Vis., September, 2023

Reference-Based Image and Video Super-Resolution via $C^{2}$-Matching.
IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup.
Pattern Recognit., June, 2023

Lifting 2D Human Pose to 3D with Domain Adapted 3D Body Concept.
Int. J. Comput. Vis., May, 2023

Domain Generalization: A Survey.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2023

DreamGaussian4D: Generative 4D Gaussian Splatting.
CoRR, 2023

Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases.
CoRR, 2023

GauHuman: Articulated Gaussian Splatting from Monocular Human Videos.
CoRR, 2023

OtterHD: A High-Resolution Multi-modality Model.
CoRR, 2023

Multimodal Foundation Models for Zero-shot Animal Species Recognition in Camera Trap Images.
CoRR, 2023

SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation.
CoRR, 2023

LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models.
CoRR, 2023

Robust Sequential DeepFake Detection.
CoRR, 2023

MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation.
CoRR, 2023

DeformToon3D: Deformable 3D Toonification from Neural Radiance Fields.
CoRR, 2023

PointHPS: Cascaded 3D Human Pose and Shape Estimation from Point Clouds.
CoRR, 2023

HumanLiff: Layer-wise 3D Human Generation with Diffusion Model.
CoRR, 2023

Temporally-Adaptive Models for Efficient Video Understanding.
CoRR, 2023

Benchmarking and Analyzing Generative Data for Visual Recognition.
CoRR, 2023

DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering.
CoRR, 2023

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation.
CoRR, 2023

FunQA: Towards Surprising Video Comprehension.
CoRR, 2023

OpenOOD v1.5: Enhanced Benchmark for Out-of-Distribution Detection.
CoRR, 2023

MIMIC-IT: Multi-Modal In-Context Instruction Tuning.
CoRR, 2023

DeepFake-Adapter: Dual-Level Adapter for DeepFake Detection.
CoRR, 2023

Learning without Forgetting for Vision-Language Models.
CoRR, 2023

SAD: Segment Any RGBD.
CoRR, 2023

RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars.
CoRR, 2023

ConsistentNeRF: Enhancing Neural Radiance Fields with 3D Consistency for Sparse View Synthesis.
CoRR, 2023

Otter: A Multi-Modal Model with In-Context Instruction Tuning.
CoRR, 2023

RoboBEV: Towards Robust Bird's Eye View Perception under Corruptions.
CoRR, 2023

SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling.
CoRR, 2023

Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need.
CoRR, 2023

Make Your Brief Stroke Real and Stereoscopic: 3D-Aware Simplified Sketch to Portrait Generation.
CoRR, 2023

Deep Class-Incremental Learning: A Survey.
CoRR, 2023

OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation.
CoRR, 2023

Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation.
Proceedings of the SIGGRAPH Asia 2023 Conference Papers, 2023

HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image.
Proceedings of the SIGGRAPH Asia 2023 Conference Papers, 2023

Efficient Video Portrait Reenactment via Grid-based Codebook.
Proceedings of the ACM SIGGRAPH 2023 Conference Proceedings, 2023

What Makes Good Examples for Visual In-Context Learning?
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and Editing.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

4D Panoptic Scene Graph Generation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

InsActor: Instruction-driven Physics-based Characters.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Towards Robust and Expressive Whole-body Human Pose and Shape Estimation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Segment Any Point Cloud Sequences by Distilling Vision Foundation Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Large Language Models are Visual Reasoning Coordinators.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Make-A-Volume: Leveraging Latent Diffusion Models for Cross-Modality 3D Brain MRI Synthesis.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Multi-Modal Generative AI with Foundation Models.
Proceedings of the 1st Workshop on Large Generative Models Meet Multimodal Applications, 2023

BiBench: Benchmarking and Analyzing Network Binarization.
Proceedings of the International Conference on Machine Learning, 2023

Make Your Brief Stroke Real and Stereoscopic: 3D-Aware Simplified Sketch to Portrait Generation.
Proceedings of the 25th International Conference on Multimodal Interaction, 2023

Voxurf: Voxel-based Efficient and Accurate Neural Surface Reconstruction.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

DiffMimic: Efficient Motion Mimicking with Differentiable Physics.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Sparse Mixture-of-Experts are Domain Generalizable Learners.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

EVA3D: Compositional 3D Human Generation from 2D Image Collections.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Masked Frequency Modeling for Self-Supervised Visual Pre-Training.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

DeformToon3d: Deformable Neural Radiance Fields for 3D Toonification.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SparseNeRF: Distilling Depth Ranking for Few-shot Novel View Synthesis.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Deep Geometrized Cartoon Line Inbetweening.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Robo3D: Towards Robust and Reliable 3D Perception against Corruptions.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Rethinking Range View Representation for LiDAR Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SHERF: Generalizable Human NeRF from a Single Image.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Text2Performer: Text-Driven Human Video Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Panoptic Video Scene Graph Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Detecting and Grounding Multi-Modal Media Manipulation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

LaserMix for Semi-Supervised LiDAR Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Collaborative Diffusion for Multi-Modal Face Generation and Editing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-Based Generator.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

F<sup>2</sup>-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Robust Video Portrait Reenactment via Personalized Representation Quantization.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
VToonify: Controllable High-Resolution Portrait Video Style Transfer.
ACM Trans. Graph., 2022

Text2Human: text-driven controllable human image generation.
ACM Trans. Graph., 2022

AvatarCLIP: zero-shot text-driven generation and animation of 3D avatars.
ACM Trans. Graph., 2022

Text2Light: Zero-Shot Text-Driven HDR Panorama Generation.
ACM Trans. Graph., 2022

Chasing the Tail in Monocular 3D Human Reconstruction With Prototype Memory.
IEEE Trans. Image Process., 2022

CARAFE++: Unified Content-Aware ReAssembly of FEatures.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Learning to Prompt for Vision-Language Models.
Int. J. Comput. Vis., 2022

Delving into Inter-Image Invariance for Unsupervised Visual Representations.
Int. J. Comput. Vis., 2022

Reference-based Image and Video Super-Resolution via C2-Matching.
CoRR, 2022

TripleE: Easy Domain Generalization via Episodic Replay.
CoRR, 2022

On-Device Domain Generalization.
CoRR, 2022

StyleFaceV: Face Video Generation via Decomposing and Recomposing Pretrained StyleGAN3.
CoRR, 2022

Neural Prompt Search.
CoRR, 2022

Sparse Fusion Mixture-of-Experts are Domain Generalizable Learners.
CoRR, 2022

Robust Face Anti-Spoofing with Dual Probabilistic Modeling.
CoRR, 2022

SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance.
CoRR, 2022

X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation.
CoRR, 2022

Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy.
CoRR, 2022

LiDAR-based 4D Panoptic Segmentation via Dynamic Shifting Network.
CoRR, 2022

Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers.
Proceedings of the SIGGRAPH Asia 2022 Conference Papers, 2022

OpenOOD: Benchmarking Generalized Out-of-Distribution Detection.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Benchmarking and Analyzing 3D Human Pose and Shape Estimation Beyond Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Audio-Driven Co-Speech Gesture Video Generation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

AnimeRun: 2D Animation Visual Correspondence from Open Source 3D Movies.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Free Lunch for Surgical Video Understanding by Distilling Self-supervisions.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2022, 2022

Benchmarking and Analyzing Point Cloud Classification under Corruptions.
Proceedings of the International Conference on Machine Learning, 2022

BiBERT: Accurate Fully Binarized BERT.
Proceedings of the Tenth International Conference on Learning Representations, 2022

TAda! Temporally-Adaptive Convolutions for Video Understanding.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Fast-Vid2Vid: Spatial-Temporal Compression for Video-to-Video Synthesis.
Proceedings of the Computer Vision - ECCV 2022, 2022

CelebV-HQ: A Large-Scale Video Facial Attributes Dataset.
Proceedings of the Computer Vision - ECCV 2022, 2022

Benchmarking Omni-Vision Representation Through the Lens of Visual Realms.
Proceedings of the Computer Vision - ECCV 2022, 2022

Panoptic Scene Graph Generation.
Proceedings of the Computer Vision - ECCV 2022, 2022

StyleSwap: Style-Based Generator Empowers Robust Face Swapping.
Proceedings of the Computer Vision - ECCV 2022, 2022

Mind the Gap in Distilling StyleGANs.
Proceedings of the Computer Vision - ECCV 2022, 2022

StyleLight: HDR Panorama Generation for Lighting Estimation and Editing.
Proceedings of the Computer Vision - ECCV 2022, 2022

Detecting and Recovering Sequential DeepFake Manipulation.
Proceedings of the Computer Vision - ECCV 2022, 2022

UNIF: United Neural Implicit Functions for Clothed Human Reconstruction and Animation.
Proceedings of the Computer Vision - ECCV 2022, 2022

X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation.
Proceedings of the Computer Vision - ECCV 2022, 2022

StyleGAN-Human: A Data-Centric Odyssey of Human Generation.
Proceedings of the Computer Vision - ECCV 2022, 2022

Relighting4D: Neural Relightable Human from Videos.
Proceedings of the Computer Vision - ECCV 2022, 2022

HuMMan: Multi-modal 4D Human Dataset for Versatile Sensing and Modeling.
Proceedings of the Computer Vision - ECCV 2022, 2022

Conditional Prompt Learning for Vision-Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Delving Deep into the Generalization of Vision Transformers under Distribution Shifts.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Full-Range Virtual Try-On with Recurrent Tri-Level Transform.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Unsupervised Image-to-Image Translation with Generative Prior.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Balanced MSE for Imbalanced Visual Regression.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Versatile Multi-Modal Pre-Training for Human-Centric Perception.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

TCTrack: Temporal Contexts for Aerial Tracking.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

SepFusion: Finding Optimal Fusion Structures for Visual Sound Separation.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Visual Sound Localization in the Wild by Cross-Modal Interference Erasing.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Learning Diverse Fashion Collocation by Neural Graph Filtering.
IEEE Trans. Multim., 2021

Iterative human and automated identification of wildlife images.
Nat. Mach. Intell., 2021

Multi-View Partial (MVP) Point Cloud Challenge 2021 on Completion and Registration: Methods and Results.
CoRR, 2021

ForgeryNet - Face Forgery Analysis Challenge 2021: Methods and Results.
CoRR, 2021

Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion.
CoRR, 2021

Playing for 3D Human Recovery.
CoRR, 2021

Delving Deep into the Generalization of Vision Transformers under Distribution Shifts.
CoRR, 2021

CelebA-Spoof Challenge 2020 on Face Anti-Spoofing: Methods and Results.
CoRR, 2021

DeeperForensics Challenge 2020 on Real-World Face Forgery Detection: Methods and Results.
CoRR, 2021

Person-in-Context Synthesis with Compositional Structural Space.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

ShineOn: Illuminating Design Choices for Practical Video-based Virtual Clothing Try-on.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision Workshops, 2021

Unsupervised Object-Level Representation Learning from Scene Images.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Balanced Chamfer Distance as a Comprehensive Metric for Point Cloud Completion.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Garment4D: Garment Reconstruction from Point Cloud Sequences.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Few-Shot Object Detection via Association and DIscrimination.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

MMFashion: An Open-Source Toolbox for Visual Fashion Analysis.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Speech2Talking-Face: Inferring and Driving a Face with Synchronized Audio-Visual Representation.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Long-tailed Recognition by Routing Diverse Distribution-Aware Experts.
Proceedings of the 9th International Conference on Learning Representations, 2021

Do 2D GANs Know 3D Shape? Unsupervised 3D Shape Reconstruction from 2D Image GANs.
Proceedings of the 9th International Conference on Learning Representations, 2021

Differentiable Dynamic Wirings for Neural Networks.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Incorporating Convolution Designs into Visual Transformers.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Semantically Coherent Out-of-Distribution Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

BlockPlanner: City Block Generation with Vectorized Graph Representation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Energy-Based Open-World Uncertainty Modeling for Confidence Calibration.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Unsupervised Domain Adaptive 3D Detection with Multi-Level Consistency.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Talk-to-Edit: Fine-Grained Facial Editing via Dialog.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Visually Informed Binaural Audio Generation without Binaural Audios.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Adversarial Robustness Under Long-Tailed Distribution.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Seesaw Loss for Long-Tailed Instance Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Unsupervised Feature Learning by Cross-Level Instance-Group Discrimination.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Variational Relational Point Completion Network.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Deep Animation Video Interpolation in the Wild.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

LiDAR-Based Panoptic Segmentation via Dynamic Shifting Network.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Robust Reference-Based Super-Resolution via C2-Matching.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

PTeacher: a Computer-Aided Personalized Pronunciation Training System with Exaggerated Audio-Visual Corrective Feedback.
Proceedings of the CHI '21: CHI Conference on Human Factors in Computing Systems, 2021

Monocular 3D Reconstruction of Interacting Hands via Collision-Aware Factorized Refinements.
Proceedings of the International Conference on 3D Vision, 2021

2020
ShineOn: Illuminating Design Choices for Practical Video-based Virtual Clothing Try-on.
CoRR, 2020

Person-in-Context Synthesiswith Compositional Structural Space.
CoRR, 2020

Unsupervised Feature Learning by Cross-Level Discrimination between Instances and Groups.
CoRR, 2020

Unsupervised Human 3D Pose Representation with Viewpoint and Pose Disentanglement.
CoRR, 2020

Unsupervised Landmark Learning from Unpaired Data.
CoRR, 2020

Sensing, Understanding and Synthesizing Humans in an Open World.
Proceedings of the HuMA'20: Proceedings of the 1st International Workshop on Human-centric Multimedia Analysis, 2020

Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation.
Proceedings of the Computer Vision - ECCV 2020, 2020

CelebA-Spoof: Large-Scale Face Anti-spoofing Dataset with Rich Annotations.
Proceedings of the Computer Vision - ECCV 2020, 2020

Knowledge Distillation Meets Self-supervision.
Proceedings of the Computer Vision - ECCV 2020, 2020

Distribution-Balanced Loss for Multi-label Classification in Long-Tailed Datasets.
Proceedings of the Computer Vision - ECCV 2020, 2020

Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement.
Proceedings of the Computer Vision - ECCV 2020, 2020

Placepedia: Comprehensive Place Understanding with Multi-faceted Annotations.
Proceedings of the Computer Vision - ECCV 2020, 2020

Rotate-and-Render: Unsupervised Photorealistic Face Rotation From Single-View Images.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Online Deep Clustering for Unsupervised Representation Learning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Self-Supervised Scene De-Occlusion.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

MaskGAN: Towards Diverse and Interactive Facial Image Manipulation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

When NAS Meets Robustness: In Search of Robust Architectures Against Adversarial Attacks.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Open Compound Domain Adaptation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Dynamic Graph CNN for Learning on Point Clouds.
ACM Trans. Graph., 2019

When NAS Meets Robustness: In Search of Robust Architectures against Adversarial Attacks.
CoRR, 2019

Learning to Synthesize Fashion Textures.
CoRR, 2019

Compound Domain Adaptation in an Open World.
CoRR, 2019

MMDetection: Open MMLab Detection Toolbox and Benchmark.
CoRR, 2019

Vision-Infused Deep Audio Inpainting.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

CARAFE: Content-Aware ReAssembly of FEatures.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Delving Deep Into Hybrid Annotations for 3D Human Recovery in the Wild.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Self-Supervised Learning via Conditional Motion Propagation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Hybrid Task Cascade for Instance Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Large-Scale Long-Tailed Recognition in an Open World.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

One-shot Face Reenactment.
Proceedings of the 30th British Machine Vision Conference 2019, 2019

Talking Face Generation by Adversarially Disentangled Audio-Visual Representation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Instance-Level Facial Attributes Transfer with Geometry-Aware Flow.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

DPATCH: An Adversarial Patch Attack on Object Detectors.
Proceedings of the Workshop on Artificial Intelligence Safety 2019 co-located with the Thirty-Third AAAI Conference on Artificial Intelligence 2019 (AAAI-19), 2019

2018
Vision-Based Calibration of Dual RCM-Based Robot Arms in Human-Robot Collaborative Minimally Invasive Surgery.
IEEE Robotics Autom. Lett., 2018

Deep Learning Markov Random Field for Semantic Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Adaptive Affinity Field for Semantic Segmentation.
CoRR, 2018

Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition.
Proceedings of the Computer Vision - ECCV 2018, 2018

Adaptive Affinity Fields for Semantic Segmentation.
Proceedings of the Computer Vision - ECCV 2018, 2018

Mix-and-Match Tuning for Self-Supervised Semantic Segmentation.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Video Object Segmentation with Re-identification.
CoRR, 2017

Unconstrained Fashion Landmark Detection via Hierarchical Recurrent Transformer Networks.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Video Frame Synthesis Using Deep Voxel Flow.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Semantic Facial Expression Editing using Autoencoded Flow.
CoRR, 2016

Fashion Landmark Detection in the Wild.
Proceedings of the Computer Vision - ECCV 2016, 2016

DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Face Model Compression by Distilling Knowledge from Neurons.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Deep Learning Face Attributes in the Wild.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Semantic Image Segmentation via Deep Parsing Network.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

2014
Fast burst images denoising.
ACM Trans. Graph., 2014


  Loading...