2025
MEAT: Multiview Diffusion Model for Human Generation on Megapixels with Mesh Attention.
CoRR, March, 2025
Vid2Sim: Realistic and Interactive Simulation from Video for Urban Navigation.
CoRR, January, 2025
Joint Optimization for 4D Human-Scene Reconstruction in the Wild.
CoRR, January, 2025
MetaUrban: An Embodied AI Simulation Platform for Urban Micromobility.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
2024
Fast-Vid2Vid++: Spatial-Temporal Distillation for Real-Time Video-to-Video Synthesis.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024
HyperStyle3D: Text-Guided 3D Portrait Stylization via Hypernetworks.
IEEE Trans. Circuits Syst. Video Technol., October, 2024
VLG: General Video Recognition with Web Textual Knowledge.
Int. J. Comput. Vis., October, 2024
ReliTalk: Relightable Talking Portrait Generation from a Single Video.
Int. J. Comput. Vis., August, 2024
Learning to Generate Diverse Pedestrian Movements from Web Videos with Noisy Labels.
CoRR, 2024
MetaUrban: A Simulation Platform for Embodied AI in Urban Spaces.
CoRR, 2024
Parameterization-Driven Neural Surface Reconstruction for Object-Oriented Editing in Neural Rendering.
Proceedings of the Computer Vision - ECCV 2024, 2024
CosmicMan: A Text-to-Image Foundation Model for Humans.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
PaintHuman: Towards High-Fidelity Text-to-3D Human Texturing via Denoised Score Distillation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
Audio-Driven Dubbing for User Generated Contents via Style-Aware Semi-Parametric Synthesis.
IEEE Trans. Circuits Syst. Video Technol., March, 2023
Bi-directional Deformation for Parameterization of Neural Implicit Surfaces.
CoRR, 2023
Innovative Digital Storytelling with AIGC: Exploration and Discussion of Recent Advances.
CoRR, 2023
Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis.
CoRR, 2023
DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Filter-Recovery Network for Multi-Speaker Audio-Visual Speech Separation.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
MotionBERT: A Unified Perspective on Learning Human Motion Representations.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
3DHumanGAN: 3D-Aware Human Image Generation with 3D Pose Mapping.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
OrthoPlanes: A Novel Representation for Better 3D-Awareness of GANs.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Text2Performer: Text-Driven Human Video Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
CelebV-Text: A Large-Scale Facial Text-Video Dataset.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
MonoHuman: Animatable Human Neural Field from Monocular Video.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
Text2Human: text-driven controllable human image generation.
ACM Trans. Graph., 2022
Everybody's Talkin': Let Me Talk as You Want.
IEEE Trans. Inf. Forensics Secur., 2022
3DHumanGAN: Towards Photo-Realistic 3D-Aware Human Image Generation.
CoRR, 2022
MotionBERT: Unified Pretraining for Human Motion Analysis.
CoRR, 2022
StyleFaceV: Face Video Generation via Decomposing and Recomposing Pretrained StyleGAN3.
CoRR, 2022
Submission to Generic Event Boundary Detection Challenge@CVPR 2022: Local Context Modeling and Global Boundary Decoding Approach.
CoRR, 2022
Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis.
CoRR, 2022
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation.
CoRR, 2022
EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model.
Proceedings of the SIGGRAPH '22: Special Interest Group on Computer Graphics and Interactive Techniques Conference, Vancouver, BC, Canada, August 7, 2022
Audio-Driven Co-Speech Gesture Video Generation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Fast-Vid2Vid: Spatial-Temporal Compression for Video-to-Video Synthesis.
Proceedings of the Computer Vision - ECCV 2022, 2022
CelebV-HQ: A Large-Scale Video Facial Attributes Dataset.
Proceedings of the Computer Vision - ECCV 2022, 2022
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation.
Proceedings of the Computer Vision - ECCV 2022, 2022
StyleGAN-Human: A Data-Centric Odyssey of Human Generation.
Proceedings of the Computer Vision - ECCV 2022, 2022
Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing.
Proceedings of the Computer Vision - ECCV 2022, 2022
TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
MoCaNet: Motion Retargeting In-the-Wild via Canonicalization Networks.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022
Talking Faces: Audio-to-Video Face Generation.
Proceedings of the Handbook of Digital Face Manipulation and Detection, 2022
DeepFakes Detection: the DeeperForensicsDeeperForensics Dataset and Challenge.
Proceedings of the Handbook of Digital Face Manipulation and Detection, 2022
2021
Everything's Talkin': Pareidolia Face Reenactment.
CoRR, 2021
DeeperForensics Challenge 2020 on Real-World Face Forgery Detection: Methods and Results.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2021
Procedural Block-Based USD Workflows in Conduit.
Proceedings of the SIGGRAPH 2021: Special Interest Group on Computer Graphics and Interactive Techniques Conference, 2021
Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
NJU MCG - Sensetime Team Submission to Pre-training for Video Understanding Challenge Track II.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
TAM: Temporal Adaptive Module for Video Recognition.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Focal Frequency Loss for Image Reconstruction and Synthesis.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Pareidolia Face Reenactment.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Audio-Driven Emotional Video Portraits.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
2020
Focal Frequency Loss for Generative Models.
CoRR, 2020
AOT: Appearance Optimal Transport Based Identity Swapping for Forgery Detection.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
MEAD: A Large-Scale Audio-Visual Dataset for Emotional Talking-Face Generation.
Proceedings of the Computer Vision - ECCV 2020, 2020
Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation.
Proceedings of the Computer Vision - ECCV 2020, 2020
TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
DeeperForensics-1.0: A Large-Scale Dataset for Real-World Face Forgery Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
Localin Reshuffle Net: Toward Naturally and Efficiently Facial Image Blending.
Proceedings of the Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30, 2020
2019
Disentangling Content and Style via Unsupervised Geometry Distillation.
Proceedings of the Deep Generative Models for Highly Structured Data, 2019
FAB: A Robust Facial Landmark Detection Framework for Motion-Blurred Videos.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Aggregation via Separation: Boosting Facial Landmark Detector With Semi-Supervised Style Translation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Make a Face: Towards Arbitrary High Fidelity Face Manipulation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
TransGaGa: Geometry-Aware Unsupervised Image-To-Image Translation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
2018
ReenactGAN: Learning to Reenact Faces via Boundary Transfer.
Proceedings of the Computer Vision - ECCV 2018, 2018
Look at Boundary: A Boundary-Aware Face Alignment Algorithm.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018
2010
Stability analysis in dynamic social networks.
Proceedings of the 2010 Spring Simulation Multiconference, 2010