2025

Cosmos World Foundation Model Platform for Physical AI.

[DOI]

Niket Agarwal

Arslan Ali

Prithvijit Chattopadhyay

Vasanth Rao Naik Sabavat

CoRR, January, 2025

2024

Meshtron: High-Fidelity, Artist-Like 3D Mesh Generation at Scale.

[DOI]

CoRR, 2024

Edify 3D: Scalable High-Quality 3D Asset Generation.

[DOI]

CoRR, 2024

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models.

[DOI]

CoRR, 2024

One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation.

[DOI]

CoRR, 2024

EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation.

[DOI]

CoRR, 2024

Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling.

[DOI]

CoRR, 2024

Wolf: Captioning Everything with a World Summarization Framework.

[DOI]

CoRR, 2024

Condition-Aware Neural Network for Controlled Image Generation.

[DOI]

CoRR, 2024

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models.

[DOI]

CoRR, 2024

ExpressiveSinger: Multilingual and Multi-Style Score-based Singing Voice Synthesis with Expressive Performance Control.

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Condition-Aware Neural Network for Controlled Image Generation.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Loss-Guided Diffusion Models for Plug-and-Play Controllable Generation.

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

ATT3D: Amortized Text-to-3D Object Synthesis.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SPACE: Speech-driven Portrait Animation with Controllable Expression.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning.

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

DiffCollage: Parallel Generation of Large Content with Diffusion Models.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Magic3D: High-Resolution Text-to-3D Content Creation.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Neuralangelo: High-Fidelity Neural Surface Reconstruction.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Learning to Relight Portrait Images via a Virtual Light Stage and Synthetic-to-Real Adaptation.

[DOI]

ACM Trans. Graph., 2022

LNS-Madam: Low-Precision Training in Logarithmic Number System Using Multiplicative Weight Update.

[DOI]

Jiawei Zhao

Steve Dai

Rangharajan Venkatesan

IEEE Trans. Computers, 2022

SPACEx: Speech-driven Portrait Animation with Controllable Expression.

[DOI]

CoRR, 2022

eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers.

[DOI]

CoRR, 2022

Implicit Warping for Animation with Image Sets.

[DOI]

Arun Mallya

Ting-Chun Wang

Ming-Yu Liu

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Implicit Neural Representations with Levels-of-Experts.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Generating Long Videos of Dynamic Scenes.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Multimodal Conditional Image Synthesis with Product-of-Experts GANs.

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2021

Generative Adversarial Networks for Image and Video Synthesis: Algorithms and Applications.

[DOI]

Proc. IEEE, 2021

Domain Stylization: A Fast Covariance Matching Framework Towards Domain Adaptation.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2021

Low-Precision Training in Logarithmic Number System using Multiplicative Weight Update.

[DOI]

Jiawei Zhao

Steve Dai

Rangharajan Venkatesan

CoRR, 2021

Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

GANcraft: Unsupervised 3D Neural Rendering of Minecraft Worlds.

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing.

[DOI]

Ting-Chun Wang

Arun Mallya

Ming-Yu Liu

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2020

Guest Editorial: Generative Adversarial Networks for Computer Vision.

[DOI]

Int. J. Comput. Vis., 2020

UFO$^2$: A Unified Framework towards Omni-supervised Object Detection.

[DOI]

CoRR, 2020

Style Example-Guided Text Generation using Generative Adversarial Transformers.

[DOI]

Kuo-Hao Zeng

Mohammad Shoeybi

Ming-Yu Liu

CoRR, 2020

SymGAN: Orientation Estimation without Annotation for Symmetric Objects.

[DOI]

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

Learning compositional functions via multiplicative weight updates.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

On the distance between two neural networks and the stability of learning.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder.

[DOI]

Kuniaki Saito

Kate Saenko

Ming-Yu Liu

Proceedings of the Computer Vision - ECCV 2020, 2020

UFO<sup>2</sup>: A Unified Framework Towards Omni-supervised Object Detection.

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

World-Consistent Video-to-Video Synthesis.

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

UNAS: Differentiable Architecture Search Meets Reinforcement Learning.

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Instance-Aware, Context-Focused, and Memory-Efficient Weakly Supervised Object Detection.

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Learning to Generate Multiple Style Transfer Outputs for an Input Sentence.

[DOI]

Proceedings of the Fourth Workshop on Neural Generation and Translation, 2020

2019

Boosting segmentation with weak supervision from image-to-image translation.

[DOI]

CoRR, 2019

Few-shot Video-to-Video Synthesis.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Dancing to Music.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

PointFlow: 3D Point Cloud Generation With Continuous Normalizing Flows.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Meta-Sim: Learning to Generate Synthetic Datasets.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Neural Turtle Graphics for Modeling City Road Layouts.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Few-Shot Unsupervised Image-to-Image Translation.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

STEP: Spatio-Temporal Progressive Learning for Video Action Detection.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Semantic Image Synthesis With Spatially-Adaptive Normalization.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Unsupervised Stylish Image Description Generation via Domain Layer Norm.

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Video-to-Video Synthesis.

[DOI]

CoRR, 2018

Domain Stylization: A Strong, Simple Baseline for Synthetic to Real Image Domain Adaptation.

[DOI]

CoRR, 2018

Video-to-Video Synthesis.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Context-aware Synthesis and Placement of Object Instances.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Reblur2Deblur: Deblurring videos via self-supervised learning.

[DOI]

Proceedings of the 2018 IEEE International Conference on Computational Photography, 2018

A Closed-Form Solution to Photorealistic Image Stylization.

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

Superpixel Sampling Networks.

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

Multimodal Unsupervised Image-to-Image Translation.

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

High-Resolution Image Synthesis and Semantic Manipulation With Conditional GANs.

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

MoCoGAN: Decomposing Motion and Content for Video Generation.

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Learning Superpixels With Segmentation-Aware Affinity Loss.

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume.

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

The 2018 NVIDIA AI City Challenge.

[DOI]

Pranamesh Chakraborty

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018

Localization-Aware Active Learning for Object Detection.

[DOI]

Proceedings of the Computer Vision - ACCV 2018, 2018

Learning Binary Residual Representations for Domain-Specific Video Streaming.

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Detecting Adversarial Attacks on Neural Network Policies with Visual Foresight.

[DOI]

CoRR, 2017

Deep 360 Pilot: Learning a Deep Agent for Piloting through 360° Sports Video.

[DOI]

CoRR, 2017

Attentional Network for Visual Object Detection.

[DOI]

Kota Hara

Ming-Yu Liu

Oncel Tuzel

Amir-massoud Farahmand

CoRR, 2017

Unsupervised Image-to-Image Translation Networks.

[DOI]

Ming-Yu Liu

Thomas M. Breuel

Jan Kautz

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Tactics of Adversarial Attack on Deep Reinforcement Learning Agents.

[DOI]

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

CASENet: Deep Category-Aware Semantic Edge Detection.

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Deep 360 Pilot: Learning a Deep Agent for Piloting through 360° Sports Videos.

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016

Automatic Learning to Remove Multipath Distortions in Time-of-Flight Range Images for a Robotic Arm Setup.

[DOI]

Kilho Son

Ming-Yu Liu

Yuichi Taguchi

CoRR, 2016

Unsupervised network pretraining via encoding human design.

[DOI]

Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision, 2016

Coupled Generative Adversarial Networks.

[DOI]

Ming-Yu Liu

Oncel Tuzel

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Learning to remove multipath distortions in Time-of-Flight range images for a robotic arm setup.

[DOI]

Kilho Son

Ming-Yu Liu

Yuichi Taguchi

Proceedings of the 2016 IEEE International Conference on Robotics and Automation, 2016

Gaussian Conditional Random Field Network for Semantic Segmentation.

[DOI]

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Deep Gaussian Conditional Random Field Network: A Model-Based Deep Network for Discriminative Denoising.

[DOI]

Raviteja Vemulapalli

Oncel Tuzel

Ming-Yu Liu

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

R-CNN for Small Object Detection.

[DOI]

Proceedings of the Computer Vision - ACCV 2016, 2016

2015

Unsupervised Deep Network Pretraining via Human Design.

[DOI]

CoRR, 2015

Layered Interpretation of Street View Images.

[DOI]

Proceedings of the Robotics: Science and Systems XI, Sapienza University of Rome, 2015

2014

Entropy-Rate Clustering: Cluster Analysis via Maximizing a Submodular Function Subject to a Matroid Constraint.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2014

Recursive Context Propagation Network for Semantic Scene Labeling.

[DOI]

Abhishek Sharma

Oncel Tuzel

Ming-Yu Liu

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Learning to Rank 3D Features.

[DOI]

Proceedings of the Computer Vision - ECCV 2014, 2014

2013

Joint Geodesic Upsampling of Depth Images.

[DOI]

Ming-Yu Liu

Oncel Tuzel

Yuichi Taguchi

Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Model-Based Vehicle Pose Estimation and Tracking in Videos Using Random Forests.

[DOI]

Proceedings of the 2013 International Conference on 3D Vision, 2013

2012

Discrete Optimization Methods for Segmentation and Matching.

[DOI]

Ming-Yu Liu

PhD thesis, 2012

Fast object localization and pose estimation in heavy clutter for robotic bin picking.

[DOI]

Int. J. Robotics Res., 2012

Voting-based pose estimation for robotic assembly using a 3D sensor.

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2012

A Grassmann manifold-based domain adaptation approach.

[DOI]

Proceedings of the 21st International Conference on Pattern Recognition, 2012

Classification and Pose Estimation of Vehicles in Videos by 3D Modeling within Discrete-Continuous Optimization.

[DOI]

Proceedings of the 2012 Second International Conference on 3D Imaging, 2012

2011

Entropy rate superpixel segmentation.

[DOI]

Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

2010

Pose estimation in heavy clutter using a multi-flash camera.

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2010

Fast directional chamfer matching.

[DOI]

Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010