2025
Cosmos World Foundation Model Platform for Physical AI.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, January, 2025
2024
3D Gaussian Ray Tracing: Fast Tracing of Particle Scenes.
ACM Trans. Graph., December, 2024
fVDB : A Deep-Learning Framework for Sparse, Large Scale, and High Performance Spatial Intelligence.
,
,
,
,
,
,
,
,
,
,
,
ACM Trans. Graph., July, 2024
InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models.
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Feed-Forward Bullet-Time Reconstruction of Dynamic Scenes from Monocular Videos.
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models.
CoRR, 2024
ReMatching Dynamic Reconstruction Flow.
CoRR, 2024
SCube: Instant Large-Scale Scene Reconstruction using VoxSplats.
CoRR, 2024
OmniRe: Omni Urban Scene Reconstruction.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Wolf: Captioning Everything with a World Summarization Framework.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
L4GM: Large 4D Gaussian Reconstruction Model.
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
RefFusion: Reference Adapted Diffusion Models for 3D Scene Inpainting.
CoRR, 2024
Can Feedback Enhance Semantic Grounding in Large Vision-Language Models?
CoRR, 2024
Augmented Reality based Simulated Data (ARSim) with multi-view consistency for AV perception networks.
CoRR, 2024
SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes.
Proceedings of the SIGGRAPH Asia 2024 Conference Papers, 2024
SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation.
Proceedings of the ACM SIGGRAPH 2024 Conference Papers, 2024
DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
L4GM: Large 4D Gaussian Reconstruction Model.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
SCube: Instant Large-Scale Scene Reconstruction using VoxSplats.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Twelfth International Conference on Learning Representations, 2024
WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Trajeglish: Traffic Modeling as Next-Token Prediction.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Transferring Labels to Solve Annotation Mismatches Across Object Detection Datasets.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Reasoning Paths with Reference Objects Elicit Quantitative Spatial Reasoning in Large Vision-Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis.
Proceedings of the Computer Vision - ECCV 2024, 2024
Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering.
Proceedings of the Computer Vision - ECCV 2024, 2024
NeRF-XL: Scaling NeRFs with Multiple GPUs.
Proceedings of the Computer Vision - ECCV 2024, 2024
Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
2023
Adaptive Shells for Efficient Neural Radiance Field Rendering.
ACM Trans. Graph., December, 2023
Learning Physically Simulated Tennis Skills from Broadcast Videos.
ACM Trans. Graph., August, 2023
Flexible Isosurface Extraction for Gradient-Based Mesh Optimization.
ACM Trans. Graph., August, 2023
Bridging the Sim2Real gap with CARE: Supervised Detection Adaptation with Conditional Alignment and Reweighting.
Trans. Mach. Learn. Res., 2023
Trajeglish: Learning the Language of Driving Scenarios.
CoRR, 2023
XCube (X<sup>3</sup>): Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies.
CoRR, 2023
Neural Fields meet Explicit Geometric Representation for Inverse Rendering of Urban Scenes.
CoRR, 2023
Bridging the Sim2Real gap with CARE: Supervised Detection Adaptation with Conditional Alignment and Reweighting.
CoRR, 2023
Compact Neural Graphics Primitives with Learned Hash Probing.
Proceedings of the SIGGRAPH Asia 2023 Conference Papers, 2023
Synthesizing Physical Character-Scene Interactions.
Proceedings of the ACM SIGGRAPH 2023 Conference Proceedings, 2023
Learning Human Dynamics in Autonomous Driving Scenarios.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
ATT3D: Amortized Text-to-3D Object Synthesis.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
End-to-end 3D Tracking with Decoupled Queries.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
DreamTeacher: Pretraining Image Backbones with Deep Generative Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Towards Viewpoint Robustness in Bird's Eye View Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Neural LiDAR Fields for Novel View Synthesis.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Neural Fields Meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Magic3D: High-Resolution Text-to-3D Content Creation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
VoxFormer: Sparse Voxel Transformer for Camera-Based 3D Semantic Scene Completion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Neural Kernel Surface Reconstruction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Align Your Latents: High-Resolution Video Synthesis with Latent Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
Neural Brushstroke Engine: Learning a Latent Style Space of Interactive Drawing Tools.
ACM Trans. Graph., 2022
ASE: large-scale reusable adversarial skill embeddings for physically simulated characters.
ACM Trans. Graph., 2022
Improving Semantic Segmentation in Transformers using Hierarchical Inter-Level Attention.
CoRR, 2022
M<sup>2</sup>BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Birds-Eye View Representation.
CoRR, 2022
Causal Scene BERT: Improving object detection by searching for challenging groups of data.
CoRR, 2022
Federated Learning with Heterogeneous Architectures using Graph HyperNetworks.
CoRR, 2022
BigDatasetGAN: Synthesizing ImageNet with Pixel-wise Annotations.
CoRR, 2022
PADL: Language-Directed Physics-Based Character Control.
Proceedings of the SIGGRAPH Asia 2022 Conference Papers, 2022
Variable Bitrate Neural Fields.
Proceedings of the SIGGRAPH '22: Special Interest Group on Computer Graphics and Interactive Techniques Conference, Vancouver, BC, Canada, August 7, 2022
Learning Smooth Neural Functions via Lipschitz Regularization.
Proceedings of the SIGGRAPH '22: Special Interest Group on Computer Graphics and Interactive Techniques Conference, Vancouver, BC, Canada, August 7, 2022
LION: Latent Point Diffusion Models for 3D Shape Generation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Optimizing Data Collection for Machine Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Low-Budget Active Learning via Wasserstein Distance: An Integer Programming Approach.
Proceedings of the Tenth International Conference on Learning Representations, 2022
Domain Adversarial Training: A Game Perspective.
Proceedings of the Tenth International Conference on Learning Representations, 2022
Neural Light Field Estimation for Street Scenes with Differentiable Virtual Object Insertion.
Proceedings of the Computer Vision - ECCV 2022, 2022
MvDeCor: Multi-view Dense Correspondence Learning for Fine-Grained 3D Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022
Neural Fields as Learnable Kernels for 3D Reconstruction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Extracting Triangular 3D Models, Materials, and Lighting From Images.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
How Much More Data Do I Need? Estimating Requirements for Downstream Tasks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
BigDatasetGAN: Synthesizing ImageNet with Pixel-wise Annotations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Polymorphic-GAN: Generating Aligned Samples across Multiple Domains with Learned Morph Maps.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
AUV-Net: Learning Aligned UV Maps for Texture Transfer and Synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Frame Averaging for Equivariant Shape Space Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
XDGAN: Multi-Modal 3D Shape Generation in 2D Space.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022
2021
The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines.
,
,
,
,
,
,
,
,
,
,
IEEE Trans. Pattern Anal. Mach. Intell., 2021
Hierarchical Neural Implicit Pose Network for Animation and Motion Retargeting.
CoRR, 2021
NP-DRAW: A Non-Parametric Structured Latent Variable Modelfor Image Generation.
CoRR, 2021
NP-DRAW: A Non-Parametric Structured Latent Variable Model for Image Generation.
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 2021
Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
ATISS: Autoregressive Transformers for Indoor Scene Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
EditGAN: High-Precision Semantic Image Editing.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
DIB-R++: Learning to Predict Lighting and Material with a Hybrid Differentiable Renderer.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Scalable Neural Data Server: A Data Recommender for Transfer Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Towards Optimal Strategies for Training Self-Driving Perception Models in Simulation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection.
Proceedings of the 38th International Conference on Machine Learning, 2021
f-Domain Adversarial Learning: Theory and Algorithms.
Proceedings of the 38th International Conference on Machine Learning, 2021
Personalized Federated Learning with First Order Model Optimization.
Proceedings of the 9th International Conference on Learning Representations, 2021
Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering.
Proceedings of the 9th International Conference on Learning Representations, 2021
Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration.
Proceedings of the 9th International Conference on Learning Representations, 2021
Emergent Road Rules In Multi-Agent Driving Environments.
Proceedings of the 9th International Conference on Learning Representations, 2021
gradSim: Differentiable simulation for system identification and visuomotor control.
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 9th International Conference on Learning Representations, 2021
Causal BERT: Improving object detection by searching for challenging groups.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021
3DStyleNet: Creating 3D Shapes with Geometric and Texture Style Variations.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Physics-based Human Motion Estimation and Synthesis from Videos.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Learning Indoor Inverse Rendering with 3D Spatially-Varying Lighting.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
DatasetGAN: Efficient Labeled Data Factory With Minimal Human Effort.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Neural Geometric Level of Detail: Real-Time Rendering With Implicit 3D Shapes.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Neural Parts: Learning Expressive 3D Shape Abstractions With Invertible Neural Networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Semantic Segmentation With Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
DriveGAN: Towards a Controllable High-Quality Neural Simulation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
2020
Nonlinear color triads for approximation, learning and direct manipulation of color distributions.
ACM Trans. Graph., 2020
UniCon: Universal Neural Controller For Physics-based Character Motion.
CoRR, 2020
Learning Deformable Tetrahedral Meshes for 3D Reconstruction.
CoRR, 2020
Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration.
CoRR, 2020
The efficacy of Neural Planning Metrics: A meta-analysis of PKL on nuScenes.
CoRR, 2020
Fed-Sim: Federated Simulation for Medical Imaging.
CoRR, 2020
ScribbleBox: Interactive Annotation Framework for Video Object Segmentation.
CoRR, 2020
Learning to Generate Diverse Dance Motions with Transformer.
CoRR, 2020
Variational Amodal Object Completion.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Learning Deformable Tetrahedral Meshes for 3D Reconstruction.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Federated Simulation for Medical Imaging.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020
Efficient and Information-Preserving Future Frame Prediction and Beyond.
Proceedings of the 8th International Conference on Learning Representations, 2020
A Theoretical Analysis of the Number of Shots in Few-Shot Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020
Interactive Annotation of 3D Object Geometry Using 2D Scribbles.
Proceedings of the Computer Vision - ECCV 2020, 2020
Implementing Planning KL-Divergence.
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020
Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D.
Proceedings of the Computer Vision - ECCV 2020, 2020
Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid.
Proceedings of the Computer Vision - ECCV 2020, 2020
Meta-Sim2: Unsupervised Learning of Scene Structure for Synthetic Data Generation.
Proceedings of the Computer Vision - ECCV 2020, 2020
Expressive Telepresence via Modular Codec Avatars.
Proceedings of the Computer Vision - ECCV 2020, 2020
ScribbleBox: Interactive Annotation Framework for Video Object Segmentation.
Proceedings of the Computer Vision - ECCV 2020, 2020
Neural Data Server: A Large-Scale Search Engine for Transfer Learning Data.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
Learning to Evaluate Perception Models Using Planner-Centric Metrics.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
Learning to Simulate Dynamic Environments With GameGAN.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
Auto-Tuning Structured Light by Optical Stochastic Gradient Descent.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
2019
Semantic Understanding of Scenes Through the ADE20K Dataset.
Int. J. Comput. Vis., 2019
The Shmoop Corpus: A Dataset of Stories with Loosely Aligned Summaries.
CoRR, 2019
Kaolin: A PyTorch Library for Accelerating 3D Deep Learning Research.
CoRR, 2019
CrevNet: Conditionally Reversible Video Prediction.
CoRR, 2019
Mimicking the In-Camera Color Pipeline for Camera-Aware Object Compositing.
CoRR, 2019
ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning.
CoRR, 2019
Identifying Clinical Terms in Free-Text Notes Using Ontology-Guided Machine Learning.
Proceedings of the Research in Computational Molecular Biology, 2019
Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019
EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis.
Proceedings of the 36th International Conference on Machine Learning, 2019
Neural Graph Evolution: Towards Efficient Automatic Robot Design.
Proceedings of the 7th International Conference on Learning Representations, 2019
Visual Reasoning by Progressive Module Networks.
Proceedings of the 7th International Conference on Learning Representations, 2019
DMM-Net: Differentiable Mask-Matching Network for Video Object Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Video Face Clustering With Unknown Number of Clusters.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Gated-SCNN: Gated Shape CNNs for Semantic Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Learning to Caption Images Through a Lifetime by Asking Questions.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Meta-Sim: Learning to Generate Synthetic Datasets.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Neural Turtle Graphics for Modeling City Road Layouts.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Object Instance Annotation With Deep Extreme Level Set Evolution.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Action Recognition From Single Timestamp Supervision in Untrimmed Videos.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Fast Interactive Object Annotation With Curve-GCN.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Synthesizing Environment-Aware Activities via Activity Sketches.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
DARNet: Deep Active Ray Network for Building Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Devil Is in the Edges: Learning Semantic Boundaries From Noisy Annotations.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Color Builder: A Direct Manipulation Interface for Versatile Color Theme Authoring.
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 2019
2018
3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection.
IEEE Trans. Pattern Anal. Mach. Intell., 2018
Lifelong Learning for Image Captioning by Asking Natural Language Questions.
CoRR, 2018
Color Sails: Discrete-Continuous Palettes for Deep Color Exploration.
CoRR, 2018
Progressive Reasoning by Module Composition.
CoRR, 2018
Scaling Egocentric Vision: The EPIC-KITCHENS Dataset.
,
,
,
,
,
,
,
,
,
,
CoRR, 2018
A Neural Compositional Paradigm for Image Captioning.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018
Pose Estimation for Objects with Rotational Symmetry.
Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018
NerveNet: Learning Structured Policy with Graph Neural Networks.
Proceedings of the 6th International Conference on Learning Representations, 2018
Scaling Egocentric Vision: The Dataset.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Computer Vision - ECCV 2018, 2018
Now You Shake Me: Towards Automatic 4D Cinema.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018
MovieGraphs: Towards Understanding Human-Centric Situations From Videos.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018
VirtualHome: Simulating Household Activities via Programs.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018
Learning to Act Properly: Predicting and Explaining Affordances From Images.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018
SurfConv: Bridging 3D and 2D Convolution for RGBD Images.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018
A Face-to-Face Neural Conversation Model.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018
Efficient Interactive Annotation of Segmentation Datasets With Polygon-RNN++.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018
VSE++: Improving Visual-Semantic Embeddings with Hard Negatives.
Proceedings of the British Machine Vision Conference 2018, 2018
2017
Teaching Machines to Describe Images via Natural Language Feedback.
CoRR, 2017
VSE++: Improved Visual-Semantic Embeddings.
CoRR, 2017
Teaching Machines to Describe Images with Natural Language Feedback.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017
Find your way by observing the sun and other semantic cues.
Proceedings of the 2017 IEEE International Conference on Robotics and Automation, 2017
Song From PI: A Musically Plausible Network for Pop Music Generation.
Proceedings of the 5th International Conference on Learning Representations, 2017
Be Your Own Prada: Fashion Synthesis with Structural Coherence.
Proceedings of the IEEE International Conference on Computer Vision, 2017
Open Vocabulary Scene Parsing.
Proceedings of the IEEE International Conference on Computer Vision, 2017
TorontoCity: Seeing the World with a Million Eyes.
Proceedings of the IEEE International Conference on Computer Vision, 2017
3D Graph Neural Networks for RGBD Semantic Segmentation.
Proceedings of the IEEE International Conference on Computer Vision, 2017
SGN: Sequential Grouping Networks for Instance Segmentation.
Proceedings of the IEEE International Conference on Computer Vision, 2017
Situation Recognition with Graph Neural Networks.
Proceedings of the IEEE International Conference on Computer Vision, 2017
Towards Diverse and Natural Image Descriptions via a Conditional GAN.
Proceedings of the IEEE International Conference on Computer Vision, 2017
Scene Parsing through ADE20K Dataset.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017
Sports Field Localization via Deep Structured Models.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017
Annotating Object Instances with a Polygon-RNN.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017
2016
Human-Machine CRFs for Identifying Bottlenecks in Scene Understanding.
IEEE Trans. Pattern Anal. Mach. Intell., 2016
Semantic Understanding of Scenes through the ADE20K Dataset.
CoRR, 2016
Efficient Summarization with Read-Again and Copy Mechanism.
CoRR, 2016
Order-Embeddings of Images and Language.
Proceedings of the 4th International Conference on Learning Representations, 2016
Soccer Field Localization from a Single Image.
CoRR, 2016
Proximal Deep Structured Models.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016
HouseCraft: Building Houses from Rental Ads and Street Views.
Proceedings of the Computer Vision - ECCV 2016, 2016
Instance-Level Segmentation for Autonomous Driving with Deep Densely Connected MRFs.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016
MovieQA: Understanding Stories in Movies through Question-Answering.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016
HD Maps: Fine-Grained Road Segmentation by Parsing Ground and Aerial Images.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016
Monocular 3D Object Detection for Autonomous Driving.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016
2015
A Framework for Symmetric Part Detection in Cluttered Scenes.
Symmetry, 2015
Instance-Level Segmentation with Deep Densely Connected MRFs.
CoRR, 2015
Generating Multi-Sentence Lingual Descriptions of Indoor Scenes.
CoRR, 2015
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015
3D Object Proposals for Accurate Object Class Detection.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015
Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015
Monocular Object Instance Segmentation and Depth Ordering with CNNs.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015
Lost Shopping! Monocular Localization in Large Indoor Spaces.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015
Enhancing Road Maps by Parsing Aerial Images Around the World.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015
Learning to Combine Mid-Level Cues for Object Proposal Generation.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015
Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015
segDeepM: Exploiting segmentation and context in deep neural networks for object detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015
Real-time coarse-to-fine topologically preserving segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015
Holistic 3D scene understanding from a single geo-tagged image.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015
Neuroaesthetics in fashion: Modeling the perception of fashionability.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015
Rent3D: Floor-plan priors for monocular layout estimation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015
Generating Multi-sentence Natural Language Descriptions of Indoor Scenes.
Proceedings of the British Machine Vision Conference 2015, 2015
2014
Human-Machine CRFs for Identifying Bottlenecks in Holistic Scene Understanding.
CoRR, 2014
Learning a Hierarchical Compositional Shape Vocabulary for Multi-class Object Representation.
CoRR, 2014
The Role of Context for Object Detection and Semantic Segmentation in the Wild.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014
Visual Semantic Search: Retrieving Videos via Complex Textual Queries.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014
What Are You Talking About? Text-to-Image Coreference.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014
Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014
Beat the MTurkers: Automatic Image Labeling from Weak 3D Supervision.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014
A High Performance CRF Model for Clothes Parsing.
Proceedings of the Computer Vision - ACCV 2014, 2014
Multi-cue Mid-level Grouping.
Proceedings of the Computer Vision - ACCV 2014, 2014
2013
Box in the Box: Joint 3D Layout and Object Reasoning from Single Images.
Proceedings of the IEEE International Conference on Computer Vision, 2013
Holistic Scene Understanding for 3D Object Detection with RGBD Cameras.
Proceedings of the IEEE International Conference on Computer Vision, 2013
Detecting Curved Symmetric Parts Using a Deformable Disc Model.
Proceedings of the IEEE International Conference on Computer Vision, 2013
Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013
A Sentence Is Worth a Thousand Pixels.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013
Bottom-Up Segmentation for Top-Down Detection.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013
2012
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, 2012
Unsupervised Disambiguation of Image Captions.
Proceedings of the First Joint Conference on Lexical and Computational Semantics, 2012
3D Object Detection and Viewpoint Estimation with a Deformable 3D Cuboid Model.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012
Superedge grouping for object localization by combining appearance and shape information.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012
Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012
Learning Categorical Shape from Captioned Images.
Proceedings of the Ninth Conference on Computer and Robot Vision, 2012
2011
A probabilistic model for recursive factorized image features.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011
2010
A Coarse-to-Fine Taxonomy of Constellations for Fast Multi-class Object Detection.
Proceedings of the Computer Vision - ECCV 2010, 2010
Proceedings of the Cognitive Systems, 2010
2009
Evaluating multi-class learning strategies in a generative hierarchical framework for object detection.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009
A bottom-up and top-down optimization framework for learning a compositional hierarchy of object classes.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009
Optimization Framework for Learning a Hierarchical Shape Vocabulary for Object Class Detection.
Proceedings of the British Machine Vision Conference, 2009
2008
Selecting features for object detection using an AdaBoost-compatible evaluation function.
Pattern Recognit. Lett., 2008
Similarity-based cross-layered hierarchical representation for object categorization.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008
2007
Learning Hierarchical Representations of Object Categories for Robot Vision.
Proceedings of the Robotics Research - The 13th International Symposium, 2007
Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007
2006
Combining Reconstructive and Discriminative Subspace Methods for Robust Classification and Regression by Subsampling.
IEEE Trans. Pattern Anal. Mach. Intell., 2006
Hierarchical Statistical Learning of Generic Parts of Object Structure.
Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), 2006
2003
Robust LDA Classification by Subsampling.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2003