Sanja Fidler

Orcid: 0000-0003-1040-3260

Affiliations:
  • NVIDIA, Toronto, Canada
  • University of Toronto, Canada


According to our database1, Sanja Fidler authored at least 249 papers between 2003 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
3D Gaussian Ray Tracing: Fast Tracing of Particle Scenes.
ACM Trans. Graph., December, 2024

fVDB : A Deep-Learning Framework for Sparse, Large Scale, and High Performance Spatial Intelligence.
ACM Trans. Graph., July, 2024

ReMatching Dynamic Reconstruction Flow.
CoRR, 2024

SCube: Instant Large-Scale Scene Reconstruction using VoxSplats.
CoRR, 2024

OmniRe: Omni Urban Scene Reconstruction.
CoRR, 2024

Wolf: Captioning Everything with a World Summarization Framework.
CoRR, 2024

DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features.
CoRR, 2024

L4GM: Large 4D Gaussian Reconstruction Model.
CoRR, 2024

RefFusion: Reference Adapted Diffusion Models for 3D Scene Inpainting.
CoRR, 2024

Can Feedback Enhance Semantic Grounding in Large Vision-Language Models?
CoRR, 2024

Augmented Reality based Simulated Data (ARSim) with multi-view consistency for AV perception networks.
CoRR, 2024

SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes.
Proceedings of the SIGGRAPH Asia 2024 Conference Papers, 2024

SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation.
Proceedings of the ACM SIGGRAPH 2024 Conference Papers, 2024

Align Your Steps: Optimizing Sampling Schedules in Diffusion Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Trajeglish: Traffic Modeling as Next-Token Prediction.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Transferring Labels to Solve Annotation Mismatches Across Object Detection Datasets.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Reasoning Paths with Reference Objects Elicit Quantitative Spatial Reasoning in Large Vision-Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis.
Proceedings of the Computer Vision - ECCV 2024, 2024

Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering.
Proceedings of the Computer Vision - ECCV 2024, 2024

NeRF-XL: Scaling NeRFs with Multiple GPUs.
Proceedings of the Computer Vision - ECCV 2024, 2024

Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Adaptive Shells for Efficient Neural Radiance Field Rendering.
ACM Trans. Graph., December, 2023

Learning Physically Simulated Tennis Skills from Broadcast Videos.
ACM Trans. Graph., August, 2023

Flexible Isosurface Extraction for Gradient-Based Mesh Optimization.
ACM Trans. Graph., August, 2023

Bridging the Sim2Real gap with CARE: Supervised Detection Adaptation with Conditional Alignment and Reweighting.
Trans. Mach. Learn. Res., 2023

Trajeglish: Learning the Language of Driving Scenarios.
CoRR, 2023

XCube (X<sup>3</sup>): Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies.
CoRR, 2023

Neural Fields meet Explicit Geometric Representation for Inverse Rendering of Urban Scenes.
CoRR, 2023

Bridging the Sim2Real gap with CARE: Supervised Detection Adaptation with Conditional Alignment and Reweighting.
CoRR, 2023

Compact Neural Graphics Primitives with Learned Hash Probing.
Proceedings of the SIGGRAPH Asia 2023 Conference Papers, 2023

Synthesizing Physical Character-Scene Interactions.
Proceedings of the ACM SIGGRAPH 2023 Conference Proceedings, 2023

Learning Human Dynamics in Autonomous Driving Scenarios.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

ATT3D: Amortized Text-to-3D Object Synthesis.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

End-to-end 3D Tracking with Decoupled Queries.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DreamTeacher: Pretraining Image Backbones with Deep Generative Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Towards Viewpoint Robustness in Bird's Eye View Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Neural LiDAR Fields for Novel View Synthesis.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Neural Fields Meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Magic3D: High-Resolution Text-to-3D Content Creation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

VoxFormer: Sparse Voxel Transformer for Camera-Based 3D Semantic Scene Completion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Neural Kernel Surface Reconstruction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Align Your Latents: High-Resolution Video Synthesis with Latent Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Neural Brushstroke Engine: Learning a Latent Style Space of Interactive Drawing Tools.
ACM Trans. Graph., 2022

ASE: large-scale reusable adversarial skill embeddings for physically simulated characters.
ACM Trans. Graph., 2022

Improving Semantic Segmentation in Transformers using Hierarchical Inter-Level Attention.
CoRR, 2022

M<sup>2</sup>BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Birds-Eye View Representation.
CoRR, 2022

Causal Scene BERT: Improving object detection by searching for challenging groups of data.
CoRR, 2022

Federated Learning with Heterogeneous Architectures using Graph HyperNetworks.
CoRR, 2022

BigDatasetGAN: Synthesizing ImageNet with Pixel-wise Annotations.
CoRR, 2022

PADL: Language-Directed Physics-Based Character Control.
Proceedings of the SIGGRAPH Asia 2022 Conference Papers, 2022

Variable Bitrate Neural Fields.
Proceedings of the SIGGRAPH '22: Special Interest Group on Computer Graphics and Interactive Techniques Conference, Vancouver, BC, Canada, August 7, 2022

Learning Smooth Neural Functions via Lipschitz Regularization.
Proceedings of the SIGGRAPH '22: Special Interest Group on Computer Graphics and Interactive Techniques Conference, Vancouver, BC, Canada, August 7, 2022

LION: Latent Point Diffusion Models for 3D Shape Generation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Optimizing Data Collection for Machine Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Low-Budget Active Learning via Wasserstein Distance: An Integer Programming Approach.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Domain Adversarial Training: A Game Perspective.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Neural Light Field Estimation for Street Scenes with Differentiable Virtual Object Insertion.
Proceedings of the Computer Vision - ECCV 2022, 2022

MvDeCor: Multi-view Dense Correspondence Learning for Fine-Grained 3D Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022

Neural Fields as Learnable Kernels for 3D Reconstruction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Extracting Triangular 3D Models, Materials, and Lighting From Images.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

How Much More Data Do I Need? Estimating Requirements for Downstream Tasks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

BigDatasetGAN: Synthesizing ImageNet with Pixel-wise Annotations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Polymorphic-GAN: Generating Aligned Samples across Multiple Domains with Learned Morph Maps.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

AUV-Net: Learning Aligned UV Maps for Texture Transfer and Synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Frame Averaging for Equivariant Shape Space Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

XDGAN: Multi-Modal 3D Shape Generation in 2D Space.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022

2021
The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Hierarchical Neural Implicit Pose Network for Animation and Motion Retargeting.
CoRR, 2021

NP-DRAW: A Non-Parametric Structured Latent Variable Modelfor Image Generation.
CoRR, 2021

NP-DRAW: A Non-Parametric Structured Latent Variable Model for Image Generation.
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 2021

Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

ATISS: Autoregressive Transformers for Indoor Scene Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

EditGAN: High-Precision Semantic Image Editing.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

DIB-R++: Learning to Predict Lighting and Material with a Hybrid Differentiable Renderer.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Scalable Neural Data Server: A Data Recommender for Transfer Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Towards Optimal Strategies for Training Self-Driving Perception Models in Simulation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection.
Proceedings of the 38th International Conference on Machine Learning, 2021

f-Domain Adversarial Learning: Theory and Algorithms.
Proceedings of the 38th International Conference on Machine Learning, 2021

Personalized Federated Learning with First Order Model Optimization.
Proceedings of the 9th International Conference on Learning Representations, 2021

Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering.
Proceedings of the 9th International Conference on Learning Representations, 2021

Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration.
Proceedings of the 9th International Conference on Learning Representations, 2021

Emergent Road Rules In Multi-Agent Driving Environments.
Proceedings of the 9th International Conference on Learning Representations, 2021

gradSim: Differentiable simulation for system identification and visuomotor control.
Proceedings of the 9th International Conference on Learning Representations, 2021

Causal BERT: Improving object detection by searching for challenging groups.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

3DStyleNet: Creating 3D Shapes with Geometric and Texture Style Variations.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Physics-based Human Motion Estimation and Synthesis from Videos.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Learning Indoor Inverse Rendering with 3D Spatially-Varying Lighting.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

DatasetGAN: Efficient Labeled Data Factory With Minimal Human Effort.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Neural Geometric Level of Detail: Real-Time Rendering With Implicit 3D Shapes.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Neural Parts: Learning Expressive 3D Shape Abstractions With Invertible Neural Networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Semantic Segmentation With Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

DriveGAN: Towards a Controllable High-Quality Neural Simulation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Nonlinear color triads for approximation, learning and direct manipulation of color distributions.
ACM Trans. Graph., 2020

UniCon: Universal Neural Controller For Physics-based Character Motion.
CoRR, 2020

Learning Deformable Tetrahedral Meshes for 3D Reconstruction.
CoRR, 2020

Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration.
CoRR, 2020

The efficacy of Neural Planning Metrics: A meta-analysis of PKL on nuScenes.
CoRR, 2020

Fed-Sim: Federated Simulation for Medical Imaging.
CoRR, 2020

ScribbleBox: Interactive Annotation Framework for Video Object Segmentation.
CoRR, 2020

Learning to Generate Diverse Dance Motions with Transformer.
CoRR, 2020

Variational Amodal Object Completion.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Learning Deformable Tetrahedral Meshes for 3D Reconstruction.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Federated Simulation for Medical Imaging.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020

Efficient and Information-Preserving Future Frame Prediction and Beyond.
Proceedings of the 8th International Conference on Learning Representations, 2020

A Theoretical Analysis of the Number of Shots in Few-Shot Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020

Interactive Annotation of 3D Object Geometry Using 2D Scribbles.
Proceedings of the Computer Vision - ECCV 2020, 2020

Implementing Planning KL-Divergence.
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D.
Proceedings of the Computer Vision - ECCV 2020, 2020

Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid.
Proceedings of the Computer Vision - ECCV 2020, 2020

Meta-Sim2: Unsupervised Learning of Scene Structure for Synthetic Data Generation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Expressive Telepresence via Modular Codec Avatars.
Proceedings of the Computer Vision - ECCV 2020, 2020

ScribbleBox: Interactive Annotation Framework for Video Object Segmentation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Neural Data Server: A Large-Scale Search Engine for Transfer Learning Data.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Learning to Evaluate Perception Models Using Planner-Centric Metrics.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Learning to Simulate Dynamic Environments With GameGAN.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Auto-Tuning Structured Light by Optical Stochastic Gradient Descent.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Semantic Understanding of Scenes Through the ADE20K Dataset.
Int. J. Comput. Vis., 2019

The Shmoop Corpus: A Dataset of Stories with Loosely Aligned Summaries.
CoRR, 2019

Kaolin: A PyTorch Library for Accelerating 3D Deep Learning Research.
CoRR, 2019

CrevNet: Conditionally Reversible Video Prediction.
CoRR, 2019

Mimicking the In-Camera Color Pipeline for Camera-Aware Object Compositing.
CoRR, 2019

ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning.
CoRR, 2019

Identifying Clinical Terms in Free-Text Notes Using Ontology-Guided Machine Learning.
Proceedings of the Research in Computational Molecular Biology, 2019

Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis.
Proceedings of the 36th International Conference on Machine Learning, 2019

Neural Graph Evolution: Towards Efficient Automatic Robot Design.
Proceedings of the 7th International Conference on Learning Representations, 2019

Visual Reasoning by Progressive Module Networks.
Proceedings of the 7th International Conference on Learning Representations, 2019

DMM-Net: Differentiable Mask-Matching Network for Video Object Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Video Face Clustering With Unknown Number of Clusters.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Gated-SCNN: Gated Shape CNNs for Semantic Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Learning to Caption Images Through a Lifetime by Asking Questions.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Meta-Sim: Learning to Generate Synthetic Datasets.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Neural Turtle Graphics for Modeling City Road Layouts.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Object Instance Annotation With Deep Extreme Level Set Evolution.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Creative Flow+ Dataset.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Action Recognition From Single Timestamp Supervision in Untrimmed Videos.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Fast Interactive Object Annotation With Curve-GCN.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Synthesizing Environment-Aware Activities via Activity Sketches.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

DARNet: Deep Active Ray Network for Building Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Devil Is in the Edges: Learning Semantic Boundaries From Noisy Annotations.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Color Builder: A Direct Manipulation Interface for Versatile Color Theme Authoring.
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 2019

2018
3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Lifelong Learning for Image Captioning by Asking Natural Language Questions.
CoRR, 2018

Color Sails: Discrete-Continuous Palettes for Deep Color Exploration.
CoRR, 2018

Progressive Reasoning by Module Composition.
CoRR, 2018

Scaling Egocentric Vision: The EPIC-KITCHENS Dataset.
CoRR, 2018

A Neural Compositional Paradigm for Image Captioning.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Pose Estimation for Objects with Rotational Symmetry.
Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018

NerveNet: Learning Structured Policy with Graph Neural Networks.
Proceedings of the 6th International Conference on Learning Representations, 2018

Scaling Egocentric Vision: The Dataset.
Proceedings of the Computer Vision - ECCV 2018, 2018

Now You Shake Me: Towards Automatic 4D Cinema.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

MovieGraphs: Towards Understanding Human-Centric Situations From Videos.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

VirtualHome: Simulating Household Activities via Programs.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Learning to Act Properly: Predicting and Explaining Affordances From Images.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

SurfConv: Bridging 3D and 2D Convolution for RGBD Images.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

A Face-to-Face Neural Conversation Model.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Efficient Interactive Annotation of Segmentation Datasets With Polygon-RNN++.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

VSE++: Improving Visual-Semantic Embeddings with Hard Negatives.
Proceedings of the British Machine Vision Conference 2018, 2018

2017
Teaching Machines to Describe Images via Natural Language Feedback.
CoRR, 2017

VSE++: Improved Visual-Semantic Embeddings.
CoRR, 2017

Teaching Machines to Describe Images with Natural Language Feedback.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Find your way by observing the sun and other semantic cues.
Proceedings of the 2017 IEEE International Conference on Robotics and Automation, 2017

Song From PI: A Musically Plausible Network for Pop Music Generation.
Proceedings of the 5th International Conference on Learning Representations, 2017

Be Your Own Prada: Fashion Synthesis with Structural Coherence.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Open Vocabulary Scene Parsing.
Proceedings of the IEEE International Conference on Computer Vision, 2017

TorontoCity: Seeing the World with a Million Eyes.
Proceedings of the IEEE International Conference on Computer Vision, 2017

3D Graph Neural Networks for RGBD Semantic Segmentation.
Proceedings of the IEEE International Conference on Computer Vision, 2017

SGN: Sequential Grouping Networks for Instance Segmentation.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Situation Recognition with Graph Neural Networks.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Towards Diverse and Natural Image Descriptions via a Conditional GAN.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Scene Parsing through ADE20K Dataset.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Sports Field Localization via Deep Structured Models.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Annotating Object Instances with a Polygon-RNN.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Human-Machine CRFs for Identifying Bottlenecks in Scene Understanding.
IEEE Trans. Pattern Anal. Mach. Intell., 2016

Semantic Understanding of Scenes through the ADE20K Dataset.
CoRR, 2016

Efficient Summarization with Read-Again and Copy Mechanism.
CoRR, 2016

Order-Embeddings of Images and Language.
Proceedings of the 4th International Conference on Learning Representations, 2016

Soccer Field Localization from a Single Image.
CoRR, 2016

Proximal Deep Structured Models.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

HouseCraft: Building Houses from Rental Ads and Street Views.
Proceedings of the Computer Vision - ECCV 2016, 2016

Instance-Level Segmentation for Autonomous Driving with Deep Densely Connected MRFs.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

MovieQA: Understanding Stories in Movies through Question-Answering.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

HD Maps: Fine-Grained Road Segmentation by Parsing Ground and Aerial Images.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Monocular 3D Object Detection for Autonomous Driving.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
A Framework for Symmetric Part Detection in Cluttered Scenes.
Symmetry, 2015

Instance-Level Segmentation with Deep Densely Connected MRFs.
CoRR, 2015

Generating Multi-Sentence Lingual Descriptions of Indoor Scenes.
CoRR, 2015

Skip-Thought Vectors.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

3D Object Proposals for Accurate Object Class Detection.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Monocular Object Instance Segmentation and Depth Ordering with CNNs.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Lost Shopping! Monocular Localization in Large Indoor Spaces.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Enhancing Road Maps by Parsing Aerial Images Around the World.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Learning to Combine Mid-Level Cues for Object Proposal Generation.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

segDeepM: Exploiting segmentation and context in deep neural networks for object detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Real-time coarse-to-fine topologically preserving segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Holistic 3D scene understanding from a single geo-tagged image.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Neuroaesthetics in fashion: Modeling the perception of fashionability.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Rent3D: Floor-plan priors for monocular layout estimation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Generating Multi-sentence Natural Language Descriptions of Indoor Scenes.
Proceedings of the British Machine Vision Conference 2015, 2015

2014
Human-Machine CRFs for Identifying Bottlenecks in Holistic Scene Understanding.
CoRR, 2014

Learning a Hierarchical Compositional Shape Vocabulary for Multi-class Object Representation.
CoRR, 2014

The Role of Context for Object Detection and Semantic Segmentation in the Wild.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Visual Semantic Search: Retrieving Videos via Complex Textual Queries.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

What Are You Talking About? Text-to-Image Coreference.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Beat the MTurkers: Automatic Image Labeling from Weak 3D Supervision.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

A High Performance CRF Model for Clothes Parsing.
Proceedings of the Computer Vision - ACCV 2014, 2014

Multi-cue Mid-level Grouping.
Proceedings of the Computer Vision - ACCV 2014, 2014

2013
Box in the Box: Joint 3D Layout and Object Reasoning from Single Images.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Holistic Scene Understanding for 3D Object Detection with RGBD Cameras.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Detecting Curved Symmetric Parts Using a Deformable Disc Model.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

A Sentence Is Worth a Thousand Pixels.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Bottom-Up Segmentation for Top-Down Detection.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

2012

Unsupervised Disambiguation of Image Captions.
Proceedings of the First Joint Conference on Lexical and Computational Semantics, 2012

3D Object Detection and Viewpoint Estimation with a Deformable 3D Cuboid Model.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Superedge grouping for object localization by combining appearance and shape information.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Learning Categorical Shape from Captioned Images.
Proceedings of the Ninth Conference on Computer and Robot Vision, 2012

2011
A probabilistic model for recursive factorized image features.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

2010
A Coarse-to-Fine Taxonomy of Constellations for Fast Multi-class Object Detection.
Proceedings of the Computer Vision - ECCV 2010, 2010

Categorical Perception.
Proceedings of the Cognitive Systems, 2010

2009
Evaluating multi-class learning strategies in a generative hierarchical framework for object detection.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

A bottom-up and top-down optimization framework for learning a compositional hierarchy of object classes.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009

Optimization Framework for Learning a Hierarchical Shape Vocabulary for Object Class Detection.
Proceedings of the British Machine Vision Conference, 2009

2008
Selecting features for object detection using an AdaBoost-compatible evaluation function.
Pattern Recognit. Lett., 2008

Similarity-based cross-layered hierarchical representation for object categorization.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

2007
Learning Hierarchical Representations of Object Categories for Robot Vision.
Proceedings of the Robotics Research - The 13th International Symposium, 2007

Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

2006
Combining Reconstructive and Discriminative Subspace Methods for Robust Classification and Regression by Subsampling.
IEEE Trans. Pattern Anal. Mach. Intell., 2006

Hierarchical Statistical Learning of Generic Parts of Object Structure.
Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), 2006

2003
Robust LDA Classification by Subsampling.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2003


  Loading...