Silvio Savarese

Affiliations:
  • Stanford University, Department of Computer Science, Stanford, CA, USA
  • University of Michigan, Department of Electrical and Computer Engineering, Ann Arbor, MI, USA (2008 - 2013)
  • University of Illinois, Urbana-Champaign, IL, USA (2005 - 2008)
  • California Institute of Technology, Pasadena, CA, USA (PhD 2005)


According to our database1, Silvio Savarese authored at least 328 papers between 2001 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Action-conditional implicit visual dynamics for deformable object manipulation.
Int. J. Robotics Res., 2024

Sample-efficient safety assurances using conformal prediction.
Int. J. Robotics Res., 2024

Asynchronous Tool Usage for Real-Time Agents.
CoRR, 2024

PRACT: Optimizing Principled Reasoning and Acting of LLM Agent.
CoRR, 2024

xGen-MM-Vid (BLIP-3-Video): You Only Need 32 Tokens to Represent a Video Even in VLMs.
CoRR, 2024

Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts.
CoRR, 2024

GIFT-Eval: A Benchmark For General Time Series Forecasting Model Evaluation.
CoRR, 2024

SFR-RAG: Towards Contextually Faithful LLMs.
CoRR, 2024

xLAM: A Family of Large Action Models to Empower AI Agent Systems.
CoRR, 2024

xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations.
CoRR, 2024

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models.
CoRR, 2024

Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents.
CoRR, 2024

Enabling High Data Throughput Reinforcement Learning on GPUs: A Domain Agnostic Framework for Data-Driven Scientific Research.
CoRR, 2024

Shared Imagination: LLMs Hallucinate Alike.
CoRR, 2024

INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness.
CoRR, 2024

APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets.
CoRR, 2024

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens.
CoRR, 2024

MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases.
CoRR, 2024

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments.
CoRR, 2024

BEHAVIOR-1K: A Human-Centered, Embodied AI Benchmark with 1, 000 Everyday Activities and Realistic Simulation.
CoRR, 2024

AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System.
CoRR, 2024

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning.
CoRR, 2024

Text2Data: Low-Resource Data Generation with Textual Control.
CoRR, 2024

Editing Arbitrary Propositions in LLMs without Subject Labels.
CoRR, 2024

Online Distribution Shift Detection via Recency Prediction.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Unified Training of Universal Time Series Forecasting Transformers.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

How Do Transformers Learn In-Context Beyond Simple Functions? A Case Study on Learning with Representations.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

X-InstructBLIP: A Framework for Aligning Image, 3D, Audio, Video to LLMs and its Emergent Cross-Modal Reasoning.
Proceedings of the Computer Vision - ECCV 2024, 2024

DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024

ULIP-2: Towards Scalable Multimodal Pre-Training for 3D Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

HIVE: Harnessing Human Feedback for Instructional Visual Editing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Causal Layering via Conditional Entropy.
Proceedings of the Causal Learning and Reasoning, 2024

2023
How Trustworthy are Performance Evaluations for Basic Vision Tasks?
IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

JRDB: A Dataset and Benchmark of Egocentric Robot Visual Perception of Humans in Built Environments.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Merlion: End-to-End Machine Learning for Time Series.
J. Mach. Learn. Res., 2023

X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning.
CoRR, 2023

Nothing Stands Still: A Spatiotemporal Benchmark on 3D Point Cloud Registration Under Large Geometric and Temporal Change.
CoRR, 2023

XGen-7B Technical Report.
CoRR, 2023

BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents.
CoRR, 2023

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization.
CoRR, 2023

REX: Rapid Exploration and eXploitation for AI Agents.
CoRR, 2023

An Extensible Multimodal Multi-task Object Dataset with Materials.
CoRR, 2023

ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding.
CoRR, 2023

CodeGen2: Lessons for Training LLMs on Programming and Natural Languages.
CoRR, 2023

AI for IT Operations (AIOps) on Cloud Platforms: Reviews, Opportunities and Challenges.
CoRR, 2023

Enhancing Performance on Seen and Unseen Dialogue Scenarios using Retrieval-Augmented End-to-End Task-Oriented System.
Proceedings of the 24th Meeting of the Special Interest Group on Discourse and Dialogue, 2023

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Modeling Dynamic Environments with Scene Graph Memory.
Proceedings of the International Conference on Machine Learning, 2023

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models.
Proceedings of the International Conference on Machine Learning, 2023

An Extensible Multi-modal Multi-task Object Dataset with Materials.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Masked Unsupervised Self-training for Label-free Image Classification.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Long Document Summarization with Top-down and Bottom-up Inference.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023

Procedure-Aware Pretraining for Instructional Video Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Best-k Search Algorithm for Neural Text Generation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

LAVIS: A One-stop Library for Language-Vision Intelligence.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023

2022
ULIP: Learning Unified Representation of Language, Image and Point Cloud for 3D Understanding.
CoRR, 2022

Online Distribution Shift Detection via Recency Prediction.
CoRR, 2022

Retrospectives on the Embodied AI Workshop.
CoRR, 2022

LAVIS: A Library for Language-Vision Intelligence.
CoRR, 2022

Minkowski Tracker: A Sparse Spatio-Temporal R-CNN for Joint Object Detection and Tracking.
CoRR, 2022

Masked Unsupervised Self-training for Zero-shot Image Classification.
CoRR, 2022

OmniXAI: A Library for Explainable AI.
CoRR, 2022

A Conversational Paradigm for Program Synthesis.
CoRR, 2022

Local calibration: metrics and recalibration.
Proceedings of the Uncertainty in Artificial Intelligence, 2022

ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation.
Proceedings of the Robotics: Science and Systems XVIII, New York City, NY, USA, June 27, 2022

CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Generating Procedural 3D materials from Images using Neural Networks.
Proceedings of the IVSP 2022: 4th International Conference on Image, Video and Signal Processing, Singapore, March 18, 2022

Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero Training.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

JRDB-Act: A Large-scale Dataset for Spatio-temporal Action, Social Group and Activity Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022


2021
Biological data annotation via a human-augmenting AI-based labeling system.
npj Digit. Medicine, 2021

Merlion: A Machine Learning Library for Time Series.
CoRR, 2021

JRDB-Act: A Large-scale Multi-modal Dataset for Spatio-temporal Action, Social Group and Activity Detection.
CoRR, 2021

Neural Architecture Search From Fréchet Task Distance.
CoRR, 2021

Localized Calibration: Metrics and Recalibration.
CoRR, 2021

Embodied Intelligence via Learning and Evolution.
CoRR, 2021

Discovering Generalizable Skills via Automated Generation of Diverse Tasks.
Proceedings of the Robotics: Science and Systems XVII, Virtual Event, July 12-16, 2021., 2021

Generalization Through Hand-Eye Coordination: An Action Space for Learning Spatially-Invariant Visuomotor Control.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

iGibson 1.0: A Simulation Environment for Interactive Tasks in Large Realistic Scenes.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Probabilistic Visual Navigation with Bidirectional Image Prediction.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Deep Affordance Foresight: Planning Through What Can Be Done in the Future.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

ReLMoGen: Integrating Motion Generation in Reinforcement Learning for Mobile Manipulation.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

Learning Multi-Arm Manipulation Through Collaborative Teleoperation.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

Robot Navigation in Constrained Pedestrian Environments using Reinforcement Learning.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

Semantic and Geometric Modeling with Neural Message Passing in 3D Scene Graphs for Hierarchical Mechanical Search.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

LASER: Learning a Latent Action Space for Efficient Reinforcement Learning.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

Adaptive Procedural Task Generation for Hard-Exploration Problems.
Proceedings of the 9th International Conference on Learning Representations, 2021

TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Topological Planning With Transformers for Vision-and-Language Navigation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Error-Aware Imitation Learning from Teleoperation Data for Mobile Manipulation.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

Co-GAIL: Learning Diverse Strategies for Human-Robot Collaboration.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

BEHAVIOR: Benchmark for Everyday Household Activities in Virtual, Interactive, and Ecological Environments.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

Learning Language-Conditioned Robot Behavior from Offline Data and Crowd-Sourced Annotation.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

What Matters in Learning from Offline Human Demonstrations for Robot Manipulation.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday Household Tasks.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

2020
Linear Artificial Forces for Human Dynamics in Complex Contexts.
Proceedings of the Neural Approaches to Dynamics of Signal Exchanges, 2020

Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks.
IEEE Trans. Robotics, 2020

Interactive Gibson Benchmark: A Benchmark for Interactive Navigation in Cluttered Environments.
IEEE Robotics Autom. Lett., 2020

Improving Social Awareness Through DANTE: Deep Affinity Network for Clustering Conversational Interactants.
Proc. ACM Hum. Comput. Interact., 2020

Learning task-oriented grasping for tool manipulation from simulated self-supervision.
Int. J. Robotics Res., 2020

Human-in-the-Loop Imitation Learning using Remote Teleoperation.
CoRR, 2020

iGibson, a Simulation Environment for Interactive Tasks in Large Realistic Scenes.
CoRR, 2020

Robust Policies via Mid-Level Visual Representations: An Experimental Study in Manipulation and Navigation.
CoRR, 2020

Privacy Preserving Recalibration under Domain Shift.
CoRR, 2020

ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation.
CoRR, 2020

How Trustworthy are the Existing Performance Evaluations for Basic Vision Tasks?
CoRR, 2020

Probabilistic Visual Navigation with Bidirectional Image Prediction.
CoRR, 2020

Learning to Generalize Across Long-Horizon Tasks from Human Demonstrations.
CoRR, 2020

GTI: Learning to Generalize across Long-Horizon Tasks from Human Demonstrations.
Proceedings of the Robotics: Science and Systems XVI, 2020

JRMOT: A Real-Time 3D Multi-Object Tracker and a New Large-Scale Dataset.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

Multimodal Sensor Fusion with Differentiable Filters.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

Visuomotor Mechanical Search: Learning to Retrieve Target Objects in Clutter.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

Localizing Against Drawn Maps via Spline-Based Registration.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

6-PACK: Category-level 6D Pose Tracker with Anchor-Based Keypoints.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

KETO: Learning Keypoint Representations for Tool Manipulation.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

IRIS: Implicit Reinforcement without Interaction at Scale for Learning Control from Offline Robot Manipulation Data.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

Which Tasks Should Be Learned Together in Multi-task Learning?
Proceedings of the 37th International Conference on Machine Learning, 2020

Goal-Aware Prediction: Learning to Model What Matters.
Proceedings of the 37th International Conference on Machine Learning, 2020

Generative Sparse Detection Networks for 3D Single-Shot Object Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020

Robust Policies via Mid-Level Visual Representations: An Experimental Study in Manipulation and Navigation.
Proceedings of the 4th Conference on Robot Learning, 2020

2019
Deep Visual MPC-Policy Learning for Navigation.
IEEE Robotics Autom. Lett., 2019

VUNet: Dynamic Scene View Synthesis for Traversability Estimation Using an RGB Camera.
IEEE Robotics Autom. Lett., 2019

Leveraging Pretrained Image Classifiers for Language-Based Segmentation.
CoRR, 2019

Interactive Gibson: A Benchmark for Interactive Navigation in Cluttered Environments.
CoRR, 2019

JRDB: A Dataset and Benchmark for Visual Perception for Navigation in Human Environments.
CoRR, 2019

Causal Induction from Visual Observations for Goal Directed Tasks.
CoRR, 2019

SURREAL-System: Fully-Integrated Stack for Distributed Deep Reinforcement Learning.
CoRR, 2019

DANTE: Deep Affinity Network for Clustering Conversational Interactants.
CoRR, 2019

Machine Vision for Natural Gas Methane Emissions Detection Using an Infrared Camera.
CoRR, 2019

A Behavioral Approach to Visual Navigation with Graph Localization Networks.
Proceedings of the Robotics: Science and Systems XV, 2019

Regression Planning Networks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Social-BiGAT: Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Cracking open the DNN black-box: Video Analytics with DNNs across the Camera-Cloud Boundary.
Proceedings of the 2019 Workshop on Hot Topics in Video Analytics and Intelligent Edges, 2019

Variable Impedance Control in End-Effector Space: An Action Space for Reinforcement Learning in Contact-Rich Tasks.
Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

Scaling Robot Supervision to Hundreds of Hours with RoboTurk: Robotic Manipulation Dataset through Human Reasoning and Dexterity.
Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

Continuous Relaxation of Symbolic Planner for One-Shot Imitation Learning.
Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

Taskonomy: Disentangling Task Transfer Learning.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Deep Local Trajectory Replanning and Control for Robot Navigation.
Proceedings of the International Conference on Robotics and Automation, 2019

Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks.
Proceedings of the International Conference on Robotics and Automation, 2019

Mechanical Search: Multi-Step Retrieval of a Target Object Occluded by Clutter.
Proceedings of the International Conference on Robotics and Automation, 2019

Situational Fusion of Visual Representation for Visual Navigation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Time-Varying Interaction Estimation Using Ensemble Methods.
Proceedings of the IEEE Data Science Workshop, 2019

DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

TopNet: Structural Point Cloud Decoder.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Learning to Navigate Using Mid-Level Visual Priors.
Proceedings of the 3rd Annual Conference on Robot Learning, 2019

HRL4IN: Hierarchical Reinforcement Learning for Interactive Navigation with Mobile Manipulators.
Proceedings of the 3rd Annual Conference on Robot Learning, 2019

AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers.
Proceedings of the 3rd Annual Conference on Robot Learning, 2019

Dynamics Learning with Cascaded Variational Inference for Multi-Step Manipulation.
Proceedings of the 3rd Annual Conference on Robot Learning, 2019

2018
Watch-n-Patch: Unsupervised Learning of Actions and Relations.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Long-term path prediction in urban scenarios using circular distributions.
Image Vis. Comput., 2018

Mid-Level Visual Representations Improve Generalization and Sample Efficiency for Learning Active Tasks.
CoRR, 2018

Coupled Recurrent Network (CRN).
CoRR, 2018

GONet++: Traversability Estimation via Dynamic Scene View Synthesis.
CoRR, 2018

SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints.
CoRR, 2018

DeformNet: Free-Form Deformation Network for 3D Shape Reconstruction from a Single Image.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Recurrent Autoregressive Networks for Online Multi-object Tracking.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Generalizing to Unseen Domains via Adversarial Data Augmentation.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

GONet: A Semi-Supervised Deep Learning Approach For Traversability Estimation.
Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018

Neural Task Programming: Learning to Generalize Across Hierarchical Tasks.
Proceedings of the 2018 IEEE International Conference on Robotics and Automation, 2018

Multi-Task Domain Adaptation for Deep Learning of Instance Grasping from Simulation.
Proceedings of the 2018 IEEE International Conference on Robotics and Automation, 2018

Active Learning for Convolutional Neural Networks: A Core-Set Approach.
Proceedings of the 6th International Conference on Learning Representations, 2018

Behavioral Indoor Navigation With Natural Language Directions.
Proceedings of the Companion of the 2018 ACM/IEEE International Conference on Human-Robot Interaction, 2018

Translating Navigation Instructions in Natural Language to a High-Level Plan for Behavioral Robot Navigation.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

CAR-Net: Clairvoyant Attentive Recurrent Network.
Proceedings of the Computer Vision - ECCV 2018, 2018

Gibson Env: Real-World Perception for Embodied Agents.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Adversarial Feature Augmentation for Unsupervised Domain Adaptation.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Im2Pano3D: Extrapolating 360° Structure and Semantics Beyond the Field of View.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Deep Learning Under Privileged Information Using Heteroscedastic Dropout.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Social GAN: Socially Acceptable Trajectories With Generative Adversarial Networks.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Demo2Vec: Reasoning Object Affordances From Online Videos.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

ROBOTURK: A Crowdsourcing Platform for Robotic Skill Learning through Imitation.
Proceedings of the 2nd Annual Conference on Robot Learning, 2018

SURREAL: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark.
Proceedings of the 2nd Annual Conference on Robot Learning, 2018

Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings.
Proceedings of the Computer Vision - ACCV 2018, 2018

2017
Recurrent Autoregressive Networks for Online Multi-Object Tracking.
CoRR, 2017

Large-Scale 3D Shape Reconstruction and Segmentation from ShapeNet Core55.
CoRR, 2017

To Go or Not To Go? A Near Unsupervised Learning Approach For Robot Navigation.
CoRR, 2017

A Geometric Approach to Active Learning for Convolutional Neural Networks.
CoRR, 2017

Weakly Supervised Generative Adversarial Networks for 3D Reconstruction.
CoRR, 2017

Joint 2D-3D-Semantic Data for Indoor Scene Understanding.
CoRR, 2017

Subcategory-Aware Convolutional Neural Networks for Object Proposals and Detection.
Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision, 2017

AdaPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems.
Proceedings of the Robotics Research, The 18th International Symposium, 2017

Adversarially Robust Policy Learning: Active construction of physically-plausible perturbations.
Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2017

Unsupervised camera localization in crowded spaces.
Proceedings of the 2017 IEEE International Conference on Robotics and Automation, 2017

Lattice Long Short-Term Memory for Human Action Recognition.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Tracking the Untrackable: Learning to Track Multiple Cues with Long-Term Dependencies.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Feedback Networks.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Deep View Morphing.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Social Scene Understanding: End-to-End Multi-person Action Localization and Collective Activity Recognition.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

image2mass: Estimating the Mass of an Object from Its Image.
Proceedings of the 1st Annual Conference on Robot Learning, CoRL 2017, Mountain View, 2017

SEGCloud: Semantic Segmentation of 3D Point Clouds.
Proceedings of the 2017 International Conference on 3D Vision, 2017

Scene Semantic Reconstruction from Egocentric RGB-D-Thermal Videos.
Proceedings of the 2017 International Conference on 3D Vision, 2017

Weakly Supervised 3D Reconstruction with Adversarial Constraint.
Proceedings of the 2017 International Conference on 3D Vision, 2017

The Group and Crowd Analysis Interdisciplinary Challenge.
Proceedings of the Group and Crowd Behavior for Computer Vision, 1st Edition, 2017

Learning to Predict Human Behavior in Crowded Scenes.
Proceedings of the Group and Crowd Behavior for Computer Vision, 1st Edition, 2017

2016
Robust real-time tracking combining 3D shape, color, and motion.
Int. J. Robotics Res., 2016

Feedback Networks.
CoRR, 2016

Human Centred Object Co-Segmentation.
CoRR, 2016

Unsupervised Semantic Action Discovery from Video Collections.
CoRR, 2016

Unsupervised Transductive Domain Adaptation.
CoRR, 2016

Forecasting Social Navigation in Crowded Complex Scenes.
CoRR, 2016

A Probabilistic Framework for Real-time 3D Segmentation using Spatial, Temporal, and Semantic Cues.
Proceedings of the Robotics: Science and Systems XII, University of Michigan, Ann Arbor, Michigan, USA, June 18, 2016

Learning Transferrable Representations for Unsupervised Domain Adaptation.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Universal Correspondence Network.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Watch-Bot: Unsupervised learning for reminding humans of forgotten actions.
Proceedings of the 2016 IEEE International Conference on Robotics and Automation, 2016

Robust single-view instance recognition.
Proceedings of the 2016 IEEE International Conference on Robotics and Automation, 2016

Point-based path prediction from polar histograms.
Proceedings of the 19th International Conference on Information Fusion, 2016

Generic 3D Representation via Pose Estimation and Matching.
Proceedings of the Computer Vision - ECCV 2016, 2016

ObjectNet3D: A Large Scale Database for 3D Object Recognition.
Proceedings of the Computer Vision - ECCV 2016, 2016

Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes.
Proceedings of the Computer Vision - ECCV 2016, 2016

Pose Estimation Errors, the Ultimate Diagnosis.
Proceedings of the Computer Vision - ECCV 2016, 2016

Learning to Track at 100 FPS with Deep Regression Networks.
Proceedings of the Computer Vision - ECCV 2016, 2016

3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction.
Proceedings of the Computer Vision - ECCV 2016, 2016

Knowledge Transfer for Scene-Specific Motion Prediction.
Proceedings of the Computer Vision - ECCV 2016, 2016

Deep Metric Learning via Lifted Structured Feature Embedding.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Structural-RNN: Deep Learning on Spatio-Temporal Graphs.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

DeLay: Robust Spatial Layout Estimation for Cluttered Indoor Scenes.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

3D Semantic Parsing of Large-Scale Indoor Spaces.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Social LSTM: Human Trajectory Prediction in Crowded Spaces.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
Automatic Extrinsic Calibration of Vision and Lidar by Maximizing Mutual Information.
J. Field Robotics, 2015

Automated Progress Monitoring Using Unordered Daily Construction Photographs and IFC-Based Building Information Models.
J. Comput. Civ. Eng., 2015

Indoor Scene Understanding with Geometric and Semantic Contexts.
Int. J. Comput. Vis., 2015

Deep Learning for Single-View Instance Recognition.
CoRR, 2015

ShapeNet: An Information-Rich 3D Model Repository.
CoRR, 2015

Semantic Cross-View Matching.
Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop, 2015

Learning to Track: Online Multi-object Tracking by Decision Making.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Unsupervised Semantic Parsing of Video Collections.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Action Recognition by Hierarchical Mid-Level Action Elements.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Data-driven 3D Voxel Patterns for object category recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Watch-n-patch: Unsupervised understanding of actions and relations.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

A coarse-to-fine model for 3D pose estimation and sub-category recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Enriching object detection with 2D-3D registration and continuous viewpoint estimation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014
Model-Based Object Recognition.
Computer Vision, A Reference Guide, 2014

Shape from Specularities.
Computer Vision, A Reference Guide, 2014

Relating Things and Stuff via ObjectProperty Interactions.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Understanding Collective Activitiesof People from Videos.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Shrinkage Optimized Directed Information using Pictorial Structures for Action Recognition.
CoRR, 2014

Beyond PASCAL: A benchmark for 3D object detection in the wild.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2014

Understanding the 3D layout of a cluttered room from multiple images.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2014

Combining 3D Shape, Color, and Motion for Robust Anytime Tracking.
Proceedings of the Robotics: Science and Systems X, 2014

Toward mutual information based place recognition.
Proceedings of the 2014 IEEE International Conference on Robotics and Automation, 2014

Structured Recurrent Temporal Restricted Boltzmann Machines.
Proceedings of the 31th International Conference on Machine Learning, 2014

Monocular Multiview Object Tracking with 3D Aspect Parts.
Proceedings of the Computer Vision - ECCV 2014, 2014

A Hierarchical Representation for Future Action Prediction.
Proceedings of the Computer Vision - ECCV 2014, 2014

Discovering Groups of People in Images.
Proceedings of the Computer Vision - ECCV 2014, 2014

Learning an Image-Based Motion Context for Multiple People Tracking.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013
A General Framework for Tracking Multiple People from a Moving Camera.
IEEE Trans. Pattern Anal. Mach. Intell., 2013

Hierarchical classification of images by sparse approximation.
Image Vis. Comput., 2013

Object detection, shape recovery, and 3D modelling by depth-encoded hough voting.
Comput. Vis. Image Underst., 2013

Label transfer exploiting three-dimensional structure for semantic segmentation.
Proceedings of the 6th International Conference on Computer Vision / Computer Graphics Collaboration Techniques and Applications, 2013

A Bayesian Approach to Tracking Learning Detection.
Proceedings of the Image Analysis and Processing - ICIAP 2013, 2013

Layout Estimation of Highly Cluttered Indoor Scenes Using Geometric and Semantic Cues.
Proceedings of the Image Analysis and Processing - ICIAP 2013, 2013

Object Detection by 3D Aspectlets and Occlusion Reasoning.
Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops, 2013

Breaking the Chain: Liberation from the Temporal Markov Assumption for Tracking Human Poses.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Find the Best Path: An Efficient and Accurate Classifier for Image Hierarchies.
Proceedings of the IEEE International Conference on Computer Vision, 2013

3D Scene Understanding by Voxel-CRF.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Weakly Supervised Learning of Mid-Level Features with Beta-Bernoulli Process Restricted Boltzmann Machines.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Understanding Indoor Scenes Using 3D Geometric Phrases.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Dense Object Reconstruction with Semantic Priors.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

EVA: An efficient vision architecture for mobile systems.
Proceedings of the International Conference on Compilers, 2013

Free your Camera: 3D Indoor Scene Understanding from Arbitrary Camera Motion.
Proceedings of the British Machine Vision Conference, 2013

2012
Multimodal Video Indexing and Retrieval Using Directed Information.
IEEE Trans. Multim., 2012

Efficient and Exact MAP-MRF Inference using Branch and Bound.
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

Object Detection using Geometrical Context Feedback.
Int. J. Comput. Vis., 2012

Toward mutual information based automatic registration of 3D point clouds.
Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012

MVSS: Michigan Visual Sonification System.
Proceedings of the 2012 IEEE International Conference on Emerging Signal Processing Applications, 2012

Relating Things and Stuff by High-Order Potential Modeling.
Proceedings of the Computer Vision - ECCV 2012. Workshops and Demonstrations, 2012

A Unified Framework for Multi-target Tracking and Collective Activity Recognition.
Proceedings of the Computer Vision - ECCV 2012, 2012

Object Co-detection.
Proceedings of the Computer Vision - ECCV 2012, 2012

Estimating the aspect layout of object categories.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

An efficient branch-and-bound algorithm for optimal human pose estimation.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Mobile object detection through client-server based vote transfer.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Semantic structure from motion with points, regions, and objects.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Automatic Targetless Extrinsic Calibration of a 3D Lidar and Camera by Maximizing Mutual Information.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011
Representations and Techniques for 3D Object Recognition and Scene Interpretation
Synthesis Lectures on Artificial Intelligence and Machine Learning, Morgan & Claypool Publishers, ISBN: 978-3-031-01557-1, 2011

Toward coherent object detection and scene layout understanding.
Image Vis. Comput., 2011

Semantic Structure from Motion: A Novel Framework for Joint Object Recognition and 3D Reconstruction.
Proceedings of the Outdoor and Large-Scale Real-World Scene Analysis - 15th International Workshop on Theoretical Foundations of Computer Vision, Dagstuhl Castle, Germany, June 26, 2011

MEVBench: A mobile computer vision benchmarking suite.
Proceedings of the 2011 IEEE International Symposium on Workload Characterization, 2011

Visually bootstrapped generalized ICP.
Proceedings of the IEEE International Conference on Robotics and Automation, 2011

Deformable part models revisited: A performance evaluation for object category pose estimation.
Proceedings of the IEEE International Conference on Computer Vision Workshops, 2011

Monitoring changes of 3D building elements from unordered photo collections.
Proceedings of the IEEE International Conference on Computer Vision Workshops, 2011

Detecting and tracking people using an RGB-D camera via multiple detector fusion.
Proceedings of the IEEE International Conference on Computer Vision Workshops, 2011

Semantic structure from motion with object and point interactions.
Proceedings of the IEEE International Conference on Computer Vision Workshops, 2011

Articulated part-based model for joint object detection and pose estimation.
Proceedings of the IEEE International Conference on Computer Vision, 2011

Robust object pose estimation via statistical manifold modeling.
Proceedings of the IEEE International Conference on Computer Vision, 2011

EFFEX: an embedded processor for computer vision based feature extraction.
Proceedings of the 48th Design Automation Conference, 2011

Cross-view action recognition via view knowledge transfer.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Recognizing human actions by attributes.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Learning context for collective activity recognition.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Semantic structure from motion.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Hierarchical Classification of Images by Sparse Approximation.
Proceedings of the British Machine Vision Conference, 2011

Toward Automatic 3D Generic Object Modeling from One Single Image.
Proceedings of the International Conference on 3D Imaging, 2011

2010
Multi-view Object Categorization and Pose Estimation.
Proceedings of the Computer Vision: Detection, Recognition and Reconstruction, 2010

Toward automated generation of parametric BIMs based on hybrid video and laser scanning data.
Adv. Eng. Informatics, 2010

Depth-Encoded Hough Voting for Joint Object Detection and Shape Recovery.
Proceedings of the Computer Vision - ECCV 2010, 2010

Multiple Target Tracking in World Coordinate with Single, Minimally Calibrated Camera.
Proceedings of the Computer Vision, 2010

Object Detection with Geometrical Context Feedback Loop.
Proceedings of the British Machine Vision Conference, 2010

2009
Application of D4AR - A 4-Dimensional augmented reality model for automating construction progress monitoring data collection, processing and communication.
J. Inf. Technol. Constr., 2009

Special issue on 3D representation for object and scene recognition.
Comput. Vis. Image Underst., 2009

What are they doing? : Collective activity classification using spatio-temporal relationship among people.
Proceedings of the 12th IEEE International Conference on Computer Vision Workshops, 2009

Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

Video scene categorization by 3D hierarchical histogram matching.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

A multi-view probabilistic model for 3D object classes.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

Unsupervised Object Pose Classification from Short Video Sequences.
Proceedings of the British Machine Vision Conference, 2009

2008
View Synthesis for Recognizing Unseen Poses of Object Classes.
Proceedings of the Computer Vision, 2008

2007
3D Reconstruction by Shadow Carving: Theory and Practical Evaluation.
Int. J. Comput. Vis., 2007

3D generic object categorization, localization and pose estimation.
Proceedings of the IEEE 11th International Conference on Computer Vision, 2007

Detecting Specular Surfaces on Natural Images.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

2006
Discriminative Object Class Models of Appearance and Shape by Correlatons.
Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), 2006

Carving from Ray-Tracing Constraints: IRT-Carving.
Proceedings of the 3rd International Symposium on 3D Data Processing, 2006

2005
Local Shape from Mirror Reflections.
Int. J. Comput. Vis., 2005

2004
Recovering Local Shape of a Mirror Surface from Reflection of a Regular Grid.
Proceedings of the Computer Vision, 2004

What do reflections tell us about the shape of a mirror?
Proceedings of the 1st Symposium on Applied Perception in Graphics and Visualization, 2004

2002
Local Analysis for 3D Reconstruction of Specular Surfaces - Part II.
Proceedings of the Computer Vision, 2002

Implementation of a Shadow Carving System for Shape Capture.
Proceedings of the 1st International Symposium on 3D Data Processing Visualization and Transmission (3DPVT 2002), 2002

Second Order Local Analysis for 3D Reconstruction of Specular Surfaces.
Proceedings of the 1st International Symposium on 3D Data Processing Visualization and Transmission (3DPVT 2002), 2002

2001
Shadow Carving.
Proceedings of the Eighth International Conference On Computer Vision (ICCV-01), Vancouver, British Columbia, Canada, July 7-14, 2001, 2001

Local Analysis for 3D Reconstruction of Specular Surfaces.
Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), 2001


  Loading...