Aniruddha Kembhavi

Orcid: 0000-0002-7608-7443

Affiliations:
  • AI2, Allen Institute for Artificial Intelligence, Seattle, US


According to our database1, Aniruddha Kembhavi authored at least 95 papers between 2008 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models.
Trans. Mach. Learn. Res., 2024

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models.
CoRR, 2024

FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning.
CoRR, 2024

PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators.
CoRR, 2024

CodeNav: Beyond tool-use to using real-world codebases with LLM agents.
CoRR, 2024

Task Me Anything.
CoRR, 2024

Preserving Identity with Variational Score for General-purpose 3D Editing.
CoRR, 2024

Open X-Embodiment: Robotic Learning Datasets and RT-X Models : Open X-Embodiment Collaboration.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Selective Visual Representations Improve Convergence and Generalization for Embodied AI.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Iterated Learning Improves Compositionality in Large Vision-Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Holodeck: Language Guided Generation of 3D Embodied AI Environments.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Seeing the Unseen: Visual Common Sense for Semantic Placement.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

MIMIC: Masked Image Modeling with Image Correspondences.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Promptable Behaviors: Personalizing Multi-Objective Rewards from Human Preferences.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
FLUID: A Unified Evaluation Framework for Flexible Sequential Data.
Trans. Mach. Learn. Res., 2023

Harmonic Mobile Manipulation.
CoRR, 2023

Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World.
CoRR, 2023

Zooming Out on Zooming In: Advancing Super-Resolution for Remote Sensing.
CoRR, 2023

MIMIC: Masked Image Modeling with Image Correspondences.
CoRR, 2023

Neural Priming for Sample-Efficient Adaptation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

OBJECT 3DIT: Language-guided 3D-aware Image Editing.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

SugarCrepe: Fixing Hackable Benchmarks for Vision-Language Compositionality.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Objaverse-XL: A Universe of 10M+ 3D Objects.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Neural Radiance Field Codebooks.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

UNIFIED-IO: A Unified Model for Vision, Language, and Multi-modal Tasks.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Scene Graph Contrastive Learning for Embodied Navigation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

I can't believe there's no images! : Learning Visual Tasks Using Only Language Supervision.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

EXCALIBUR: Encouraging and Evaluating Embodied Exploration.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Visual Programming: Compositional visual reasoning without training.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Objaverse: A Universe of Annotated 3D Objects.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Phone2Proc: Bringing Robust Robots into Our Chaotic World.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Benchmarking Progress to Infant-Level Physical Reasoning in AI.
Trans. Mach. Learn. Res., 2022

Phone2Proc: Bringing Robust Robots Into Our Chaotic World.
CoRR, 2022

A General Purpose Supervisory Signal for Embodied Agents.
CoRR, 2022

Satlas: A Large-Scale, Multi-Task Dataset for Remote Sensing Image Understanding.
CoRR, 2022

I Can't Believe There's No Images! Learning Visual Tasks Using only Language Data.
CoRR, 2022

Retrospectives on the Embodied AI Workshop.
CoRR, 2022

ProcTHOR: Large-Scale Embodied AI Using Procedural Generation.
CoRR, 2022

GRIT: General Robust Image Task Benchmark.
CoRR, 2022

ASC me to Do Anything: Multi-task Training for Embodied AI.
CoRR, 2022

Ask4Help: Learning to Leverage an Expert for Embodied Tasks.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

🏘️ ProcTHOR: Large-Scale Embodied AI Using Procedural Generation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Webly Supervised Concept Expansion for General Purpose Vision Models.
Proceedings of the Computer Vision - ECCV 2022, 2022

Object Manipulation via Visual Target Localization.
Proceedings of the Computer Vision - ECCV 2022, 2022

Towards General Purpose Vision Systems: An End-to-End Task-Agnostic Vision-Language Architecture.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

What do navigation agents learn about their environment?
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Simple but Effective: CLIP Embeddings for Embodied AI.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Container: Context Aggregation Network.
CoRR, 2021

Towards General Purpose Vision Systems.
CoRR, 2021

Bridging the Imitation Gap by Adaptive Insubordination.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Container: Context Aggregation Networks.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning Generalizable Visual Representations via Interactive Gameplay.
Proceedings of the 9th International Conference on Learning Representations, 2021

GridToPix: Training Embodied Agents with Minimal Supervision.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

RobustNav: Towards Benchmarking Robustness in Embodied Navigation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Visual Room Rearrangement.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Visual Semantic Role Labeling for Video Understanding.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

ManipulaTHOR: A Framework for Visual Object Manipulation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
AllenAct: A Framework for Embodied AI Research.
CoRR, 2020

Bridging the Imitation Gap by Adaptive Insubordination.
CoRR, 2020

In the Wild: From ML Models to Pragmatic ML Systems.
CoRR, 2020

ObjectNav Revisited: On Evaluation of Embodied Agents Navigating to Objects.
CoRR, 2020

Supermasks in Superposition.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Learning About Objects by Learning to Interact with Them.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Feel The Music: Automatically Generating A Dance For An Input Song.
Proceedings of the Eleventh International Conference on Computational Creativity, 2020

X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Grounded Situation Recognition.
Proceedings of the Computer Vision - ECCV 2020, 2020

A Cordial Sync: Going Beyond Marginal Policies for Multi-agent Embodied Tasks.
Proceedings of the Computer Vision - ECCV 2020, 2020

What's Hidden in a Randomly Weighted Neural Network?
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

RoboTHOR: An Open Simulation-to-Real Embodied AI Platform.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Artificial Agents Learn Flexible Visual Representations by Playing a Hiding Game.
CoRR, 2019

ELASTIC: Improving CNNs With Dynamic Scaling Policies.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Two Body Problem: Collaborative Visual Task Completion.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
ELASTIC: Improving CNNs with Instance Specific Scaling Policies.
CoRR, 2018

Imagine This! Scripts to Compositions to Videos.
Proceedings of the Computer Vision - ECCV 2018, 2018

IQA: Visual Question Answering in Interactive Environments.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Structured Set Matching Networks for One-Shot Part Labeling.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
C-VQA: A Compositional Split of the Visual Question Answering (VQA) v1.0 Dataset.
CoRR, 2017

Bidirectional Attention Flow for Machine Comprehension.
Proceedings of the 5th International Conference on Learning Representations, 2017

Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Semantic Parsing to Probabilistic Programs for Situated Question Answering.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

A Diagram is Worth a Dozen Images.
Proceedings of the Computer Vision - ECCV 2016, 2016

2011
Vehicle Detection Using Partial Least Squares.
IEEE Trans. Pattern Anal. Mach. Intell., 2011

2010
Recognizing Objects And Reasoning About Their Interactions.
PhD thesis, 2010

Why Did the Person Cross the Road (There)? Scene Understanding Using Probabilistic Logic Models and Common Sense Reasoning.
Proceedings of the Computer Vision, 2010

2009
Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2009

Motion segmentation and activity representation in crowds.
Int. J. Imaging Syst. Technol., 2009

Human detection using partial least squares analysis.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

Incremental Multiple Kernel Learning for object recognition.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

2008
Tracking Down Under: Following the Satin Bowerbird.
Proceedings of the 9th IEEE Workshop on Applications of Computer Vision (WACV 2008), 2008


  Loading...