We stand with Ukraine

We stand with Ukraine

Alexander Toshev

Orcid: 0000-0003-0925-638X

According to our database¹, Alexander Toshev authored at least 66 papers between 2006 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

DSplats: 3D Generation by Denoising Splats-Based Multiview Diffusion Models.

[BibT_eX]

[DOI]

,

,

,

Federico Semeraro

,

,

,

Alexander Toshev

CoRR, 2024

From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons.

[BibT_eX]

[DOI]

,

,

,

Aleksei Timofeev

,

,

,

,

,

Alexander Toshev

CoRR, 2024

World-consistent Video Diffusion with Explicit 3D Modeling.

[BibT_eX]

[DOI]

,

,

Miguel Ángel Bautista

,

,

Alexander Toshev

,

Joshua M. Susskind

,

CoRR, 2024

Multimodal Autoregressive Pre-training of Large Vision Encoders.

[BibT_eX]

[DOI]

,

,

,

,

,

David Haldimann

,

,

Victor Guilherme Turrisi da Costa

,

,

,

Alexander T. Toshev

,

,

,

,

Joshua M. Susskind

,

Alaaeldin El-Nouby

CoRR, 2024

On the Modeling Capabilities of Large Language Models for Sequential Decision Making.

[BibT_eX]

[DOI]

Martin Klissarov

,

,

Alexander Toshev

,

CoRR, 2024

DataComp-LM: In search of the next generation of training sets for language models.

[BibT_eX]

[DOI]

,

,

Georgios Smyrnis

,

,

,

Samir Yitzhak Gadre

,

,

Etash Kumar Guha

,

,

,

,

,

Niklas Muennighoff

,

Reinhard Heckel

,

,

,

Suchin Gururangan

,

Mitchell Wortsman

,

,

,

Marianna Nezhurina

,

,

,

,

,

,

,

,

,

,

Gabriel Ilharco

,

,

Kalyani Marathe

,

,

,

Khyathi Raghavi Chandu

,

,

Igor Vasiljevic

,

,

,

,

,

,

Luke Zettlemoyer

,

,

Alaaeldin El-Nouby

,

Hadi Pouransari

,

Alexander Toshev

,

,

Dirk Groeneveld

,

,

,

,

,

Alexandros G. Dimakis

,

,

,

,

Vaishaal Shankar

CoRR, 2024

Grounding Multimodal Large Language Models in Actions.

[BibT_eX]

[DOI]

,

,

,

,

,

Alexander Toshev

CoRR, 2024

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training.

[BibT_eX]

[DOI]

Brandon McKinzie

,

,

Jean-Philippe Fauconnier

,

,

,

,

,

,

,

,

,

,

Karanjeet Singh

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Alexander Toshev

,

CoRR, 2024

Scalable Pre-training of Large Autoregressive Image Models.

[BibT_eX]

[DOI]

Alaaeldin El-Nouby

,

,

,

Miguel Ángel Bautista

,

Vaishaal Shankar

,

Alexander T. Toshev

,

Joshua M. Susskind

,

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Large Language Models as Generalizable Policies for Embodied Tasks.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Natalie Mackraz

,

,

Alexander T. Toshev

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Data Filtering Networks.

[BibT_eX]

[DOI]

,

Albin Madappally Jose

,

,

,

Alexander T. Toshev

,

Vaishaal Shankar

Proceedings of the Twelfth International Conference on Learning Representations, 2024

MM1: Methods, Analysis and Insights from Multimodal LLM Pre-training.

[BibT_eX]

[DOI]

Brandon McKinzie

,

,

Jean-Philippe Fauconnier

,

,

,

,

,

,

,

,

,

Karanjeet Singh

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Alexander Toshev

,

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

Large Language Models as Generalizable Policies for Embodied Tasks.

[BibT_eX]

[DOI]

,

,

,

,

,

Katherine Metcalf

,

Natalie Mackraz

,

,

Alexander Toshev

CoRR, 2023

Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts.

[BibT_eX]

[DOI]

Erik A. Daxberger

,

,

,

,

,

,

Michael Emmersberger

,

,

Alexander Toshev

,

CoRR, 2023

Principles and Guidelines for Evaluating Social Robot Navigation Algorithms.

[BibT_eX]

[DOI]

CoRR, 2023

Value function estimation using conditional diffusion models for control.

[BibT_eX]

[DOI]

,

,

Miguel Ángel Bautista

,

,

Alexander Toshev

,

Joshua M. Susskind

CoRR, 2023

On Robustness in Multimodal Learning.

[BibT_eX]

[DOI]

Brandon McKinzie

,

Joseph Yitan Cheng

,

Vaishaal Shankar

,

,

Jonathon Shlens

,

Alexander Toshev

CoRR, 2023

STAIR: Learning Sparse Text and Image Representation in Grounded Tokens.

[BibT_eX]

[DOI]

,

,

,

,

,

Albin Madappally Jose

,

Alexander Toshev

,

Jonathon Shlens

,

,

CoRR, 2023

Robustness in Multimodal Learning under Train-Test Modality Mismatch.

[BibT_eX]

[DOI]

Brandon McKinzie

,

Vaishaal Shankar

,

Joseph Yitan Cheng

,

,

Jonathon Shlens

,

Alexander T. Toshev

Proceedings of the International Conference on Machine Learning, 2023

Perceptual Grouping in Contrastive Vision-Language Models.

[BibT_eX]

[DOI]

Kanchana Ranasinghe

,

Brandon McKinzie

,

,

,

Alexander Toshev

,

Jonathon Shlens

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation.

[BibT_eX]

[DOI]

,

Brandon McKinzie

,

,

Vaishaal Shankar

,

Alexander Toshev

Proceedings of the Proceedings on "I Can't Believe It's Not Better: Failure Modes in the Age of Foundation Models" at NeurIPS 2023 Workshops, 2023

STAIR: Learning Sparse Text and Image Representation in Grounded Tokens.

[BibT_eX]

[DOI]

,

,

,

,

,

Albin Madappally Jose

,

Alexander Toshev

,

,

Jonathon Shlens

,

,

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022

Socially CompliAnt Navigation Dataset (SCAND): A Large-Scale Dataset of Demonstrations for Social Navigation.

[BibT_eX]

[DOI]

,

,

,

Garrett Warnell

,

,

Alexander Toshev

,

,

,

IEEE Robotics Autom. Lett., 2022

Perceptual Grouping in Vision-Language Models.

[BibT_eX]

[DOI]

Kanchana Ranasinghe

,

Brandon McKinzie

,

,

,

Alexander Toshev

,

Jonathon Shlens

CoRR, 2022

Retrospectives on the Embodied AI Workshop.

[BibT_eX]

[DOI]

CoRR, 2022

Gesture2Path: Imitation Learning for Gesture-aware Navigation.

[BibT_eX]

[DOI]

,

Tsang-Wei Edward Lee

,

,

Anthony G. Francis

,

,

,

Alexander Toshev

,

CoRR, 2022

A Protocol for Validating Social Navigation Policies.

[BibT_eX]

[DOI]

,

Tsang-Wei Edward Lee

,

,

,

Anthony G. Francis

,

Alexander Toshev

CoRR, 2022

Do As I Can, Not As I Say: Grounding Language in Robotic Affordances.

[BibT_eX]

[DOI]

CoRR, 2022

GAUDI: A Neural Architect for Immersive 3D Scene Generation.

[BibT_eX]

[DOI]

Miguel Ángel Bautista

,

,

,

,

Alexander Toshev

,

,

,

,

,

Daniel Ulbricht

,

,

Joshua M. Susskind

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning.

[BibT_eX]

[DOI]

,

,

,

,

Alexander Toshev

,

,

Proceedings of the Tenth International Conference on Learning Representations, 2022

Do As I Can, Not As I Say: Grounding Language in Robotic Affordances.

[BibT_eX]

[DOI]

Proceedings of the Conference on Robot Learning, 2022

2021

ReLMoGen: Integrating Motion Generation in Reinforcement Learning for Mobile Manipulation.

[BibT_eX]

[DOI]

,

,

Roberto Martín-Martín

,

,

Alexander Toshev

,

Silvio Savarese

Proceedings of the IEEE International Conference on Robotics and Automation, 2021

2020

Interactive Gibson Benchmark: A Benchmark for Interactive Navigation in Cluttered Environments.

[BibT_eX]

[DOI]

,

William B. Shen

,

,

,

,

Alexander Toshev

,

Roberto Martín-Martín

,

Silvio Savarese

IEEE Robotics Autom. Lett., 2020

ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation.

[BibT_eX]

[DOI]

,

,

Roberto Martín-Martín

,

,

Alexander Toshev

,

Silvio Savarese

CoRR, 2020

ObjectNav Revisited: On Evaluation of Embodied Agents Navigating to Objects.

[BibT_eX]

[DOI]

,

,

Aniruddha Kembhavi

,

Oleksandr Maksymets

,

Roozbeh Mottaghi

,

,

Alexander Toshev

,

CoRR, 2020

Adversarial Generative Grammars for Human Activity Prediction.

[BibT_eX]

[DOI]

A. J. Piergiovanni

,

Anelia Angelova

,

Alexander Toshev

,

Michael S. Ryoo

Proceedings of the Computer Vision - ECCV 2020, 2020

Learning Object-conditioned Exploration using Distributed Soft Actor Critic.

[BibT_eX]

[DOI]

,

,

,

,

Alexander Toshev

Proceedings of the 4th Conference on Robot Learning, 2020

Modeling Long-horizon Tasks as Sequential Interaction Landscapes.

[BibT_eX]

[DOI]

,

,

Alexander Toshev

,

Proceedings of the 4th Conference on Robot Learning, 2020

2019

Interactive Gibson: A Benchmark for Interactive Navigation in Cluttered Environments.

[BibT_eX]

[DOI]

,

William B. Shen

,

,

,

,

Alexander Toshev

,

Roberto Martín-Martín

,

Silvio Savarese

CoRR, 2019

Long Range Neural Navigation Policies for the Real World.

[BibT_eX]

[DOI]

,

Alexander Toshev

,

,

Tsang-Wei Edward Lee

Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

Visual Representations for Semantic Target Driven Navigation.

[BibT_eX]

[DOI]

Arsalan Mousavian

,

Alexander Toshev

,

,

,

,

Proceedings of the International Conference on Robotics and Automation, 2019

Evolving Space-Time Neural Architectures for Videos.

[BibT_eX]

[DOI]

A. J. Piergiovanni

,

Anelia Angelova

,

Alexander Toshev

,

Michael S. Ryoo

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks.

[BibT_eX]

[DOI]

,

Alexander Toshev

,

,

Silvio Savarese

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Self-supervisory Signals for Object Discovery and Detection.

[BibT_eX]

[DOI]

,

Alexander Toshev

,

CoRR, 2018

Visual Representations for Semantic Target Driven Navigation.

[BibT_eX]

[DOI]

Arsalan Mousavian

,

Alexander Toshev

,

,

,

CoRR, 2018

Sim2Real Viewpoint Invariant Visual Servoing by Recurrent Control.

[BibT_eX]

[DOI]

Fereshteh Sadeghi

,

Alexander Toshev

,

,

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge.

[BibT_eX]

[DOI]

,

Alexander Toshev

,

,

IEEE Trans. Pattern Anal. Mach. Intell., 2017

Sim2Real View Invariant Visual Servoing by Recurrent Control.

[BibT_eX]

[DOI]

Fereshteh Sadeghi

,

Alexander Toshev

,

,

CoRR, 2017

No Fuss Distance Metric Learning Using Proxies.

[BibT_eX]

[DOI]

Yair Movshovitz-Attias

,

Alexander Toshev

,

Thomas K. Leung

,

,

Proceedings of the IEEE International Conference on Computer Vision, 2017

Towards Accurate Multi-person Pose Estimation in the Wild.

[BibT_eX]

[DOI]

George Papandreou

,

,

,

Alexander Toshev

,

Jonathan Tompson

,

,

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016

The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition.

[BibT_eX]

[DOI]

Jonathan Krause

,

,

,

,

Alexander Toshev

,

,

,

Proceedings of the Computer Vision - ECCV 2016, 2016

Chained Predictions Using Convolutional Neural Networks.

[BibT_eX]

[DOI]

Georgia Gkioxari

,

Alexander Toshev

,

Proceedings of the Computer Vision - ECCV 2016, 2016

Generation and Comprehension of Unambiguous Object Descriptions.

[BibT_eX]

[DOI]

,

,

Alexander Toshev

,

,

,

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015

Pose Embeddings: A Deep Architecture for Learning to Match Human Poses.

[BibT_eX]

[DOI]

,

Caroline Pantofaru

,

,

,

George Toderici

,

Alexander Toshev

,

CoRR, 2015

Show and tell: A neural image caption generator.

[BibT_eX]

[DOI]

,

Alexander Toshev

,

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014

Deep Convolutional Ranking for Multilabel Image Annotation.

[BibT_eX]

[DOI]

,

,

,

Alexander Toshev

,

Proceedings of the 2nd International Conference on Learning Representations, 2014

DeepPose: Human Pose Estimation via Deep Neural Networks.

[BibT_eX]

[DOI]

Alexander Toshev

,

Christian Szegedy

Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Scalable Object Detection Using Deep Neural Networks.

[BibT_eX]

[DOI]

,

Christian Szegedy

,

Alexander Toshev

,

Dragomir Anguelov

Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013

Deep Neural Networks for Object Detection.

[BibT_eX]

[DOI]

Christian Szegedy

,

Alexander Toshev

,

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

2012

Shape-Based Object Detection via Boundary Structure Segmentation.

[BibT_eX]

[DOI]

Alexander Toshev

,

,

Kostas Daniilidis

Int. J. Comput. Vis., 2012

2010

Cascaded Models for Articulated Pose Estimation.

[BibT_eX]

[DOI]

,

Alexander Toshev

,

Proceedings of the Computer Vision, 2010

Object detection via boundary structure segmentation.

[BibT_eX]

[DOI]

Alexander Toshev

,

,

Kostas Daniilidis

Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Detecting and parsing architecture at city scale from range data.

[BibT_eX]

[DOI]

Alexander Toshev

,

Philippos Mordohai

,

Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009

Shape-based object recognition in videos using 3D synthetic object models.

[BibT_eX]

[DOI]

Alexander Toshev

,

,

Kostas Daniilidis

Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

2007

Image Matching via Saliency Region Correspondences.

[BibT_eX]

[DOI]

Alexander Toshev

,

,

Kostas Daniilidis

Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

2006

An APRIORI-based Method for Frequent Composite Event Discovery in Videos.

[BibT_eX]

[DOI]

Alexander Toshev

,

François Brémond

,

Monique Thonnat

Proceedings of the 2006 IEEE International Conference on Computer Vision Systems, 2006

Loading...