We stand with Ukraine

We stand with Ukraine

Marcus Rohrbach

Orcid: 0000-0001-5908-7751

Affiliations:

TU Darmstadt, Germany

According to our database¹, Marcus Rohrbach authored at least 88 papers between 2009 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

on rohrbach.vision
on scholar.google.com

On csauthors.net:

Bibliography

2024

Simple Token-Level Confidence Improves Caption Correctness.

[BibT_eX]

[DOI]

,

Spencer Whitehead

,

Joseph E. Gonzalez

,

,

,

Marcus Rohrbach

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Efficient Pre-training for Localized Instruction Generation of Procedural Videos.

[BibT_eX]

[DOI]

,

Davide Moltisanti

,

Laura Sevilla-Lara

,

Marcus Rohrbach

,

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

Improving Selective Visual Question Answering by Learning from Your Peers.

[BibT_eX]

[DOI]

Corentin Dancette

,

Spencer Whitehead

,

Rishabh Maheshwary

,

Ramakrishna Vedantam

,

,

,

,

Marcus Rohrbach

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly.

[BibT_eX]

[DOI]

Spencer Whitehead

,

,

,

Joseph Gonzalez

,

,

,

Marcus Rohrbach

Proceedings of the Computer Vision - ECCV 2022, 2022

CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition.

[BibT_eX]

[DOI]

Shreyank N. Gowda

,

Laura Sevilla-Lara

,

,

Marcus Rohrbach

Proceedings of the Computer Vision - ECCV 2022, 2022

Learn2Augment: Learning to Composite Videos for Data Augmentation in Action Recognition.

[BibT_eX]

[DOI]

Shreyank N. Gowda

,

Marcus Rohrbach

,

,

Laura Sevilla-Lara

Proceedings of the Computer Vision - ECCV 2022, 2022

FLAVA: A Foundational Language And Vision Alignment Model.

[BibT_eX]

[DOI]

Amanpreet Singh

,

,

Vedanuj Goswami

,

Guillaume Couairon

,

Wojciech Galuba

,

Marcus Rohrbach

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Learning To Recognize Procedural Activities with Distant Supervision.

[BibT_eX]

[DOI]

,

,

Gedas Bertasius

,

Marcus Rohrbach

,

,

Lorenzo Torresani

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting.

[BibT_eX]

[DOI]

,

,

,

,

Joseph E. Gonzalez

,

Marcus Rohrbach

,

Proceedings of the 9th International Conference on Learning Representations, 2021

A New Split for Evaluating True Zero-Shot Action Recognition.

[BibT_eX]

[DOI]

Shreyank N. Gowda

,

Laura Sevilla-Lara

,

,

,

Marcus Rohrbach

Proceedings of the Pattern Recognition - 43rd DAGM German Conference, DAGM GCPR 2021, Bonn, Germany, September 28, 2021

KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA.

[BibT_eX]

[DOI]

,

,

,

,

Marcus Rohrbach

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

SMART Frame Selection for Action Recognition.

[BibT_eX]

[DOI]

Shreyank N. Gowda

,

Marcus Rohrbach

,

Laura Sevilla-Lara

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Decoupling Representation and Classifier for Long-Tailed Recognition.

[BibT_eX]

[DOI]

,

,

Marcus Rohrbach

,

,

,

,

Yannis Kalantidis

Proceedings of the 8th International Conference on Learning Representations, 2020

Uncertainty-guided Continual Learning with Bayesian Neural Networks.

[BibT_eX]

[DOI]

,

Mohamed Elhoseiny

,

,

Marcus Rohrbach

Proceedings of the 8th International Conference on Learning Representations, 2020

TextCaps: A Dataset for Image Captioning with Reading Comprehension.

[BibT_eX]

[DOI]

Oleksii Sidorov

,

,

Marcus Rohrbach

,

Amanpreet Singh

Proceedings of the Computer Vision - ECCV 2020, 2020

Learning to Generate Grounded Visual Captions Without Localization Supervision.

[BibT_eX]

[DOI]

,

Yannis Kalantidis

,

Ghassan AlRegib

,

,

Marcus Rohrbach

,

Proceedings of the Computer Vision - ECCV 2020, 2020

Adversarial Continual Learning.

[BibT_eX]

[DOI]

,

Franziska Meier

,

Roberto Calandra

,

,

Marcus Rohrbach

Proceedings of the Computer Vision - ECCV 2020, 2020

12-in-1: Multi-Task Vision and Language Representation Learning.

[BibT_eX]

[DOI]

,

Vedanuj Goswami

,

Marcus Rohrbach

,

,

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

In Defense of Grid Features for Visual Question Answering.

[BibT_eX]

[DOI]

,

,

Marcus Rohrbach

,

Erik G. Learned-Miller

,

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA.

[BibT_eX]

[DOI]

,

Amanpreet Singh

,

,

Marcus Rohrbach

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Learning to Generate Grounded Image Captions without Localization Supervision.

[BibT_eX]

[DOI]

,

Yannis Kalantidis

,

Ghassan AlRegib

,

,

Marcus Rohrbach

,

CoRR, 2019

Continual Learning with Tiny Episodic Memories.

[BibT_eX]

[DOI]

Arslan Chaudhry

,

Marcus Rohrbach

,

Mohamed Elhoseiny

,

Thalaiyasingam Ajanthan

,

Puneet Kumar Dokania

,

Philip H. S. Torr

,

Marc'Aurelio Ranzato

CoRR, 2019

CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual Dialog.

[BibT_eX]

[DOI]

,

José M. F. Moura

,

,

,

Marcus Rohrbach

Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Probabilistic Neural Symbolic Models for Interpretable Visual Question Answering.

[BibT_eX]

[DOI]

Ramakrishna Vedantam

,

,

,

Marcus Rohrbach

,

,

Proceedings of the 36th International Conference on Machine Learning, 2019

Efficient Lifelong Learning with A-GEM.

[BibT_eX]

[DOI]

Arslan Chaudhry

,

Marc'Aurelio Ranzato

,

Marcus Rohrbach

,

Mohamed Elhoseiny

Proceedings of the 7th International Conference on Learning Representations, 2019

Selfless Sequential Learning.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

Tinne Tuytelaars

Proceedings of the 7th International Conference on Learning Representations, 2019

Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks With Octave Convolution.

[BibT_eX]

[DOI]

,

,

,

,

Yannis Kalantidis

,

Marcus Rohrbach

,

,

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Grounded Video Description.

[BibT_eX]

[DOI]

,

Yannis Kalantidis

,

,

,

Marcus Rohrbach

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Towards VQA Models That Can Read.

[BibT_eX]

[DOI]

Amanpreet Singh

,

Vivek Natarajan

,

,

,

,

,

,

Marcus Rohrbach

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition.

[BibT_eX]

[DOI]

,

,

Yannis Kalantidis

,

Laura Sevilla-Lara

,

Marcus Rohrbach

,

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Cycle-Consistency for Robust Visual Question Answering.

[BibT_eX]

[DOI]

,

,

Marcus Rohrbach

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Adversarial Inference for Multi-Sentence Video Description.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Uncertainty-Guided Continual Learning in Bayesian Neural Networks - Extended Abstract.

[BibT_eX]

[DOI]

,

Mohamed Elhoseiny

,

,

Marcus Rohrbach

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Graph-Based Global Reasoning Networks.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

,

,

,

Yannis Kalantidis

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication.

[BibT_eX]

[DOI]

,

,

,

Marcus Rohrbach

,

Byoung-Tak Zhang

,

,

,

Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Large-Scale Visual Relationship Understanding.

[BibT_eX]

[DOI]

,

Yannis Kalantidis

,

Marcus Rohrbach

,

,

,

Mohamed Elhoseiny

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Pythia v0.1: the Winning Entry to the VQA Challenge 2018.

[BibT_eX]

[DOI]

,

Vivek Natarajan

,

,

Marcus Rohrbach

,

,

CoRR, 2018

A Dataset for Telling the Stories of Social Media Videos.

[BibT_eX]

[DOI]

,

,

Marcus Rohrbach

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Visual Coreference Resolution in Visual Dialog Using Neural Module Networks.

[BibT_eX]

[DOI]

,

José M. F. Moura

,

,

,

Marcus Rohrbach

Proceedings of the Computer Vision - ECCV 2018, 2018

Memory Aware Synapses: Learning What (not) to Forget.

[BibT_eX]

[DOI]

,

Francesca Babiloni

,

Mohamed Elhoseiny

,

Marcus Rohrbach

,

Tinne Tuytelaars

Proceedings of the Computer Vision - ECCV 2018, 2018

Multimodal Explanations: Justifying Decisions and Pointing to the Evidence.

[BibT_eX]

[DOI]

,

Lisa Anne Hendricks

,

,

,

,

,

Marcus Rohrbach

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Exploring the Challenges Towards Lifelong Fact Learning.

[BibT_eX]

[DOI]

Mohamed Elhoseiny

,

Francesca Babiloni

,

,

Marcus Rohrbach

,

,

Tinne Tuytelaars

Proceedings of the Computer Vision - ACCV 2018, 2018

2017

Long-Term Recurrent Convolutional Networks for Visual Recognition and Description.

[BibT_eX]

[DOI]

,

Lisa Anne Hendricks

,

Marcus Rohrbach

,

Subhashini Venugopalan

,

Sergio Guadarrama

,

,

IEEE Trans. Pattern Anal. Mach. Intell., 2017

Movie Description.

[BibT_eX]

[DOI]

,

,

Marcus Rohrbach

,

,

Christopher Joseph Pal

,

Hugo Larochelle

,

Aaron C. Courville

,

Int. J. Comput. Vis., 2017

Ask Your Neurons: A Deep Learning Approach to Visual Question Answering.

[BibT_eX]

[DOI]

Mateusz Malinowski

,

Marcus Rohrbach

,

Int. J. Comput. Vis., 2017

Attentive Explanations: Justifying Decisions and Pointing to the Evidence (Extended Abstract).

[BibT_eX]

[DOI]

,

Lisa Anne Hendricks

,

,

,

,

,

Marcus Rohrbach

CoRR, 2017

Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training.

[BibT_eX]

[DOI]

Rakshith Shetty

,

Marcus Rohrbach

,

Lisa Anne Hendricks

,

,

Proceedings of the IEEE International Conference on Computer Vision, 2017

Learning to Reason: End-to-End Module Networks for Visual Question Answering.

[BibT_eX]

[DOI]

,

,

Marcus Rohrbach

,

,

Proceedings of the IEEE International Conference on Computer Vision, 2017

Captioning Images with Diverse Objects.

[BibT_eX]

[DOI]

Subhashini Venugopalan

,

Lisa Anne Hendricks

,

Marcus Rohrbach

,

Raymond J. Mooney

,

,

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Generating Descriptions with Grounded and Co-referenced People.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

,

,

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Modeling Relationships in Referential Expressions with Compositional Modular Networks.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

,

,

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016

Recognizing Fine-Grained and Composite Activities Using Hand-Centric Features and Script Data.

[BibT_eX]

[DOI]

Marcus Rohrbach

,

,

Michaela Regneri

,

,

Mykhaylo Andriluka

,

,

Int. J. Comput. Vis., 2016

Attributes as Semantic Units between Natural Language and Visual Recognition.

[BibT_eX]

[DOI]

Marcus Rohrbach

CoRR, 2016

Attentive Explanations: Justifying Decisions and Pointing to the Evidence.

[BibT_eX]

[DOI]

,

Lisa Anne Hendricks

,

,

,

,

Marcus Rohrbach

CoRR, 2016

Utilizing Large Scale Vision and Text Datasets for Image Segmentation from Referring Expressions.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

Subhashini Venugopalan

,

CoRR, 2016

Learning to Compose Neural Networks for Question Answering.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

,

Proceedings of the NAACL HLT 2016, 2016

Multimodal Video Description.

[BibT_eX]

[DOI]

Vasili Ramanishka

,

,

,

Subhashini Venugopalan

,

Lisa Anne Hendricks

,

Marcus Rohrbach

,

Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding.

[BibT_eX]

[DOI]

,

,

,

,

,

Marcus Rohrbach

Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Grounding of Textual Phrases in Images by Reconstruction.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

,

,

Proceedings of the Computer Vision - ECCV 2016, 2016

Segmentation from Natural Language Expressions.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

Proceedings of the Computer Vision - ECCV 2016, 2016

Generating Visual Explanations.

[BibT_eX]

[DOI]

Lisa Anne Hendricks

,

,

Marcus Rohrbach

,

,

,

Proceedings of the Computer Vision - ECCV 2016, 2016

Natural Language Object Retrieval.

[BibT_eX]

[DOI]

,

,

Marcus Rohrbach

,

,

,

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data.

[BibT_eX]

[DOI]

Lisa Anne Hendricks

,

Subhashini Venugopalan

,

Marcus Rohrbach

,

Raymond J. Mooney

,

,

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Neural Module Networks.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

,

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Commonsense in Parts: Mining Part-Whole Relations from the Web and Image Tags.

[BibT_eX]

[DOI]

,

Charles Hariman

,

,

,

Marcus Rohrbach

,

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015

A Multi-scale Multiple Instance Video Description Network.

[BibT_eX]

[DOI]

,

Subhashini Venugopalan

,

Vasili Ramanishka

,

Marcus Rohrbach

,

CoRR, 2015

Deep Compositional Question Answering with Neural Module Networks.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

,

CoRR, 2015

Translating Videos to Natural Language Using Deep Recurrent Neural Networks.

[BibT_eX]

[DOI]

Subhashini Venugopalan

,

,

,

Marcus Rohrbach

,

Raymond J. Mooney

,

Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Sequence to Sequence - Video to Text.

[BibT_eX]

[DOI]

Subhashini Venugopalan

,

Marcus Rohrbach

,

Jeffrey Donahue

,

Raymond J. Mooney

,

,

Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Spatial Semantic Regularisation for Large Scale Object Detection.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

,

,

,

Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images.

[BibT_eX]

[DOI]

Mateusz Malinowski

,

Marcus Rohrbach

,

Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

The Long-Short Story of Movie Description.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

Proceedings of the Pattern Recognition - 37th German Conference, 2015

A dataset for Movie Description.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014

Combining visual recognition and computational linguistics : linguistic knowledge for visual recognition and natural language descriptions of visual content.

[BibT_eX]

[DOI]

Marcus Rohrbach

PhD thesis, 2014

Coherent Multi-Sentence Video Description with Variable Level of Detail.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

,

Annemarie Friedrich

,

,

Mykhaylo Andriluka

,

,

CoRR, 2014

Coherent Multi-sentence Video Description with Variable Level of Detail.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

,

Annemarie Friedrich

,

,

Proceedings of the Pattern Recognition - 36th German Conference, 2014

2013

Grounding Action Descriptions in Videos.

[BibT_eX]

[DOI]

Michaela Regneri

,

Marcus Rohrbach

,

Dominikus Wetzel

,

,

,

Trans. Assoc. Comput. Linguistics, 2013

Transfer Learning in a Transductive Setting.

[BibT_eX]

[DOI]

Marcus Rohrbach

,

,

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Translating Video Content to Natural Language Descriptions.

[BibT_eX]

[DOI]

Marcus Rohrbach

,

,

,

,

,

Proceedings of the IEEE International Conference on Computer Vision, 2013

Multi-view Pictorial Structures for 3D Human Pose Estimation.

[BibT_eX]

[DOI]

,

Mykhaylo Andriluka

,

Marcus Rohrbach

,

Proceedings of the British Machine Vision Conference, 2013

2012

3D Object Detection with Multiple Kinects.

[BibT_eX]

[DOI]

,

Marcus Rohrbach

,

Proceedings of the Computer Vision - ECCV 2012. Workshops and Demonstrations, 2012

Script Data for Attribute-Based Recognition of Composite Activities.

[BibT_eX]

[DOI]

Marcus Rohrbach

,

Michaela Regneri

,

Mykhaylo Andriluka

,

,

,

Proceedings of the Computer Vision - ECCV 2012, 2012

A database for fine grained activity detection of cooking activities.

[BibT_eX]

[DOI]

Marcus Rohrbach

,

,

Mykhaylo Andriluka

,

Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011

The Benefits of Dense Stereo for Pedestrian Detection.

[BibT_eX]

[DOI]

Christoph Gustav Keller

,

Markus Enzweiler

,

Marcus Rohrbach

,

David Fernández Llorca

,

Christoph Schnörr

,

Dariu M. Gavrila

IEEE Trans. Intell. Transp. Syst., 2011

Evaluating knowledge transfer and zero-shot learning in a large-scale setting.

[BibT_eX]

[DOI]

Marcus Rohrbach

,

,

Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

2010

Combining Language Sources and Robust Semantic Relatedness for Attribute-Based Knowledge Transfer.

[BibT_eX]

[DOI]

Marcus Rohrbach

,

,

György Szarvas

,

Proceedings of the Trends and Topics in Computer Vision, 2010

What helps where - and why? Semantic relatedness for knowledge transfer.

[BibT_eX]

[DOI]

Marcus Rohrbach

,

,

György Szarvas

,

,

Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009

High-Level Fusion of Depth and Intensity for Pedestrian Classification.

[BibT_eX]

[DOI]

Marcus Rohrbach

,

Markus Enzweiler

,

Dariu M. Gavrila

Proceedings of the Pattern Recognition, 2009

Loading...