Mohamed Elhoseiny

Vikas Chandra

CoRR, 2024

Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

How Well Can Vision Language Models See Image Details?

[BibT_eX]

[DOI]

CoRR, 2024

Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling.

[BibT_eX]

[DOI]

CoRR, 2024

MiniGPT-Med: Large Language Model as a General Interface for Radiology Diagnosis.

[BibT_eX]

[DOI]

CoRR, 2024

InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

VRSBench: A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding.

[BibT_eX]

[DOI]

Xiang Li

Jian Ding

CoRR, 2024

iMotion-LLM: Motion Prediction Instruction Tuning.

[BibT_eX]

[DOI]

CoRR, 2024

Kestrel: Point Grounding Multimodal LLM for Part-Aware 3D Vision-Language Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens.

[BibT_eX]

[DOI]

CoRR, 2024

A Hybrid Graph Network for Complex Activity Detection in Video.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Multimodal Representation and Retrieval [MRR 2024].

[BibT_eX]

[DOI]

Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual Grounding.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

No Culture Left Behind: ArtELingo-28, a Benchmark of WikiArt with Captions in 28 Languages.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Uni3DL: A Unified Model for 3D Vision-Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

MEERKAT: Audio-Visual Large Language Model for Grounding in Space and Time.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Goldfish: Vision-Language Understanding of Arbitrarily Long Videos.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Overcoming Generic Knowledge Loss with Selective Parameter Update.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ShapeWalk: Compositional Shape Editing Through Language-Guided Chains.

[BibT_eX]

[DOI]

Habib Slim

Chamuditha Jayanga Galappaththige

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

AI Art Neural Constellation: Revealing the Collective and Contrastive State of AI-Generated and Human Art.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Adversarial Text to Continuous Image Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ImageCaptioner2: Image Captioner for Image Captioning Bias Amplification Assessment.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Uni3DL: Unified Model for 3D and Language Understanding.

[BibT_eX]

[DOI]

CoRR, 2023

StoryGPT-V: Large Language Models as Consistent Story Visualizers.

[BibT_eX]

[DOI]

Xiaoqian Shen

Raghuraman Krishnamoorthi

CoRR, 2023

Label Delay in Continual Learning.

[BibT_eX]

[DOI]

CoRR, 2023

ToddlerDiffusion: Flash Interpretable Controllable Diffusion Model.

[BibT_eX]

[DOI]

CoRR, 2023

3DCoMPaT<sup>++</sup>: An improved Large-scale 3D Vision Dataset for Compositional Recognition.

[BibT_eX]

[DOI]

CoRR, 2023

MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning.

[BibT_eX]

[DOI]

Vikas Chandra

Yunyang Xiong

CoRR, 2023

Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations.

[BibT_eX]

[DOI]

CoRR, 2023

Overcoming General Knowledge Loss with Selective Parameter Finetuning.

[BibT_eX]

[DOI]

CoRR, 2023

Exploring Open-Vocabulary Semantic Segmentation without Human Labels.

[BibT_eX]

[DOI]

CoRR, 2023

ImageCaptioner<sup>2</sup>: Image Captioner for Image Captioning Bias Amplification Assessment.

[BibT_eX]

[DOI]

CoRR, 2023

Video ChatCaptioner: Towards Enriched Spatiotemporal Descriptions.

[BibT_eX]

[DOI]

CoRR, 2023

Guiding Online Reinforcement Learning with Action-Free Offline Pretraining.

[BibT_eX]

[DOI]

CoRR, 2023

SLAMB: Accelerated Large Batch Training with Sparse Communication.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Deyao Zhu

Li Erran Li

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Continual Zero-Shot Learning through Semantically Guided Generative Random Walks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

OxfordTVG-HIC: Can Machine Make Humorous Captions from Images?

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

FishNet: A Large-scale Dataset and Benchmark for Fish Recognition, Detection, and Functional Trait Prediction.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Exploring Open-Vocabulary Semantic Segmentation from CLIP Vision Encoder Distillation Only.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

MoStGAN-V: Video Generation with Temporal Motion Styles.

[BibT_eX]

[DOI]

Xiaoqian Shen

Xiang Li

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MammalNet: A Large-Scale Video Benchmark for Mammal Recognition and Behavior Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

A Simple Baseline that Questions the Use of Pretrained-Models in Continual Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction.

[BibT_eX]

[DOI]

CoRR, 2022

Efficiently Disentangle Causal Representations.

[BibT_eX]

[DOI]

CoRR, 2022

3DRefTransformer: Fine-Grained Object Identification in Real-World Scenes Using Natural Language.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual Grounding.

[BibT_eX]

[DOI]

Eslam Mohamed Bakr

Yasmeen Alsaedy

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Creative Walk Adversarial Networks: Novel Art Generation with Probabilistic Random Walk Deviation from Style Norms.

[BibT_eX]

[DOI]

Proceedings of the 13th International Conference on Computational Creativity, Bozen-Bolzano, Italy, June 27, 2022

ArtELingo: A Million Emotion Annotations of WikiArt with Emphasis on Diversity over Language and Culture.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Social-Implicit: Rethinking Trajectory Prediction Evaluation and The Effectiveness of Implicit Maximum Likelihood Estimation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

3D CoMPaT: Composition of Materials on Parts of 3D Things.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2.

[BibT_eX]

[DOI]

Sergey Tulyakov

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

It is Okay to Not Be Okay: Overcoming Emotional Bias in Affective Image Captioning by Contrastive Data Collection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

RelTransformer: A Transformer-Based Long-Tail Visual Relationship Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Domain-Aware Continual Zero-Shot Learning.

[BibT_eX]

[DOI]

Kai Yi

CoRR, 2021

RelTransformer: Balancing the Visual Relationship Detection from Local Context, Scene and Memory.

[BibT_eX]

[DOI]

CoRR, 2021

Imaginative Walks: Generative Random Walk Deviation Loss for Improved Unseen Learning Representation.

[BibT_eX]

[DOI]

CoRR, 2021

VisualGPT: Data-efficient Image Captioning by Balancing Visual Input and Linguistic Knowledge from Pretraining.

[BibT_eX]

[DOI]

CoRR, 2021

CIZSL++: Creativity Inspired Generative Zero-Shot Learning.

[BibT_eX]

[DOI]

Kai Yi

Mohamed Elfeki

CoRR, 2021

HalentNet: Multimodal Trajectory Forecasting with Hallucinative Intents.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Class Normalization for (Continual)? Generalized Zero-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Aligning Latent and Image Spaces to Connect the Unconnectable.

[BibT_eX]

[DOI]

Grigorii Sotnikov

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Exploring Long Tail Visual Relationship Recognition with Large Vocabulary.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Wölfflin's Affective Generative Analysis for Visual Art.

[BibT_eX]

[DOI]

Divyansh Jha

Hanna H. Chang

Proceedings of the Twelfth International Conference on Computational Creativity, 2021

Adversarial Generation of Continuous Images.

[BibT_eX]

[DOI]

Savva Ignatyev

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

ArtEmis: Affective Language for Visual Art.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Motion Forecasting with Unlikelihood Training in Continuous Space.

[BibT_eX]

[DOI]

Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

Semi-Supervised Few-Shot Learning with Prototypical Random Walks.

[BibT_eX]

[DOI]

Proceedings of the AAAI Workshop on Meta-Learning and MetaDL Challenge, 2021

2020

Normalization Matters in Zero-Shot Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Inner Ensemble Nets.

[BibT_eX]

[DOI]

Abduallah A. Mohamed

Muhammed Mohaimin Sadiq

Ehab AlBadawy

Christian G. Claudel

CoRR, 2020

Efficient long-distance relation extraction with DG-SpanBERT.

[BibT_eX]

[DOI]

CoRR, 2020

Long-tail Visual Relationship Recognition with a Visiolinguistic Hubless Loss.

[BibT_eX]

[DOI]

CoRR, 2020

Temporal Positive-unlabeled Learning for Biomedical Hypothesis Generation via Risk Estimation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Compositional Language Continual Learning.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Uncertainty-guided Continual Learning with Bayesian Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Semi-Supervised Few-Shot Learning with Local and Global Consistency.

[BibT_eX]

[DOI]

CoRR, 2019

Continual Learning with Tiny Episodic Memories.

[BibT_eX]

[DOI]

Arslan Chaudhry

Marcus Rohrbach

Thalaiyasingam Ajanthan

Puneet Kumar Dokania

Philip H. S. Torr

Marc'Aurelio Ranzato

CoRR, 2019

Video Object Segmentation using Teacher-Student Adaptation in a Human Robot Interaction (HRI) Setting.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Robotics and Automation, 2019

GDPP: Learning Diverse Generations using Determinantal Point Processes.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Efficient Lifelong Learning with A-GEM.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Creativity Inspired Zero-Shot Learning.

[BibT_eX]

[DOI]

Mohamed Elfeki

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Uncertainty-Guided Continual Learning in Bayesian Neural Networks - Extended Abstract.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Large-Scale Visual Relationship Understanding.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

GDPP: Learning Diverse Generations Using Determinantal Point Process.

[BibT_eX]

[DOI]

CoRR, 2018

Video Segmentation using Teacher-Student Adaptation in a Human Robot Interaction (HRI) Setting.

[BibT_eX]

[DOI]

CoRR, 2018

Choose Your Neuron: Incorporating Domain Knowledge Through Neuron-Importance.

[BibT_eX]

[DOI]

Ramprasaath R. Selvaraju

Prithvijit Chattopadhyay

Proceedings of the Computer Vision - ECCV 2018, 2018

DesIGN: Design Inspiration from Generative Networks.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Memory Aware Synapses: Learning What (not) to Forget.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

A Generative Adversarial Approach for Zero-Shot Learning From Noisy Texts.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Exploring the Challenges Towards Lifelong Fact Learning.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2018, 2018

The Shape of Art History in the Eyes of the Machine.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Write a Classifier: Predicting Visual Classifiers from Unstructured Text.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2017

Imagine it for me: Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts.

[BibT_eX]

[DOI]

CoRR, 2017

Overlapping Cover Local Regression Machines.

[BibT_eX]

[DOI]

CoRR, 2017

CAN: Creative Adversarial Networks, Generating "Art" by Learning About Styles and Deviating from Style Norms.

[BibT_eX]

[DOI]

Proceedings of the Eighth International Conference on Computational Creativity, 2017

Relationship Proposal Networks.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Link the Head to the "Beak": Zero Shot Learning from Noisy Text Description at Part Precision.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Sherlock: Scalable Fact Learning in Images.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Text to multi-level MindMaps - A novel method for hierarchical visual abstraction of natural language text.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2016

Write a Classifier: Predicting Visual Classifiers from Unstructured Text Descriptions.

[BibT_eX]

[DOI]

CoRR, 2016

Digging Deep into the Layers of CNNs: In Search of How CNNs Achieve View Invariance.

[BibT_eX]

[DOI]

Proceedings of the 4th International Conference on Learning Representations, 2016

Joint object recognition and pose estimation using a nonlinear view-invariant latent generative model.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision, 2016

A Comparative Analysis and Study of Multiview CNN Models for Joint Object Categorization and Pose Estimation.

[BibT_eX]

[DOI]

Proceedings of the 33nd International Conference on Machine Learning, 2016

SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Automatic Annotation of Structured Facts in Images.

[BibT_eX]

[DOI]

Proceedings of the 5th Workshop on Vision and Language, 2016

Zero-Shot Event Detection by Multimodal Distributional Semantic Embedding of Videos.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015

Generalized Twin Gaussian processes using Sharma-Mittal divergence.

[BibT_eX]

[DOI]

Mach. Learn., 2015

Tell and Predict: Kernel Classifier Prediction for Unseen Visual Classes from Unstructured Text Descriptions.

[BibT_eX]

[DOI]

CoRR, 2015

Convolutional Models for Joint Object Categorization and Pose Estimation.

[BibT_eX]

[DOI]

CoRR, 2015

Sherlock: Modeling Structured Knowledge in Images.

[BibT_eX]

[DOI]

CoRR, 2015

Weather classification with deep convolutional neural networks.

[BibT_eX]

[DOI]

Sheng Huang

Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Learning Hypergraph-regularized Attribute Predictors.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Overlapping Domain Cover for Scalable and Accurate Regression Kernel Machines.

[BibT_eX]

[DOI]

Proceedings of the British Machine Vision Conference 2015, 2015

Visual Classifier Prediction by Distributional Semantic Embedding of Text Descriptions.

[BibT_eX]

[DOI]

Proceedings of the Fourth Workshop on Vision and Language, 2015

2014

Text to Multi-level MindMaps: A New Way for Interactive Visualization and Summarization of Natural Language Text.

[BibT_eX]

[DOI]

CoRR, 2014

SRI-Sarnoff AURORA System at TRECVID 2014 Multimedia Event Detection and Recounting.

[BibT_eX]

[DOI]

Proceedings of the 2014 TREC Video Retrieval Evaluation, 2014

Improving non-negative matrix factorization via ranking its bases.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

2013

GPU-Framework for Teamwork Action Recognition.

[BibT_eX]

[DOI]

Hossam El Deen Mostafa Faheem

Taymour Nazmy

Eman Shaaban

CoRR, 2013

Low-bitrate benefits of JPEG compression on sift recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2013

Write a Classifier: Zero-Shot Learning Using Purely Textual Descriptions.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2013

MultiClass Object Classification in Video Surveillance Systems - Experimental Study.

[BibT_eX]

[DOI]

Amr Bakry

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013

2012

English2MindMap: An Automated System for MindMap Generation from English Text.

[BibT_eX]

[DOI]