Mohamed Elhoseiny

Orcid: 0000-0001-9659-1551

According to our database1, Mohamed Elhoseiny authored at least 133 papers between 2012 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions.
Trans. Mach. Learn. Res., 2024

AutoBench-V: Can Large Vision-Language Models Benchmark Themselves?
CoRR, 2024

LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding.
CoRR, 2024

Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models.
CoRR, 2024

How Well Can Vision Language Models See Image Details?
CoRR, 2024

Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling.
CoRR, 2024

MiniGPT-Med: Large Language Model as a General Interface for Radiology Diagnosis.
CoRR, 2024

InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding.
CoRR, 2024

VRSBench: A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding.
CoRR, 2024

iMotion-LLM: Motion Prediction Instruction Tuning.
CoRR, 2024

Kestrel: Point Grounding Multimodal LLM for Part-Aware 3D Vision-Language Understanding.
CoRR, 2024

MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens.
CoRR, 2024

A Hybrid Graph Network for Complex Activity Detection in Video.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Multimodal Representation and Retrieval [MRR 2024].
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual Grounding.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

No Culture Left Behind: ArtELingo-28, a Benchmark of WikiArt with Captions in 28 Languages.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Uni3DL: A Unified Model for 3D Vision-Language Understanding.
Proceedings of the Computer Vision - ECCV 2024, 2024

Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations.
Proceedings of the Computer Vision - ECCV 2024, 2024

MEERKAT: Audio-Visual Large Language Model for Grounding in Space and Time.
Proceedings of the Computer Vision - ECCV 2024, 2024

Goldfish: Vision-Language Understanding of Arbitrarily Long Videos.
Proceedings of the Computer Vision - ECCV 2024, 2024

Overcoming Generic Knowledge Loss with Selective Parameter Update.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ShapeWalk: Compositional Shape Editing Through Language-Guided Chains.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

AI Art Neural Constellation: Revealing the Collective and Contrastive State of AI-Generated and Human Art.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Adversarial Text to Continuous Image Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ImageCaptioner2: Image Captioner for Image Captioning Bias Amplification Assessment.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Uni3DL: Unified Model for 3D and Language Understanding.
CoRR, 2023

StoryGPT-V: Large Language Models as Consistent Story Visualizers.
CoRR, 2023

Label Delay in Continual Learning.
CoRR, 2023

ToddlerDiffusion: Flash Interpretable Controllable Diffusion Model.
CoRR, 2023

3DCoMPaT<sup>++</sup>: An improved Large-scale 3D Vision Dataset for Compositional Recognition.
CoRR, 2023

MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning.
CoRR, 2023

Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations.
CoRR, 2023

Overcoming General Knowledge Loss with Selective Parameter Finetuning.
CoRR, 2023

Exploring Open-Vocabulary Semantic Segmentation without Human Labels.
CoRR, 2023

ImageCaptioner<sup>2</sup>: Image Captioner for Image Captioning Bias Amplification Assessment.
CoRR, 2023

Video ChatCaptioner: Towards Enriched Spatiotemporal Descriptions.
CoRR, 2023

Guiding Online Reinforcement Learning with Action-Free Offline Pretraining.
CoRR, 2023

SLAMB: Accelerated Large Batch Training with Sparse Communication.
Proceedings of the International Conference on Machine Learning, 2023

Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Continual Zero-Shot Learning through Semantically Guided Generative Random Walks.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

OxfordTVG-HIC: Can Machine Make Humorous Captions from Images?
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

FishNet: A Large-scale Dataset and Benchmark for Fish Recognition, Detection, and Functional Trait Prediction.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Exploring Open-Vocabulary Semantic Segmentation from CLIP Vision Encoder Distillation Only.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

MoStGAN-V: Video Generation with Temporal Motion Styles.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MammalNet: A Large-Scale Video Benchmark for Mammal Recognition and Behavior Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
A Simple Baseline that Questions the Use of Pretrained-Models in Continual Learning.
CoRR, 2022

Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction.
CoRR, 2022

Efficiently Disentangle Causal Representations.
CoRR, 2022

3DRefTransformer: Fine-Grained Object Identification in Real-World Scenes Using Natural Language.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual Grounding.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Creative Walk Adversarial Networks: Novel Art Generation with Probabilistic Random Walk Deviation from Style Norms.
Proceedings of the 13th International Conference on Computational Creativity, Bozen-Bolzano, Italy, June 27, 2022

ArtELingo: A Million Emotion Annotations of WikiArt with Emphasis on Diversity over Language and Culture.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification.
Proceedings of the Computer Vision - ECCV 2022, 2022

Social-Implicit: Rethinking Trajectory Prediction Evaluation and The Effectiveness of Implicit Maximum Likelihood Estimation.
Proceedings of the Computer Vision - ECCV 2022, 2022

3D CoMPaT: Composition of Materials on Parts of 3D Things.
Proceedings of the Computer Vision - ECCV 2022, 2022

StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

It is Okay to Not Be Okay: Overcoming Emotional Bias in Affective Image Captioning by Contrastive Data Collection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

RelTransformer: A Transformer-Based Long-Tail Visual Relationship Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Domain-Aware Continual Zero-Shot Learning.
CoRR, 2021

RelTransformer: Balancing the Visual Relationship Detection from Local Context, Scene and Memory.
CoRR, 2021

Imaginative Walks: Generative Random Walk Deviation Loss for Improved Unseen Learning Representation.
CoRR, 2021

VisualGPT: Data-efficient Image Captioning by Balancing Visual Input and Linguistic Knowledge from Pretraining.
CoRR, 2021

CIZSL++: Creativity Inspired Generative Zero-Shot Learning.
CoRR, 2021

HalentNet: Multimodal Trajectory Forecasting with Hallucinative Intents.
Proceedings of the 9th International Conference on Learning Representations, 2021

Class Normalization for (Continual)? Generalized Zero-Shot Learning.
Proceedings of the 9th International Conference on Learning Representations, 2021

Aligning Latent and Image Spaces to Connect the Unconnectable.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Exploring Long Tail Visual Relationship Recognition with Large Vocabulary.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Wölfflin's Affective Generative Analysis for Visual Art.
Proceedings of the Twelfth International Conference on Computational Creativity, 2021

Adversarial Generation of Continuous Images.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

ArtEmis: Affective Language for Visual Art.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Motion Forecasting with Unlikelihood Training in Continuous Space.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

Semi-Supervised Few-Shot Learning with Prototypical Random Walks.
Proceedings of the AAAI Workshop on Meta-Learning and MetaDL Challenge, 2021

2020
Normalization Matters in Zero-Shot Learning.
CoRR, 2020

Inner Ensemble Nets.
CoRR, 2020

Efficient long-distance relation extraction with DG-SpanBERT.
CoRR, 2020

Long-tail Visual Relationship Recognition with a Visiolinguistic Hubless Loss.
CoRR, 2020

Temporal Positive-unlabeled Learning for Biomedical Hypothesis Generation via Risk Estimation.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Compositional Language Continual Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020

Uncertainty-guided Continual Learning with Bayesian Neural Networks.
Proceedings of the 8th International Conference on Learning Representations, 2020

ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes.
Proceedings of the Computer Vision - ECCV 2020, 2020

Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Semi-Supervised Few-Shot Learning with Local and Global Consistency.
CoRR, 2019

Continual Learning with Tiny Episodic Memories.
CoRR, 2019

Video Object Segmentation using Teacher-Student Adaptation in a Human Robot Interaction (HRI) Setting.
Proceedings of the International Conference on Robotics and Automation, 2019

GDPP: Learning Diverse Generations using Determinantal Point Processes.
Proceedings of the 36th International Conference on Machine Learning, 2019

Efficient Lifelong Learning with A-GEM.
Proceedings of the 7th International Conference on Learning Representations, 2019

Creativity Inspired Zero-Shot Learning.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Uncertainty-Guided Continual Learning in Bayesian Neural Networks - Extended Abstract.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Large-Scale Visual Relationship Understanding.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
GDPP: Learning Diverse Generations Using Determinantal Point Process.
CoRR, 2018

Video Segmentation using Teacher-Student Adaptation in a Human Robot Interaction (HRI) Setting.
CoRR, 2018

Choose Your Neuron: Incorporating Domain Knowledge Through Neuron-Importance.
Proceedings of the Computer Vision - ECCV 2018, 2018

DesIGN: Design Inspiration from Generative Networks.
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Memory Aware Synapses: Learning What (not) to Forget.
Proceedings of the Computer Vision - ECCV 2018, 2018

A Generative Adversarial Approach for Zero-Shot Learning From Noisy Texts.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Exploring the Challenges Towards Lifelong Fact Learning.
Proceedings of the Computer Vision - ACCV 2018, 2018

The Shape of Art History in the Eyes of the Machine.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Write a Classifier: Predicting Visual Classifiers from Unstructured Text.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

Imagine it for me: Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts.
CoRR, 2017

Overlapping Cover Local Regression Machines.
CoRR, 2017

CAN: Creative Adversarial Networks, Generating "Art" by Learning About Styles and Deviating from Style Norms.
Proceedings of the Eighth International Conference on Computational Creativity, 2017

Relationship Proposal Networks.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Link the Head to the "Beak": Zero Shot Learning from Noisy Text Description at Part Precision.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Sherlock: Scalable Fact Learning in Images.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Text to multi-level MindMaps - A novel method for hierarchical visual abstraction of natural language text.
Multim. Tools Appl., 2016

Write a Classifier: Predicting Visual Classifiers from Unstructured Text Descriptions.
CoRR, 2016

Digging Deep into the Layers of CNNs: In Search of How CNNs Achieve View Invariance.
Proceedings of the 4th International Conference on Learning Representations, 2016

Joint object recognition and pose estimation using a nonlinear view-invariant latent generative model.
Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision, 2016

A Comparative Analysis and Study of Multiview CNN Models for Joint Object Categorization and Pose Estimation.
Proceedings of the 33nd International Conference on Machine Learning, 2016

SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Automatic Annotation of Structured Facts in Images.
Proceedings of the 5th Workshop on Vision and Language, 2016

Zero-Shot Event Detection by Multimodal Distributional Semantic Embedding of Videos.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Generalized Twin Gaussian processes using Sharma-Mittal divergence.
Mach. Learn., 2015

Tell and Predict: Kernel Classifier Prediction for Unseen Visual Classes from Unstructured Text Descriptions.
CoRR, 2015

Convolutional Models for Joint Object Categorization and Pose Estimation.
CoRR, 2015

Sherlock: Modeling Structured Knowledge in Images.
CoRR, 2015

Weather classification with deep convolutional neural networks.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Learning Hypergraph-regularized Attribute Predictors.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Overlapping Domain Cover for Scalable and Accurate Regression Kernel Machines.
Proceedings of the British Machine Vision Conference 2015, 2015

Visual Classifier Prediction by Distributional Semantic Embedding of Text Descriptions.
Proceedings of the Fourth Workshop on Vision and Language, 2015

2014
Text to Multi-level MindMaps: A New Way for Interactive Visualization and Summarization of Natural Language Text.
CoRR, 2014

SRI-Sarnoff AURORA System at TRECVID 2014 Multimedia Event Detection and Recounting.
Proceedings of the 2014 TREC Video Retrieval Evaluation, 2014

Improving non-negative matrix factorization via ranking its bases.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

2013
GPU-Framework for Teamwork Action Recognition.
CoRR, 2013

Low-bitrate benefits of JPEG compression on sift recognition.
Proceedings of the IEEE International Conference on Image Processing, 2013

Write a Classifier: Zero-Shot Learning Using Purely Textual Descriptions.
Proceedings of the IEEE International Conference on Computer Vision, 2013

MultiClass Object Classification in Video Surveillance Systems - Experimental Study.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013

2012
English2MindMap: An Automated System for MindMap Generation from English Text.
Proceedings of the 2012 IEEE International Symposium on Multimedia, 2012


  Loading...