Rao Muhammad Anwer

Mohammed Irfan Kurpath

CoRR, 2024

All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages.

[BibT_eX]

[DOI]

Henok Biadglign Ademtew

Feno Heriniaina Rabevohitra

Mike Zhang

Mahardika Krisna Ihsani

Fadillah Adamsyah Maani

Amirpouya Ghasemaghaei

Johan S. Obando-Ceron

CoRR, 2024

CAMEL-Bench: A Comprehensive Arabic LMM Benchmark.

[BibT_eX]

[DOI]

Sara Ghaboura

Ahmed Heakl

Omkar Thawakar

Ali Husain Salem Abdulla Alharthi

CoRR, 2024

AgroGPT: Efficient Agricultural Vision-Language Model with Expert Tuning.

[BibT_eX]

[DOI]

Muhammad Awais

Ali Husain Salem Abdulla Alharthi

Amandeep Kumar

Hisham Cholakkal

CoRR, 2024

Open3DTrack: Towards Open-Vocabulary 3D Multi-Object Tracking.

[BibT_eX]

[DOI]

Ayesha Ishaq

Mehaboobathunnisa Sahul Hameed

CoRR, 2024

CDChat: A Large Multimodal Model for Remote Sensing Change Description.

[BibT_eX]

[DOI]

CoRR, 2024

BOrg: A Brain Organoid-Based Mitosis Dataset for Automatic Analysis of Brain Diseases.

[BibT_eX]

[DOI]

Muhammad Awais

Bidisha Bhattacharya

Orly Reiner

CoRR, 2024

Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation.

[BibT_eX]

[DOI]

CoRR, 2024

Multi-modal Generation via Cross-Modal In-Context Learning.

[BibT_eX]

[DOI]

CoRR, 2024

MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT.

[BibT_eX]

[DOI]

CoRR, 2024

PALO: A Polyglot Large Multimodal Model for 5B People.

[BibT_eX]

[DOI]

CoRR, 2024

DB-SAM: Delving into High Quality Universal Medical Image Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024, 2024

BAPLe: Backdoor Attacks on Medical Foundational Models Using Prompt Learning.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024, 2024

Long-Tailed 3D Semantic Segmentation with Adaptive Weight Constraint and Sampling.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Bidirectional Reciprocative Information Communication for Few-Shot Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Modulate Your Spectrum in Self-Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

BiMediX: Bilingual Medical Mixture of Experts LLM.

[BibT_eX]

[DOI]

Sara Pieri

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

CONDA: Condensed Deep Association Learning for Co-salient Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Efficient 3D-Aware Facial Image Editing via Attribute-Specific Prompt Learning.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Continual Learning and Unknown Object Discovery in 3D Scenes via Self-distillation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Composed Video Retrieval via Enriched Context and Discriminative Embeddings.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

GLaMM: Pixel Grounding Large Multimodal Model.

[BibT_eX]

[DOI]

Omkar Chakradhar Thawakar

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

XrayGPT: Chest Radiographs Summarization using Large Medical Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, 2024

Semi-supervised Open-World Object Detection.

[BibT_eX]

[DOI]

Abhishek Singh Gehlot

Hisham Cholakkal

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Transformers in Remote Sensing: A Survey.

[BibT_eX]

[DOI]

Abdulaziz Amer Aleissaee

Remote. Sens., April, 2023

SipMaskv2: Enhanced Fast Image and Video Instance Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

Foundational Models Defining a New Era in Vision: A Survey and Outlook.

[BibT_eX]

[DOI]

CoRR, 2023

XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.

[BibT_eX]

[DOI]

Omkar Thawakar

CoRR, 2023

DFormer: Diffusion-guided Transformer for Universal Image Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

LEAPS: End-to-End One-Step Person Search With Learnable Proposals.

[BibT_eX]

[DOI]

CoRR, 2023

SAT: Scale-Augmented Transformer for Person Search.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Surface-Biased Multi-Level Context 3D Object Detection.

[BibT_eX]

Sultan Abu Ghazal

Jean Lahoud

Proceedings of the 18th International Joint Conference on Computer Vision, 2023

3D Indoor Instance Segmentation in an Open-World.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

3D Mitochondria Instance Segmentation with Spatio-Temporal Transformers.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

A Spatial-Temporal Deformable Attention Based Framework for Breast Lesion Detection in Videos.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Cross-Modulated Few-Shot Image Generation for Colorectal Tissue Classification.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Generative Multiplane Neural Radiance for 3D-Aware Image Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Arabic Mini-ClimateGPT : A Climate Change and Sustainability Tailored Arabic LLM.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Discriminative Co-Saliency and Background Mining Transformer for Co-Salient Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Person Image Synthesis via Denoising Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

PSM-PS: Part-Based Signal Modulation for Person Search.

[BibT_eX]

[DOI]

Reem Abdalla Sharif

Mustansar Fiaz

Proceedings of the Computer Analysis of Images and Patterns, 2023

SA2-Net: Scale-aware Attention Network for Microscopic Image Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 34th British Machine Vision Conference 2023, 2023

2022

Multi-scale Feature Aggregation for Crowd Counting.

[BibT_eX]

[DOI]

CoRR, 2022

3D Vision with Transformers: A Survey.

[BibT_eX]

[DOI]

CoRR, 2022

An Investigation into Whitening Loss for Self-supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 4th ACM International Conference on Multimedia in Asia, 2022

On the Robustness of 3D Object Detectors.

[BibT_eX]

[DOI]

Proceedings of the 4th ACM International Conference on Multimedia in Asia, 2022

Learning a Dynamic Cross-Modal Network for Multispectral Pedestrian Detection.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Video Instance Segmentation via Multi-Scale Spatio-Temporal Split Attention Transformer.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

Class-Agnostic Object Detection with Multi-modal Transformer.

[BibT_eX]

[DOI]

Vineeth N. Balasubramanian

Proceedings of the Computer Vision - ECCV 2022, 2022

DoodleFormer: Creative Sketch Drawing with Transformers.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Spatio-temporal Relation Modeling for Few-shot Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Energy-based Latent Aligner for Incremental Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

PSTR: End-to-End One-Step Person Search With Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

PS-ARM: An End-to-End Attention-Aware Relation Mixer Network for Person Search.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2022, 2022

2021

Mask-Guided Attention Network and Occlusion-Sensitive Hard Example Mining for Occluded Pedestrian Detection.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2021

Compact Deep Color Features for Remote Sensing Scene Classification.

[BibT_eX]

[DOI]

Jorma Laaksonen

Neural Process. Lett., 2021

Multi-modal Transformers Excel at Class-agnostic Object Detection.

[BibT_eX]

[DOI]

Vineeth N. Balasubramanian

CoRR, 2021

PSC-Net: learning part spatial co-occurrence for occluded pedestrian detection.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2021

Handwriting Transformers.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Structured Latent Embeddings for Recognizing Unseen Classes in Unseen Domains.

[BibT_eX]

[DOI]

Ling Shao

Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020

PSC-Net: Learning Part Spatial Co-occurence for Occluded Pedestrian Detection.

[BibT_eX]

[DOI]

CoRR, 2020

Count- and Similarity-Aware R-CNN for Pedestrian Detection.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

D2Det: Towards High Quality Object Detection and Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Deep Contextual Attention for Human-Object Interaction Detection.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Learning Rich Features at High-Speed for Single-Shot Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Mask-Guided Attention Network for Occluded Pedestrian Detection.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Enriched Feature Guided Refinement Network for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Efficient Featurized Image Pyramid Network for Single Shot Detector.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Multi-stream Convolutional Networks for Indoor Scene Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Analysis of Images and Patterns, 2019

2018

Scale coding bag of deep features for human attribute and action recognition.

[BibT_eX]

[DOI]

Mach. Vis. Appl., 2018

Bottom-Up Attention Guidance for Recurrent Image Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Two-Stream Part-Based Deep Representation for Human Attribute Recognition.

[BibT_eX]

[DOI]

Jorma Laaksonen

Proceedings of the 2018 International Conference on Biometrics, 2018

2017

Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition and Remote Sensing Scene Classification.

[BibT_eX]

[DOI]

CoRR, 2017

Top-Down Deep Appearance Attention for Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the Image Analysis - 20th Scandinavian Conference, 2017

TEX-Nets: Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, 2017

2016

Combining Holistic and Part-based Deep Representations for Computational Painting Categorization.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016

2015

Recognizing Actions Through Action-Specific Person Detection.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2015

Compact color-texture description for texture classification.

[BibT_eX]

[DOI]

Pattern Recognit. Lett., 2015

PicSOM Experiments in TRECVID 2015.

[BibT_eX]

[DOI]

Proceedings of the 2015 TREC Video Retrieval Evaluation, 2015

Deep Semantic Pyramids for Human Attributes and Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the Image Analysis - 19th Scandinavian Conference, 2015

2014

Semantic Pyramids for Gender and Action Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2014

PicSOM Experiments in TRECVID 2014.

[BibT_eX]

[DOI]

Proceedings of the 2014 TREC Video Retrieval Evaluation, 2014

2013

Color for Object Detection and Action Recognition.

[BibT_eX]

[DOI]

PhD thesis, 2013

Coloring Action Recognition in Still Images.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., 2013

2012

Color attributes for object detection.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011

Opponent Colors for Human Detection.

[BibT_eX]

[DOI]

David Vázquez

Antonio M. López

Proceedings of the Pattern Recognition and Image Analysis - 5th Iberian Conference, 2011

Color Contribution to Part-Based Person Detection in Different Types of Scenarios.

[BibT_eX]

[DOI]