Fahad Shahbaz Khan

Orcid: 0000-0002-4263-3143

Affiliations:
  • Linköping University, Sweden


According to our database1, Fahad Shahbaz Khan authored at least 305 papers between 2009 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Image colorization: A survey and dataset.
Inf. Fusion, 2025

2024
Understanding Whitening Loss in Self-Supervised Learning.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

UNETR++: Delving Into Efficient and Accurate 3D Medical Image Segmentation.
IEEE Trans. Medical Imaging, September, 2024

Guidance Through Surrogate: Toward a Generic Diagnostic Attack.
IEEE Trans. Neural Networks Learn. Syst., February, 2024

MineGAN++: Mining Generative Models for Efficient Knowledge Transfer to Limited Data Domains.
Int. J. Comput. Vis., February, 2024

Self-Supervised Masked Convolutional Transformer Block for Anomaly Detection.
IEEE Trans. Pattern Anal. Mach. Intell., January, 2024

Robust Perception and Precise Segmentation for Scribble-Supervised RGB-D Saliency Detection.
IEEE Trans. Pattern Anal. Mach. Intell., January, 2024

CT-VOS: Cutout prediction and tagging for self-supervised video object segmentation.
Comput. Vis. Image Underst., January, 2024

Effectiveness assessment of recent large vision-language models.
Vis. Intell., 2024

Remote Sensing Change Detection With Transformers Trained From Scratch.
IEEE Trans. Geosci. Remote. Sens., 2024

ELGC-Net: Efficient Local-Global Context Aggregation for Remote Sensing Change Detection.
IEEE Trans. Geosci. Remote. Sens., 2024

Guided-attention and gated-aggregation network for medical image segmentation.
Pattern Recognit., 2024

Visual attention methods in deep learning: An in-depth survey.
Inf. Fusion, 2024

Lightning fast video anomaly detection via multi-scale adversarial distillation.
Comput. Vis. Image Underst., 2024

CAMEL-Bench: A Comprehensive Arabic LMM Benchmark.
CoRR, 2024

How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?
CoRR, 2024

Frontiers in Intelligent Colonoscopy.
CoRR, 2024

Open3DTrack: Towards Open-Vocabulary 3D Multi-Object Tracking.
CoRR, 2024

AgriCLIP: Adapting CLIP for Agriculture and Livestock via Domain-Specialized Cross-Model Alignment.
CoRR, 2024

CDChat: A Large Multimodal Model for Remote Sensing Change Description.
CoRR, 2024

Efficient Localized Adaptation of Neural Weather Forecasting: A Case Study in the MENA Region.
CoRR, 2024

iSeg: An Iterative Refinement-based Framework for Training-free Segmentation.
CoRR, 2024

Connecting Dreams with Visual Brainstorming Instruction.
CoRR, 2024

GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model.
CoRR, 2024

Open-Vocabulary Temporal Action Localization using Multimodal Guidance.
CoRR, 2024

VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs.
CoRR, 2024

VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding.
CoRR, 2024

Towards Evaluating the Robustness of Visual State Space Models.
CoRR, 2024

On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models.
CoRR, 2024

Multi-Granularity Language-Guided Multi-Object Tracking.
CoRR, 2024

Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation.
CoRR, 2024

Dual Hyperspectral Mamba for Efficient Spectral Compressive Imaging.
CoRR, 2024

How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs.
CoRR, 2024

Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration.
CoRR, 2024

Efficient Video Object Segmentation via Modulated Cross-Attention Memory.
CoRR, 2024

VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding.
CoRR, 2024

AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation.
CoRR, 2024

ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes.
CoRR, 2024

MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT.
CoRR, 2024

PALO: A Polyglot Large Multimodal Model for 5B People.
CoRR, 2024

Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding.
CoRR, 2024

Learnable weight initialization for volumetric medical image segmentation.
Artif. Intell. Medicine, 2024

Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024, 2024

DB-SAM: Delving into High Quality Universal Medical Image Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024, 2024

BAPLe: Backdoor Attacks on Medical Foundational Models Using Prompt Learning.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024, 2024

MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024, 2024

Language Guided Domain Generalized Medical Image Segmentation.
Proceedings of the IEEE International Symposium on Biomedical Imaging, 2024

Long-Tailed 3D Semantic Segmentation with Adaptive Weight Constraint and Sampling.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Bidirectional Reciprocative Information Communication for Few-Shot Semantic Segmentation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Modulate Your Spectrum in Self-Supervised Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Get What You Want, Not What You Don't: Image Content Suppression for Text-to-Image Diffusion Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Sentence-level Prompts Benefit Composed Image Retrieval.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

BiMediX: Bilingual Medical Mixture of Experts LLM.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Learning Camouflaged Object Detection from Noisy Pseudo Label.
Proceedings of the Computer Vision - ECCV 2024, 2024

CONDA: Condensed Deep Association Learning for Co-salient Object Detection.
Proceedings of the Computer Vision - ECCV 2024, 2024

Continual Learning and Unknown Object Discovery in 3D Scenes via Self-distillation.
Proceedings of the Computer Vision - ECCV 2024, 2024

SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

VideoGrounding-DINO: Towards Open-Vocabulary Spatio- Temporal Video Grounding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Composed Video Retrieval via Enriched Context and Discriminative Embeddings.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

GLaMM: Pixel Grounding Large Multimodal Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

GeoChat: Grounded Large Vision-Language Model for Remote Sensing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Cross-Modal Self-Training: Aligning Images and Pointclouds to learn Classification without Labels.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Progressive Semantic-Guided Vision Transformer for Zero-Shot Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

XrayGPT: Chest Radiographs Summarization using Large Medical Vision-Language Models.
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, 2024

S3A: Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Semi-supervised Open-World Object Detection.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Generative Multi-Label Zero-Shot Learning.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Guest Editorial Introduction to the Special Section on Transformer Models in Vision.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges.
Mach. Intell. Res., October, 2023

Transformers in medical imaging: A survey.
Medical Image Anal., August, 2023

CyTran: A cycle-consistent transformer with multi-level consistency for non-contrast to contrast CT translation.
Neurocomputing, June, 2023

Stylized Adversarial Defense.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2023

Visual Object Tracking With Discriminative Filters and Siamese Networks: A Survey and Outlook.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2023

Transformers in Remote Sensing: A Survey.
Remote. Sens., April, 2023

SipMaskv2: Enhanced Fast Image and Video Instance Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

SSMTL++: Revisiting self-supervised multi-task learning for video anomaly detection.
Comput. Vis. Image Underst., March, 2023

Learning Enriched Features for Fast Image Restoration and Enhancement.
IEEE Trans. Pattern Anal. Mach. Intell., 2023

Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models.
CoRR, 2023

PG-Video-LLaVA: Pixel Grounding Large Video-Language Models.
CoRR, 2023

Enhancing Novel Object Detection via Cooperative Foundational Models.
CoRR, 2023

Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization.
CoRR, 2023

Videoprompter: an ensemble of foundational models for zero-shot video understanding.
CoRR, 2023

Improving Underwater Visual Tracking With a Large Scale Dataset and Image Enhancement.
CoRR, 2023

Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment.
CoRR, 2023

Foundational Models Defining a New Era in Vision: A Survey and Outlook.
CoRR, 2023

PromptIR: Prompting for All-in-One Blind Image Restoration.
CoRR, 2023

XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.
CoRR, 2023

Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models.
CoRR, 2023

DFormer: Diffusion-guided Transformer for Universal Image Segmentation.
CoRR, 2023

Video Instance Segmentation in an Open-World.
CoRR, 2023

StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing.
CoRR, 2023

LEAPS: End-to-End One-Step Person Search With Learnable Proposals.
CoRR, 2023

Multimodal Multi-Head Convolutional Attention with Various Kernel Sizes for Medical Image Super-Resolution.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

SAT: Scale-Augmented Transformer for Person Search.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Salient Mask-Guided Vision Transformer for Fine-Grained Classification.
Proceedings of the 18th International Joint Conference on Computer Vision, 2023

Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

PromptIR: Prompting for All-in-One Image Restoration.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Cal-DETR: Calibrated Detection Transformer.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

3D Indoor Instance Segmentation in an Open-World.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Accelerated MRI Reconstruction via Dynamic Deformable Alignment Based Transformer.
Proceedings of the Machine Learning in Medical Imaging - 14th International Workshop, 2023

Distilling Local Texture Features for Colorectal Tissue Classification in Low Data Regimes.
Proceedings of the Machine Learning in Medical Imaging - 14th International Workshop, 2023

3D Mitochondria Instance Segmentation with Spatio-Temporal Transformers.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

A Spatial-Temporal Deformable Attention Based Framework for Breast Lesion Detection in Videos.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Cross-Modulated Few-Shot Image Generation for Colorectal Tissue Classification.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Frequency Domain Adversarial Training for Robust Volumetric Medical Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Boosting Adversarial Transferability using Dynamic Cues.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Generative Multiplane Neural Radiance for 3D-Aware Image Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Self-regulating Prompts: Foundational Model Adaptation without Forgetting.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

3D Instance Segmentation via Enhanced Spatial and Semantic Supervision.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Arabic Mini-ClimateGPT : A Climate Change and Sustainability Tailored Arabic LLM.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

PromptCAL: Contrastive Affinity Learning via Auxiliary Prompts for Generalized Novel Category Discovery.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Fine-tuned CLIP Models are Efficient Video Learners.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

D<sup>3</sup>Former: Debiased Dual Distilled Transformer for Incremental Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Gated Multi-Resolution Transfer Network for Burst Restoration and Enhancement.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Discriminative Co-Saliency and Background Mining Transformer for Co-Salient Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

3D-Aware Multi-Class Image-to-Image Translation with NeRFs.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MaPLe: Multi-modal Prompt Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Burstormer: Burst Image Restoration and Enhancement Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Person Image Synthesis via Denoising Diffusion Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Fast Video Instance Segmentation via Recurrent Encoder-Based Transformers.
Proceedings of the Computer Analysis of Images and Patterns, 2023

Unsupervised Landmark Discovery Using Consistency-Guided Bottleneck.
Proceedings of the 34th British Machine Vision Conference 2023, 2023

2022
Transformers in Vision: A Survey.
ACM Comput. Surv., January, 2022

Distilled Siamese Networks for Visual Tracking.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Incremental Object Detection via Meta-Learning.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

A Background-Agnostic Framework With Adversarial Training for Abnormal Event Detection in Video.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Towards Partial Supervision for Generic Object Counting in Natural Scenes.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

From Handcrafted to Deep Features for Pedestrian Detection: A Survey.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Guidance Through Surrogate: Towards a Generic Diagnostic Attack.
CoRR, 2022

Lightning Fast Video Anomaly Detection via Adversarial Knowledge Distillation.
CoRR, 2022

CLIP model is an Efficient Continual Learner.
CoRR, 2022

AVisT: A Benchmark for Visual Object Tracking in Adverse Visibility.
CoRR, 2022

Multi-scale Feature Aggregation for Crowd Counting.
CoRR, 2022

3D Vision with Transformers: A Survey.
CoRR, 2022

Self-Supervised Video Object Segmentation via Cutout Prediction and Tagging.
CoRR, 2022

COCOA: Context-Conditional Adaptation for Recognizing Unseen Classes in Unseen Domains.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

An Investigation into Whitening Loss for Self-supervised Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection.
Proceedings of the 4th ACM International Conference on Multimedia in Asia, 2022

On the Robustness of 3D Object Detectors.
Proceedings of the 4th ACM International Conference on Multimedia in Asia, 2022

Learning a Dynamic Cross-Modal Network for Multispectral Pedestrian Detection.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

SepTr: Separable Transformer for Audio Spectrogram Processing.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

On Improving Adversarial Transferability of Vision Transformers.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Video Instance Segmentation via Multi-Scale Spatio-Temporal Split Attention Transformer.
Proceedings of the Computer Vision - ECCV 2022, 2022

OpenLDN: Learning to Discover Novel Classes for Open-World Semi-Supervised Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022

EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications.
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

Class-Agnostic Object Detection with Multi-modal Transformer.
Proceedings of the Computer Vision - ECCV 2022, 2022

Dense Gaussian Processes for Few-Shot Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022

DoodleFormer: Creative Sketch Drawing with Transformers.
Proceedings of the Computer Vision - ECCV 2022, 2022

Restormer: Efficient Transformer for High-Resolution Image Restoration.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Spatio-temporal Relation Modeling for Few-shot Action Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Self-Supervised Predictive Convolutional Attentive Block for Anomaly Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Self-supervised Video Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Adaptive Feature Consolidation Network for Burst Super-Resolution.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

Energy-based Latent Aligner for Incremental Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

OW-DETR: Open-world Detection Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Burst Image Restoration and Enhancement.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

PSTR: End-to-End One-Step Person Search With Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022


UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

AVisT: A Benchmark for Visual Object Tracking in Adverse Visibility.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022

Adversarial Pixel Restoration as a Pretext Task for Transferable Perturbations.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022

Self-distilled Vision Transformer for Domain Generalization.
Proceedings of the Computer Vision - ACCV 2022, 2022

PS-ARM: An End-to-End Attention-Aware Relation Mixer Network for Person Search.
Proceedings of the Computer Vision - ACCV 2022, 2022

2021
Mask-Guided Attention Network and Occlusion-Sensitive Hard Example Mining for Occluded Pedestrian Detection.
IEEE Trans. Image Process., 2021

Compact Deep Color Features for Remote Sensing Scene Classification.
Neural Process. Lett., 2021

Airline ticket price and demand prediction: A survey.
J. King Saud Univ. Comput. Inf. Sci., 2021

Learning digital camera pipeline for extreme low-light imaging.
Neurocomputing, 2021

Multi-modal Transformers Excel at Class-agnostic Object Detection.
CoRR, 2021

CyTran: Cycle-Consistent Transformers for Non-Contrast to Contrast CT Translation.
CoRR, 2021

Context-Conditional Adaptation for Recognizing Unseen Classes in Unseen Domains.
CoRR, 2021

MineGAN++: Mining Generative Models for Efficient Knowledge Transfer to Limited Data Domains.
CoRR, 2021

Deep Gaussian Processes for Few-Shot Segmentation.
CoRR, 2021

Low Light Image Enhancement via Global and Local Context Modeling.
CoRR, 2021

PSC-Net: learning part spatial co-occurrence for occluded pedestrian detection.
Sci. China Inf. Sci., 2021

Intriguing Properties of Vision Transformers.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

The Ninth Visual Object Tracking VOT2021 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

Orthogonal Projection Loss.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

On Generating Transferable Targeted Perturbations.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Discriminative Region-based Multi-Label Zero-Shot Learning.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

D2-Net: Weakly-Supervised Action Localization via Discriminative Embeddings and Denoised Activations.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Handwriting Transformers.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Multi-Stage Progressive Image Restoration.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Towards Open World Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Learning To Fuse Asymmetric Feature Maps in Siamese Trackers.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Anomaly Detection in Video via Self-Supervised and Multi-Task Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Meta-learning the Learning Trends Shared Across Tasks.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

Self-supervised Knowledge Distillation for Few-shot Learning.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

Mode-Guided Feature Augmentation for Domain Generalization.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

Structured Latent Embeddings for Recognizing Unseen Classes in Unseen Domains.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

Rich Semantics Improve Few-Shot Learning.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020
Confidence Propagation through CNNs for Guided Sparse Depth Regression.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

GPS-level accurate camera localization with HorizonNet.
J. Field Robotics, 2020

A Scene-Agnostic Framework with Adversarial Training for Abnormal Event Detection in Video.
CoRR, 2020

Incremental Object Detection via Meta-Learning.
CoRR, 2020

PSC-Net: Learning Part Spatial Co-occurence for Occluded Pedestrian Detection.
CoRR, 2020

Filling the Gaps in Atrous Convolution: Semantic Segmentation With a Better Context.
IEEE Access, 2020

Learning Enriched Features for Real Image Restoration and Enhancement.
Proceedings of the Computer Vision - ECCV 2020, 2020

Count- and Similarity-Aware R-CNN for Pedestrian Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020


Fixing Localization Errors to Improve Image Classification.
Proceedings of the Computer Vision - ECCV 2020, 2020

Latent Embedding Feedback and Discriminative Features for Zero-Shot Classification.
Proceedings of the Computer Vision - ECCV 2020, 2020

SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation.
Proceedings of the Computer Vision - ECCV 2020, 2020

CycleISP: Real Image Restoration via Improved Data Synthesis.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Learning Human-Object Interaction Detection Using Interaction Points.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Semi-Supervised Learning for Few-Shot Image-to-Image Translation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

MineGAN: Effective Knowledge Transfer From GANs to Target Domains With Few Images.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Learning Fast and Robust Target Models for Video Object Segmentation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

iTAML: An Incremental Task-Agnostic Meta-learning Approach.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

A Self-supervised Approach for Adversarial Robustness.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

AnimalWeb: A Large-Scale Hierarchical Dataset of Annotated Animal Faces.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

D2Det: Towards High Quality Object Detection and Instance Segmentation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Any-Shot Object Detection.
Proceedings of the Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30, 2020

Synthesizing the Unseen for Zero-Shot Object Detection.
Proceedings of the Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30, 2020

Fine-Grained Recognition: Accounting for Subtle Differences between Similar Classes.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Synthetic Data Generation for End-to-End Thermal Infrared Tracking.
IEEE Trans. Image Process., 2019

Deep motion and appearance cues for visual tracking.
Pattern Recognit. Lett., 2019

Random Path Selection for Incremental Learning.
CoRR, 2019

Discriminative Online Learning for Fast Video Object Segmentation.
CoRR, 2019

Random Path Selection for Continual Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Cross-Domain Transferability of Adversarial Perturbations.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Multi-Modal Fusion for End-to-End RGB-T Tracking.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

The Seventh Visual Object Tracking VOT2019 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Learning the Model Update for Siamese Trackers.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Deep Contextual Attention for Human-Object Interaction Detection.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Learning Rich Features at High-Speed for Single-Shot Object Detection.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Mask-Guided Attention Network for Occluded Pedestrian Detection.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Enriched Feature Guided Refinement Network for Object Detection.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

3C-Net: Category Count and Center Loss for Weakly-Supervised Action Localization.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

iSAID: A Large-scale Dataset for Instance Segmentation in Aerial Images.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Efficient Featurized Image Pyramid Network for Single Shot Detector.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019


Out-Of-Distribution Detection for Generalized Zero-Shot Action Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

A Generative Appearance Model for End-To-End Video Object Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Object-Centric Auto-Encoders and Dummy Anomalies for Abnormal Event Detection in Video.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019


ATOM: Accurate Tracking by Overlap Maximization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Object Counting and Instance Segmentation With Image-Level Supervision.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019



Multi-stream Convolutional Networks for Indoor Scene Recognition.
Proceedings of the Computer Analysis of Images and Patterns, 2019

2018
Beyond Eleven Color Names for Image Understanding.
Mach. Vis. Appl., 2018

Scale coding bag of deep features for human attribute and action recognition.
Mach. Vis. Appl., 2018

Countering Bias in Tracking Evaluations.
Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2018), 2018

HorizonNet for visual terrain navigation.
Proceedings of the IEEE International Conference on Image Processing, 2018

Two-Stream Part-Based Deep Representation for Human Attribute Recognition.
Proceedings of the 2018 International Conference on Biometrics, 2018

The Sixth Visual Object Tracking VOT2018 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

On the Optimization of Advanced DCF-Trackers.
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Unveiling the Power of Deep Tracking.
Proceedings of the Computer Vision - ECCV 2018, 2018

Density Adaptive Point Set Registration.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Propagating Confidences through CNNs for Sparse Data Regression.
Proceedings of the British Machine Vision Conference 2018, 2018

Combining Local and Global Models for Robust Re-detection.
Proceedings of the 15th IEEE International Conference on Advanced Video and Signal Based Surveillance, 2018

2017
Discriminative Scale Space Tracking.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

DCCO: Towards Deformable Continuous Convolution Operators.
CoRR, 2017

Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition and Remote Sensing Scene Classification.
CoRR, 2017

Top-Down Deep Appearance Attention for Action Recognition.
Proceedings of the Image Analysis - 20th Scandinavian Conference, 2017

TEX-Nets: Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition.
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, 2017

The Visual Object Tracking VOT2017 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

ECO: Efficient Convolution Operators for Tracking.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Deep Projective 3D Semantic Segmentation.
Proceedings of the Computer Analysis of Images and Patterns, 2017

DCCO: Towards Deformable Continuous Convolution Operators for Visual Tracking.
Proceedings of the Computer Analysis of Images and Patterns, 2017

Ellipse Detection for Visual Cyclists Analysis "In the Wild".
Proceedings of the Computer Analysis of Images and Patterns, 2017

2016
Combining Holistic and Part-based Deep Representations for Computational Painting Categorization.
Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016

Combining Visual Tracking and Person Detection for Long Term Tracking on a UAV.
Proceedings of the Advances in Visual Computing - 12th International Symposium, 2016

Deep motion features for visual tracking.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

Aligning the dissimilar: A probabilistic method for feature-based point set registration.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

The Visual Object Tracking VOT2016 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the Computer Vision - ECCV 2016 Workshops, 2016


Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking.
Proceedings of the Computer Vision - ECCV 2016, 2016

A Probabilistic Framework for Color-Based Point Set Registration.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
Recognizing Actions Through Action-Specific Person Detection.
IEEE Trans. Image Process., 2015

Compact color-texture description for texture classification.
Pattern Recognit. Lett., 2015

Deep Semantic Pyramids for Human Attributes and Action Recognition.
Proceedings of the Image Analysis - 19th Scandinavian Conference, 2015

Coloring Channel Representations for Visual Tracking.
Proceedings of the Image Analysis - 19th Scandinavian Conference, 2015

Convolutional Features for Correlation Filter Based Visual Tracking.
Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop, 2015

Learning Spatially Regularized Correlation Filters for Visual Tracking.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

An Overview of Color Name Applications in Computer Vision.
Proceedings of the Computational Color Imaging - 5th International Workshop, 2015

2014
Semantic Pyramids for Gender and Action Recognition.
IEEE Trans. Image Process., 2014

Painting-91: a large scale database for computational painting categorization.
Mach. Vis. Appl., 2014

Scale Coding Bag-of-Words for Action Recognition.
Proceedings of the 22nd International Conference on Pattern Recognition, 2014


A Low-Level Active Vision Framework for Collaborative Unmanned Aircraft Systems.
Proceedings of the Computer Vision - ECCV 2014 Workshops, 2014

Adaptive Color Attributes for Real-Time Visual Tracking.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Accurate Scale Estimation for Robust Visual Tracking.
Proceedings of the British Machine Vision Conference, 2014

2013
Interactive Visual and Semantic Image Retrieval.
Proceedings of the Multimodal Interaction in Image and Video Applications, 2013

Coloring Action Recognition in Still Images.
Int. J. Comput. Vis., 2013

Discriminative Color Descriptors.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Fusing Color and Shape for Bag-of-Words Based Object Recognition.
Proceedings of the Computational Color Imaging - 4th International Workshop, 2013

Evaluating the Impact of Color on Texture Recognition.
Proceedings of the Computer Analysis of Images and Patterns, 2013

2012
Discriminative compact pyramids for object and scene recognition.
Pattern Recognit., 2012

Modulating Shape Features by Color Attention for Object Recognition.
Int. J. Comput. Vis., 2012

Color attributes for object detection.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011
Portmanteau Vocabularies for Multi-Cue Image Representation.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

2010
The Impact of Color on Bag-of-Words Based Object Recognition.
Proceedings of the 20th International Conference on Pattern Recognition, 2010

2009
Top-down color attention for object recognition.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009


  Loading...