Qi Wu

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-Training Framework.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models.

[BibT_eX]

[DOI]

Gengze Zhou

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Context-I2W: Mapping Images to Context-Dependent Words for Accurate Zero-Shot Composed Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Augmented Commonsense Knowledge for Remote Object Grounding.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

WebVLN: Vision-and-Language Navigation on Websites.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Data Hiding With Deep Learning: A Survey Unifying Digital Watermarking and Steganography.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Soc. Syst., December, 2023

Medical visual question answering: A survey.

[BibT_eX]

[DOI]

Artif. Intell. Medicine, September, 2023

HOP+: History-Enhanced and Order-Aware Pre-Training for Vision-and-Language Navigation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

Multi-Granularity Aggregation Transformer for Joint Video-Audio-Text Representation Learning.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., June, 2023

A Proposal-Free One-Stage Framework for Referring Expression Comprehension and Generation via Dense Cross-Attention.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Rethinking and Improving Feature Pyramids for One-Stage Referring Expression Comprehension.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2023

Weakly-Supervised 3D Spatial Reasoning for Text-Based Visual Question Answering.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2023

Subject-Oriented Video Captioning.

[BibT_eX]

[DOI]

CoRR, 2023

Watermarking Vision-Language Pre-trained Models for Multi-modal Embedding as a Service.

[BibT_eX]

[DOI]

CoRR, 2023

Improving Online Source-free Domain Adaptation for Object Detection by Unsupervised Data Acquisition.

[BibT_eX]

[DOI]

CoRR, 2023

Align before Search: Aligning Ads Image to Text for Accurate Cross-Modal Sponsored Search.

[BibT_eX]

[DOI]

CoRR, 2023

SwitchGPT: Adapting Large Language Models for Non-Text Outputs.

[BibT_eX]

[DOI]

Xinyu Wang

Bohan Zhuang

CoRR, 2023

S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Likelihood-Based Text-to-Image Evaluation with Patch-Level Perceptual and Semantic Credit Assignment.

[BibT_eX]

[DOI]

CoRR, 2023

AerialVLN: Vision-and-Language Navigation for UAVs.

[BibT_eX]

[DOI]

CoRR, 2023

Attention Mechanisms in Medical Image Segmentation: A Survey.

[BibT_eX]

[DOI]

CoRR, 2023

S4M: Generating Radiology Reports by A Single Model for Multiple Body Parts.

[BibT_eX]

[DOI]

CoRR, 2023

LoRA: A Logical Reasoning Augmented Dataset for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Mind the Gap: Improving Success Rate of Vision-and-Language Navigation by Revisiting Oracle Success Routes.

[BibT_eX]

[DOI]

Chongyang Zhao

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Multi-modal Adapter for Medical Vision-and-Language Learning.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning in Medical Imaging - 14th International Workshop, 2023

BHSD: A 3D Multi-class Brain Hemorrhage Segmentation Dataset.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning in Medical Imaging - 14th International Workshop, 2023

PLMVQA: Applying Pseudo Labels for Medical Visual Question Answering with Limited Data.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023 Workshops, 2023

MedIM: Boost Medical Image Representation via Radiology Report-Guided Masking.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Unpaired Cross-Modal Interaction Learning for COVID-19 Segmentation on Limited CT Images.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Scaling Data Generation in Vision-and-Language Navigation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

ShapeScaffolder: Structure-Aware 3D Shape Generation from Text.

[BibT_eX]

[DOI]

Xi Tian

Yong-Liang Yang

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation.

[BibT_eX]

[DOI]

Yanyuan Qiao

Zheng Yu

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

March in Chat: Interactive Prompting for Remote Embodied Referring Expression.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

AerialVLN: Vision-and-Language Navigation for UAVs.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Identity-Consistent Aggregation for Video Object Detection.

[BibT_eX]

[DOI]

Chaorui Deng

Da Chen

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Memory-efficient Temporal Moment Localization in Long Videos.

[BibT_eX]

[DOI]

Edison Marrese-Taylor

Basura Fernando

Hiroya Takamura

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

S<sup>3</sup>C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning to Dub Movies via Hierarchical Prosody Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Digging out Discrimination Information from Generated Samples for Robust Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022

Visual Question Answering - From Theory to Application

[BibT_eX]

[DOI]

Advances in Computer Vision and Pattern Recognition, Springer, ISBN: 978-981-19-0963-4, 2022

Robust Learning From Noisy Web Images Via Data Purification for Fine-Grained Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2022

Co-LDL: A Co-Training-Based Label Distribution Learning Method for Tackling Label Noise.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2022

Show, Price and Negotiate: A Negotiator With Online Value Look-Ahead.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2022

Structured Multimodal Attentions for TextVQA.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Visual Grounding Via Accumulated Attention.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Toward 3D Spatial Reasoning for Human-like Text-based Visual Question Answering.

[BibT_eX]

[DOI]

CoRR, 2022

ClusTR: Exploring Efficient Self-attention via Clustering for Vision Transformers.

[BibT_eX]

[DOI]

CoRR, 2022

Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information.

[BibT_eX]

[DOI]

CoRR, 2022

HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation.

[BibT_eX]

[DOI]

CoRR, 2022

ForeSI: Success-Aware Visual Navigation Agent.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

Learning Distinct and Representative Modes for Image Captioning.

[BibT_eX]

[DOI]

Qi Chen

Chaorui Deng

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Diagnosing Vision-and-Language Navigation: What Really Matters.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

UniMiSS: Universal Medical Self-supervised Learning via Breaking Dimensionality Barrier.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

A Simple and Robust Correlation Filtering Method for Text-Based Person Search.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

HOP: History-and-Order Aware Pretraining for Vision-and-Language Navigation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Maintaining Reasoning Consistency in Compositional Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

V2C: Visual Voice Cloning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Enhancing Person Synthesis in Complex Scenes via Intrinsic and Contextual Structure Modeling.

[BibT_eX]

[DOI]

Xi Tian

Yongliang Yang

Proceedings of the 33rd British Machine Vision Conference 2022, 2022

Program Generation from Diverse Video Demonstrations.

[BibT_eX]

[DOI]

Proceedings of the 33rd British Machine Vision Conference 2022, 2022

Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions.

[BibT_eX]

[DOI]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Learning the Dynamics of Visual Relational Reasoning via Reinforced Path Routing.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Referring Expression Comprehension: A Survey of Methods and Datasets.

[BibT_eX]

[DOI]

Yanyuan Qiao

Chaorui Deng

IEEE Trans. Multim., 2021

Learning Dual Encoding Model for Adaptive Visual Understanding in Visual Dialogue.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2021

Language-Guided Navigation via Cross-Modal Grounding and Alternate Adversarial Learning.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2021

Image editing with varying intensities of processing.

[BibT_eX]

[DOI]

Comput. Vis. Image Underst., 2021

LocFormer: Enabling Transformers to Perform Temporal Moment Localization on Long Untrimmed Videos With a Feature Sampling Approach.

[BibT_eX]

[DOI]

Edison Marrese-Taylor

Basura Fernando

Hiroya Takamura

CoRR, 2021

Unified 2D and 3D Pre-training for Medical Image classification and Segmentation.

[BibT_eX]

[DOI]

CoRR, 2021

Memory Regulation and Alignment toward Generalizer RGB-Infrared Person.

[BibT_eX]

[DOI]

CoRR, 2021

Data Hiding with Deep Learning: A Survey Unifying Digital Watermarking and Steganography.

[BibT_eX]

[DOI]

CoRR, 2021

Know What and Know Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation.

[BibT_eX]

[DOI]

CoRR, 2021

Learning for Visual Navigation by Imagining the Success.

[BibT_eX]

[DOI]

CoRR, 2021

Multi-intersection Traffic Optimisation: A Benchmark Dataset and a Strong Baseline.

[BibT_eX]

[DOI]

CoRR, 2021

Optimistic Agent: Accurate Graph-Based Value Estimation for More Successful Visual Navigation.

[BibT_eX]

[DOI]

Ehsan Abbasnejad

Javen Shi

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Debiased Visual Question Answering from Feature and Sample Perspectives.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Landmark-RxR: Solving Vision-and-Language Navigation with Fine-Grained Alignment Supervision.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

R-GAN: Exploring Human-like Way for Reasonable Text-to-Image Synthesis via Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Neighbor-view Enhanced Model for Vision and Language Navigation.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Chop Chop BERT: Visual Question Answering by Chopping VisualBERT's Heads.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

The Road to Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Jo-SRC: A Contrastive Approach for Combating Noisy Labels.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Towards Accurate Text-Based Image Captioning With Content Diversity Exploration.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

VLN BERT: A Recurrent Vision-and-Language BERT for Navigation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Sketch, Ground, and Refine: Top-Down Dense Video Captioning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Confidence-aware Non-repetitive Multimodal Transformers for TextCaps.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

How to Train Your Agent to Read and Write.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Reasoning on the Relation: Enhancing Visual Representation for Visual Question Answering and Cross-Modal Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2020

Scripted Video Generation With a Bottom-Up Generative Adversarial Network.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2020

Image and Sentence Matching via Semantic Concepts and Order Learning.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2020

Semantics for Robotic Mapping, Perception and Interaction: A Survey.

[BibT_eX]

[DOI]

Found. Trends Robotics, 2020

A Recurrent Vision-and-Language BERT for Navigation.

[BibT_eX]

[DOI]

CoRR, 2020

CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation.

[BibT_eX]

[DOI]

CoRR, 2020

Data-driven Meta-set Based Fine-Grained Visual Classification.

[BibT_eX]

[DOI]

CoRR, 2020

Utilising Prior Knowledge for Visual Navigation: Distil and Adapt.

[BibT_eX]

[DOI]

Ehsan Abbasnejad

Javen Shi

CoRR, 2020

Language and Visual Entity Relationship Graph for Agent Navigation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Data-driven Meta-set Based Fine-Grained Visual Recognition.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Give Me Something to Eat: Referring Expression Comprehension with Commonsense Knowledge.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Cascade Reasoning Network for Text-based Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Visual-Semantic Graph Matching for Visual Grounding.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Medical Data Inquiry Using a Question Answering Model.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Symposium on Biomedical Imaging, 2020

Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

DAM: Deliberation, Abandon and Memory Networks for Generating Detailed and Non-repetitive Responses in Visual Dialogue.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Sub-Instruction Aware Vision-and-Language Navigation.

[BibT_eX]

[DOI]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Soft Expert Reward Learning for Vision-and-Language Navigation.

[BibT_eX]

[DOI]

Hu Wang

Chunhua Shen

Proceedings of the Computer Vision - ECCV 2020, 2020

Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Object-and-Action Aware Model for Visual Language Navigation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Length-Controllable Image Captioning.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Fine-Grained Video-Text Retrieval With Hierarchical Graph Reasoning.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Intelligent Home 3D: Automatic 3D-House Design From Linguistic Descriptions Only.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Say As You Wish: Fine-Grained Control of Image Caption Generation With Abstract Scene Graphs.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Gold Seeker: Information Gain From Policy Distributions for Goal-Oriented Vision-and-Langauge Reasoning.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

AIML at VQA-Med 2020: Knowledge Inference via a Skeleton-based Sentence Mapping Approach for Medical Domain Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Working Notes of CLEF 2020, 2020

Modular Graph Attention Network for Complex Visual Relational Reasoning.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30, 2020

Overcoming Language Priors in VQA via Decomposed Linguistic Representations.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Attend and Imagine: Multi-Label Image Classification With Visual Attention and Recurrent Neural Networks.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2019

Heritage image annotation via collective knowledge.

[BibT_eX]

[DOI]

Pattern Recognit., 2019

Medical image classification using synergic deep learning.

[BibT_eX]

[DOI]

Medical Image Anal., 2019

Integrating Temporal and Spatial Attentions for VATEX Video Captioning Challenge 2019.

[BibT_eX]

[DOI]

CoRR, 2019

Show, Price and Negotiate: A Hierarchical Attention Recurrent Visual Negotiator.

[BibT_eX]

[DOI]

CoRR, 2019

RERERE: Remote Embodied Referring Expressions in Real indoor Environments.

[BibT_eX]

[DOI]

CoRR, 2019

An Attribute-Based High-Level Image Representation for Scene Classification.

[BibT_eX]

[DOI]

Wenhua Liu

Yidong Li

IEEE Access, 2019

Watch, Reason and Code: Learning to Represent Videos Using Program.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Mind Your Neighbours: Image Annotation With Metadata Neighbourhood Graph Co-Attention Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

What's to Know? Uncertainty as a Guide to Asking Goal-Oriented Questions.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Multilabel Image Classification With Regional Latent Semantic Dependencies.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2018

Image Captioning and Visual Question Answering Based on Attributes and External Knowledge.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2018

FVQA: Fact-Based Visual Question Answering.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2018

An Active Information Seeking Model for Goal-oriented Vision-and-Language Tasks.

[BibT_eX]

[DOI]

CoRR, 2018

Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks.

[BibT_eX]

[DOI]

CoRR, 2018

Skin Lesion Classification in Dermoscopy Images Using Synergic Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2018, 2018

Goal-Oriented Visual Question Generation via Intermediate Rewards.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

Parallel Attention: A Unified Framework for Visual Object Discovery Through Dialogs and Queries.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Are You Talking to Me? Reasoned Visual Dialog Generation Through Adversarial Learning.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Learning Semantic Concepts and Order for Image and Sentence Matching.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Visual Question Answering With Memory-Augmented Networks.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Connecting Language and Vision to Actions.

[BibT_eX]

[DOI]

Peter Anderson

Abhishek Das

Proceedings of ACL 2018, Melbourne, Australia, July 15-20, 2018, Tutorial Abstracts, 2018

HCVRD: A Benchmark for Large-Scale Human-Centered Visual Relationship Detection.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Kill Two Birds With One Stone: Weakly-Supervised Neural Network for Image Annotation and Tag Refinement.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Visual Question Answering: A Tutorial.

[BibT_eX]

[DOI]

Damien Teney

Anton van den Hengel

IEEE Signal Process. Mag., 2017

Visual question answering: A survey of methods and datasets.

[BibT_eX]

[DOI]

Comput. Vis. Image Underst., 2017

Learning Semantic Concepts and Order for Image and Sentence Matching.

[BibT_eX]

[DOI]

Yan Huang

Liang Wang

CoRR, 2017

Asking the Difficult Questions: Goal-Oriented Visual Question Generation via Intermediate Rewards.

[BibT_eX]

[DOI]

CoRR, 2017

Care about you: towards large-scale human-centric visual relationship detection.

[BibT_eX]

[DOI]

CoRR, 2017

Classification of Medical Images and Illustrations in the Biomedical Literature Using Synergic Deep Learning.

[BibT_eX]

[DOI]

CoRR, 2017

Explicit Knowledge-based Reasoning for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Historical Image Annotation by Exploring the Tag Relevance.

[BibT_eX]

[DOI]

Proceedings of the 4th IAPR Asian Conference on Pattern Recognition, 2017

2016

Multi-Label Image Classification with Regional Latent Semantic Dependencies.

[BibT_eX]

[DOI]

CoRR, 2016

Image Captioning and Visual Question Answering Based on Attributes and Their Related External Knowledge.

[BibT_eX]

[DOI]

CoRR, 2016

Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge from External Sources.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

What Value Do Explicit High Level Concepts Have in Vision to Language Problems?

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015

Modelling visual objects regardless of depictive style.

[BibT_eX]

[DOI]

PhD thesis, 2015

Cross-depiction problem: Recognition and synthesis of photographs and artwork.

[BibT_eX]

[DOI]

Comput. Vis. Media, 2015

Image Captioning with an Intermediate Attributes Layer.

[BibT_eX]

[DOI]

CoRR, 2015

The Cross-Depiction Problem: Computer Vision Algorithms for Recognising Objects in Artwork and in Photographs.

[BibT_eX]

[DOI]

CoRR, 2015

Beyond Photo-Domain Object Recognition: Benchmarks for the Cross-Depiction Problem.

[BibT_eX]

[DOI]

Hongping Cai

Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop, 2015

2014

Learning Graphs to Model Visual Objects across Different Depictive Styles.

[BibT_eX]

[DOI]

Hongping Cai

Proceedings of the Computer Vision - ECCV 2014, 2014

2013

Modelling Visual Objects Invariant to Depictive Style.

[BibT_eX]

[DOI]

Proceedings of the British Machine Vision Conference, 2013

2012

Prime Shapes in Natural Images.

[BibT_eX]

[DOI]