Xing Xu

Orcid: 0000-0001-5685-3123

Affiliations:
  • University of Electronic Science and Technology of China, School of Computer Science and Engineering, Center for Future Media, Chengdu, China
  • Kyushu University, Japan (PhD 2015)


According to our database1, Xing Xu authored at least 222 papers between 2013 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
AnoOnly: Semi-supervised anomaly detection with the only loss on anomalies.
Expert Syst. Appl., 2025

2024
Fuzzy Multimodal Graph Reasoning for Human-Centric Instructional Video Grounding.
IEEE Trans. Fuzzy Syst., September, 2024

CR-FPN: channel relation feature pyramid network for object detection.
Wirel. Networks, July, 2024

Representation separation adversarial networks for cross-modal retrieval.
Wirel. Networks, July, 2024

Cross-Modal Attention Preservation with Self-Contrastive Learning for Composed Query-Based Image Retrieval.
ACM Trans. Multim. Comput. Commun. Appl., June, 2024

Complex Relation Embedding for Scene Graph Generation.
IEEE Trans. Neural Networks Learn. Syst., June, 2024

SDN: Semantic Decoupling Network for Temporal Language Grounding.
IEEE Trans. Neural Networks Learn. Syst., May, 2024

Multi-Grained Attention Network With Mutual Exclusion for Composed Query-Based Image Retrieval.
IEEE Trans. Circuits Syst. Video Technol., April, 2024

Relation-Aggregated Cross-Graph Correlation Learning for Fine-Grained Image-Text Retrieval.
IEEE Trans. Neural Networks Learn. Syst., February, 2024

Learning Relationship-Enhanced Semantic Graph for Fine-Grained Image-Text Matching.
IEEE Trans. Cybern., February, 2024

Ecarnet: enhanced clue-ambiguity reasoning network for multimodal fake news detection.
Multim. Syst., February, 2024

Runge-Kutta Guided Feature Augmentation for Few-Sample Learning.
IEEE Trans. Multim., 2024

Zero-Shot Video Moment Retrieval With Angular Reconstructive Text Embeddings.
IEEE Trans. Multim., 2024

Boosting Adversarial Training with Hardness-Guided Attack Strategy.
IEEE Trans. Multim., 2024

Semantics Disentangling for Cross-Modal Retrieval.
IEEE Trans. Image Process., 2024

Coreset Learning-Based Sparse Black-Box Adversarial Attack for Video Recognition.
IEEE Trans. Inf. Forensics Secur., 2024

VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via Hierarchical Vector Quantization.
CoRR, 2024

Multi-Scale Temporal Difference Transformer for Video-Text Retrieval.
CoRR, 2024

UGNCL: Uncertainty-Guided Noisy Correspondence Learning for Efficient Cross-Modal Matching.
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Unsupervised Cross-Domain Image Retrieval with Semantic-Attended Mixture-of-Experts.
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Enhanced Experts with Uncertainty-Aware Routing for Multimodal Sentiment Analysis.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Counterfactually Augmented Event Matching for De-biased Temporal Sentence Grounding.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

PTAN: Principal Token-aware Adjacent Network for Compositional Temporal Grounding.
Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024

Temporal Self-Paced Proposal Learning for Weakly-Supervised Video Moment Retrieval and Highlight Detection.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Domain Prompt Learning Framework for Real Image Dehazing.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Diverse Embedding Modeling with Adaptive Noise Filter for Text-based Person Retrieval.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Embracing Unimodal Aleatoric Uncertainty for Robust Multimodal Fusion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

T-SciQ: Teaching Multimodal Chain-of-Thought Reasoning via Large Language Model Signals for Science Question Answering.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Adaptive Uncertainty-Based Learning for Text-Based Person Retrieval.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Composition-Aware Image Steganography Through Adversarial Self-Generated Supervision.
IEEE Trans. Neural Networks Learn. Syst., November, 2023

Less is Better: Exponential Loss for Cross-Modal Matching.
IEEE Trans. Circuits Syst. Video Technol., September, 2023

Hypercomplex context guided interaction modeling for scene graph generation.
Pattern Recognit., September, 2023

Category Alignment Adversarial Learning for Cross-Modal Retrieval.
IEEE Trans. Knowl. Data Eng., May, 2023

Quaternion Relation Embedding for Scene Graph Generation.
IEEE Trans. Multim., 2023

OMGH: Online Manifold-Guided Hashing for Flexible Cross-Modal Retrieval.
IEEE Trans. Multim., 2023

Self-Supervised Fine-Grained Cycle-Separation Network (FSCN) for Visual-Audio Separation.
IEEE Trans. Multim., 2023

Quaternion Representation Learning for cross-modal matching.
Knowl. Based Syst., 2023

TFUN: Trilinear Fusion Network for Ternary Image-Text Retrieval.
Inf. Fusion, 2023

IRPSM-net: Information retention pyramid stereo matching network.
Int. J. Comput. Sci. Math., 2023

BatchNorm-based Weakly Supervised Video Anomaly Detection.
CoRR, 2023

MSFlow: Multi-Scale Flow-based Framework for Unsupervised Anomaly Detection.
CoRR, 2023

MoCoSA: Momentum Contrast for Knowledge Graph Completion with Structure-Augmented Pre-trained Language Models.
CoRR, 2023

AnoOnly: Semi-Supervised Anomaly Detection without Loss on Normal Data.
CoRR, 2023

Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual Document Understanding Models.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

DCEL: Deep Cross-modal Evidential Learning for Text-Based Person Retrieval.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Faster Video Moment Retrieval with Point-Level Supervision.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Taking a Part for the Whole: An Archetype-agnostic Framework for Voice-Face Association.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Joint Searching and Grounding: Multi-Granularity Video Content Retrieval.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Zero-shot Sketch-based Image Retrieval with Adaptive Balanced Discriminability and Generalizability.
Proceedings of the 2023 ACM International Conference on Multimedia Retrieval, 2023

Multi-granularity Separation Network for Text-Based Person Retrieval with Bidirectional Refinement Regularization.
Proceedings of the 2023 ACM International Conference on Multimedia Retrieval, 2023

Region-Aware Semantic Consistency for Unsupervised Domain-Adaptive Semantic Segmentation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Information Selection-based Domain Adaptation from Black-box Predictors.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Label-Semantic-Enhanced Online Hashing for Efficient Cross-modal Retrieval.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Progressive Event Alignment Network for Partial Relevant Video Retrieval.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Special Issue on Synthetic Media on the Web.
World Wide Web, 2022

Mind the Remainder: Taylor's Theorem View on Recurrent Neural Networks.
IEEE Trans. Neural Networks Learn. Syst., 2022

Cross-Modal Dynamic Networks for Video Moment Retrieval With Text Query.
IEEE Trans. Multim., 2022

View-Invariant Human Action Recognition Via View Transformation Network (VTN).
IEEE Trans. Multim., 2022

Semantic-Aligned Attention With Refining Feature Embedding for Few-Shot Image Classification.
IEEE Trans. Intell. Transp. Syst., 2022

Cognitive Memory-Guided AutoEncoder for Effective Intrusion Detection in Internet of Things.
IEEE Trans. Ind. Informatics, 2022

Med-BERT: A Pretraining Framework for Medical Records Named Entity Recognition.
IEEE Trans. Ind. Informatics, 2022

Learning Cross-Modal Common Representations by Private-Shared Subspaces Separation.
IEEE Trans. Cybern., 2022

Flow-Edge Guided Unsupervised Video Object Segmentation.
IEEE Trans. Circuits Syst. Video Technol., 2022

Action-Centric Relation Transformer Network for Video Question Answering.
IEEE Trans. Circuits Syst. Video Technol., 2022

Modeling Two-Stream Correspondence for Visual Sound Separation.
IEEE Trans. Circuits Syst. Video Technol., 2022

Joint Feature Synthesis and Embedding: Adversarial Cross-Modal Retrieval Revisited.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Universal Weighting Metric Learning for Cross-Modal Retrieval.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Semantic guided knowledge graph for large-scale zero-shot learning.
J. Vis. Commun. Image Represent., 2022

Comprehensive Framework of Early and Late Fusion for Image-Sentence Retrieval.
IEEE Multim., 2022

Query-based black-box attack against medical image segmentation model.
Future Gener. Comput. Syst., 2022

Semantic Enhanced Knowledge Graph for Large-Scale Zero-Shot Learning.
CoRR, 2022

Thunder: Thumbnail based Fast Lightweight Image Denoising Network.
CoRR, 2022

I-WKNN: Fast-speed and high-accuracy WIFI positioning for intelligent sports stadiums.
Comput. Electr. Eng., 2022

Language-enhanced object reasoning networks for video moment retrieval with text query.
Comput. Electr. Eng., 2022

Learning discriminative representations via variational self-distillation for cross-view geo-localization.
Comput. Electr. Eng., 2022

Structure-Aware Semantic-Aligned Network for Universal Cross-Domain Retrieval.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Multimodal Disentanglement Variational AutoEncoders for Zero-Shot Cross-Modal Retrieval.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Prototype-based Selective Knowledge Distillation for Zero-Shot Sketch Based Image Retrieval.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Point to Rectangle Matching for Image Text Retrieval.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Rethinking Open-World Object Detection in Autonomous Driving Scenarios.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

ARRA: Absolute-Relative Ranking Attack against Image Retrieval.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

DHHN: Dual Hierarchical Hybrid Network for Weakly-Supervised Audio-Visual Video Parsing.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Selective Hypergraph Convolutional Networks for Skeleton-based Action Recognition.
Proceedings of the ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27, 2022

Accelerated Sign Hunter: A Sign-based Black-box Attack via Branch-Prune Strategy and Stabilized Hierarchical Search.
Proceedings of the ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27, 2022

X-HRNet: Towards Lightweight Human Pose Estimation with Spatially Unidimensional Self-Attention.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Instance-Level Semantic Alignment for Zero-Shot Cross-Modal Retrieval.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

GTLR: Graph-Based Transformer with Language Reconstruction for Video Paragraph Grounding.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Detach and Enhance: Learning Disentangled Cross-modal Latent Representation for Efficient Face-Voice Association and Matching.
Proceedings of the IEEE International Conference on Data Mining, 2022

Semi-supervised Video Paragraph Grounding with Contrastive Encoder.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

TVT: Three-Way Vision Transformer through Multi-Modal Hypersphere Learning for Zero-Shot Sketch-Based Image Retrieval.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Cross-Modal Hybrid Feature Fusion for Image-Sentence Matching.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Zero-shot Cross-modal Retrieval by Assembling AutoEncoder and Generative Adversarial Network.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Radial Graph Convolutional Network for Visual Question Generation.
IEEE Trans. Neural Networks Learn. Syst., 2021

Interclass-Relativity-Adaptive Metric Learning for Cross-Modal Matching and Beyond.
IEEE Trans. Multim., 2021

Exploiting Subspace Relation in Semantic Labels for Cross-Modal Hashing.
IEEE Trans. Knowl. Data Eng., 2021

Adversarial Attack Against Urban Scene Segmentation for Autonomous Vehicles.
IEEE Trans. Ind. Informatics, 2021

Deep Fuzzy Hashing Network for Efficient Image Retrieval.
IEEE Trans. Fuzzy Syst., 2021

Few-shot prototype alignment regularization network for document image layout segementation.
Pattern Recognit., 2021

5G-Network-Enabled Smart Ambulance: Architecture, Application, and Evaluation.
IEEE Netw., 2021

Toward Effective Intrusion Detection Using Log-Cosh Conditional Variational Autoencoder.
IEEE Internet Things J., 2021

Adaptive Square Attack: Fooling Autonomous Cars With Adversarial Traffic Signs.
IEEE Internet Things J., 2021

Heterogeneous data fusion for predicting mild cognitive impairment conversion.
Inf. Fusion, 2021

A Cognitive Memory-Augmented Network for Visual Anomaly Detection.
IEEE CAA J. Autom. Sinica, 2021

I-WKNN: Fast-Speed and High-Accuracy WIFI Positioning for Intelligent Stadiums.
CoRR, 2021

Hybrid Fusion with Intra- and Cross-Modality Attention for Image-Recipe Retrieval.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

Semantic Enhanced Cross-modal GAN for Zero-shot Learning.
Proceedings of the MMAsia '21: ACM Multimedia Asia, Gold Coast, Australia, December 1, 2021

Extracting Useful Knowledge from Noisy Web Images via Data Purification for Fine-Grained Recognition.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Video Representation Learning with Graph Contrastive Augmentation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Meta Self-Paced Learning for Cross-Modal Matching.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Disentangled Representation Learning and Enhancement Network for Single Image De-Raining.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Relationship-Preserving Knowledge Distillation for Zero-Shot Sketch Based Image Retrieval.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Vision-guided Music Source Separation via a Fine-grained Cycle-Separation Network.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

CAA: Candidate-Aware Aggregation for Temporal Action Detection.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Learning Hierarchal Channel Attention for Fine-grained Visual Classification.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Multi-scale Dynamic Network for Temporal Action Detection.
Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

Cross-Modal Image-Recipe Retrieval via Intra- and Inter-Modality Hybrid Fusion.
Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

PoseGTAC: Graph Transformer Encoder-Decoder with Atrous Convolution for 3D Human Pose Estimation.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Feature Space Targeted Attacks by Statistic Alignment.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Graph Convolutional Hourglass Networks for Skeleton-Based Action Recognition.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Efficient Online Label Consistent Hashing for Large-Scale Cross-Modal Retrieval.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Attention-Based Relation Reasoning Network for Video-Text Retrieval.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Combine Early and Late Fusion Together: A Hybrid Fusion Framework for Image-Text Matching.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Multimodal Transformer Networks with Latent Interaction for Audio-Visual Event Localization.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

From General to Specific: Informative Scene Graph Generation via Balance Adjustment.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Multi-Stage Aggregated Transformer Network for Temporal Language Localization in Videos.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Partial Feature Selection and Alignment for Multi-Source Domain Adaptation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Enhancing Audio-Visual Association with Self-Supervised Curriculum Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Cross-Modal Attention With Semantic Consistence for Image-Text Matching.
IEEE Trans. Neural Networks Learn. Syst., 2020

Temporal Reasoning Graph for Activity Recognition.
IEEE Trans. Image Process., 2020

A Context Knowledge Map Guided Coarse-to-Fine Action Recognition.
IEEE Trans. Image Process., 2020

Ternary Adversarial Networks With Self-Supervision for Zero-Shot Cross-Modal Retrieval.
IEEE Trans. Cybern., 2020

A Novel Vehicle Tracking ID Switches Algorithm for Driving Recording Sensors.
Sensors, 2020

Similarity preserving feature generating networks for zero-shot learning.
Neurocomputing, 2020

Question-Led object attention for visual question answering.
Neurocomputing, 2020

Unified Binary Generative Adversarial Network for Image Retrieval and Compression.
Int. J. Comput. Vis., 2020

Cognitive visual anomaly detection with constrained latent representations for industrial inspection robot.
Appl. Soft Comput., 2020

Correlated Features Synthesis and Alignment for Zero-shot Cross-modal Retrieval.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

3D Self-Attention for Unsupervised Video Quantization.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

Scene graph generation via multi-relation classification and cross-modal attention coordinator.
Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

Graph-based variational auto-encoder for generalized zero-shot learning.
Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

Self-supervised adversarial learning for cross-modal retrieval.
Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

Multi-level expression guided attention network for referring expression comprehension.
Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

Temporal Denoising Mask Synthesis Network for Learning Blind Video Temporal Consistency.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Learning Optimization-based Adversarial Perturbations for Attacking Sequential Recognition Models.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Hearing like Seeing: Improving Voice-Face Interactions and Associations via Adversarial Deep Semantic Matching Network.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

CC-LSTM: Cross and Conditional Long-Short Time Memory for Video Captioning.
Proceedings of the Pattern Recognition. ICPR International Workshops and Challenges, 2020

Ocean: A Dual Learning Approach For Generalized Zero-Shot Sketch-Based Image Retrieval.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Fooled by Imagination: Adversarial Attack to Image Captioning Via Perturbation in Complex Domain.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

What Machines See Is Not What They Get: Fooling Scene Text Recognition Models With Adversarial Text Images.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Universal Weighting Metric Learning for Cross-Modal Matching.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Learning Cross-Aligned Latent Embeddings for Zero-Shot Cross-Modal Retrieval.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Deep adversarial metric learning for cross-modal retrieval.
World Wide Web, 2019

Fusion by synthesizing: A multi-view deep neural network for zero-shot recognition.
Signal Process., 2019

Word-to-region attention network for visual question answering.
Multim. Tools Appl., 2019

Learning one-to-many stylised Chinese character transformation and generation by generative adversarial networks.
IET Image Process., 2019

Cooperative Cross-Stream Network for Discriminative Action Representation.
CoRR, 2019

Generative Reconstructive Hashing for Incomplete Video Analysis.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Learnable Aggregating Net with Diversity Learning for Video Question Answering.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Sequence-To-Sequence Domain Adaptation Network for Robust Text Image Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Learning to create multi-stylized Chinese character fonts by generative adversarial networks.
Proceedings of the ACM Turing Celebration Conference - China, 2019

Template-Based Math Word Problem Solvers with Recursive Neural Networks.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Deliberate Attention Networks for Image Captioning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Perceptual Pyramid Adversarial Networks for Text-to-Image Synthesis.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Recognition and Detection of Two-Person Interactive Actions Using Automatically Selected Skeleton Features.
IEEE Trans. Hum. Mach. Syst., 2018

One-shot learning based pattern transition map for action early recognition.
Signal Process., 2018

Zero-shot learning via discriminative representation extraction.
Pattern Recognit. Lett., 2018

Semantic binary coding for visual recognition via joint concept-attribute modelling.
Multim. Tools Appl., 2018

FDCNet: filtering deep convolutional network for marine organism classification.
Multim. Tools Appl., 2018

Domain Invariant Subspace Learning for Cross-Modal Retrieval.
Proceedings of the MultiMedia Modeling - 24th International Conference, 2018

Cumulative Nets for Edge Detection.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Pseudo Transfer with Marginalized Corrupted Attribute for Zero-shot Learning.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Modal-adversarial Semantic Learning Network for Extendable Cross-modal Retrieval.
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018

Dual Learning for Visual Question Generation.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Domain separation network for cross-modal retrieval.
Proceedings of the 10th International Conference on Internet Multimedia Computing and Service, 2018

Index and Retrieve Multimedia Data: Cross-Modal Hashing by Learning Subspace Relation.
Proceedings of the Database Systems for Advanced Applications, 2018

Deep Region Hashing for Generic Instance Search from Images.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Binary Generative Adversarial Networks for Image Retrieval.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Video Captioning With Attention-Based LSTM and Semantic Consistency.
IEEE Trans. Multim., 2017

Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval.
IEEE Trans. Image Process., 2017

Large-scale image retrieval with supervised sparse hashing.
Neurocomputing, 2017

Kernel based latent semantic sparse hashing for large-scale retrieval from heterogeneous data sources.
Neurocomputing, 2017

Exploiting score distribution for heterogenous feature fusion in image classification.
Neurocomputing, 2017

Supervised hashing with adaptive discrete optimization for multimedia retrieval.
Neurocomputing, 2017

Deep Region Hashing for Efficient Large-scale Instance Search from Images.
CoRR, 2017

Wound intensity correction and segmentation with convolutional neural networks.
Concurr. Comput. Pract. Exp., 2017

Non-Linear Matrix Completion for Social Image Tagging.
IEEE Access, 2017

Spatial Verification via Compact Words for Mobile Instance Search.
Proceedings of the MultiMedia Modeling - 23rd International Conference, 2017

A System for Spatiotemporal Anomaly Localization in Surveillance Videos.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Adversarial Cross-Modal Retrieval.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Transductive Visual-Semantic Embedding for Zero-shot Learning.
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, 2017

Attribute hashing for zero-shot image retrieval.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Asymmetric sparse hashing.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Unsupervised cross-modal retrieval through adversarial learning.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Exploiting Concept Correlation with Attributes for Semantic Binary Representation Learning.
Proceedings of the Internet Multimedia Computing and Service, 2017

Matrix Tri-Factorization with Manifold Regularizations for Zero-Shot Learning.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Jointly Learning Attentions with Semantic Cross-Modal Correlation for Visual Question Answering.
Proceedings of the Databases Theory and Applications, 2017

2016
Learning multi-task local metrics for image annotation.
Multim. Tools Appl., 2016

Underwater image enhancement method using weighted guided trigonometric filtering and artificial light correction.
J. Vis. Commun. Image Represent., 2016

Learning unified binary codes for cross-modal retrieval via latent semantic hashing.
Neurocomputing, 2016

Combining multi-representation for multimedia event detection using co-training.
Neurocomputing, 2016

Bidirectional Long-Short Term Memory for Video Description.
CoRR, 2016

Cross-modal Retrieval with Label Completion.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Attention-based LSTM with Semantic Consistency for Videos Captioning.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Bidirectional Long-Short Term Memory for Video Description.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Discriminant Cross-modal Hashing.
Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016

Underwater image descattering and quality assessment.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Multi-cue Information Fusion for Two-Layer Activity Recognition.
Proceedings of the Computer Vision - ACCV 2016 Workshops, 2016

2015
Semi-supervised Coupled Dictionary Learning for Cross-modal Retrieval in Internet Images and Texts.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Coupled dictionary learning and feature mapping for cross-modal retrieval.
Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

Query expansion with pairwise learning in object retrieval challenge.
Proceedings of the 21st Korea-Japan Joint Workshop on Frontiers of Computer Vision, 2015

2014
Tag completion with defective tag assignments via image-tag re-weighting.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

MLIA at ImageCLFE 2014 Scalable Concept Image Annotation Challenge.
Proceedings of the Working Notes for CLEF 2014 Conference, 2014

Exploring Image Specific Structured Loss for Image Annotation with Incomplete Labelling.
Proceedings of the Computer Vision - ACCV 2014, 2014

2013
Latent topic model for image annotation by modeling topic correlation.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

Image Annotation by Learning Label-Specific Distance Metrics.
Proceedings of the Image Analysis and Processing - ICIAP 2013, 2013

Correlated topic model for image annotation.
Proceedings of the 19th Korea-Japan Joint Workshop on Frontiers of Computer Vision, Incheon, Korea (South), January 30, 2013


  Loading...