Lianli Gao

Orcid: 0000-0002-2522-6394

According to our database1, Lianli Gao authored at least 228 papers between 2011 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Towards faster yet accurate video prediction for resource-constrained platforms.
Neurocomputing, 2025

2024
Overcoming Data Deficiency for Multi-Person Pose Estimation.
IEEE Trans. Neural Networks Learn. Syst., August, 2024

SPT: Spatial Pyramid Transformer for Image Captioning.
IEEE Trans. Circuits Syst. Video Technol., June, 2024

Dual-Branch Hybrid Learning Network for Unbiased Scene Graph Generation.
IEEE Trans. Circuits Syst. Video Technol., March, 2024

Utilizing Greedy Nature for Multimodal Conditional Image Synthesis in Transformers.
IEEE Trans. Multim., 2024

Exploring Spatial Frequency Information for Enhanced Video Prediction Quality.
IEEE Trans. Multim., 2024

Memory-Based Augmentation Network for Video Captioning.
IEEE Trans. Multim., 2024

ReSParser: Fully Convolutional Multiple Human Parsing With Representative Sets.
IEEE Trans. Multim., 2024

DMH-CL: Dynamic Model Hardness Based Curriculum Learning for Complex Pose Estimation.
IEEE Trans. Multim., 2024

CPI-Parser: Integrating Causal Properties Into Multiple Human Parsing.
IEEE Trans. Image Process., 2024

SeMv-3D: Towards Semantic and Mutil-view Consistency simultaneously for General Text-to-3D Generation with Triplane Priors.
CoRR, 2024

One-step Noisy Label Mitigation.
CoRR, 2024

MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct.
CoRR, 2024

Any Target Can be Offense: Adversarial Example Generation via Generalized Latent Infection.
CoRR, 2024

Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization.
CoRR, 2024

RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception.
CoRR, 2024

EchoReel: Enhancing Action Generation of Existing Video Diffusion Models.
CoRR, 2024

CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model.
CoRR, 2024

MPT: Multi-grained Prompt Tuning for Text-Video Retrieval.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

SI-BiViT: Binarizing Vision Transformers with Spatial Interaction.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

MagicVFX: Visual Effects Synthesis in Just Minutes.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Effective and Efficient Few-shot Fine-tuning for Vision Transformers.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Attribute Vision Transformer for UAV-Human Re-Identification.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

BFD: Binarized Frequency-enhanced Distillation for Vision Transformer.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Training-Free Semantic Video Composition via Pre-trained Diffusion Model.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

RRE: A Relevance Relation Extraction Framework for Cross-domain Recommender System at Alipay.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

DePT: Decoupled Prompt Tuning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ProS: Prompting-to-Simulate Generalized Knowledge for Universal Cross-Domain Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

F³-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
KE-RCNN: Unifying Knowledge-Based Reasoning Into Part-Level Attribute Parsing.
IEEE Trans. Cybern., November, 2023

Adaptive Fine-Grained Predicates Learning for Scene Graph Generation.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

Semisupervised Network Embedding With Differentiable Deep Quantization.
IEEE Trans. Neural Networks Learn. Syst., August, 2023

Complementarity-Aware Space Learning for Video-Text Retrieval.
IEEE Trans. Circuits Syst. Video Technol., August, 2023

Learning visual question answering on controlled semantic noisy labels.
Pattern Recognit., June, 2023

Transferable and differentiable discrete network embedding for multi-domains with hierarchical knowledge distillation.
Inf. Sci., June, 2023

Label-Guided Generative Adversarial Network for Realistic Image Synthesis.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

Heterogeneous Knowledge Network for Visual Dialog.
IEEE Trans. Circuits Syst. Video Technol., February, 2023

AMANet: Adaptive Multi-Path Aggregation for Learning Human 2D-3D Correspondences.
IEEE Trans. Multim., 2023

From External to Internal: Structuring Image for Text-to-Image Attributes Manipulation.
IEEE Trans. Multim., 2023

Revisiting Multi-Codebook Quantization.
IEEE Trans. Image Process., 2023

From Global to Local: Multi-Scale Out-of-Distribution Detection.
IEEE Trans. Image Process., 2023

Toward a Unified Transformer-Based Framework for Scene Graph Generation and Human-Object Interaction Detection.
IEEE Trans. Image Process., 2023

State-Aware Compositional Learning Toward Unbiased Training for Scene Graph Generation.
IEEE Trans. Image Process., 2023

End-to-end Image Captioning via Visual Region Aggregation and Dual-level Collaboration.
Int. J. Softw. Informatics, 2023

Context-based Transfer and Efficient Iterative Learning for Unbiased Scene Graph Generation.
CoRR, 2023

Make-A-Storyboard: A General Framework for Storyboard with Disentangled and Merged Control.
CoRR, 2023

F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis.
CoRR, 2023

Towards Redundancy-Free Sub-networks in Continual Learning.
CoRR, 2023

MotionZero: Exploiting Motion Priors for Zero-shot Text-to-Video Generation.
CoRR, 2023

Towards a Unified Transformer-based Framework for Scene Graph Generation and Human-object Interaction Detection.
CoRR, 2023

Less is More: On the Feature Redundancy of Pretrained Models When Transferring to Few-shot Tasks.
CoRR, 2023

CIParsing: Unifying Causality Properties into Multiple Human Parsing.
CoRR, 2023

Informative Scene Graph Generation via Debiasing.
CoRR, 2023

Boosting Adversarial Attacks by Leveraging Decision Boundary Information.
CoRR, 2023

Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

MobileVidFactory: Automatic Diffusion-Based Social Media Video Generation for Mobile Devices from Text.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Precise Target-Oriented Attack against Deep Hashing-based Retrieval.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Depth-Aware Sparse Transformer for Video-Language Learning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

CUCL: Codebook for Unsupervised Continual Learning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

A Closer Look at Few-shot Classification Again.
Proceedings of the International Conference on Machine Learning, 2023

End-To-End Part-Level Action Parsing With Transformer.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

DETA: Denoised Task Adaptation for Few-Shot Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Part-Aware Transformer for Generalizable Person Re-identification.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Prototype-Based Embedding Network for Scene Graph Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
AgeGAN++: Face Aging and Rejuvenation With Dual Conditional GANs.
IEEE Trans. Multim., 2022

Push & Pull: Transferable Adversarial Examples With Attentive Attack.
IEEE Trans. Multim., 2022

Video Question Answering With Prior Knowledge and Object-Sensitive Learning.
IEEE Trans. Image Process., 2022

Continual Referring Expression Comprehension via Dual Modular Memorization.
IEEE Trans. Image Process., 2022

Hierarchical Representation Network With Auxiliary Tasks for Video Captioning and Video Question Answering.
IEEE Trans. Image Process., 2022

Learning Cross-Modal Common Representations by Private-Shared Subspaces Separation.
IEEE Trans. Cybern., 2022

Relation Regularized Scene Graph Generation.
IEEE Trans. Cybern., 2022

Progressive Meta-Learning With Curriculum.
IEEE Trans. Circuits Syst. Video Technol., 2022

Action-Centric Relation Transformer Network for Video Question Answering.
IEEE Trans. Circuits Syst. Video Technol., 2022

KTN: Knowledge Transfer Network for Learning Multiperson 2D-3D Correspondences.
IEEE Trans. Circuits Syst. Video Technol., 2022

Text-instance graph: Exploring the relational semantics for text-based visual question answering.
Pattern Recognit., 2022

Visual Commonsense-aware Representation Network for Video Captioning.
CoRR, 2022

RepParser: End-to-End Multiple Human Parsing with Representative Parts.
CoRR, 2022

KTN: Knowledge Transfer Network for Learning Multi-person 2D-3D Correspondences.
CoRR, 2022

Practical No-box Adversarial Attacks with Training-free Hybrid Image Transformation.
CoRR, 2022

A Lower Bound of Hash Codes' Performance.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Natural Color Fool: Towards Boosting Black-box Unrestricted Attacks.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Free-Lunch for Cross-Domain Few-Shot Learning: Style-Aware Episodic Training with Robust Contrastive Learning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Progressive Tree-Structured Prototype Network for End-to-End Image Captioning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Dynamic Scene Graph Generation via Temporal Prior Inference.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Skeleton-based Action Recognition via Adaptive Cross-Form Learning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Class Gradient Projection For Continual Learning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

S2 Transformer for Image Captioning.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Learning to Generate Scene Graph from Head to Tail.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

MKE-GCN: Multi-Modal Knowledge Embedded Graph Convolutional Network for Skeleton-Based Action Recognition in the Wild.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Support-Set Based Multi-Modal Representation Enhancement for Video Captioning.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Multi-Scale Graph Attention Network for Scene Graph Generation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Context Gating with Multi-Level Ranking Learning for Visual Dialog.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Frequency Domain Model Augmentation for Adversarial Attack.
Proceedings of the Computer Vision - ECCV 2022, 2022

Towards Open-Vocabulary Scene Graph Generation with Prompt-Based Finetuning.
Proceedings of the Computer Vision - ECCV 2022, 2022

Unified Multivariate Gaussian Mixture for Efficient Neural Image Compression.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Wnet: Audio-Guided Video Object Segmentation via Wavelet-Based Cross- Modal Denoising Networks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Fine-Grained Predicates Learning for Scene Graph Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Practical Evaluation of Adversarial Robustness via Adaptive Auto Attack.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Rich Visual Knowledge-Based Augmentation Network for Visual Question Answering.
IEEE Trans. Neural Networks Learn. Syst., 2021

Large Factor Image Super-Resolution With Cascaded Convolutional Neural Networks.
IEEE Trans. Multim., 2021

Deep Hash-based Relevance-aware Data Quality Assessment for Image Dark Data.
Trans. Data Sci., 2021

Foreground-Background Parallel Compression With Residual Encoding for Surveillance Video.
IEEE Trans. Circuits Syst. Video Technol., 2021

GuessWhich? Visual dialog with attentive memory network.
Pattern Recognit., 2021

Lightweight dynamic conditional GAN with pyramid attention for text-to-image synthesis.
Pattern Recognit., 2021

Generalized pyramid co-attention with learnable aggregation net for video question answering.
Pattern Recognit., 2021

Technical Report: Disentangled Action Parsing Networks for Accurate Part-level Action Parsing.
CoRR, 2021

Fast Gradient Non-sign Methods.
CoRR, 2021

Adversarial Attacks on ML Defense Models Competition.
CoRR, 2021

Unsupervised Domain-adaptive Hash for Networks.
CoRR, 2021

Semi-supervised Network Embedding with Differentiable Deep Quantisation.
CoRR, 2021

Semantic Compositional Learning for Low-shot Scene Graph Generation.
CoRR, 2021

Staircase Sign Method for Boosting Adversarial Attacks.
CoRR, 2021

Curriculum-Based Meta-learning.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Conceptual and Syntactical Cross-modal Alignment with Cross-level Consistency for Image-Text Matching.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

A System for Interactive and Intelligent AD Auxiliary Screening.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Semantic-aware Transfer with Instance-adaptive Parsing for Crowded Scenes Pose Estimation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Fully Functional Image Manipulation Using Scene Graphs in A Bounding-Box Free Way.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Camera-Agnostic Person Re-Identification via Adversarial Disentangling Learning.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Exploring Contextual-Aware Representation and Linguistic-Diverse Expression for Visual Dialog.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

PoseGTAC: Graph Transformer Encoder-Decoder with Atrous Convolution for 3D Human Pose Estimation.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Towards Unsupervised Deformable-Instances Image-to-Image Translation.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Feature Space Targeted Attacks by Statistic Alignment.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

SKANet: Structured Knowledge-Aware Network for Visual Dialog.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Hierarchical Representation Network With Auxiliary Tasks For Video Captioning.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Exploiting Scene Graphs for Human-Object Interaction Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

From General to Specific: Informative Scene Graph Generation via Balance Adjustment.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

RSGNet: Relation based Skeleton Graph Network for Crowded Scenes Pose Estimation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
A framework for image dark data assessment.
World Wide Web, 2020

Play and rewind: Context-aware video temporal action proposals.
Pattern Recognit., 2020

Hierarchical LSTMs with Adaptive Attention for Visual Captioning.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

Traffic sign detection and recognition based on pyramidal convolutional networks.
Neural Comput. Appl., 2020

Understanding and improving ontology reasoning efficiency through learning and ranking.
Inf. Syst., 2020

Unsupervised urban scene segmentation via domain adaptation.
Neurocomputing, 2020

Fused GRU with semantic-temporal attention for video captioning.
Neurocomputing, 2020

Question-Led object attention for visual question answering.
Neurocomputing, 2020

Unified Binary Generative Adversarial Network for Image Retrieval and Compression.
Int. J. Comput. Vis., 2020

Patch-wise++ Perturbation for Adversarial Targeted Attacks.
CoRR, 2020

Cognitive visual anomaly detection with constrained latent representations for industrial inspection robot.
Appl. Soft Comput., 2020

Correlated Features Synthesis and Alignment for Zero-shot Cross-modal Retrieval.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

3D Self-Attention for Unsupervised Video Quantization.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

EvoGAN: an evolutionary GAN for face aging and rejuvenation.
Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

Temporal Denoising Mask Synthesis Network for Learning Blind Video Temporal Consistency.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

KTN: Knowledge Transfer Network for Multi-person DensePose Estimation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

One-shot Scene Graph Generation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Lab2Pix: Label-Adaptive Generative Adversarial Network for Unsupervised Image Synthesis.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Label-Attended Hashing for Multi-Label Image Retrieval.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Learning from the Scene and Borrowing from the Rich: Tackling the Long Tail in Scene Graph Generation.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Bottom-up and Top-down: Bidirectional Additive Net for Edge Detection.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Patch-Wise Attack for Fooling Deep Neural Network.
Proceedings of the Computer Vision - ECCV 2020, 2020

Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentences.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

What Machines See Is Not What They Get: Fooling Scene Text Recognition Models With Adversarial Text Images.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Learning Cross-Aligned Latent Embeddings for Zero-Shot Cross-Modal Retrieval.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

SNEQ: Semi-Supervised Attributed Network Embedding with Attention-Based Quantisation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Deep adversarial metric learning for cross-modal retrieval.
World Wide Web, 2019

Residual attention-based LSTM for video captioning.
World Wide Web, 2019

Exploiting long-term temporal dynamics for video captioning.
World Wide Web, 2019

From Deterministic to Generative: Multimodal Stochastic RNNs for Video Captioning.
IEEE Trans. Neural Networks Learn. Syst., 2019

Fusion by synthesizing: A multi-view deep neural network for zero-shot recognition.
Signal Process., 2019

Synchronization-based clustering on evolving data stream.
Inf. Sci., 2019

One Network for Multi-Domains: Domain Adaptive Hashing with Intersectant Generative Adversarial Network.
CoRR, 2019

Difficulty-Controllable Multi-hop Question Generation from Knowledge Graphs.
Proceedings of the Semantic Web - ISWC 2019, 2019

Learnable Aggregating Net with Diversity Learning for Video Question Answering.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Adaptive Multi-Path Aggregation for Human DensePose Estimation in the Wild.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Deep Recurrent Quantization for Generating Sequential Binary Codes.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

One Network for Multi-Domains: Domain Adaptive Hashing with Intersectant Generative Adversarial Networks.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Beyond Product Quantization: Deep Progressive Quantization for Image Retrieval.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Matching User with Item Set: Collaborative Bundle Recommendation with Deep Attention Network.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Social Relation Recognition From Videos via Multi-Scale Spatial-Temporal Reasoning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Template-Based Math Word Problem Solvers with Recursive Neural Networks.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Beyond RNNs: Positional Self-Attention with Co-Attention for Video Question Answering.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Structured Two-Stream Attention Network for Video Question Answering.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Deliberate Attention Networks for Image Captioning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Perceptual Pyramid Adversarial Networks for Text-to-Image Synthesis.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Two-Stream 3-D convNet Fusion for Action Recognition in Videos With Arbitrary Size and Length.
IEEE Trans. Multim., 2018

Self-Supervised Video Hashing With Hierarchical Binary Auto-Encoder.
IEEE Trans. Image Process., 2018

Quantization-based hashing: a general framework for scalable image and video retrieval.
Pattern Recognit., 2018

Multiple hierarchical deep hashing for large scale image retrieval.
Multim. Tools Appl., 2018

Deep appearance and motion learning for egocentric activity recognition.
Neurocomputing, 2018

Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks.
CoRR, 2018

Small Object Detection Using Deep Feature Pyramid Networks.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Cumulative Nets for Edge Detection.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Examine before You Answer: Multi-task Learning with Adaptive-attentions for Multiple-choice VQA.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

From Pixels to Objects: Cubic Visual Attention for Visual Question Answering.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Dual Conditional GANs for Face Aging and Rejuvenation.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Coarse-to-fine Image Co-segmentation with Intra and Inter Rank Constraints.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Unpaired Image-to-Image Translation from Shared Deep Space.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

MathDQN: Solving Arithmetic Word Problems via Deep Reinforcement Learning.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Deep Region Hashing for Generic Instance Search from Images.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Binary Generative Adversarial Networks for Image Retrieval.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Processing Long Queries Against Short Text: Top-<i>k</i> Advertisement Matching in News Stream Applications.
ACM Trans. Inf. Syst., 2017

Video Captioning With Attention-Based LSTM and Semantic Consistency.
IEEE Trans. Multim., 2017

Beyond Frame-level CNN: Saliency-Aware 3-D CNN With LSTM for Video Action Recognition.
IEEE Signal Process. Lett., 2017

Learning in high-dimensional multimedia data: the state of the art.
Multim. Syst., 2017

Large-scale image retrieval with supervised sparse hashing.
Neurocomputing, 2017

Kernel based latent semantic sparse hashing for large-scale retrieval from heterogeneous data sources.
Neurocomputing, 2017

Exploiting score distribution for heterogenous feature fusion in image classification.
Neurocomputing, 2017

Real-time social media retrieval with spatial, temporal and social constraints.
Neurocomputing, 2017

From Deterministic to Generative: Multi-Modal Stochastic RNNs for Video Captioning.
CoRR, 2017

Deep Region Hashing for Efficient Large-scale Instance Search from Images.
CoRR, 2017

Deep Discrete Hashing with Self-supervised Pairwise Labels.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2017

Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Sharp and Real Image Super-Resolution Using Generative Adversarial Network.
Proceedings of the Neural Information Processing - 24th International Conference, 2017

Movie Fill in the Blank with Adaptive Temporal Attention and Description Update.
Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017

Jointly Learning Attentions with Semantic Cross-Modal Correlation for Visual Question Answering.
Proceedings of the Databases Theory and Applications, 2017

Event Video Mashup: From Hundreds of Videos to Minutes of Skeleton.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Feature aggregating hashing for image copy detection.
World Wide Web, 2016

Optimized Graph Learning Using Partial Tags and Multiple Features for Image and Video Annotation.
IEEE Trans. Image Process., 2016

Deep and fast: Deep learning hashing with semi-supervised graph construction.
Image Vis. Comput., 2016

Self-representation nearest neighbor search for classification.
Neurocomputing, 2016

Spatial and temporal scoring for egocentric video summarization.
Neurocomputing, 2016

Joint Graph Learning and Video Segmentation via Multiple Cues and Topology Calibration.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Attention-based LSTM with Semantic Consistency for Videos Captioning.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Graph-without-cut: An Ideal Graph Learning for Image Segmentation.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Semantic annotation and reasoning for sensor data streams
PhD thesis, 2015

Supervised Hashing with Pseudo Labels for Scalable Multimedia Retrieval.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Exploring Viewable Angle Information in Georeferenced Video Search.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Scalable Multimedia Retrieval by Deep Learning Hashing with Relative Similarity Learning.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Zero-shot Image Categorization by Image Correlation Exploration.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Optimal graph learning with partial tags and multiple features for image and video annotation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014
Corrigendum to "A web-based semantic tagging and activity recognition system for species' accelerometry data" [Ecol. Inf. 13(2013) 47-56].
Ecol. Informatics, 2014

Estimating Fire Weather Indices via Semantic Reasoning over Wireless Sensor Network Data Streams.
CoRR, 2014

2013
A Web-based semantic tagging and activity recognition system for species' accelerometry data.
Ecol. Informatics, 2013

Semantic-based detection of segment outliers and unusual events for wireless sensor networks.
Proceedings of the 18th International Conference on Information Quality, 2013

2011
Publishing, Linking and Annotating Events via Interactive Timelines: an Earth Sciences Case Study.
Proceedings of the Workhop on Detection, 2011


  Loading...