Zhengjun Zha

Orcid: 0000-0003-2510-8993

According to our database1, Zhengjun Zha authored at least 482 papers between 2007 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Inductive State-Relabeling Adversarial Active Learning With Heuristic Clique Rescaling.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Event-Driven Heterogeneous Network for Video Deraining.
Int. J. Comput. Vis., December, 2024

Context-Aware Proposal-Boundary Network With Structural Consistency for Audiovisual Event Localization.
IEEE Trans. Neural Networks Learn. Syst., November, 2024

Towards Generalized UAV Object Detection: A Novel Perspective from Frequency Domain Disentanglement.
Int. J. Comput. Vis., November, 2024

Exert Diversity and Mitigate Bias: Domain Generalizable Person Re-identification with a Comprehensive Benchmark.
Int. J. Comput. Vis., November, 2024

Hue Guidance Network for Single Image Reflection Removal.
IEEE Trans. Neural Networks Learn. Syst., October, 2024

Extraordinarily Time- and Memory-Efficient Large-Scale Canonical Correlation Analysis in Fourier Domain: From Shallow to Deep.
IEEE Trans. Neural Networks Learn. Syst., October, 2024

Graph Representation Learning for Large-Scale Neuronal Morphological Analysis.
IEEE Trans. Neural Networks Learn. Syst., April, 2024

On Exploring Multiplicity of Primitives and Attributes for Texture Recognition in the Wild.
IEEE Trans. Pattern Anal. Mach. Intell., January, 2024

Downstream-Pretext Domain Knowledge Traceback for Active Learning.
IEEE Trans. Multim., 2024

Vision-and-Language Navigation via Latent Semantic Alignment Learning.
IEEE Trans. Multim., 2024

Unleashing Knowledge Potential of Source Hypothesis for Source-Free Domain Adaptation.
IEEE Trans. Multim., 2024

DDOD: Dive Deeper into the Disentanglement of Object Detector.
IEEE Trans. Multim., 2024

Prototype-Augmented Self-Supervised Generative Network for Generalized Zero-Shot Learning.
IEEE Trans. Image Process., 2024

Event-Based Optical Flow via Transforming Into Motion-Dependent View.
IEEE Trans. Image Process., 2024

A Closer Look at the Reflection Formulation in Single Image Reflection Removal.
IEEE Trans. Image Process., 2024

Adaptive Texture and Spectrum Clue Mining for Generalizable Face Forgery Detection.
IEEE Trans. Inf. Forensics Secur., 2024

SMART: Syntax-Calibrated Multi-Aspect Relation Transformer for Change Captioning.
IEEE Trans. Pattern Anal. Mach. Intell., 2024

Background Activation Suppression for Weakly Supervised Object Localization and Semantic Segmentation.
Int. J. Comput. Vis., 2024

EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting.
CoRR, 2024

Visual-Geometric Collaborative Guidance for Affordance Learning.
CoRR, 2024

MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling.
CoRR, 2024

ForgeryGPT: Multimodal Large Language Model For Explainable Image Forgery Detection and Localization.
CoRR, 2024

LoTLIP: Improving Language-Image Pre-training for Long Text Understanding.
CoRR, 2024

Grounding 3D Scene Affordance From Egocentric Interactions.
CoRR, 2024

DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion.
CoRR, 2024

QMambaBSR: Burst Image Super-Resolution with Query State Space Model.
CoRR, 2024

FC3DNet: A Fully Connected Encoder-Decoder for Efficient Demoir'eing.
CoRR, 2024

Towards Realistic Data Generation for Real-World Super-Resolution.
CoRR, 2024

FourierMamba: Fourier Learning Integration with State Space Models for Image Deraining.
CoRR, 2024

EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views.
CoRR, 2024

ViViD: Video Virtual Try-on using Diffusion Models.
CoRR, 2024

Efficient Real-world Image Super-Resolution Via Adaptive Directional Gradient Convolution.
CoRR, 2024

MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results.
CoRR, 2024

Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey.
CoRR, 2024

The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report.
CoRR, 2024

Hierarchical Information Enhancement Network for Cascade Prediction in Social Networks.
CoRR, 2024

Multi-perspective Memory Enhanced Network for Identifying Key Nodes in Social Networks.
CoRR, 2024

VisualCritic: Making LMMs Perceive Visual Quality Like Humans.
CoRR, 2024

RelationVLM: Making Large Vision-Language Models Understand Visual Relations.
CoRR, 2024

Event-based Asynchronous HDR Imaging by Temporal Incident Light Modulation.
CoRR, 2024

SCott: Accelerating Diffusion Models with Stochastic Consistency Distillation.
CoRR, 2024

ESCNet: Entity-enhanced and Stance Checking Network for Multi-modal Fact-Checking.
Proceedings of the ACM on Web Conference 2024, 2024

MGAW: An Effective Method for Geo-localization in Adverse Weather.
Proceedings of the 2nd Workshop on UAVs in Multimedia: Capturing the World from a New Perspective, 2024

I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models.
Proceedings of the ACM SIGGRAPH 2024 Conference Papers, 2024

Cross-Modal Semantic Alignment Learning for Text-Based Person Search.
Proceedings of the MultiMedia Modeling - 30th International Conference, 2024

MLP Embedded Inverse Tone Mapping.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

UniDense: Unleashing Diffusion Models with Meta-Routers for Universal Few-Shot Dense Prediction.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

CLaM: An Open-Source Library for Performance Evaluation of Text-driven Human Motion Generation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Natural Language-centered Inference Network for Multi-modal Fake News Detection.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

CCM: Real-Time Controllable Visual Content Creation Using Text-to-Image Consistency Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

DreamClean: Restoring Clean Image Using Deep Diffusion Prior.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Event-Adapted Video Super-Resolution.
Proceedings of the Computer Vision - ECCV 2024, 2024

Motion Aware Event Representation-Driven Image Deblurring.
Proceedings of the Computer Vision - ECCV 2024, 2024

Noise-Assisted Prompt Learning for Image Forgery Detection and Localization.
Proceedings of the Computer Vision - ECCV 2024, 2024

Revisiting Single Image Reflection Removal in the Wild.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

LEMON: Learning 3D Human-Object Interaction Relation from 2D Images.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DemosaicFormer: Coarse-to-Fine Demosaicing Network for HybridEVS Camera.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024


MambaPupil: Bidirectional Selective Recurrent model for Event-based Eye tracking.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024


NTIRE 2024 Image Shadow Removal Challenge Report.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

HirFormer: Dynamic High Resolution Transformer for Large-Scale Image Shadow Removal.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

SDCNet: Spatially-Adaptive Deformable Convolution Networks for HR NonHomogeneous Dehazing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Shadow Removal via Global Residual Free Unet and Shadow Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024


Prompt-Enhanced Multiple Instance Learning for Weakly Supervised Video Anomaly Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024


HomoFormer: Homogenized Transformer for Image Shadow Removal.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Context-aware Difference Distilling for Multi-change Captioning.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Learning Discriminative Noise Guidance for Image Forgery Detection and Localization.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Neuromorphic Event Signal-Driven Network for Video De-raining.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Image De-Raining Transformer.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

Constructing Spatio-Temporal Graphs for Face Forgery Detection.
ACM Trans. Web, August, 2023

Continual Image Deraining With Hypergraph Convolutional Networks.
IEEE Trans. Pattern Anal. Mach. Intell., August, 2023

Semantic and Relation Modulation for Audio-Visual Event Localization.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Entity-Enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

Category-Stitch Learning for Union Domain Generalization.
ACM Trans. Multim. Comput. Commun. Appl., January, 2023

Synergy between Semantic Segmentation and Image Denoising via Alternate Boosting.
ACM Trans. Multim. Comput. Commun. Appl., 2023

Learning Video-Text Aligned Representations for Video Captioning.
ACM Trans. Multim. Comput. Commun. Appl., 2023

Deep Texton-Coherence Network for Camouflaged Object Detection.
IEEE Trans. Multim., 2023

Domain Generalization Via Encoding and Resampling in a Unified Latent Space.
IEEE Trans. Multim., 2023

Location-Free Camouflage Generation Network.
IEEE Trans. Multim., 2023

I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models.
CoRR, 2023

CCM: Adding Conditional Controls to Text-to-Image Consistency Models.
CoRR, 2023

Decoupling Degradation and Content Processing for Adverse Weather Image Restoration.
CoRR, 2023

Deep Spiking-UNet for Image Processing.
CoRR, 2023

Knowledge-Enhanced Hierarchical Information Correlation Learning for Multi-Modal Rumor Detection.
CoRR, 2023

DreamTime: An Improved Optimization Strategy for Text-to-3D Content Creation.
CoRR, 2023

DreamWaltz: Make a Scene with Complex 3D Animatable Avatars.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Fusion-Based Low-Light Image Enhancement.
Proceedings of the MultiMedia Modeling - 29th International Conference, 2023

Hierarchical Semantic Enhancement Network for Multimodal Fake News Detection.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

ECENet: Explainable and Context-Enhanced Network for Muti-modal Fact verification.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Learning Semantics-Grounded Vocabulary Representation for Video-Text Retrieval.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Alleviating Spatial Misalignment and Motion Interference for UAV-based Video Recognition.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

MaTCR: Modality-Aligned Thought Chain Reasoning for Multimodal Task-Oriented Dialogue Generation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Random Shuffle Transformer for Image Restoration.
Proceedings of the International Conference on Machine Learning, 2023

Hierarchical Context Modeling Network for Landmark Recognition.
Proceedings of the IEEE International Conference on Data Mining, 2023

Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained Vision-Language Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Grounding 3D Object Affordance from 2D Interactions in Images.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Spatial-Aware Token for Weakly Supervised Object Localization.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Self-supervised Cross-view Representation Reconstruction for Change Captioning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Text-Driven Generative Domain Adaptation with Spectral Consistency Regularization.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning Cross-Representation Affinity Consistency for Sparsely Supervised Biomedical Instance Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Adaptive Frequency Filters As Efficient Global Token Mixers.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Self-Organizing Pathway Expansion for Non-Exemplar Class-Incremental Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Streaming Video Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Decoupling-and-Aggregating for Image Exposure Correction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Generalized UAV Object Detection via Frequency Domain Disentanglement.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

NTIRE 2023 Challenge on Efficient Super-Resolution: Methods and Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

NTIRE 2023 Challenge on Efficient Super-Resolution: Methods and Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Edge-aware Regional Message Passing Controller for Image Forgery Localization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Neural Dependencies Emerging from Learning Massive Categories.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning to Dub Movies via Hierarchical Prosody Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Event-Guided Person Re-Identification via Sparse-Dense Complementary Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Exploring Tuning Characteristics of Ventral Stream's Neurons for Few-Shot Image Classification.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
A Model-Driven Deep Unfolding Method for JPEG Artifacts Removal.
IEEE Trans. Neural Networks Learn. Syst., 2022

Boundary-Aware Arbitrary-Shaped Scene Text Detector With Learnable Embedding Network.
IEEE Trans. Multim., 2022

Online Residual Quantization Via Streaming Data Correlation Preserving.
IEEE Trans. Multim., 2022

E-Commerce Storytelling Recommendation Using Attentional Domain-Transfer Network and Adversarial Pre-Training.
IEEE Trans. Multim., 2022

I<sup>2</sup>Transformer: Intra- and Inter-Relation Embedding Transformer for TV Show Captioning.
IEEE Trans. Image Process., 2022

Long Short-Term Relation Transformer With Global Gating for Video Captioning.
IEEE Trans. Image Process., 2022

Syntax-Guided Hierarchical Attention Network for Video Captioning.
IEEE Trans. Circuits Syst. Video Technol., 2022

Context-Aware Visual Policy Network for Fine-Grained Image Captioning.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Structure injected weight normalization for training deep networks.
Multim. Syst., 2022

Progressive Pan-Sharpening via Cross-Scale Collaboration Networks.
IEEE Geosci. Remote. Sens. Lett., 2022

Learning Degradation-Invariant Representation for Robust Real-World Person Re-Identification.
Int. J. Comput. Vis., 2022

Label Noise-Resistant Mean Teaching for Weakly Supervised Fake News Detection.
CoRR, 2022

FAMLP: A Frequency-Aware MLP-Like Architecture For Domain Generalization.
CoRR, 2022

Exploring Figure-Ground Assignment Mechanism in Perceptual Organization.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Rank Diminishing in Deep Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Stochastic Window Transformer for Image Restoration.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

AS-Net: Class-Aware Assistance and Suppression Network for Few-Shot Learning.
Proceedings of the MultiMedia Modeling - 28th International Conference, 2022

Lightweight Wavelet-Based Network for JPEG Artifacts Removal.
Proceedings of the MultiMedia Modeling - 28th International Conference, 2022

Long-Range Feature Dependencies Capturing for Low-Resolution Image Classification.
Proceedings of the MultiMedia Modeling - 28th International Conference, 2022

Single Image Shadow Detection via Complementary Mechanism.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Enhancement by Your Aesthetic: An Intelligible Unsupervised Personalized Enhancer for Low-Light Images.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Cross-modal Semantic Alignment Pre-training for Vision-and-Language Navigation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Learning Dual Convolutional Dictionaries for Image De-raining.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

JPEG Compression-aware Image Forgery Localization.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Exploring Fourier Prior for Single Image Rain Removal.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Event-driven Video Deblurring via Spatio-Temporal Relation-Aware Network.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Principled Knowledge Extrapolation with GANs.
Proceedings of the International Conference on Machine Learning, 2022

JPEG Artifacts Removal via Contrastive Representation Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022

S2N: Suppression-Strengthen Network for Event-Based Recognition Under Variant Illuminations.
Proceedings of the Computer Vision - ECCV 2022, 2022

Bijective Mapping Network for Shadow Removal.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Few Shot Generative Model Adaption via Relaxed Spatial Structural Alignment.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Temporal Complementarity-Guided Reinforcement Learning for Image-to-Video Person Re-Identification.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Multi-grained Spatio-Temporal Features Perceived Network for Event-based Lip-Reading.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Weakly Supervised High-Fidelity Clothing Model Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Degradation-agnostic Correspondence from Resolution-asymmetric Stereo.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Automatic Relation-aware Graph Network Proliferation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Lifelong Unsupervised Domain Adaptive Person Re-identification with Coordinated Anti-forgetting and Adaptation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Self-Sustaining Representation Expansion for Non-Exemplar Class-Incremental Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Efficient Model-Driven Network for Shadow Removal.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Unsupervised Coherent Video Cartoonization with Perceptual Motion Consistency.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Debiased Batch Normalization via Gaussian Process for Generalizable Person Re-identification.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

ProgressiveMotionSeg: Mutually Reinforced Framework for Event-Based Motion Segmentation.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Modality-Adaptive Mixup and Invariant Decomposition for RGB-Infrared Person Re-identification.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Cross-Domain Object Representation via Robust Low-Rank Correlation Analysis.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Leveraging Deep Statistics for Underwater Image Enhancement.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Deep Coattention-Based Comparator for Relative Representation Learning in Person Re-Identification.
IEEE Trans. Neural Networks Learn. Syst., 2021

Visual Navigation With Multiple Goals Based on Deep Reinforcement Learning.
IEEE Trans. Neural Networks Learn. Syst., 2021

Laplacian Pyramid Neural Network for Dense Continuous-Value Regression for Complex Scenes.
IEEE Trans. Neural Networks Learn. Syst., 2021

One-Shot Texture Retrieval Using Global Grouping Metric.
IEEE Trans. Multim., 2021

R-Net: A Relationship Network for Efficient and Accurate Scene Text Detection.
IEEE Trans. Multim., 2021

Domain-Oriented Semantic Embedding for Zero-Shot Learning.
IEEE Trans. Multim., 2021

A Mutually Attentive Co-Training Framework for Semi-Supervised Recognition.
IEEE Trans. Multim., 2021

Learning and Fusing Multiple User Interest Representations for Micro-Video and Movie Recommendations.
IEEE Trans. Multim., 2021

Weakly Supervised Neuron Reconstruction From Optical Microscopy Images With Morphological Priors.
IEEE Trans. Medical Imaging, 2021

MKEL: Multiple Kernel Ensemble Learning via Unified Ensemble Loss for Image Classification.
ACM Trans. Intell. Syst. Technol., 2021

Structure-Guided Deep Video Inpainting.
IEEE Trans. Circuits Syst. Video Technol., 2021

SLiKER: Sparse loss induced kernel ensemble regression.
Pattern Recognit., 2021

PRRNet: Pixel-Region relation network for face forgery detection.
Pattern Recognit., 2021

Dense Residual Network: Enhancing global dense feature flow for character recognition.
Neural Networks, 2021

Local-binarized very deep residual network for visual categorization.
Neurocomputing, 2021

Human activity recognition by manifold regularization based dynamic graph convolutional networks.
Neurocomputing, 2021

Successive Graph Convolutional Network for Image De-raining.
Int. J. Comput. Vis., 2021

Weakly Supervised High-Fidelity Clothing Model Generation.
CoRR, 2021

Calibrated Feature Decomposition for Generalizable Person Re-Identification.
CoRR, 2021

Edge-featured Graph Neural Architecture Search.
CoRR, 2021

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP.
CoRR, 2021

Multi-Modulation Network for Audio-Visual Event Localization.
CoRR, 2021

MViT: Mask Vision Transformer for Facial Expression Recognition in the wild.
CoRR, 2021

BoundarySqueeze: Image Segmentation as Boundary Squeezing.
CoRR, 2021

Memory Enhanced Embedding Learning for Cross-Modal Video-Text Retrieval.
CoRR, 2021

Rethinking Graph Neural Network Search from Message-passing.
CoRR, 2021

General-Purpose Speech Representation Learning through a Self-Supervised Multi-Granularity Framework.
CoRR, 2021

VAE^2: Preventing Posterior Collapse of Variational Video Predictions in the Wild.
CoRR, 2021

Low-Rank Subspaces in GANs.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

TDI TextSpotter: Taking Data Imbalance into Account in Scene Text Spotting.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Pose-Guided Feature Learning with Knowledge Distillation for Occluded Person Re-Identification.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Multifocal Attention-Based Cross-Scale Network for Image De-raining.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Cluster and Scatter: A Multi-grained Active Semi-supervised Learning Framework for Scalable Person Re-identification.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Disentangle Your Dense Object Detector.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

End-to-end Boundary Exploration for Weakly-supervised Semantic Segmentation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Language-Conditioned Region Proposal and Retrieval Network for Referring Expression Comprehension.
Proceedings of the MMPT@ICMR2021: Proceedings of the 2021 Workshop on Multi-Modal Pre-Training for Multimedia Understanding, 2021

A Decomposition-based Network for Non-uniform Illuminated Retinal Image Enhancement.
Proceedings of the 15th International Symposium on Medical Information and Communication Technology, 2021

Understanding Noise Injection in GANs.
Proceedings of the 38th International Conference on Machine Learning, 2021

Uncertainty Principles of Encoding GANs.
Proceedings of the 38th International Conference on Machine Learning, 2021

Adversarial Disentanglement and Correlation Network for Rgb-Infrared Person Re-Identification.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Self-Supervised Visual Representations Learning by Contrastive Mask Prediction.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Improving De-raining Generalization via Neural Reorganization.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Cross-Patch Graph Convolutional Network for Image Denoising.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Attack-Guided Perceptual Data Generation for Real-world Re-Identification.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Learning Dual Priors for JPEG Compression Artifacts Removal.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Image De-Raining via Continual Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Group-aware Label Transfer for Domain Adaptive Person Re-identification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Structured Multi-Level Interaction Network for Video Moment Localization via Language Query.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Light Field Super-Resolution With Zero-Shot Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Rethinking Graph Neural Architecture Search From Message-Passing.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Exploiting Sample Uncertainty for Domain Adaptive Person Re-Identification.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Training Spiking Neural Networks with Accumulated Spiking Flow.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Rain Streak Removal via Dual Graph Convolutional Network.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Robust Deep Co-Saliency Detection With Group Semantic and Pyramid Attention.
IEEE Trans. Neural Networks Learn. Syst., 2020

Adversarial Attribute-Text Embedding for Person Search With Natural Language Query.
IEEE Trans. Multim., 2020

Bidirectional Attention-Recognition Model for Fine-Grained Object Classification.
IEEE Trans. Multim., 2020

Neuronal Population Reconstruction From Ultra-Scale Optical Microscopy Images via Progressive Learning.
IEEE Trans. Medical Imaging, 2020

Learning Rich Part Hierarchies With Progressive Attention Networks for Fine-Grained Image Recognition.
IEEE Trans. Image Process., 2020

A Multi-Scale Spatial-Temporal Attention Model for Person Re-Identification in Videos.
IEEE Trans. Image Process., 2020

Multi-Objective Matrix Normalization for Fine-Grained Visual Recognition.
IEEE Trans. Image Process., 2020

Visual Object Tracking via Guessing and Matching.
IEEE Trans. Circuits Syst. Video Technol., 2020

Frank-Wolfe Network: An Interpretable Deep Structure for Non-Sparse Coding.
IEEE Trans. Circuits Syst. Video Technol., 2020

A generalized least-squares approach regularized with graph embedding for dimensionality reduction.
Pattern Recognit., 2020

Real-World Image Denoising with Deep Boosting.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

Towards a new generation of artificial intelligence in China.
Nat. Mach. Intell., 2020

Temporal Attribute-Appearance Learning Network for Video-based Person Re-Identification.
CoRR, 2020

On Noise Injection in Generative Adversarial Networks.
CoRR, 2020

Self-Supervised Tuning for Few-Shot Segmentation.
CoRR, 2020

Fast Dense Residual Network: Enhancing Global Dense Feature Flow for Text Recognition.
CoRR, 2020

Deep Self-representative Concept Factorization Network for Representation Learning.
Proceedings of the 2020 SIAM International Conference on Data Mining, 2020

Learning Semantic-aware Normalization for Generative Adversarial Networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Hierarchical Granularity Transfer Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Joint Sketch-Attribute Learning for Fine-Grained Face Synthesis.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

Deep Palette-Based Color Decomposition for Image Recoloring with Aesthetic Suggestion.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

A Structured Graph Attention Network for Vehicle Re-Identification.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

ASTA-Net: Adaptive Spatio-Temporal Attention Network for Person Re-Identification in Videos.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Hierarchical Gumbel Attention Network for Text-based Person Search.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Structural Semantic Adversarial Active Learning for Image Captioning.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Nighttime Dehazing with a Synthetic Benchmark.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

March on Data Imperfections: Domain Division and Domain Generalization for Semantic Segmentation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Space-Time Video Super-Resolution Using Temporal Profiles.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Fine-grained Feature Alignment with Part Perspective Transformation for Vehicle ReID.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Dual Context-Aware Refinement Network for Person Search.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Transferrable Referring Expression Grounding with Concept Transfer and Context Inheritance.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Diverter-Guider Recurrent Network for Diverse Poems Generation from Image.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Semantic Image Analogy with a Conditional Single-Image GAN.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

DeepFacePencil: Creating Face Images from Freehand Sketches.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Memory-Augmented Relation Network for Few-Shot Learning.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Dual Path Interaction Network for Video Moment Localization.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Towards Neuron Segmentation from Macaque Brain Images: A Weakly Supervised Approach.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020

Towards Semantically Scalable Image Coding using Semantic Map.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2020

Multi-Scale Group Transformer for Long Sequence Modeling in Speech Separation.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

JPEG Artifacts Removal via Compression Quality Ranker-Guided Networks.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Learning to Discretely Compose Reasoning Module Networks for Video Captioning.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Co-Saliency Spatio-Temporal Interaction Network for Person Re-Identification in Videos.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Multi-Scale Spatial-Temporal Integration Convolutional Tube for Human Action Recognition.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Stacked Convolutional Deep Encoding Network For Video-Text Retrieval.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Time-Sensitive Collaborative Interest Aware Model for Session-Based Recommendation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Convolutional Dictionary Pair Learning Network for Image Representation Learning.
Proceedings of the ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020, 2020

Spatiotemporal Fusion in 3D CNNs: A Probabilistic View.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Object Relational Graph With Teacher-Recommended Learning for Video Captioning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

State-Relabeling Adversarial Active Learning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Deep Structure-Revealed Network for Texture Recognition.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Self-Supervised Domain-Aware Generative Network for Generalized Zero-Shot Learning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Deep Degradation Prior for Low-Quality Image Classification.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Domain-Aware Visual Bias Eliminating for Generalized Zero-Shot Learning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Parsing-Based View-Aware Embedding Network for Vehicle Re-Identification.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Real-World Person Re-Identification via Degradation Invariance Learning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Iterative Context-Aware Graph Inference for Visual Dialog.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Posterior-Guided Neural Architecture Search.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

CircleNet for Hip Landmark Detection.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Filtration and Distillation: Enhancing Region Attention for Fine-Grained Visual Categorization.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Spatiotemporal-Textual Co-Attention Network for Video Question Answering.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Convolutional Attention Networks for Scene Text Recognition.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Dense 3D-Convolutional Neural Network for Person Re-Identification in Videos.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Cross-Modality Feature Learning via Convolutional Autoencoder.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Learning Compact Appearance Representation for Video-Based Person Re-Identification.
IEEE Trans. Circuits Syst. Video Technol., 2019

A generalized multi-dictionary least squares framework regularized with multi-graph embeddings.
Pattern Recognit., 2019

Dynamically building diversified classifier pruning ensembles via canonical correlation analysis.
Multim. Tools Appl., 2019

Identity Preserve Transform: Understand What Activity Classification Models Have Learnt.
CoRR, 2019

LinesToFacePhoto: Face Photo Generation from Lines with Conditional Self-Attention Generative Adversarial Network.
CoRR, 2019

One-Shot Neural Architecture Search Through A Posteriori Distribution Guided Sampling.
CoRR, 2019

Referring Expression Grounding by Marginalizing Scene Graph Likelihood.
CoRR, 2019

Making History Matter: Gold-Critic Sequence Training for Visual Dialog.
CoRR, 2019

Manifold Alignment via Global and Local Structures Preserving PCA Framework.
IEEE Access, 2019

Abstract Reasoning with Distracting Features.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Learning Deep Bilinear Transformation for Fine-grained Image Representation.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Adaptive Alignment Network for Person Re-identification.
Proceedings of the MultiMedia Modeling - 25th International Conference, 2019

Near-Duplicate Video Retrieval Through Toeplitz Kernel Partial Least Squares.
Proceedings of the MultiMedia Modeling - 25th International Conference, 2019

Robust Subspace Discovery by Block-diagonal Adaptive Locality-constrained Representation.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Question-Aware Tube-Switch Network for Video Question Answering.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Ground-Aware Point Cloud Semantic Segmentation for Autonomous Driving.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Domain-Specific Embedding Network for Zero-Shot Recognition.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Deep Adversarial Graph Attention Convolution Network for Text-Based Person Search.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Knowledge-guided Pairwise Reconstruction Network for Weakly Supervised Referring Expression Grounding.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

LinesToFacePhoto: Face Photo Generation From Lines With Conditional Self-Attention Generative Adversarial Networks.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Illumination-Invariant Person Re-Identification.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Hybrid Image Enhancement With Progressive Laplacian Enhancing Unit.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Hierarchical Global-Local Temporal Modeling for Video Captioning.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

DADNet: Dilated-Attention-Deformable ConvNet for Crowd Counting.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

BERT4SessRec: Content-Based Video Relevance Prediction with Bidirectional Encoder Representations from Transformer.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Progressive Retinex: Mutually Reinforced Illumination-Noise Perception Network for Low-Light Image Enhancement.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Cross-Fiber Spatial-Temporal Co-enhanced Networks for Video Action Recognition.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Fast and Accurate Electron Microscopy Image Registration with 3D Convolution.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Progressive Learning for Neuronal Population Reconstruction from Optical Microscopy Images.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Extract Bone Parts Without Human Prior: End-to-end Convolutional Neural Network for Pediatric Bone Age Assessment.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Instance Segmentation from Volumetric Biomedical Images Without Voxel-Wise Labeling.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Neural Network-Based Arithmetic Coding for Inter Prediction Information in HEVC.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2019

One-Shot Texture Retrieval with Global Context Metric.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Densely Supervised Hierarchical Policy-Value Network for Image Paragraph Generation.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Exploring the Task Cooperation in Multi-goal Visual Navigation.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Mutually Reinforced Spatio-Temporal Convolutional Tube for Human Action Recognition.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Structure-Aware Residual Pyramid Network for Monocular Depth Estimation.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

MLTS: A Multi-Language Scene Text Spotter.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Multimodal Semantic Attention Network for Video Captioning.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Knowing User Better: Jointly Predicting Click-Through and Playtime for Micro-Video.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Semantic-Embedding and Shape-Aware U-Net for Ultrasound Eyeball Segmentation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Accurate Segmentation of Synaptic Cleft with Contour Growing Concatenated with a Convnet.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Adaptive Structure-Constrained Robust Latent Low-Rank Coding for Image Recovery.
Proceedings of the 2019 IEEE International Conference on Data Mining, 2019

Deep Multiple-Attribute-Perceived Network for Real-World Texture Recognition.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Making History Matter: History-Advantage Sequence Training for Visual Dialog.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Learning to Assemble Neural Module Tree Networks for Visual Grounding.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

JPEG Artifacts Reduction via Deep Convolutional Sparse Coding.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Context-Reinforced Semantic Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-Grained Image Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Adaptive Transfer Network for Cross-Domain Person Re-Identification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Camera Lens Super-Resolution.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Robust Deep Co-Saliency Detection with Group Semantic.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

A Two-Stream Mutual Attention Network for Semi-Supervised Biomedical Segmentation with Noisy Labels.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
A Fast Uyghur Text Detector for Complex Background Images.
IEEE Trans. Multim., 2018

CCLBR: Congestion Control-Based Load Balanced Routing in Unstructured P2P Systems.
IEEE Syst. J., 2018

Explainability by Parsing: Neural Module Tree Networks for Natural Language Visual Grounding.
CoRR, 2018

CA3Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification.
CoRR, 2018

Multi-Level Deep Cascade Trees for Conversion Rate Prediction.
CoRR, 2018

Fully Point-wise Convolutional Neural Network for Modeling Statistical Regularities in Natural Images.
CoRR, 2018

A CNN-Based In-Loop Filter with CU Classification for HEVC.
Proceedings of the IEEE Visual Communications and Image Processing, 2018

Particle Swarm Programming-Based Interactive Content-Based Image Retrieval.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Collaborative Detection and Caption Network.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Temporal-Contextual Attention Network for Video-Based Person Re-identification.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Co-occurrent Structural Edge Detection for Color-Guided Depth Map Super-Resolution.
Proceedings of the MultiMedia Modeling - 24th International Conference, 2018

A Feature-Adaptive Semi-Supervised Framework for Co-saliency Detection.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

LA-Net: Layout-Aware Dense Network for Monocular Depth Estimation.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Session details: Vision-3 (Applications in Multimedia).
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Session details: Vision-2 (Object & Scene Understanding).
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Connectionist Temporal Fusion for Sign Language Translation.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Context-Aware Visual Policy Network for Sequence-Level Image Captioning.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

CA<sub>3</sub>Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Attention and Language Ensemble for Scene Text Recognition with Convolutional Sequence Modeling.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Content-Based Video Relevance Prediction with Second-Order Relevance and Attention Modeling.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Temporal Hierarchical Attention at Category- and Item-Level for Micro-Video Click-Through Prediction.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Object Trajectory Proposal via Hierarchical Volume Grouping.
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018

Sketchpointnet: A Compact Network for Robust Sketch Recognition.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

3D Cnn-Based Soma Segmentation from Brain Images at Single-Neuron Resolution.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Towards Human-Level License Plate Recognition.
Proceedings of the Computer Vision - ECCV 2018, 2018


Deep Residual Attention Network for Spectral Image Super-Resolution.
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

MiCT: Mixed 3D/2D Convolutional Tube for Human Action Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

CCNet: Cluster-Coordinated Net for Learning Multi-agent Communication Protocols with Reinforcement Learning.
Proceedings of The 10th Asian Conference on Machine Learning, 2018

2017
Guest Editorial: Knowledge-Based Multimedia Computing.
Multim. Tools Appl., 2017

Improving triplet-wise training of convolutional neural network for vehicle re-identification.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Diversity-induced weighted classifier ensemble learning.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Adaptive Pooling in Multi-instance Learning for Web Video Annotation.
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

Progressive tone mapping of brain images at single-neuron resolution.
Proceedings of the 2017 IEEE Global Conference on Signal and Information Processing, 2017

2016
p-Laplacian Regularized Sparse Coding for Human Activity Recognition.
IEEE Trans. Ind. Electron., 2016

A Unified Scheme for Super-Resolution and Depth Estimation From Asymmetric Stereoscopic Video.
IEEE Trans. Circuits Syst. Video Technol., 2016

A robust vision inspection system for detecting surface defects of film capacitors.
Signal Process., 2016

Guest Editorial: Large-Scale Multimedia Content Analysis on Social Media.
Multim. Tools Appl., 2016

Social media analytics and learning.
Neurocomputing, 2016

Building Locally Discriminative Classifier Ensemble Through Classifier Fusion Among Nearest Neighbors.
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

Collaborative Q-Learning Based Routing Control in Unstructured P2P Networks.
Proceedings of the MultiMedia Modeling - 22nd International Conference, 2016

Linear Distance Preserving Pseudo-Supervised and Unsupervised Hashing.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Multi-Scale Triplet CNN for Person Re-Identification.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Action recognition with novel high-level pose features.
Proceedings of the 2016 IEEE International Conference on Multimedia & Expo Workshops, 2016

Comparative Deep Learning of Hybrid Representations for Image Recommendations.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
Corrections to "Exploiting Web Images for Semantic Video Indexing Via Robust Sample-Specific Loss".
IEEE Trans. Multim., 2015

Semantic-Based Location Recommendation With Multimodal Venue Semantics.
IEEE Trans. Multim., 2015

Robust Multiview Feature Learning for RGB-D Image Understanding.
ACM Trans. Intell. Syst. Technol., 2015

An Attribute-Assisted Reranking Model for Web Image Search.
IEEE Trans. Image Process., 2015

Guest editorial: selected papers from ICIMCS 2012.
Multim. Syst., 2015

一种基于拥塞发现的强化学习P2P网络视频点播预取策略 (Video Prefetching Strategy Based on Congestion Finding with Reinforcement Learning in P2P VOD Networks).
计算机科学, 2015

Depth map super-resolution using stereo-vision-assisted model.
Neurocomputing, 2015

Learning Multi-view Deep Features for Small Object Retrieval in Surveillance Scenarios.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Sparse canonical correlation analysis for recognition.
Proceedings of the 7th International Conference on Internet Multimedia Computing and Service, 2015

Sparse principle motion component for one-shot gesture recognition.
Proceedings of the 7th International Conference on Internet Multimedia Computing and Service, 2015

The AdaBoost algorithm for vehicle detection based on CNN features.
Proceedings of the 7th International Conference on Internet Multimedia Computing and Service, 2015

2014
Attribute-Augmented Semantic Hierarchy: Towards a Unified Framework for Content-Based Image Retrieval.
ACM Trans. Multim. Comput. Commun. Appl., 2014

Introduction to the Special Issue Best Papers of ACM Multimedia 2013.
ACM Trans. Multim. Comput. Commun. Appl., 2014

Exploiting Web Images for Semantic Video Indexing Via Robust Sample-Specific Loss.
IEEE Trans. Multim., 2014

Adaptive Learning for Celebrity Identification With Video Context.
IEEE Trans. Multim., 2014

Product Aspect Ranking and Its Applications.
IEEE Trans. Knowl. Data Eng., 2014

Robust (Semi) Nonnegative Graph Embedding.
IEEE Trans. Image Process., 2014

Gradient-domain-based enhancement of multi-view depth video.
Inf. Sci., 2014

A novel segmentation based video-denoising method with noise level estimation.
Inf. Sci., 2014

Achieving dynamic load balancing through mobile agents in small world P2P networks.
Comput. Networks, 2014

Improving Color Constancy with Internet Photo Collections.
Proceedings of the Advances in Multimedia Information Processing - PCM 2014, 2014

A Stereo-Vision-Assisted model for depth map super-resolution.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

2013
Hierarchical Organization of Collaboratively Constructed Content.
Proceedings of the People's Web Meets NLP, Collaboratively Constructed Language Resources, 2013

GPSView: A scenic driving route planner.
ACM Trans. Multim. Comput. Commun. Appl., 2013

Beyond Text QA: Multimedia Answer Generation by Harvesting Web Information.
IEEE Trans. Multim., 2013

Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search.
IEEE Trans. Image Process., 2013

Detecting Group Activities With Multi-Camera Context.
IEEE Trans. Circuits Syst. Video Technol., 2013

Marginalized multi-layer multi-instance kernel for video concept detection.
Signal Process., 2013

Multimedia encyclopedia construction by mining web knowledge.
Signal Process., 2013

Interactive social group recommendation for Flickr photos.
Neurocomputing, 2013

Partial-Duplicate Image Retrieval via Saliency-Guided Visual Matching.
IEEE Multim., 2013

Robust Semantic Video Indexing by Harvesting Web Images.
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013

Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval.
Proceedings of the ACM Multimedia Conference, 2013

Learning attribute-aware dictionary for image classification and search.
Proceedings of the International Conference on Multimedia Retrieval, 2013

Click-boosting random walk for image search reranking.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2013

A Pattern Matching Based Model for Implicit Opinion Question Identification.
Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013

2012
Oracle in Image Search: A Content-Based Approach to Performance Prediction.
ACM Trans. Inf. Syst., 2012

Interactive Video Indexing With Statistical Active Learning.
IEEE Trans. Multim., 2012

Event Driven Web Video Summarization by Tag Localization and Key-Shot Identification.
IEEE Trans. Multim., 2012

Difficulty Guided Image Retrieval Using Linear Multiple Feature Embedding.
IEEE Trans. Multim., 2012

Parallel Lasso for Large-Scale Video Concept Detection.
IEEE Trans. Multim., 2012

Mining Travel Patterns from Geotagged Photos.
ACM Trans. Intell. Syst. Technol., 2012

Semantic-Gap-Oriented Active Learning for Multilabel Image Annotation.
IEEE Trans. Image Process., 2012

k-Partite graph reinforcement and its application in multimedia information retrieval.
Inf. Sci., 2012

A comprehensive representation scheme for video semantic ontology and its applications in semantic concept detection.
Neurocomputing, 2012

Active learning for social image retrieval using Locally Regressive Optimal Design.
Neurocomputing, 2012

Multimedia Question Answering.
IEEE Multim., 2012

Topology Adaptation Based on Mobile Agent in Unstructured P2P Networks.
Proceedings of the Advances in Multimedia Information Processing - PCM 2012, 2012

Combining SIFT and Global Features for Web Image Classification.
Proceedings of the Advances in Multimedia Information Processing - PCM 2012, 2012

Video Browser Showdown by NUS.
Proceedings of the Advances in Multimedia Modeling - 18th International Conference, 2012

Attribute feedback.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Attribute feedback.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Attribute-assisted reranking for web image retrieval.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Visual query attributes suggestion.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Answering Opinion Questions on Products by Exploiting Hierarchical Organization of Consumer Reviews.
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012

Robust Non-negative Graph Embedding: Towards noisy data, unreliable graphs, and noisy labels.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Automatic labeling hierarchical topics.
Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012

Text Mining in Multimedia.
Proceedings of the Mining Text Data, 2012

2011
Utilizing Related Samples to Enhance Interactive Concept-Based Video Search.
IEEE Trans. Multim., 2011

Less is More: Efficient 3-D Object Retrieval With Query View Selection.
IEEE Trans. Multim., 2011

Research and applications on georeferenced multimedia: a survey.
Multim. Tools Appl., 2011

Hierarchical organization of unstructured consumer reviews.
Proceedings of the 20th International Conference on World Wide Web, 2011

Multimedia answering: enriching text QA with media information.
Proceedings of the Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011

Product comparison using comparative relations.
Proceedings of the Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011

Optimizing multimodal reranking for web image search.
Proceedings of the Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011

Mining Travel Patterns from GPS-Tagged Photos.
Proceedings of the Advances in Multimedia Modeling, 2011

Semi-automatic Flickr Group Suggestion.
Proceedings of the Advances in Multimedia Modeling, 2011

Integrating rich information for video recommendation with multi-task rank aggregation.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Learning "verb-object" concepts for semantic image annotation.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Learning concept bundles for video search with complex queries.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Difficulty guided image retrieval using linear multiview embedding.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Query expansion by spatial co-occurrence for image retrieval.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Locally regressive G-optimal design for image retrieval.
Proceedings of the 1st International Conference on Multimedia Retrieval, 2011

ShotTagger: tag location for internet videos.
Proceedings of the 1st International Conference on Multimedia Retrieval, 2011

Matching Content-based Saliency Regions for partial-duplicate image retrieval.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Domain-Assisted Product Aspect Hierarchy Generation: Towards Hierarchical Organization of Unstructured Consumer Reviews.
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 2011

Aspect Ranking: Identifying Important Product Aspects from Online Consumer Reviews.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011

2010
Joint Learning of Labels and Distance Metric.
IEEE Trans. Syst. Man Cybern. Part B, 2010

Visual query suggestion: Towards capturing user intent in internet image search.
ACM Trans. Multim. Comput. Commun. Appl., 2010

TRECVID 2010 Known-item Search by NUS.
Proceedings of the TRECVID 2010 workshop participants notebook papers, 2010

Which Tags Are Related to Visual Content?
Proceedings of the Advances in Multimedia Modeling, 2010

Mediapedia: Mining Web Knowledge to Construct Multimedia Encyclopedia.
Proceedings of the Advances in Multimedia Modeling, 2010

Evaluation of histogram based interest point detector in web image classification and search.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

Utilizing related samples to learn complex queries in interactive concept-based video search.
Proceedings of the 9th ACM International Conference on Image and Video Retrieval, 2010

2009
Graph-based semi-supervised learning with multiple labels.
J. Vis. Commun. Image Represent., 2009

Visual query suggestion.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Robust Distance Metric Learning with Auxiliary Knowledge.
Proceedings of the IJCAI 2009, 2009

An efficient sparse metric learning in high-dimensional space via <i>l</i><sub>1</sub>-penalized log-determinant regularization.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

2008
Feature Detection and Correspondence for Camera Calibration.
Int. J. Inf. Acquis., 2008

MSRA atT TRECVID 2008: High-Level Feature Extraction and Automatic Search.
Proceedings of the TRECVID 2008 workshop participants notebook papers, 2008

Graph-based semi-supervised learning with multi-label.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Optimized video scene segmentation.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Unbiased active learning for image retrieval.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Joint multi-label multi-instance learning for image classification.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

A joint appearance-spatial distance for kernel-based image categorization.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

2007
MSRA-USTC-SJTU at TRECVID 2007: High-Level Feature Extraction and Search.
Proceedings of the TRECVID 2007 workshop participants notebook papers, 2007

Refining video annotation by exploiting pairwise concurrent relation.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

Building a comprehensive ontology to refine video concept detection.
Proceedings of the 9th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2007


  Loading...