Jun Yu

Orcid: 0000-0003-1922-7283

Affiliations:
  • Hangzhou Dianzi University, School of Computer Science, China
  • University of North Carolina at Charlotte, NC, USA (former)
  • Xiamen University, School of Information Science and Technology, China (former)
  • Microsoft Research Asia (2012 - 2013)
  • Nanyang Technological University, School of Computer Engineering, Singapore (2009 - 2011)
  • Zhejiang University, College of Computer Science, China (PhD 2009)


According to our database1, Jun Yu authored at least 253 papers between 2001 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Attribute Prototype-Guided Iterative Scene Graph for Explainable Radiology Report Generation.
IEEE Trans. Medical Imaging, December, 2024

Token-Mixer: Bind Image and Text in One Embedding Space for Medical Image Reporting.
IEEE Trans. Medical Imaging, November, 2024

Latent Semantic Consensus for Deterministic Geometric Model Fitting.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2024

Recurrent Appearance Flow for Occlusion-Free Virtual Try-On.
ACM Trans. Multim. Comput. Commun. Appl., August, 2024

Effective Video Summarization by Extracting Parameter-Free Motion Attention.
ACM Trans. Multim. Comput. Commun. Appl., July, 2024

Video Moment Retrieval With Noisy Labels.
IEEE Trans. Neural Networks Learn. Syst., May, 2024

Regularly Truncated M-Estimators for Learning With Noisy Labels.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2024

Transformer-based multimodal feature enhancement networks for multimodal depression detection integrating video, audio and remote photoplethysmograph signals.
Inf. Fusion, April, 2024

A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing Objects in 3D Scenes.
IEEE Trans. Circuits Syst. Video Technol., March, 2024

Semantic Disentanglement Adversarial Hashing for Cross-Modal Retrieval.
IEEE Trans. Circuits Syst. Video Technol., March, 2024

PAINT: Photo-realistic Fashion Design Synthesis.
ACM Trans. Multim. Comput. Commun. Appl., February, 2024

Multi-Task Paired Masking With Alignment Modeling for Medical Vision-Language Pre-Training.
IEEE Trans. Multim., 2024

Semi-Supervised Medical Report Generation via Graph-Guided Hybrid Feature Consistency.
IEEE Trans. Multim., 2024

DSIS-DPR:Structured Instance Segmentation and Diffusion Prior Refinement for Dental Anatomy Learning.
IEEE Trans. Multim., 2024

FedSea: Federated Learning via Selective Feature Alignment for Non-IID Multimodal Data.
IEEE Trans. Multim., 2024

Multi-Granularity Contrastive Cross-Modal Collaborative Generation for End-to-End Long-Term Video Question Answering.
IEEE Trans. Image Process., 2024

Learning to Discover Knowledge: A Weakly-Supervised Partial Domain Adaptation Approach.
IEEE Trans. Image Process., 2024

FAFusion: Learning for Infrared and Visible Image Fusion via Frequency Awareness.
IEEE Trans. Instrum. Meas., 2024

MTDAN: A Lightweight Multi-Scale Temporal Difference Attention Networks for Automated Video Depression Detection.
IEEE Trans. Affect. Comput., 2024

Semantic-aware hyper-space deformable neural radiance fields for facial avatar reconstruction.
Pattern Recognit. Lett., 2024

GTADT: Gated tone-sensitive acne grading via augmented domain transfer.
Multim. Tools Appl., 2024

Multi2Human: Controllable human image generation with multimodal controls.
Neurocomputing, 2024

ZS-SRT: An efficient zero-shot super-resolution training method for Neural Radiance Fields.
Neurocomputing, 2024

Prompting Video-Language Foundation Models with Domain-specific Fine-grained Heuristics for Video Question Answering.
CoRR, 2024

Imp: Highly Capable Large Multimodal Models for Mobile Devices.
CoRR, 2024

SRGS: Super-Resolution 3D Gaussian Splatting.
CoRR, 2024

Advancing Incremental Few-Shot Semantic Segmentation via Semantic-Guided Relation Alignment and Adaptation.
Proceedings of the MultiMedia Modeling - 30th International Conference, 2024

MVPbev: Multi-view Perspective Image Generation from BEV with Test-time Controllability and Generalizability.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Latent Representation Reorganization for Face Privacy Protection.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Facial Identity Anonymization via Intrinsic and Extrinsic Attention Distraction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

GLOW: Global Layout Aware Attacks on Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Integrating Representation Subspace Mapping with Unimodal Auxiliary Loss for Attention-based Multimodal Emotion Recognition.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Multi-Domain Deep Learning from a Multi-View Perspective for Cross-Border E-commerce Search.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

BCLNet: Bilateral Consensus Learning for Two-View Correspondence Pruning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Graph Context Transformation Learning for Progressive Correspondence Pruning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
LPR: learning point-level temporal action localization through re-training.
Multim. Syst., October, 2023

MARN: Multi-level Attentional Reconstruction Networks for Weakly Supervised Video Temporal Grounding.
Neurocomputing, October, 2023

Concept Parser With Multimodal Graph Learning for Video Captioning.
IEEE Trans. Circuits Syst. Video Technol., September, 2023

BreastDM: A DCE-MRI dataset for breast tumor image segmentation and classification.
Comput. Biol. Medicine, September, 2023

An efficient multi-path structure with staged connection and multi-scale mechanism for text-to-image synthesis.
Multim. Syst., June, 2023

EGRA-NeRF: Edge-Guided Ray Allocation for Neural Radiance Fields.
Image Vis. Comput., June, 2023

Bilaterally Slimmable Transformer for Elastic and Efficient Visual Question Answering.
IEEE Trans. Multim., 2023

Joint Embedding of Deep Visual and Semantic Features for Medical Image Report Generation.
IEEE Trans. Multim., 2023

Dual-Level Adaptive and Discriminative Knowledge Transfer for Cross-Domain Recognition.
IEEE Trans. Multim., 2023

Electromagnetic Imaging Boosted Visual Object Recognition Under Difficult Visual Conditions.
IEEE Trans. Geosci. Remote. Sens., 2023

Import vertical characteristic of rain streak for single image deraining.
Multim. Syst., 2023

Multi-task Paired Masking with Alignment Modeling for Medical Vision-Language Pre-training.
CoRR, 2023

GLOW: Global Layout Aware Attacks for Object Detection.
CoRR, 2023

Contrastive Perturbation Network for Weakly Supervised Temporal Sentence Grounding.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023

Follow-me: Deceiving Trackers with Fabricated Paths.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Prompting Large Language Models with Answer Heuristics for Knowledge-Based Visual Question Answering.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

ANetQA: A Large-scale Benchmark for Fine-grained Compositional Reasoning over Untrimmed Videos.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

ShiftDDPMs: Exploring Conditional Diffusion Models by Shifting Diffusion Trajectories.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Knowledge-Constrained Answer Generation for Open-Ended Video Question Answering.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Fine-grained Image Classification via Multi-scale Selective Hierarchical Biquadratic Pooling.
ACM Trans. Multim. Comput. Commun. Appl., 2022

Domain-invariant Graph for Adaptive Semi-supervised Domain Adaptation.
ACM Trans. Multim. Comput. Commun. Appl., 2022

TaoHighlight: Commodity-Aware Multi-Modal Video Highlight Detection in E-Commerce.
IEEE Trans. Multim., 2022

An Individual-Difference-Aware Model for Cross-Person Gaze Estimation.
IEEE Trans. Image Process., 2022

Multiview Consensus Structure Discovery.
IEEE Trans. Cybern., 2022

Local-Global Graph Pooling via Mutual Information Maximization for Video-Paragraph Retrieval.
IEEE Trans. Circuits Syst. Video Technol., 2022

Towards Knowledge-Aware Video Captioning via Transitive Visual Relationship Detection.
IEEE Trans. Circuits Syst. Video Technol., 2022

Exploring Fine-Grained Cluster Structure Knowledge for Unsupervised Domain Adaptation.
IEEE Trans. Circuits Syst. Video Technol., 2022

Generalized Multi-View Collaborative Subspace Clustering.
IEEE Trans. Circuits Syst. Video Technol., 2022

QoS-Driven Resource Optimization for Intelligent Fog Radio Access Network: A Dynamic Power Allocation Perspective.
IEEE Trans. Cogn. Commun. Netw., 2022

Hierarchical Deep Click Feature Prediction for Fine-Grained Image Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Weakly supervised moment localization with natural language based on semantic reconstruction.
Image Vis. Comput., 2022

A contrastive triplet network for automatic chest X-ray reporting.
Neurocomputing, 2022

Interaction augmented transformer with decoupled decoding for video captioning.
Neurocomputing, 2022

Modeling long-term video semantic distribution for temporal action proposal generation.
Neurocomputing, 2022

Joint usage of global and local attentions in hourglass network for human pose estimation.
Neurocomputing, 2022

FDAM: full-dimension attention module for deep convolutional neural networks.
Int. J. Multim. Inf. Retr., 2022

Semisupervised image classification by mutual learning of multiple self-supervised models.
Int. J. Intell. Syst., 2022

Guest Editorial: Intelligent information processing and services in media convergence.
Int. J. Intell. Syst., 2022

Towards Efficient and Elastic Visual Question Answering with Doubly Slimmable Transformer.
CoRR, 2022

Hyper-relationship Learning Network for Scene Graph Generation.
CoRR, 2022

Complex-valued Reinforcement Learning Based Dynamic Beamforming Design for IRS Aided Time-Varying Downlink Channel.
Proceedings of the 95th IEEE Vehicular Technology Conference, 2022

Unsupervised Domain Adaptation Integrating Transformer and Mutual Information for Cross-Corpus Speech Emotion Recognition.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Delegate-based Utility Preserving Synthesis for Pedestrian Image Anonymization.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Triple Disentangling Network for Unsupervised Domain Adaptation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Group Correspondence: A Statistical Perspective for Incomplete Multi-View Clustering Augmentation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Wnet: Audio-Guided Video Object Segmentation via Wavelet-Based Cross- Modal Denoising Networks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

ESCNet: Gaze Target Detection with the Understanding of 3D Scenes.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Toward Multi-Modal Conditioned Fashion Image Translation.
IEEE Trans. Multim., 2021

Asymmetric Supervised Consistent and Specific Hashing for Cross-Modal Retrieval.
IEEE Trans. Image Process., 2021

Complementary, Heterogeneous and Adversarial Networks for Image-to-Image Translation.
IEEE Trans. Image Process., 2021

SPRNet: Single-Pixel Reconstruction for One-Stage Instance Segmentation.
IEEE Trans. Cybern., 2021

Toward Realistic Face Photo-Sketch Synthesis via Composition-Aided GANs.
IEEE Trans. Cybern., 2021

Long-Term Video Question Answering via Multimodal Hierarchical Memory Attentive Networks.
IEEE Trans. Circuits Syst. Video Technol., 2021

Coupled Knowledge Transfer for Visual Data Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2021

Distributed feedback network for single-image deraining.
Inf. Sci., 2021

Deep embedding of concept ontology for hierarchical fashion recognition.
Neurocomputing, 2021

Unnoticeable synthetic face replacement for image privacy protection.
Neurocomputing, 2021

Contrastive learning of graph encoder for accelerating pedestrian trajectory prediction training.
IET Image Process., 2021

The Story in Your Eyes: An Individual-difference-aware Model for Cross-person Gaze Estimation.
CoRR, 2021

Federated Learning Model Training Method Based on Data Features Perception Aggregation.
Proceedings of the 94th IEEE Vehicular Technology Conference, 2021

Effective De-identification Generative Adversarial Network for Face Anonymization.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Weakly Supervised Dense Video Captioning via Jointly Usage of Knowledge Distillation and Cross-modal Matching.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Learning Controlled Semantic Embedding for Cross-Modal Retrieval.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Deep Graph-neighbor Coherence Preserving Network for Unsupervised Cross-modal Hashing.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Proposal Complementary Action Detection.
ACM Trans. Multim. Comput. Commun. Appl., 2020

Spatial Pyramid-Enhanced NetVLAD With Weighted Triplet Loss for Place Recognition.
IEEE Trans. Neural Networks Learn. Syst., 2020

Compositional Attention Networks With Two-Stream Fusion for Video Question Answering.
IEEE Trans. Image Process., 2020

Constrained Discriminative Projection Learning for Image Classification.
IEEE Trans. Image Process., 2020

Multimodal Transformer With Multi-View Visual Representation for Image Captioning.
IEEE Trans. Circuits Syst. Video Technol., 2020

Guest Editorial Introduction to the Special Section on Representation Learning for Visual Content Understanding.
IEEE Trans. Circuits Syst. Video Technol., 2020

TVENet: Temporal variance embedding network for fine-grained action representation.
Pattern Recognit., 2020

Intra- and Inter-modal Multilinear Pooling with Multitask Learning for Video Grounding.
Neural Process. Lett., 2020

Fine-grained image classification with factorized deep user click feature.
Inf. Process. Manag., 2020

Incremental focal loss GANs.
Inf. Process. Manag., 2020

Fine-grained visual understanding and reasoning.
Neurocomputing, 2020

Style-adaptive photo aesthetic rating via convolutional neural networks and multi-task learning.
Neurocomputing, 2020

Multi-task Compositional Network for Visual Relationship Detection.
Int. J. Comput. Vis., 2020

Image Retrieval via Gated Multiscale NetVLAD for Social Media Applications.
IEEE Multim., 2020

Representation learning of image composition for aesthetic prediction.
Comput. Vis. Image Underst., 2020

Comprehensive Graph-conditional Similarity Preserving Network for Unsupervised Cross-modal Hashing.
CoRR, 2020

Learning Domain-invariant Graph for Adaptive Semi-supervised Domain Adaptation with Few Labeled Source Samples.
CoRR, 2020

Repulsive Mixture Models of Exponential Family PCA for Clustering.
CoRR, 2020

Detecting Communities in Heterogeneous Multi-Relational Networks: A Message Passing based Approach.
CoRR, 2020

Weakly-Supervised Multi-Level Attentional Reconstruction Network for Grounding Textual Queries in Videos.
CoRR, 2020

Discriminative Regions Erasing Strategy for Weakly-Supervised Temporal Action Localization.
Proceedings of the Pattern Recognition and Computer Vision - Third Chinese Conference, 2020

Relationship graph learning network for visual relationship detection.
Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

Deep Multimodal Neural Architecture Search.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Diversified Bayesian Nonnegative Matrix Factorization.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Effective 3-D Shape Retrieval by Integrating Traditional Descriptors and Pointwise Convolution.
IEEE Trans. Multim., 2019

Long-Form Video Question Answering via Dynamic Hierarchical Reinforced Networks.
IEEE Trans. Image Process., 2019

Scalable Zero-Shot Learning via Binary Visual-Semantic Embeddings.
IEEE Trans. Image Process., 2019

Zero-Shot Learning via Robust Latent Representation and Manifold Regularization.
IEEE Trans. Image Process., 2019

Image Recognition by Predicted User Click Feature With Multidomain Multitask Transfer Deep Network.
IEEE Trans. Image Process., 2019

Multimodal Face-Pose Estimation With Multitask Manifold Deep Learning.
IEEE Trans. Ind. Informatics, 2019

Adapting Stochastic Block Models to Power-Law Degree Distributions.
IEEE Trans. Cybern., 2019

Deep Mixture of Diverse Experts for Large-Scale Visual Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Multimodal activity recognition with local block CNN and attention-based spatial weighted CNN.
J. Vis. Commun. Image Represent., 2019

Realization of a Novel Logarithmic Chaotic System and Its Characteristic Analysis.
Int. J. Bifurc. Chaos, 2019

End-to-end visual grounding via region proposal networks and bilinear pooling.
IET Comput. Vis., 2019

Multimodal Unified Attention Networks for Vision-and-Language Interactions.
CoRR, 2019

Single Pixel Reconstruction for One-stage Instance Segmentation.
CoRR, 2019

Video Dialog via Multi-Grained Convolutional Self-Attention Context Networks.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Multi-interaction Network with Object Relation for Video Question Answering.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

PCPCAD: Proposal Complementary Action Detector.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

On Exploring Undetermined Relationships for Visual Relationship Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Deep Modular Co-Attention Networks for Visual Question Answering.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Embedding Complementary Deep Networks for Image Classification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Dynamic Resource Allocation in High-Speed Railway Fog Radio Access Networks with Delay Constraint.
Proceedings of the Communications and Networking, 2019

ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
User-Click-Data-Based Fine-Grained Image Recognition via Weakly Supervised Metric Learning.
ACM Trans. Multim. Comput. Commun. Appl., 2018

Beyond Bilinear: Generalized Multimodal Factorized High-Order Pooling for Visual Question Answering.
IEEE Trans. Neural Networks Learn. Syst., 2018

Embedding Visual Hierarchy With Deep Networks for Large-Scale Visual Recognition.
IEEE Trans. Image Process., 2018

Local Deep-Feature Alignment for Unsupervised Dimension Reduction.
IEEE Trans. Image Process., 2018

Leveraging Content Sensitiveness and User Trustworthiness to Recommend Fine-Grained Privacy Settings for Social Image Sharing.
IEEE Trans. Inf. Forensics Secur., 2018

Multitask Autoencoder Model for Recovering Human Poses.
IEEE Trans. Ind. Electron., 2018

Unsupervised image segmentation via Stacked Denoising Auto-encoder and hierarchical patch indexing.
Signal Process., 2018

Face biometric quality assessment via light CNN.
Pattern Recognit. Lett., 2018

Integrating multi-level deep learning and concept ontology for large-scale visual recognition.
Pattern Recognit., 2018

Blind image quality prediction by exploiting multi-level deep representations.
Pattern Recognit., 2018

Machine learning for big visual analysis.
Mach. Vis. Appl., 2018

Click data guided query modeling with click propagation and sparse coding.
Multim. Tools Appl., 2018

Textually Guided Ranking Network for Attentional Image Retweet Modeling.
CoRR, 2018

Learning Decorrelated Hashing Codes for Multimodal Retrieval.
CoRR, 2018

Deformable Point Cloud Recognition Using Intrinsic Function and Deep Learning.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Comprehensive Distance-Preserving Autoencoders for Cross-Modal Retrieval.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Deep Learning for Multimedia: Science or Technology?
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Ontology-Driven Hierarchical Deep Learning for Fashion Recognition.
Proceedings of the IEEE 1st Conference on Multimedia Information Processing and Retrieval, 2018

Open-Ended Long-form Video Question Answering via Adaptive Hierarchical Reinforced Networks.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Traffic Signals Timing Cycle Length Learning: Using Taxi Gps Trajectories.
Proceedings of the 2018 International Conference on Machine Learning and Cybernetics, 2018

Deep Point Convolutional Approach for 3D Model Retrieval.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Deep click feature based query merging for robust image recognition.
Proceedings of the 10th International Conference on Internet Multimedia Computing and Service, 2018

FishEyeRecNet: A Multi-context Collaborative Deep Network for Fisheye Image Rectification.
Proceedings of the Computer Vision - ECCV 2018, 2018

2017
HD-MTL: Hierarchical Deep Multi-Task Learning for Large-Scale Visual Recognition.
IEEE Trans. Image Process., 2017

iPrivacy: Image Privacy Protection by Identifying Sensitive Objects via Deep Multi-Task Learning.
IEEE Trans. Inf. Forensics Secur., 2017

Coupled Deep Autoencoder for Single Image Super-Resolution.
IEEE Trans. Cybern., 2017

Deep Multimodal Distance Metric Learning Using Click Constraints for Image Ranking.
IEEE Trans. Cybern., 2017

Constrained Low-Rank Learning Using Least Squares-Based Regularization.
IEEE Trans. Cybern., 2017

Diversified dictionaries for multi-instance learning.
Pattern Recognit., 2017

Three-dimensional image-based human pose recovery with hypergraph regularized autoencoders.
Multim. Tools Appl., 2017

Machine learning and signal processing for big multimedia analysis.
Neurocomputing, 2017

Exemplar-based 3D human pose estimation with sparse spectral embedding.
Neurocomputing, 2017

DeepSim: Deep similarity for image quality assessment.
Neurocomputing, 2017

Multi-modal Face Pose Estimation with Multi-task Manifold Deep Learning.
CoRR, 2017

Composition-aided Sketch-realistic Portrait Generation.
CoRR, 2017

Beyond Bilinear: Generalized Multi-modal Factorized High-order Pooling for Visual Question Answering.
CoRR, 2017

Embedding Visual Hierarchy with Deep Networks for Large-Scale Visual Recognition.
CoRR, 2017

Deep Mixture of Diverse Experts for Large-Scale Visual Recognition.
CoRR, 2017

Improving Stochastic Block Models by Incorporating Power-Law Degree Characteristic.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Privacy Setting Recommendation for Image Sharing.
Proceedings of the 16th IEEE International Conference on Machine Learning and Applications, 2017

Deep Mixture of Experts with Diverse Task Spaces.
Proceedings of the 16th IEEE International Conference on Machine Learning and Applications, 2017

Fine-grained image recognition via weakly supervised click data guided bilinear CNN model.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Convolutional neural networks for intestinal hemorrhage detection in wireless capsule endoscopy images.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Query Modeling for Click Data Based Image Recognition Using Graph Based Propagation and Sparse Coding.
Proceedings of the Internet Multimedia Computing and Service, 2017

Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering.
Proceedings of the IEEE International Conference on Computer Vision, 2017

2016
Biologically inspired image quality assessment.
Signal Process., 2016

Data-driven facial animation via semi-supervised local patch alignment.
Pattern Recognit., 2016

Realtime and robust object matching with a large number of templates.
Multim. Tools Appl., 2016

Towards robust subspace recovery via sparsity-constrained latent low-rank representation.
J. Vis. Commun. Image Represent., 2016

Boosting video popularity through keyword suggestion and recommendation systems.
Neurocomputing, 2016

Recent developments on deep big vision.
Neurocomputing, 2016

Photo aesthetic quality assessment via label distribution learning.
Proceedings of the 2016 IEEE International Conference on Systems, Man, and Cybernetics, 2016

Data-driven facial animation via hypergraph learning.
Proceedings of the 2016 IEEE International Conference on Systems, Man, and Cybernetics, 2016

Deep Similarity Feature Learning for Person Re-identification.
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

Multi-modal Image Re-ranking with Autoencoders and Click Semantics.
Proceedings of the MultiMedia Modeling - 22nd International Conference, 2016

Deep Neural Networks with Relativity Learning for facial expression recognition.
Proceedings of the 2016 IEEE International Conference on Multimedia & Expo Workshops, 2016

Deep Neural Network Boosted Large Scale Image Recognition Using User Click Data.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2016

Weakly Supervised Hand Pose Recovery with Domain Adaptation by Low-Rank Alignment.
Proceedings of the IEEE International Conference on Data Mining Workshops, 2016

2015
Multimodal Deep Autoencoder for Human Pose Recovery.
IEEE Trans. Image Process., 2015

Image-Based Three-Dimensional Human Pose Recovery by Multiview Locality-Sensitive Sparse Retrieval.
IEEE Trans. Ind. Electron., 2015

Semiautomated Extraction of Street Light Poles From Mobile LiDAR Point-Clouds.
IEEE Trans. Geosci. Remote. Sens., 2015

Learning to Rank Using User Clicks and Visual Features for Image Retrieval.
IEEE Trans. Cybern., 2015

Machine learning and signal processing for human pose recovery and behavior analysis.
Signal Process., 2015

Semantic embedding for indoor scene recognition by weighted hypergraph learning.
Signal Process., 2015

Low-rank matrix factorization with multiple Hypergraph regularizer.
Pattern Recognit., 2015

Multi-view ensemble manifold regularization for 3D object recognition.
Inf. Sci., 2015

l<sub>2, 1</sub> Norm regularized fisher criterion for optimal feature selection.
Neurocomputing, 2015

Human pose recovery by supervised spectral embedding.
Neurocomputing, 2015

Hessian Regularized Sparse Coding for Human Action Recognition.
Proceedings of the MultiMedia Modeling - 21st International Conference, 2015

Supervised Spectral Embedding for Human Pose Estimation.
Proceedings of the Intelligence Science and Big Data Engineering. Image and Video Data Engineering, 2015

Hypergraph Regularized Autoencoder for 3D Human Pose Recovery.
Proceedings of the Computer Vision - CCF Chinese Conference, 2015

2014
Exploiting Click Constraints and Multi-view Features for Image Re-ranking.
IEEE Trans. Multim., 2014

Click Prediction for Web Image Reranking Using Multimodal Sparse Coding.
IEEE Trans. Image Process., 2014

High-Order Distance-Based Multiview Stochastic Learning in Image Classification.
IEEE Trans. Cybern., 2014

Image clustering based on sparse patch alignment framework.
Pattern Recognit., 2014

Pairwise Three-Dimensional Shape Context for Partial Object Matching and Retrieval on Mobile Laser Scanning Data.
IEEE Geosci. Remote. Sens. Lett., 2014

Automated Detection of Road Manhole and Sewer Well Covers From Mobile LiDAR Point Clouds.
IEEE Geosci. Remote. Sens. Lett., 2014

Semantic preserving distance metric learning and applications.
Inf. Sci., 2014

Image clustering by hyper-graph regularized non-negative matrix factorization.
Neurocomputing, 2014

Genetic algorithm for spanning tree construction in P2P distributed interactive applications.
Neurocomputing, 2014

Semantic-based intelligent data clean framework for big data.
Proceedings of the Proceedings IEEE International Conference on Security, 2014

Structured action classification with hypergraph regularization.
Proceedings of the 2014 IEEE International Conference on Systems, Man, and Cybernetics, 2014

Sparse Manifold Learning and Its Applications in Image Classification.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2014

2013
Pairwise constraints based multiview features fusion for scene classification.
Pattern Recognit., 2013

High-level attributes modeling for indoor scenes classification.
Neurocomputing, 2013

Automatic cartoon matching in computer-assisted animation production.
Neurocomputing, 2013

Skeleton correspondence construction and its applications in animation style reusing.
Neurocomputing, 2013

Multi-view hypergraph learning by patch alignment framework.
Neurocomputing, 2013

Image-Based 3D Human Pose Recovery with Locality Sensitive Sparse Retrieval.
Proceedings of the IEEE International Conference on Systems, 2013

2012
On Combining Multiple Features for Cartoon Character Retrieval and Clip Synthesis.
IEEE Trans. Syst. Man Cybern. Part B, 2012

Semisupervised Multiview Distance Metric Learning for Cartoon Synthesis.
IEEE Trans. Image Process., 2012

Adaptive Hypergraph Learning and its Application in Image Classification.
IEEE Trans. Image Process., 2012

Interactive cartoon reusing by transfer learning.
Signal Process., 2012

Image classification by multimodal subspace learning.
Pattern Recognit. Lett., 2012

Semi-supervised distance metric learning based on local linear regression for data clustering.
Neurocomputing, 2012

Graph based transductive learning for cartoon correspondence construction.
Neurocomputing, 2012

Transductive Cartoon Retrieval by Multiple Hypergraph Learning.
Proceedings of the Neural Information Processing - 19th International Conference, 2012

2011
Complex Object Correspondence Construction in Two-Dimensional Animation.
IEEE Trans. Image Process., 2011

Cartoon synthesis using constrained spreading activation network.
Multim. Tools Appl., 2011

Semi-automatic cartoon generation by motion planning.
Multim. Syst., 2011

Fuzzy Diffusion Distance Learning for Cartoon Similarity Estimation.
J. Comput. Sci. Technol., 2011

Stroke Correspondence Construction Using Manifold Learning.
Comput. Graph. Forum, 2011

2010
Recognizing Cartoon Image Gestures for Retrieval and Interactive Cartoon Clip Synthesis.
IEEE Trans. Circuits Syst. Video Technol., 2010

Transductive graph based cartoon synthesis.
Comput. Animat. Virtual Worlds, 2010

2008
Perspective-aware cartoon clips synthesis.
Comput. Animat. Virtual Worlds, 2008

2007
Adaptive control in cartoon data reusing.
Comput. Animat. Virtual Worlds, 2007

2001
Spatiotemporal segmentation for compact video representation.
Signal Process. Image Commun., 2001


  Loading...