Rongrong Ji

Orcid: 0000-0001-9163-2932

Affiliations:
  • Xiamen University, Xiamen, China
  • Columbia University, New York, NY, USA


According to our database1, Rongrong Ji authored at least 670 papers between 2006 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
M3ixup: A multi-modal data augmentation approach for image captioning.
Pattern Recognit., 2025

2024
Training-Free Transformer Architecture Search With Zero-Cost Proxy Guided Evolution.
IEEE Trans. Pattern Anal. Mach. Intell., October, 2024

Uncovering the Over-Smoothing Challenge in Image Super-Resolution: Entropy-Based Quantification and Contrastive Optimization.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2024

Toward Unified Token Learning for Vision-Language Tracking.
IEEE Trans. Circuits Syst. Video Technol., April, 2024

Positive-Sample-Free Object Tracking via a Soft Constraint.
IEEE Trans. Circuits Syst. Video Technol., March, 2024

Transformer Tracking via Frequency Fusion.
IEEE Trans. Circuits Syst. Video Technol., February, 2024

Shadow-aware dynamic convolution for shadow removal.
Pattern Recognit., February, 2024

A closer look at branch classifiers of multi-exit architectures.
Comput. Vis. Image Underst., February, 2024

Towards Language-Guided Visual Recognition via Dynamic Convolutions.
Int. J. Comput. Vis., January, 2024

A Survivor in the Era of Large-Scale Pretraining: An Empirical Study of One-Stage Referring Expression Comprehension.
IEEE Trans. Multim., 2024

HODN: Disentangling Human-Object Feature for HOI Detection.
IEEE Trans. Multim., 2024

Bilateral Knowledge Interaction Network for Referring Image Segmentation.
IEEE Trans. Multim., 2024

Weakly-Supervised RGBD Video Object Segmentation.
IEEE Trans. Image Process., 2024

Defense Against Adversarial Attacks Using Topology Aligning Adversarial Training.
IEEE Trans. Inf. Forensics Secur., 2024

An efficient blur kernel estimation method for blind image Super-Resolution.
Pattern Recognit., 2024

Deep hybrid transformer network for robust modulation classification in wireless communications.
Knowl. Based Syst., 2024

DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion.
CoRR, 2024

Boosting CLIP Adaptation for Image Quality Assessment via Meta-Prompt Learning and Gradient Regularization.
CoRR, 2024

PartFormer: Awakening Latent Diverse Representation from Vision Transformer for Object Re-Identification.
CoRR, 2024

I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing.
CoRR, 2024

TraDiffusion: Trajectory-Based Training-Free Image Generation.
CoRR, 2024

CamoTeacher: Dual-Rotation Consistency Learning for Semi-Supervised Camouflaged Object Detection.
CoRR, 2024

Beyond Inter-Item Relations: Dynamic Adaptive Mixture-of-Experts for LLM-Based Sequential Recommendation.
CoRR, 2024

VITA: Towards Open-Source Interactive Omni Multimodal LLM.
CoRR, 2024

EasyInv: Toward Fast and Better DDIM Inversion.
CoRR, 2024

Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation.
CoRR, 2024

ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models.
CoRR, 2024

Move and Act: Enhanced Object Manipulation and Background Integrity for Image Editing.
CoRR, 2024

Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model.
CoRR, 2024

INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model.
CoRR, 2024

Routing Experts: Learning to Route Dynamic Experts in Multi-modal Large Language Models.
CoRR, 2024

Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model.
CoRR, 2024

AnySR: Realizing Image Super-Resolution as Any-Scale, Any-Resource.
CoRR, 2024

Oracle Bone Inscriptions Multi-modal Dataset.
CoRR, 2024

HRSAM: Efficiently Segment Anything in High-Resolution Images.
CoRR, 2024

HUWSOD: Holistic Self-training for Unified Weakly Supervised Object Detection.
CoRR, 2024

Local Manifold Learning for No-Reference Image Quality Assessment.
CoRR, 2024

UIO-LLMs: Unbiased Incremental Optimization for Long-Context LLMs.
CoRR, 2024

Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text.
CoRR, 2024

Depth-Guided Semi-Supervised Instance Segmentation.
CoRR, 2024

Evaluating and Analyzing Relationship Hallucinations in LVLMs.
CoRR, 2024

VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language Large Models.
CoRR, 2024

Image Captioning via Dynamic Path Customization.
CoRR, 2024

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis.
CoRR, 2024

Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion.
CoRR, 2024

Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference.
CoRR, 2024

ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion.
CoRR, 2024

CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method.
CoRR, 2024

Multi-Modal Prompt Learning on Blind Image Quality Assessment.
CoRR, 2024

NeRF-DetS: Enhancing Multi-View 3D Object Detection with Sampling-adaptive Network of Continuous NeRF-based Representation.
CoRR, 2024

ConCLVD: Controllable Chinese Landscape Video Generation via Diffusion Model.
CoRR, 2024

Rethinking 3D Dense Caption and Visual Grounding in A Unified Framework through Prompt-based Localization.
CoRR, 2024

DiffusionFace: Towards a Comprehensive Dataset for Diffusion-Based Face Forgery Analysis.
CoRR, 2024

Not All Attention is Needed: Parameter and Computation Efficient Transfer Learning for Multi-modal Large Language Models.
CoRR, 2024

DMAD: Dual Memory Bank for Real-World Anomaly Detection.
CoRR, 2024

Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers.
CoRR, 2024

DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation.
CoRR, 2024

Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models.
CoRR, 2024

Semi-supervised Counting via Pixel-by-pixel Density Distribution Modelling.
CoRR, 2024

EBFT: Effective and Block-Wise Fine-Tuning for Sparse LLMs.
CoRR, 2024

Unified-Width Adaptive Dynamic Network for All-In-One Image Restoration.
CoRR, 2024

Feature Denoising Diffusion Model for Blind Image Quality Assessment.
CoRR, 2024

Cross-Modality Perturbation Synergy Attack for Person Re-identification.
CoRR, 2024

Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation.
CoRR, 2024

StealthDiffusion: Towards Evading Diffusion Forensic Detection through Diffusion Model.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Adaptive Selection based Referring Image Segmentation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

3D-GRES: Generalized 3D Referring Expression Segmentation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Multimodal Inplace Prompt Tuning for Open-set Object Detection.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Deep Instruction Tuning for Segment Anything Model.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Prompting to Adapt Foundational Segmentation Models.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Cantor: Inspiring Multimodal Chain-of-Thought of MLLM.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

QueryMatch: A Query-based Contrastive Learning Framework for Weakly Supervised Visual Grounding.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

ERQ: Error Reduction for Post-Training Quantization of Vision Transformers.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Outlier-aware Slicing for Post-Training Quantization in Vision Transformer.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Integrating Global Context Contrast and Local Sensitivity for Blind Image Quality Assessment.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Adaptive Feature Selection for No-Reference Image Quality Assessment by Mitigating Semantic Noise Sensitivity.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

CaM: Cache Merging for Memory-efficient LLMs Inference.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

AffineQuant: Affine Transformation Quantization for Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Exploring Target Representations for Masked Autoencoders.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

GreedyAgent: Crafting Efficient Agents for Meta-learning from Learning Curves via Greedy Algorithm Selection.
Proceedings of the Advanced Intelligent Computing Technology and Applications, 2024

Functionally Similar Multi-Label Knowledge Distillation.
Proceedings of the IEEE International Conference on Acoustics, 2024

Code Membership Inference for Detecting Unauthorized Data Use in Code Pre-trained Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

AnyTrans: Translate AnyText in the Image with Large Scale Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

TF-FAS: Twofold-Element Fine-Grained Semantic Guidance for Generalizable Face Anti-spoofing.
Proceedings of the Computer Vision - ECCV 2024, 2024

Multi-branch Collaborative Learning Network for 3D Visual Grounding.
Proceedings of the Computer Vision - ECCV 2024, 2024

AccDiffusion: An Accurate Method for Higher-Resolution Image Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

Enhancing Tampered Text Detection Through Frequency Feature Fusion and Decomposition.
Proceedings of the Computer Vision - ECCV 2024, 2024

DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

GraCo: Granularity-Controllable Interactive Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Autoregressive Queries for Adaptive Tracking with Spatio-Temporal Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

UniPTS: A Unified Framework for Proficient Post-Training Sparsity.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Aligning and Prompting Everything All at Once for Universal Visual Perception.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

PortraitBooth: A Versatile Portrait Model for Fast Identity-Preserved Personalization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

FocSAM: Delving Deeply into Focused Objects in Segmenting Anything.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

MMAPS: End-to-End Multi-Grained Multi-Modal Attribute-Aware Product Summarization.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Learning Image Demoiréing from Unpaired Real Data.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Toward Open-Set Human Object Interaction Detection.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Lottery Jackpots Exist in Pre-Trained Models.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Super Vision Transformer.
Int. J. Comput. Vis., December, 2023

Pruning Networks With Cross-Layer Ranking & k-Reciprocal Nearest Filters.
IEEE Trans. Neural Networks Learn. Syst., November, 2023

Distilling a Powerful Student Model via Online Knowledge Distillation.
IEEE Trans. Neural Networks Learn. Syst., November, 2023

Carrying Out CNN Channel Pruning in a White Box.
IEEE Trans. Neural Networks Learn. Syst., October, 2023

Prioritized Subnet Sampling for Resource-Adaptive Supernet Training.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2023

Training Compact CNNs for Image Classification Using Dynamic-Coded Filter Fusion.
IEEE Trans. Pattern Anal. Mach. Intell., August, 2023

Towards local visual modeling for image captioning.
Pattern Recognit., June, 2023

SiMaN: Sign-to-Magnitude Network Binarization.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2023

Generating Hypergraph-Based High-Order Representations of Whole-Slide Histopathological Images for Survival Prediction.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2023

DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning.
Int. J. Comput. Vis., May, 2023

Leveraging Local and Global Cues for Visual Tracking via Parallel Interaction Network.
IEEE Trans. Circuits Syst. Video Technol., April, 2023

Robust Tracking via Uncertainty-Aware Semantic Consistency.
IEEE Trans. Circuits Syst. Video Technol., April, 2023

1xN Pattern for Pruning Convolutional Neural Networks.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2023

SiamBAN: Target-Aware Tracking With Siamese Box Adaptive Network.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2023

Dynamic Support Network for Few-Shot Class Incremental Learning.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

HGNN<sup>+</sup>: General Hypergraph Neural Networks.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

A Real-Time Global Inference Network for One-Stage Referring Expression Comprehension.
IEEE Trans. Neural Networks Learn. Syst., 2023

Fast Monocular Depth Estimation via Side Prediction Aggregation with Continuous Spatial Refinement.
IEEE Trans. Multim., 2023

Knowing What it is: Semantic-Enhanced Dual Attention Transformer.
IEEE Trans. Multim., 2023

Learning Efficient GANs for Image Translation via Differentiable Masks and Co-Attention Distillation.
IEEE Trans. Multim., 2023

Multi-Branch Distance-Sensitive Self-Attention Network for Image Captioning.
IEEE Trans. Multim., 2023

Semantically Consistent Visual Representation for Adversarial Robustness.
IEEE Trans. Inf. Forensics Secur., 2023

Adaptive Feature Selection for No-Reference Image Quality Assessment using Contrastive Mitigating Semantic Noise Sensitivity.
CoRR, 2023

Boosting the Cross-Architecture Generalization of Dataset Distillation through an Empirical Study.
CoRR, 2023

Less is More: Learning Reference Knowledge Using No-Reference Image Quality Assessment.
CoRR, 2023

X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation.
CoRR, 2023

I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization.
CoRR, 2023

NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative Learning.
CoRR, 2023

JM3D & JM3D-LLM: Elevating 3D Representation with Joint Multi-modal Cues.
CoRR, 2023

Towards Unified Token Learning for Vision-Language Tracking.
CoRR, 2023

DLIP: Distilling Language-Image Pre-training.
CoRR, 2023

M3PS: End-to-End Multi-Grained Multi-Modal Attribute-Aware Product Summarization in E-commerce.
CoRR, 2023

Continual Face Forgery Detection via Historical Distribution Preserving.
CoRR, 2023

Towards General Visual-Linguistic Face Forgery Detection.
CoRR, 2023

Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer.
CoRR, 2023

Approximated Prompt Tuning for Vision-Language Pre-trained Models.
CoRR, 2023

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models.
CoRR, 2023

Spatial Re-parameterization for N: M Sparsity.
CoRR, 2023

Adapting Pre-trained Language Models to Vision-Language Tasks via Dynamic Visual Prompting.
CoRR, 2023

CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion Models.
CoRR, 2023

MultiQuant: A Novel Multi-Branch Topology Method for Arbitrary Bit-width Network Quantization.
CoRR, 2023

Distribution-Flexible Subset Quantization for Post-Quantizing Super-Resolution Networks.
CoRR, 2023

Latent Feature Relation Consistency for Adversarial Robustness.
CoRR, 2023

CAT: Collaborative Adversarial Training.
CoRR, 2023

Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-Identification.
CoRR, 2023

Towards End-to-end Semi-supervised Learning for One-stage Object Detection.
CoRR, 2023

Towards Efficient Visual Adaption via Structural Re-parameterization.
CoRR, 2023

Spectral Aware Softmax for Visible-Infrared Person Re-Identification.
CoRR, 2023

Exploring Invariant Representation for Visible-Infrared Person Re-Identification.
CoRR, 2023

Unsupervised Domain Adaptation on Person Re-Identification via Dual-level Asymmetric Mutual Learning.
CoRR, 2023

Self-supervised Graph Representation Learning for Black Market Account Detection.
Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, 2023

Two-Stage Deep Learning Segmentation for Tiny Brain Regions.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023

Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Discover and Align Taxonomic Context Priors for Open-world Semi-Supervised Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

CAPro: Webly Supervised Learning with Cross-modality Aligned Prototypes.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Improving Adversarial Robustness via Information Bottleneck Distillation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Semi-Supervised Panoptic Narrative Grounding.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Learning Occlusion Disentanglement with Fine-grained Localization for Occluded Person Re-identification.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Improving Human-Object Interaction Detection via Virtual Image Learning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

PixelFace+: Towards Controllable Face Generation and Manipulation with Text Descriptions and Segmentation Masks.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

EALink: An Efficient and Accurate Pre-Trained Framework for Issue-Commit Link Recovery.
Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, 2023

RefBERT: A Two-Stage Pre-trained Framework for Automatic Rename Refactoring.
Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, 2023

Interactive Object Placement with Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2023

Bi-directional Masks for Efficient N: M Sparse Training.
Proceedings of the International Conference on Machine Learning, 2023

Real-Time Image Demoiréing on Mobile Devices.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

AutoDiffusion: Training-Free Optimization of Time Steps and Architectures for Automated Diffusion Model Acceleration.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

InterFormer Real-time Interactive Image Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Pseudo-label Alignment for Semi-supervised Instance Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DiffRate : Differentiable Compression Rate for Efficient Vision Transformers.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SMMix: Self-Motivated Image Mixing for Vision Transformers.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Category-aware Allocation Transformer for Weakly Supervised Object Localization.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Automatic Network Pruning via Hilbert-Schmidt Independence Criterion Lasso under Information Bottleneck Principle.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

DistilPose: Tokenized Pose Regression with Heatmap Distillation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

RefTeacher: A Strong Baseline for Semi-Supervised Referring Expression Comprehension.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Meta Architecture for Point Cloud Analysis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

RefCLIP: A Universal Teacher for Weakly Supervised Referring Expression Comprehension.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Clover: Towards A Unified Video-Language Alignment and Fusion Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Discriminator-Cooperated Feature Map Distillation for GAN Compression.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

You Only Segment Once: Towards Real-Time Panoptic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

OMPQ: Orthogonal Mixed Precision Quantization.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

CF-ViT: A General Coarse-to-Fine Method for Vision Transformer.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Attention-Based Neural Architecture Search for Person Re-Identification.
IEEE Trans. Neural Networks Learn. Syst., 2022

Network Pruning Using Adaptive Exemplar Filters.
IEEE Trans. Neural Networks Learn. Syst., 2022

Filter Sketch for Network Pruning.
IEEE Trans. Neural Networks Learn. Syst., 2022

Knowledge-Driven Generative Adversarial Network for Text-to-Image Synthesis.
IEEE Trans. Multim., 2022

Towards Lightweight Transformer Via Group-Wise Transformation for Vision-and-Language Tasks.
IEEE Trans. Image Process., 2022

Knowing What to Learn: A Metric-Oriented Focal Mechanism for Image Captioning.
IEEE Trans. Image Process., 2022

Disentangling Task-Oriented Representations for Unsupervised Domain Adaptation.
IEEE Trans. Image Process., 2022

Plenty is Plague: Fine-Grained Learning for Visual Question Answering.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Fast Class-Wise Updating for Online Hashing.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Towards Robust Adversarial Training via Dual-label Supervised and Geometry Constraint.
Int. J. Softw. Informatics, 2022

Exploring Content Relationships for Distilling Efficient GANs.
CoRR, 2022

Shadow Removal by High-Quality Shadow Synthesis.
CoRR, 2022

Meta Architecure for Point Cloud Analysis.
CoRR, 2022

Exploiting the Partly Scratch-off Lottery Ticket for Quantization-Aware Training.
CoRR, 2022

LAB-Net: LAB Color-Space Oriented Lightweight Network for Shadow Removal.
CoRR, 2022

CycleTrans: Learning Neutral yet Discriminative Features for Visible-Infrared Person Re-Identification.
CoRR, 2022

Clover: Towards A Unified Video-Language Alignment and Fusion Model.
CoRR, 2022

Super Vision Transformer.
CoRR, 2022

Shadow-Aware Dynamic Convolution for Shadow Removal.
CoRR, 2022

What Goes beyond Multi-modal Fusion in One-stage Referring Expression Comprehension: An Empirical Study.
CoRR, 2022

End-to-End Zero-Shot HOI Detection via Vision and Language Knowledge Distillation.
CoRR, 2022

PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation.
CoRR, 2022

Global2Local: A Joint-Hierarchical Attention for Video Captioning.
CoRR, 2022

Factored Attention and Embedding for Unstructured-view Topic-related Ultrasound Report Generation.
CoRR, 2022

Differentiated Relevances Embedding for Group-based Referring Expression Comprehension.
CoRR, 2022

Coarse-to-Fine Vision Transformer.
CoRR, 2022

Optimizing Gradient-driven Criteria in Network Sparsity: Gradient is All You Need.
CoRR, 2022

What Hinders Perceptual Quality of PSNR-oriented Methods?
CoRR, 2022

Deepwalk-aware graph convolutional networks.
Sci. China Inf. Sci., 2022

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

PyramidCLIP: Hierarchical Feature Alignment for Vision-language Model Pretraining.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Learning Best Combination for Efficient N: M Sparsity.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Dynamic Prototype Mask for Occluded Person Re-Identification.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Towards Open-Ended Text-to-Face Generation, Combination and Manipulation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Learning Dynamic Prior Knowledge for Text-to-Face Pixel Synthesis.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Cycle Encoding of a StyleGAN Encoder for Improved Reconstruction and Editability.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Searching Lightweight Neural Network for Image Signal Processing.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Visual Tempo Contrastive Learning for Few-Shot Action Recognition.
Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

MDNet: Motion Distinction Network for Effective Action Recognition.
Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

SeqTR: A Simple Yet Universal Network for Visual Grounding.
Proceedings of the Computer Vision - ECCV 2022, 2022

Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks.
Proceedings of the Computer Vision - ECCV 2022, 2022

Fine-grained Data Distribution Alignment for Post-Training Quantization.
Proceedings of the Computer Vision - ECCV 2022, 2022

Black-Box Dissector: Towards Erasing-Based Hard-Label Model Stealing Attack.
Proceedings of the Computer Vision - ECCV 2022, 2022

ECO-TR: Efficient Correspondences Finding via Coarse-to-Fine Refinement.
Proceedings of the Computer Vision - ECCV 2022, 2022

An Information Theoretic Approach for Attention-Driven Face Forgery Detection.
Proceedings of the Computer Vision - ECCV 2022, 2022

The Tenth Visual Object Tracking VOT2022 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

Privacy-Preserving Face Recognition with Learnable Privacy Budgets in Frequency Domain.
Proceedings of the Computer Vision - ECCV 2022, 2022

PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation.
Proceedings of the Computer Vision - ECCV 2022, 2022

ARM: Any-Time Super-Resolution Method.
Proceedings of the Computer Vision - ECCV 2022, 2022

Training-free Transformer Architecture Search.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Neural Architecture Search with Representation Mutual Information.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

DIFNet: Boosting Visual Information Flow for Image Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Active Teacher for Semi-Supervised Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Boosting Crowd Counting via Multifaceted Attention.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Dual Contrastive Learning for General Face Forgery Detection.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Learning to Learn Transferable Attack.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Aggregating Global and Local Visual Representation for Vehicle Re-IDentification.
IEEE Trans. Multim., 2021

Uncovering Media Bias via Social Network Learning.
ACM Trans. Intell. Syst. Technol., 2021

Beyond Universal Person Re-Identification Attack.
IEEE Trans. Inf. Forensics Secur., 2021

Bio-Inspired Deep Attribute Learning Towards Facial Aesthetic Prediction.
IEEE Trans. Affect. Comput., 2021

Joint segmentation and detection of COVID-19 via a sequential region generation network.
Pattern Recognit., 2021

Evolving Fully Automated Machine Learning via Life-Long Knowledge Anchors.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

MIGO-NAS: Towards Fast and Generalizable Neural Architecture Search.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Winning Solutions and Post-Challenge Analyses of the ChaLearn AutoDL Challenge 2019.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Cauchy loss induced block diagonal representation for robust multi-view subspace clustering.
Neurocomputing, 2021

Real-time semantic segmentation via sequential knowledge distillation.
Neurocomputing, 2021

Binarized Neural Architecture Search for Efficient Object Recognition.
Int. J. Comput. Vis., 2021

Towards Language-guided Visual Recognition via Dynamic Convolutions.
CoRR, 2021

OMPQ: Orthogonal Mixed Precision Quantization.
CoRR, 2021

Prioritized Subnet Sampling for Resource-Adaptive Supernet Training.
CoRR, 2021

Fine-grained Data Distribution Alignment for Post-Training Quantization.
CoRR, 2021

An Information Theory-inspired Strategy for Automatic Network Pruning.
CoRR, 2021

Training Compact CNNs for Image Classification using Dynamic-coded Filter Fusion.
CoRR, 2021

GuidedMix-Net: Learning to Improve Pseudo Masks Using Labeled Images as Reference.
CoRR, 2021

You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient.
CoRR, 2021

1×N Block Pattern for Network Sparsity.
CoRR, 2021

ISTR: End-to-End Instance Segmentation with Transformers.
CoRR, 2021

Black-Box Dissector: Towards Erasing-based Hard-Label Model Stealing Attack.
CoRR, 2021

Carrying out CNN Channel Pruning in a White Box.
CoRR, 2021

DisCo: Remedy Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning.
CoRR, 2021

Lottery Jackpots Exist in Pre-trained Models.
CoRR, 2021

Learnable Expansion-and-Compression Network for Few-shot Class-Incremental Learning.
CoRR, 2021

Distilling a Powerful Student Model via Online Knowledge Distillation.
CoRR, 2021

On Evolving Attention Towards Domain Adaptation.
CoRR, 2021

DeeperForensics Challenge 2020 on Real-World Face Forgery Detection: Methods and Results.
CoRR, 2021

SiMaN: Sign-to-Magnitude Network Binarization.
CoRR, 2021

Aurora Guard: Reliable Face Anti-Spoofing via Mobile Lighting System.
CoRR, 2021

Non-Parametric Adaptive Network Pruning.
CoRR, 2021

Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

CDP: Towards Optimal Filter Pruning via Class-wise Discriminative Power.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Long-Range Feature Propagating for Natural Image Matting.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Show, Read and Reason: Table Structure Recognition with Flexible Context Aggregator.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

RecycleNet: An Overlapped Text Instance Recovery Approach.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

E2Net: Excitative-Expansile Learning for Weakly Supervised Object Localization.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Towards Robustness Against Natural Language Word Substitutions.
Proceedings of the 9th International Conference on Learning Representations, 2021

The Ninth Visual Object Tracking VOT2021 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

A Dual-stream Framework for 3D Mask Face Presentation Attack Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

EC-DARTS: Inducing Equalized and Consistent Optimization into DARTS.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

TRAR: Routing the Attention Spans in Transformer for Visual Question Answering.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

ReCU: Reviving the Dead Weights in Binary Neural Networks.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Parallel Detection-and-Segmentation Learning for Weakly Supervised Instance Segmentation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Aha! Adaptive History-driven Attack for Decision-based Black-box Models.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Seminar Learning for Click-Level Weakly Supervised Semantic Segmentation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Occlude Them All: Occlusion-Aware Attention Network for Occluded Person Re-ID.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Architecture Disentanglement for Deep Neural Networks.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

RSTNet: Captioning With Adaptive Attention on Visual and Non-Visual Words.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Discover Cross-Modality Nuances for Visible-Infrared Person Re-Identification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Removing the Background by Adding the Background: Towards Background Robust Self-Supervised Video Representation Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Toward Joint Thing-and-Stuff Mining for Weakly Supervised Panoptic Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Image-to-Image Translation via Hierarchical Style Disentanglement.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Beyond Max-Margin: Class Margin Equilibrium for Few-Shot Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Towards Compact CNNs via Collaborative Compression.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Domain General Face Forgery Detection by Learning to Weight.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Dual-level Collaborative Transformer for Image Captioning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Local Relation Learning for Face Forgery Detection.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Dual Distribution Alignment Network for Generalizable Person Re-Identification.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
A New Transfer Function for Volume Visualization of Aortic Stent and Its Application to Virtual Endoscopy.
ACM Trans. Multim. Comput. Commun. Appl., 2020

Toward Compact ConvNets via Structure-Sparsity Regularized Filter Pruning.
IEEE Trans. Neural Networks Learn. Syst., 2020

Fine-Grained Spatial Alignment Model for Person Re-Identification With Focal Triplet Loss.
IEEE Trans. Image Process., 2020

Category-Aware Spatial Constraint for Weakly Supervised Detection.
IEEE Trans. Image Process., 2020

Similarity-Preserving Linkage Hashing for Online Image Retrieval.
IEEE Trans. Image Process., 2020

Every node counts: Self-ensembling graph convolutional networks for semi-supervised learning.
Pattern Recognit., 2020

Semi-Supervised Adversarial Monocular Depth Estimation.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

Link-aware semi-supervised hypergraph.
Inf. Sci., 2020

Hadamard Matrix Guided Online Hashing.
Int. J. Comput. Vis., 2020

Learning Efficient GANs using Differentiable Masks and co-Attention Distillation.
CoRR, 2020

PAMS: Quantized Super-Resolution via Parameterized Max Scale.
CoRR, 2020

Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion.
CoRR, 2020

Learning Task-oriented Disentangled Representations for Unsupervised Domain Adaptation.
CoRR, 2020

Dual Distribution Alignment Network for Generalizable Person Re-Identification.
CoRR, 2020

Architecture Disentanglement for Deep Neural Networks.
CoRR, 2020

ASFD: Automatic and Scalable Face Detector.
CoRR, 2020

Distribution Distillation Loss: Generic Approach for Improving Face Recognition from Hard Samples.
CoRR, 2020

Filter Sketch for Network Pruning.
CoRR, 2020

UWSOD: Toward Fully-Supervised-Level Capacity Weakly Supervised Object Detection.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Rotated Binary Neural Network.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

K-armed Bandit based Multi-Modal Network Architecture Search for Visual Question Answering.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Exploring Language Prior for Mode-Sensitive Visual Attention Modeling.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Cascade Grouped Attention Network for Referring Expression Segmentation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Attacking Image Captioning Towards Accuracy-Preserving Target Words Removal.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Dual Channel Hypergraph Collaborative Filtering.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Channel Pruning via Automatic Structure Search.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Polynomial Universal Adversarial Perturbations for Person Re-Identification.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Multiple Expert Brainstorming for Domain Adaptive Person Re-Identification.
Proceedings of the Computer Vision - ECCV 2020, 2020

Enabling Deep Residual Networks for Weakly Supervised Object Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020

PAMS: Quantized Super-Resolution via Parameterized Max Scale.
Proceedings of the Computer Vision - ECCV 2020, 2020

Interpretable Neural Network Decoupling.
Proceedings of the Computer Vision - ECCV 2020, 2020

Improving Face Recognition from Hard Samples via Distribution Distillation Loss.
Proceedings of the Computer Vision - ECCV 2020, 2020

API-Net: Robust Generative Classifier via a Single Discriminator.
Proceedings of the Computer Vision - ECCV 2020, 2020

SSCGAN: Facial Attribute Editing via Style Skip Connections.
Proceedings of the Computer Vision - ECCV 2020, 2020

Anti-bandit Neural Architecture Search for Model Defense.
Proceedings of the Computer Vision - ECCV 2020, 2020

Cogradient Descent for Bilinear Optimization.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Rethinking Performance Estimation in Neural Architecture Search.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

AD-Cluster: Augmented Discriminative Clustering for Domain Adaptive Person Re-Identification.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Noise-Aware Fully Webly Supervised Object Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Filter Grafting for Deep Neural Networks.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

HRank: Filter Pruning Using High-Rank Feature Map.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Projection & Probability-Driven Black-Box Attack.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Siamese Box Adaptive Network for Visual Tracking.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

One-Shot Adversarial Attacks on Visual Tracking With Dual Attention.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Salience-Guided Cascaded Suppression Network for Person Re-Identification.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Asymmetric Co-Teaching for Unsupervised Cross-Domain Person Re-Identification.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Revisiting Image Aesthetic Assessment via Self-Supervised Feature Learning.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Fast Learning of Temporal Action Proposal via Dense Boundary Generator.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Binarized Neural Architecture Search.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Face Sketch Synthesis by Multidomain Adversarial Learning.
IEEE Trans. Neural Networks Learn. Syst., 2019

Cross-Modality Microblog Sentiment Prediction via Bi-Layer Multimodal Hypergraph Learning.
IEEE Trans. Multim., 2019

Deep Manifold Structure Transfer for Action Recognition.
IEEE Trans. Image Process., 2019

Correntropy-Induced Robust Low-Rank Hypergraph.
IEEE Trans. Image Process., 2019

Exploring High-Order Correlations for Industry Anomaly Detection.
IEEE Trans. Ind. Electron., 2019

Ordinal Constraint Binary Coding for Approximate Nearest Neighbor Search.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Holistic CNN Compression via Low-Rank Decomposition with Knowledge Transfer.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Font generation based on least squares conditional generative adversarial nets.
Multim. Tools Appl., 2019

Do Hotel Responses Matter?: A Comprehensive Perspective on Investigating Online Reviews.
Inf. Resour. Manag. J., 2019

Universal Adversarial Perturbations Against Person Re-Identification.
CoRR, 2019

Hadamard Codebook Based Deep Hashing.
CoRR, 2019

Semantic-aware Image Deblurring.
CoRR, 2019

Scene-based Factored Attention for Image Captioning.
CoRR, 2019

Dynamic Neural Network Decoupling.
CoRR, 2019

Dynamic Distribution Pruning for Efficient Network Architecture Search.
CoRR, 2019

Supervised Online Hashing via Similarity Distribution Learning.
CoRR, 2019

Attribute Guided Unpaired Image-to-Image Translation with Semi-supervised Learning.
CoRR, 2019

Aurora Guard: Real-Time Face Anti-Spoofing via Light Reflection.
CoRR, 2019

Towards Compact ConvNets via Structure-Sparsity Regularized Filter Pruning.
CoRR, 2019

Social Media Based Topic Modeling for Smart Campus: A Deep Topical Correlation Analysis Method.
IEEE Access, 2019

FreeAnchor: Learning to Match Anchors for Visual Object Detection.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Information Competing Process for Learning Diversified Representations.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Variational Structured Semantic Inference for Diverse Image Captioning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Session details: Brave New Idea.
Proceedings of the MMAsia '19: ACM Multimedia Asia, Beijing, China, December 16-18, 2019, 2019

Multi-scale Features for Weakly Supervised Lesion Detection of Cerebral Hemorrhage with Collaborative Learning.
Proceedings of the MMAsia '19: ACM Multimedia Asia, Beijing, China, December 16-18, 2019, 2019

Multi-modal Multi-layer Fusion Network with Average Binary Center Loss for Face Anti-spoofing.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

A Part Power Set Model for Scale-Free Person Retrieval.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Hypergraph Induced Convolutional Manifold Networks.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Generalized Zero-Shot Vehicle Detection in Remote Sensing Imagery via Coarse-to-Fine Framework.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Colloquial Image Captioning.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Multi-scale Gem Pooling with N-Pair Center Loss for Fine-Grained Image Search.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Visual-Textual Sentiment Analysis in Product Reviews.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Multinomial Distribution Learning for Effective Neural Architecture Search.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Structured Modeling of Joint Deep Feature and Prediction Refinement for Salient Object Detection.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Universal Perturbation Attack Against Image Retrieval.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Bayesian Optimized 1-Bit CNNs.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Scoot: A Perceptual Metric for Facial Sketches.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Universal Adversarial Perturbation via Prior Driven Uncertainty Approximation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Towards Cross-modality Topic Modelling via Deep Topical Correlation Analysis.
Proceedings of the IEEE International Conference on Acoustics, 2019

Learning Similarity-specific Dictionary for Zero-shot Fine-grained Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

DSNET: Accelerate Indoor Scene Semantic Segmentation.
Proceedings of the IEEE International Conference on Acoustics, 2019

Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Circulant Binary Convolutional Networks: Enhancing the Performance of 1-Bit DCNNs With Circulant Back Propagation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Towards Optimal Structured CNN Pruning via Generative Adversarial Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Towards Visual Feature Translation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Dynamic Capsule Attention for Visual Question Answering.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Free VQA Models from Knowledge Inertia by Pairwise Inconformity Learning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Towards Optimal Fine Grained Retrieval via Decorrelated Centralized Loss with Normalize-Scale Layer.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

PVRNet: Point-View Relation Neural Network for 3D Shape Recognition.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Learning Neural Bag-of-Matrix-Summarization with Riemannian Network.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Towards Optimal Discrete Online Hashing with Balanced Similarity.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Hypergraph Neural Networks.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Weakly Supervised Object Detection via Object-Specific Pixel Gradient.
IEEE Trans. Neural Networks Learn. Syst., 2018

Predicting Microblog Sentiments via Weakly Supervised Multimodal Deep Learning.
IEEE Trans. Multim., 2018

A Stacked Sparse Autoencoder-Based Detector for Automatic Identification of Neuromagnetic High Frequency Oscillations in Epilepsy.
IEEE Trans. Medical Imaging, 2018

Inductive Multi-Hypergraph Learning and Its Application on View-Based 3D Object Classification.
IEEE Trans. Image Process., 2018

Action-Attending Graphic Neural Network.
IEEE Trans. Image Process., 2018

Body Structure Aware Deep Crowd Counting.
IEEE Trans. Image Process., 2018

Image Quality Assessment for Color Correction Based on Color Contrast Similarity and Color Value Difference.
IEEE Trans. Circuits Syst. Video Technol., 2018

Face sketch aging via aging oriented principal component analysis.
Pattern Recognit. Lett., 2018

AAM Based Face Sketch Synthesis.
Neural Process. Lett., 2018

Less is More: Unified Model for Unsupervised Multi-Domain Image-to-Image Translation.
CoRR, 2018

Face Sketch Synthesis Style Similarity: A New Structure Co-occurrence Texture Measure.
CoRR, 2018

Topically-informed bilingually-constrained recursive autoencoders for statistical machine translation.
Commun. Inf. Syst., 2018

Surface Saliency Detection Based on Curvature Co-Occurrence Histograms.
IEEE Access, 2018

Context-Aware Phrase Representation for Statistical Machine Translation.
Proceedings of the PRICAI 2018: Trends in Artificial Intelligence, 2018

Topic-Guided Automatical Human-Simulated Tweeting System.
Proceedings of the PRICAI 2018: Trends in Artificial Intelligence, 2018

PVNet: A Joint Convolutional Network of Point Cloud and Multi-View for 3D Shape Recognition.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Dense Auto-Encoder Hashing for Robust Cross-Modality Retrieval.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Supervised Online Hashing via Hadamard Codebook Learning.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Session details: Multimedia-2 (Socical & Emotional Multimedia).
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Centralized Ranking Loss with Weakly Supervised Localization for Fine-Grained Object Retrieval.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Robust Face Sketch Synthesis via Generative Adversarial Fusion of Priors and Parametric Sigmoid.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Accelerating Convolutional Networks via Global & Dynamic Filter Pruning.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Cross-Modality Person Re-Identification with Generative Adversarial Training.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Depth-assisted RefineNet for Indoor Semantic Segmentation.
Proceedings of the 24th International Conference on Pattern Recognition, 2018

Towards Compact Visual Descriptor via Deep Fisher Network with Binary Embedding.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Gamma Mixture Models for Outlier Removal.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Modulated Convolutional Networks.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Generative Adversarial Learning Towards Fast Weakly Supervised Detection.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

GroupCap: Group-Based Image Captioning With Structured Relevance and Diversity Constraints.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Asynchronous Bidirectional Decoding for Neural Machine Translation.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Output Constraint Transfer for Kernelized Correlation Filter in Tracking.
IEEE Trans. Syst. Man Cybern. Syst., 2017

Continuous Probability Distribution Prediction of Image Emotions via Multitask Shared Sparse Regression.
IEEE Trans. Multim., 2017

Mobile Social Multimedia Analytics in the Big Data Era: An Introduction to the Special Issue.
ACM Trans. Intell. Syst. Technol., 2017

Learning-Based Shadow Recognition and Removal From Monochromatic Natural Images.
IEEE Trans. Image Process., 2017

Toward Optimal Manifold Hashing via Discrete Locally Linear Embedding.
IEEE Trans. Image Process., 2017

Exploring Coherent Motion Patterns via Structured Trajectory Learning for Crowd Mood Modeling.
IEEE Trans. Circuits Syst. Video Technol., 2017

Weakly supervised vehicle detection in satellite images via multi-instance discriminative learning.
Pattern Recognit., 2017

Learning high-dimensional multimedia data.
Multim. Syst., 2017

Special issue on "visual semantic analysis with weak supervision".
Multim. Syst., 2017

Deep Spatio-temporal Manifold Network for Action Recognition.
CoRR, 2017

More Than An Answer: Neural Pivot Network for Visual Qestion Answering.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

StructCap: Structured Semantic Embedding for Image Captioning.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Deep-based fisher vector for mobile visual search.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Sensitive Information Detection on Cyber-Space.
Proceedings of the Image and Graphics - 9th International Conference, 2017

Optimization Algorithm Toward Deep Features Based Camera Pose Estimation.
Proceedings of the Image and Graphics - 9th International Conference, 2017

Cross-Modality Binary Code Learning via Fusion Similarity Hashing.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Lattice-Based Recurrent Neural Network Encoders for Neural Machine Translation.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Ordinal Constrained Binary Code Learning for Nearest Neighbor Search.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

ESPACE: Accelerating Convolutional Neural Networks via Eliminating Spatial and Channel Redundancy.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Image Categorization by Learning a Propagated Graphlet Path.
IEEE Trans. Neural Networks Learn. Syst., 2016

Joint Depth and Semantic Inference from a Single Image via Elastic Conditional Random Field.
Pattern Recognit., 2016

Towards perceptual video cropping with curve fitting.
Multim. Tools Appl., 2016

On application-unbiased benchmarking of web videos from a social network perspective.
Multim. Tools Appl., 2016

Visual sentiment topic model based microblog image sentiment analysis.
Multim. Tools Appl., 2016

Fast verification via statistical geometric for mobile visual search.
Multim. Syst., 2016

Spectral-spatial co-clustering of hyperspectral image data based on bipartite graph.
Multim. Syst., 2016

Decomposed human localization from social photo album.
Multim. Syst., 2016

Special issue: When social media meets physical world.
Multim. Syst., 2016

A cross-media public sentiment analysis system for microblog.
Multim. Syst., 2016

Discriminative local collaborative representation for online object tracking.
Knowl. Based Syst., 2016

Special issue on weakly supervised learning.
J. Vis. Commun. Image Represent., 2016

Local consistent hierarchical Hough Match for image re-ranking.
J. Vis. Commun. Image Represent., 2016

A novel features ranking metric with application to scalable visual and bioinformatics data classification.
Neurocomputing, 2016

Advanced learning for large-scale heterogeneous computing.
Neurocomputing, 2016

Learning for medical imaging.
Neurocomputing, 2016

3D object retrieval with multi-feature collaboration and bipartite graph matching.
Neurocomputing, 2016

Detection based object labeling of 3D point cloud for indoor scenes.
Neurocomputing, 2016

The distributed system for inverted multi-index visual retrieval.
Neurocomputing, 2016

Masked face detection via a modified LeNet.
Neurocomputing, 2016

Multimodal learning for view-based 3D object classification.
Neurocomputing, 2016

Web video topics discovery and structuralization with social network.
Neurocomputing, 2016

Dynamic programming based optimized product quantization for approximate nearest neighbor search.
Neurocomputing, 2016

Bounding Multiple Gaussians Uncertainty with Application to Object Tracking.
Int. J. Comput. Vis., 2016

Face recognition by decision fusion of two-dimensional linear discriminant analysis and local binary pattern.
Frontiers Comput. Sci., 2016

Survey of visual sentiment prediction for social media analysis.
Frontiers Comput. Sci., 2016

Predicting Personalized Emotion Perceptions of Social Images.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Supervised Matrix Factorization for Cross-Modality Hashing.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Towards Convolutional Neural Networks Compression via Global Error Reconstruction.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Crowd video retrieval via deep attribute-embedding graph ranking.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2016

Towards Building Abstraction by Using Line Segment Descriptor.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2016

Variational Neural Discourse Relation Recognizer.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Search-Based Depth Estimation via Coupled Dictionary Learning with Large-Margin Structure Inference.
Proceedings of the Computer Vision - ECCV 2016, 2016

A spatial-temporal visual mid-level ontology for GIF sentiment analysis.
Proceedings of the IEEE Congress on Evolutionary Computation, 2016

Towards Optimal Binary Code Learning via Ordinal Embedding.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

Towards Domain Adaptive Vehicle Detection in Satellite Image by Supervised Super-Resolution Transfer.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

3D Object Retrieval with Multimodal Views.
Proceedings of the 9th Eurographics Workshop on 3D Object Retrieval, 2016

2015
Learning a Probabilistic Topology Discovering Model for Scene Categorization.
IEEE Trans. Neural Networks Learn. Syst., 2015

Probabilistic Skimlets Fusion for Summarizing Multiple Consumer Landmark Videos.
IEEE Trans. Multim., 2015

When Location Meets Social Multimedia: A Survey on Vision-Based Recognition and Mining for Geo-Social Multimedia Analytics.
ACM Trans. Intell. Syst. Technol., 2015

On-Device Mobile Landmark Recognition Using Binarized Descriptor with Multifeature Fusion.
ACM Trans. Intell. Syst. Technol., 2015

Spatial-Aware Object-Level Saliency Prediction by Learning Graphlet Hierarchies.
IEEE Trans. Ind. Electron., 2015

Social Attribute-Aware Force Model: Exploiting Richness of Interaction for Abnormal Crowd Detection.
IEEE Trans. Circuits Syst. Video Technol., 2015

High-capacity reversible watermarking scheme of 2D-vector data.
Signal Image Video Process., 2015

Sparse auto-encoder based feature learning for human body detection in depth image.
Signal Process., 2015

Signal processing and learning methods for 3D semantic analysis.
Signal Process., 2015

Robust infrared target tracking based on particle filter with embedded saliency detection.
Inf. Sci., 2015

Localizing web videos using social images.
Inf. Sci., 2015

Learning for visual semantic understanding in big data.
Neurocomputing, 2015

Feature learning based on SAE-PCA network for human gesture recognition in RGBD images.
Neurocomputing, 2015

Learning for 3D understanding.
Neurocomputing, 2015

Video (GIF) Sentiment Analysis using Large-Scale Mid-Level Ontology.
CoRR, 2015

A Cross-media Sentiment Analytics Platform For Microblog.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Modeling Inter- and Intra-Part Deformations for Object Structure Parsing.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Multimodal hypergraph learning for microblog sentiment prediction.
Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

An effective eye states detection method based on the projection of the gray interval distribution.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Sentiment analysis of Chinese micro-blog based on multi-modal correlation model.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Top Rank Supervised Binary Coding for Visual Search.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Forward stereo obstacle detection with Weighted Hough Transform and local temporal correlation.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Interactive on-device Mobile Landmark Recognition with compact binary codes.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Rank Preserving Hashing for Rapid Image Search.
Proceedings of the 2015 Data Compression Conference, 2015

Understanding image structure via hierarchical shape parsing.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Towards 3D object detection with bimodal deep Boltzmann machines over RGBD imagery.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Cross-Modality Sentiment Analysis for Social Multimedia.
Proceedings of the 2015 IEEE International Conference on Multimedia Big Data, BigMM 2015, 2015

Low-Rank Similarity Metric Learning in High Dimensions.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015


2014
Representative Discovery of Structure Cues for Weakly-Supervised Image Segmentation.
IEEE Trans. Multim., 2014

Towards Mobile Document Image Retrieval for Digital Library.
IEEE Trans. Multim., 2014

Weakly Supervised Multi-Graph Learning for Robust Image Reranking.
IEEE Trans. Multim., 2014

Learning High-Level Feature by Deep Belief Networks for 3-D Model Retrieval and Recognition.
IEEE Trans. Multim., 2014

Actively Learning Human Gaze Shifting Paths for Semantics-Aware Photo Cropping.
IEEE Trans. Image Process., 2014

Toward Statistical Modeling of Saccadic Eye-Movement and Visual Saliency.
IEEE Trans. Image Process., 2014

Learning-Based Bipartite Graph Matching for View-Based 3D Model Retrieval.
IEEE Trans. Image Process., 2014

Spatiotemporal Grid Flow for Video Retargeting.
IEEE Trans. Image Process., 2014

Mining Compact Bag-of-Patterns for Low Bit Rate Mobile Visual Search.
IEEE Trans. Image Process., 2014

Weakly Supervised Visual Dictionary Learning by Harnessing Image Attributes.
IEEE Trans. Image Process., 2014

Hyperspectral Image Classification Through Bilayer Graph-Based Learning.
IEEE Trans. Image Process., 2014

3-D Object Retrieval With Hausdorff Distance Learning.
IEEE Trans. Ind. Electron., 2014

Spectral-Spatial Constraint Hyperspectral Image Classification.
IEEE Trans. Geosci. Remote. Sens., 2014

Symbiotic Tracker Ensemble Toward A Unified Tracking Framework.
IEEE Trans. Circuits Syst. Video Technol., 2014

Improved and Promising Identificationof Human MicroRNAs by Incorporatinga High-Quality Negative Set.
IEEE ACM Trans. Comput. Biol. Bioinform., 2014

Visual tracking via weakly supervised learning from multiple imperfect oracles.
Pattern Recognit., 2014

Where should I stand? Learning based human position recommendation for mobile photographing.
Multim. Tools Appl., 2014

Online semi-supervised compressive coding for robust visual tracking.
J. Vis. Commun. Image Represent., 2014

Structured partial least squares for simultaneous object tracking and segmentation.
Neurocomputing, 2014

Robust tracking via patch-based appearance model and local background estimation.
Neurocomputing, 2014

Single/cross-camera multiple-person tracking by graph matching.
Neurocomputing, 2014

Online MIL tracking with instance-level semi-supervised learning.
Neurocomputing, 2014

Large-Scale Geosocial Multimedia [Guest editorial].
IEEE Multim., 2014

Discriminative Orthogonal Nonnegative matrix factorization with flexibility for data representation.
Expert Syst. Appl., 2014

Efficient semantic image segmentation with multi-class ranking prior.
Comput. Vis. Image Underst., 2014

Pursuing Detector Efficiency for Simple Scene Pedestrian Detection.
Proceedings of the MultiMedia Modeling - 20th Anniversary International Conference, 2014

Hacking Chinese Touclick CAPTCHA by Multi-Scale Corner Structure Model with Fast Pattern Matching.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Robust nonnegative matrix factorization via L1 norm regularization by multiplicative updating rules.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Microblog Sentiment Analysis Based on Cross-media Bag-of-words Model.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2014

RGBD Salient Object Detection: A Benchmark and Algorithms.
Proceedings of the Computer Vision - ECCV 2014, 2014

2013
Nonlinear scrambling-based reversible watermarking for 2D-vector maps.
Vis. Comput., 2013

Image retrieval with query-adaptive hashing.
ACM Trans. Multim. Comput. Commun. Appl., 2013

Learning to Distribute Vocabulary Indexing for Scalable Visual Search.
IEEE Trans. Multim., 2013

Learning from mobile contexts to minimize the mobile location search latency.
Signal Process. Image Commun., 2013

Weakly supervised codebook learning by iterative label propagation with graph quantization.
Signal Process., 2013

Bidirectional-isomorphic manifold learning at image semantic understanding & representation.
Multim. Tools Appl., 2013

Visual attention modeling based on short-term environmental adaption.
J. Vis. Commun. Image Represent., 2013

Background subtraction driven seeds selection for moving objects segmentation and matting.
Neurocomputing, 2013

A Bayesian framework for dense depth estimation based on spatial-temporal correlation.
Neurocomputing, 2013

Mining spatiotemporal video patterns towards robust action retrieval.
Neurocomputing, 2013

Learning Compact Visual Descriptors for Low Bit Rate Mobile Landmark Search.
AI Mag., 2013

Seeing actions through scene context.
Proceedings of the 2013 Visual Communications and Image Processing, 2013

Decomposed human localization in personal photo albums.
Proceedings of the 2013 Visual Communications and Image Processing, 2013

A new camera self-calibration method based on CSA.
Proceedings of the 2013 Visual Communications and Image Processing, 2013

Saliency detection by adaptive clustering.
Proceedings of the 2013 Visual Communications and Image Processing, 2013

Geographical Retagging.
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013

Stereotime: a wireless 2D and 3D switchable video communication system.
Proceedings of the ACM Multimedia Conference, 2013

Query-dependent visual dictionary adaptation for image reranking.
Proceedings of the ACM Multimedia Conference, 2013

Large-scale visual sentiment ontology and detectors using adjective noun pairs.
Proceedings of the ACM Multimedia Conference, 2013

SentiBank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content.
Proceedings of the ACM Multimedia Conference, 2013

Semi-Supervised Learning with Manifold Fitted Graphs.
Proceedings of the IJCAI 2013, 2013

Spectral-spatial classification of hyperspectral imagery based on Random Forests.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2013

Visual Reranking through Weakly Supervised Multi-graph Learning.
Proceedings of the IEEE International Conference on Computer Vision, 2013

On the interoperability of local descriptors compression.
Proceedings of the IEEE International Conference on Acoustics, 2013

Label Propagation from ImageNet to 3D Point Clouds.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Salient Object Detection via Low-Rank and Structured Sparse Matrix Decomposition.
Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013

Localizing Web Videos from Heterogeneous Images.
Proceedings of the Late-Breaking Developments in the Field of Artificial Intelligence, 2013

2012
Active query sensing: Suggesting the best query view for mobile visual search.
ACM Trans. Multim. Comput. Commun. Appl., 2012

Context-Aware Semi-Local Feature Detector.
ACM Trans. Intell. Syst. Technol., 2012

Task-Dependent Visual-Codebook Compression.
IEEE Trans. Image Process., 2012

3-D Object Retrieval and Recognition With Hypergraph Analysis.
IEEE Trans. Image Process., 2012

Cross-View Down/Up-Sampling Method for Multiview Depth Video Coding.
IEEE Signal Process. Lett., 2012

k-Partite graph reinforcement and its application in multimedia information retrieval.
Inf. Sci., 2012

Location Discriminative Vocabulary Coding for Mobile Landmark Search.
Int. J. Comput. Vis., 2012

Robust Nonnegative Matrix Factorization via L<sub>1</sub> Norm Regularization
CoRR, 2012

Symbiotic Black-Box Tracker.
Proceedings of the Advances in Multimedia Modeling - 18th International Conference, 2012

View-based 3D object retrieval by bipartite graph matching.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Estimating viewing angles in mobile street view search.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

Weakly supervised topic grouping of YouTube search results.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

Multi-stage vector quantization towards low bit rate visual search.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

Hyperspectral image classification with hypergraph modelling.
Proceedings of the 4th International Conference on Internet Multimedia Computing and Service, 2012

Learning multiple codebooks for low bit rate mobile visual search.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Predicting the effectiveness of queries for visual search.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Pruning tree-structured vector quantizer towards low bit rate mobile visual search.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Weak attributes for large-scale image retrieval.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

What are we looking for: Towards statistical modeling of saccadic eye movements and visual saliency.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Supervised hashing with kernels.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Towards compact topical descriptors.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Weakly supervised sparse coding with geometric consistency pooling.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011
Mining flickr landmarks by modeling reconstruction sparsity.
ACM Trans. Multim. Comput. Commun. Appl., 2011

Actor-independent action search using spatiotemporal vocabulary with appearance hashing.
Pattern Recognit., 2011

Building descriptive and discriminative visual codebook for large-scale image applications.
Multim. Tools Appl., 2011

Vocabulary Hierarchy Optimization and Transfer for Scalable Image Search.
IEEE Multim., 2011

Grid-Based Retargeting with Transformation Consistency Smoothing.
Proceedings of the Advances in Multimedia Modeling, 2011

Video indexing and recommendation based on affective analysis of viewers.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

A mobile location search system with active query sensing.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Active query sensing for mobile location search.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Unsupervised fast anomaly detection in crowds.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Learning heterogeneous data for hierarchical web video classification.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Towards low bit rate mobile visual search with multiple-channel coding.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Learning Compact Visual Descriptor for Low Bit Rate Mobile Landmark Search.
Proceedings of the IJCAI 2011, 2011

Sparse representation based visual element analysis.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

Learning the trip suggestion from landmark photos on the web.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

PKUBench: A context rich mobile visual search benchmark.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

Video stabilization based on saliency driven SIFT matching and discriminative RANSAC.
Proceedings of the ICIMCS 2011, 2011

Contextual dictionaries for image super resolution.
Proceedings of the ICIMCS 2011, 2011

A spatiotemporal context phrase description for general dynamic texture.
Proceedings of the ICIMCS 2011, 2011

When codeword frequency meets geographical location.
Proceedings of the IEEE International Conference on Acoustics, 2011

A lowbit rate vocabulary coding scheme for mobile landmark search.
Proceedings of the IEEE International Conference on Acoustics, 2011

Sorting local descriptors for lowbit rate mobile visual search.
Proceedings of the IEEE International Conference on Acoustics, 2011

Nonnegative Spectral Clustering with Discriminative Regularization.
Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

Topic level sampling towards optimized locality sensitive vocabulary coding.
Proceedings of the 8th International Conference on Information, 2011

2010
A rotation and scale invariant texture description approach.
Proceedings of the Visual Communications and Image Processing 2010, 2010

3D silhouette tracking with occlusion inference.
Proceedings of the Visual Communications and Image Processing 2010, 2010

Saliency detection based on short-term sparse representation.
Proceedings of the International Conference on Image Processing, 2010

Visual saliency as sequential eye fixation probability.
Proceedings of the International Conference on Image Processing, 2010

A robust texture descriptor using multifractal analysis with Gabor filter.
Proceedings of the Second International Conference on Internet Multimedia Computing and Service, 2010

Visual topic model for web image annotation.
Proceedings of the Second International Conference on Internet Multimedia Computing and Service, 2010

Mining actor correlations with hierarchical concurrence parsing.
Proceedings of the IEEE International Conference on Acoustics, 2010

SIGMA: Spatial Integrated Matching Association algorithm for logo detection.
Proceedings of the IEEE International Conference on Acoustics, 2010

Exploring statistical properties for semantic annotation: sparse distributed and convergent assumptions for keywords.
Proceedings of the IEEE International Conference on Acoustics, 2010

Visual tracking via weakly supervised learning from multiple imperfect oracles.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Novel observation model for probabilistic object tracking.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Towards semantic embedding in visual vocabulary.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009
Visual and textual fusion for semantically supervised region-based retrieval.
Multim. Syst., 2009

Photo assessment based on computational visual attention model.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

What is a complete set of keywords for image description & annotation on the web.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Location sensitive indexing for image-based advertising.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Mining city landmarks from blogs by graph modeling.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

VisualCor system: search actor correlations in TV series.
Proceedings of the First International Conference on Internet Multimedia Computing and Service, 2009

Vocabulary hierarchy optimization for effective and transferable retrieval.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

2008
DRM: dynamic region matching for image retrieval using probabilistic fuzzy matching and boosting feature selection.
Signal Image Video Process., 2008

Vision-Based Semi-supervised Homecare with Spatial Constraint.
Proceedings of the Advances in Multimedia Information Processing, 2008

Attention-driven action retrieval with DTW-based 3d descriptor matching.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

Place retrieval with graph-based place-view model.
Proceedings of the 1st ACM SIGMM International Conference on Multimedia Information Retrieval, 2008

Cross-media manifold learning for image retrieval & annotation.
Proceedings of the 1st ACM SIGMM International Conference on Multimedia Information Retrieval, 2008

Flexible sub block ordering based intra 4/SPL times/4 prediction.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Clustering-based subspace SVM ensemble for relevance feedback learning.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Directional correlation analysis of local Haar binary pattern for text detection.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Vocabulary tree incremental indexing for scalable location recognition.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Text Particles Multi-band Fusion for Robust Text Detection.
Proceedings of the Image Analysis and Recognition, 5th International Conference, 2008

2007
Visual & textual fusion for region retrieval: from both fuzzy matching and bayesian reasoning aspects.
Proceedings of the 9th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2007

Using Visual Dictionary to Associate Semantic Objects in Region-Based Image Retrieval.
Proceedings of the Image Analysis and Recognition, 4th International Conference, 2007

A Novel Retrieval Refinement and Interaction Pattern by Exploring Result Correlations for Image Retrieval.
Proceedings of the Adaptive Multimedial Retrieval: Retrieval, 2007

2006
A New Steganalysis Method for Adaptive Spread Spectrum Steganography.
Proceedings of the Second International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2006), 2006

Genetic Algorithm Based Optimal Block Mapping Method for LSB Substitution.
Proceedings of the Second International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2006), 2006


  Loading...