Houqiang Li

Orcid: 0000-0003-2188-3028

Affiliations:
  • University of Science and Technology of China, Department of Electrical Engineering and Information Science, Hefei, China
  • Hefei Comprehensive National Science Center, Institute of Artificial Intelligence, Hefei, China


According to our database1, Houqiang Li authored at least 650 papers between 2004 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Recovering Permuted Sequential Features for effective Reinforcement Learning.
Neural Networks, 2025

2024
SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection.
ACM Trans. Multim. Comput. Commun. Appl., November, 2024

Reconstruction-Free Image Compression for Machine Vision via Knowledge Transfer.
ACM Trans. Multim. Comput. Commun. Appl., October, 2024

Toward On-Demand Transmission: Joint Feature and Image Coding With Reversible Neural Networks.
IEEE Trans. Circuits Syst. Video Technol., October, 2024

Recurrent Generic Contour-Based Instance Segmentation With Progressive Learning.
IEEE Trans. Circuits Syst. Video Technol., September, 2024

Full DouZero+: Improving DouDizhu AI by Opponent Modeling, Coach-Guided Training and Bidding Learning.
IEEE Trans. Games, September, 2024

MCMARL: Parameterizing Value Function via Mixture of Categorical Distributions for Multi-Agent Reinforcement Learning.
IEEE Trans. Games, September, 2024

Domain-Agnostic Priors for Semantic Segmentation Under Unsupervised Domain Adaptation and Domain Generalization.
Int. J. Comput. Vis., September, 2024

CLIP2GAN: Toward Bridging Text With the Latent Space of GANs.
IEEE Trans. Circuits Syst. Video Technol., August, 2024

Coordinate-aligned multi-camera collaboration for active multi-object tracking.
Multim. Syst., August, 2024

Optimizing Camera Motion with MCTS and Target Motion Modeling in Multi-Target Active Object Tracking.
ACM Trans. Multim. Comput. Commun. Appl., July, 2024

HVDistill: Transferring Knowledge from Images to Point Clouds via Unsupervised Hybrid-View Distillation.
Int. J. Comput. Vis., July, 2024

Exploiting Spatial-Temporal Context for Interacting Hand Reconstruction on Monocular RGB Video.
ACM Trans. Multim. Comput. Commun. Appl., June, 2024

Lightweight Context Model Equipped aiWave in Response to the AVS Call for Evidence on Volumetric Medical Image Coding.
IEEE Trans. Circuits Syst. Video Technol., May, 2024

Detect Any Shadow: Segment Anything for Video Shadow Detection.
IEEE Trans. Circuits Syst. Video Technol., May, 2024

DaFIR: Distortion-Aware Representation Learning for Fisheye Image Rectification.
IEEE Trans. Circuits Syst. Video Technol., May, 2024

Exploring Neighbor Correspondence Matching for Multiple-hypotheses Video Frame Synthesis.
ACM Trans. Multim. Comput. Commun. Appl., April, 2024

DBVC: An End-to-End 3-D Deep Biomedical Video Coding Framework.
IEEE Trans. Circuits Syst. Video Technol., April, 2024

CTDS: Centralized Teacher With Decentralized Student for Multiagent Reinforcement Learning.
IEEE Trans. Games, March, 2024

Towards Codebook-Free Deep Probabilistic Quantization for Image Retrieval.
IEEE Trans. Pattern Anal. Mach. Intell., January, 2024

Progressive Recurrent Network for shadow removal.
Comput. Vis. Image Underst., January, 2024

Structure Similarity Preservation Learning for Asymmetric Image Retrieval.
IEEE Trans. Multim., 2024

Video Demoiréing With Deep Temporal Color Embedding and Video-Image Invertible Consistency.
IEEE Trans. Multim., 2024

Content-Adaptive Rate-Distortion Modeling for Frame-Level Rate Control in Versatile Video Coding.
IEEE Trans. Multim., 2024

Prior-Aware Cross Modality Augmentation Learning for Continuous Sign Language Recognition.
IEEE Trans. Multim., 2024

PersonMAE: Person Re-Identification Pre-Training With Masked AutoEncoders.
IEEE Trans. Multim., 2024

Deep Unrestricted Document Image Rectification.
IEEE Trans. Multim., 2024

Progressive Similarity Preservation Learning for Deep Scalable Product Quantization.
IEEE Trans. Multim., 2024

Multi-Granularity Matching Transformer for Text-Based Person Search.
IEEE Trans. Multim., 2024

Self-Supervised Representation Learning With Spatial-Temporal Consistency for Sign Language Recognition.
IEEE Trans. Image Process., 2024

A Robust Framework for One-Shot Key Information Extraction via Deep Partial Graph Matching.
IEEE Trans. Image Process., 2024

VNVC: A Versatile Neural Video Coding Framework for Efficient Human-Machine Vision.
IEEE Trans. Pattern Anal. Mach. Intell., 2024

MotionRL: Align Text-to-Motion Generation to Human Preferences with Multi-Reward Reinforcement Learning.
CoRR, 2024

StreetSurfGS: Scalable Urban Street Surface Reconstruction with Planar-based Gaussian Splatting.
CoRR, 2024

AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding.
CoRR, 2024

LaneTCA: Enhancing Video Lane Detection with Temporal Context Aggregation.
CoRR, 2024

Scaling up Multimodal Pre-training for Sign Language Understanding.
CoRR, 2024

RoFIR: Robust Fisheye Image Rectification Framework Impervious to Optical Center Deviation.
CoRR, 2024

Text-Animator: Controllable Visual Text Video Generation.
CoRR, 2024

Prediction and Reference Quality Adaptation for Learned Video Compression.
CoRR, 2024

Mini Honor of Kings: A Lightweight Environment for Multi-Agent Reinforcement Learning.
CoRR, 2024

TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy.
CoRR, 2024

MASA: Motion-aware Masked Autoencoder with Semantic Alignment for Sign Language Recognition.
CoRR, 2024

EG4D: Explicit Generation of 4D Object without Score Distillation.
CoRR, 2024

Learning Generalizable Human Motion Generator with Reinforcement Learning.
CoRR, 2024

TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding.
CoRR, 2024

GaussNav: Gaussian Splatting for Visual Navigation.
CoRR, 2024

Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction.
CoRR, 2024

DeepEraser: Deep Iterative Context Mining for Generic Text Eraser.
CoRR, 2024

Spatial Decomposition and Temporal Fusion based Inter Prediction for Learned Video Compression.
CoRR, 2024

Exploiting GPT-4 Vision for Zero-shot Point Cloud Understanding.
CoRR, 2024

Fact Embedding through Diffusion Model for Knowledge Graph Completion.
Proceedings of the ACM on Web Conference 2024, 2024

Practical Learned Image Compression with Online Encoder Optimization.
Proceedings of the Picture Coding Symposium, 2024

Refining Video-Based Person Re-Identification: An Integrated Framework with Facial and Body Cues.
Proceedings of the 1st ICMR Workshop on Multimedia Object Re-Identification, 2024

P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

SEDS: Semantically Enhanced Dual-Stream Encoder for Sign Language Retrieval.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Progressive Multi-modal Conditional Prompt Tuning.
Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024

Content-adaptive Variable Resolution Framework for Intra Coding.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2024

Remember the Past for Better Future: Memory-Augmented Offline RL.
Proceedings of the International Joint Conference on Neural Networks, 2024

Learning Label Dependencies for Visual Information Extraction.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Cross-Lingual Transfer for Natural Language Inference via Multilingual Prompt Translator.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Image2Sentence based Asymmetrical Zero-shot Composed Image Retrieval.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Heredity-aware Child Face Image Generation with Latent Space Disentanglement.
Proceedings of the Second Tiny Papers Track at ICLR 2024, 2024

Exploring GPT-4 Vision for Text-to-Image Synthesis Evaluation.
Proceedings of the Second Tiny Papers Track at ICLR 2024, 2024

BoolQuestions: Does Dense Retrieval Understand Boolean Logic in Language?
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Interpretable Composition Attribution Enhancement for Visio-linguistic Compositional Understanding.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

FOREST2SEQ: Revitalizing Order Prior for Sequential Indoor Scene Synthesis.
Proceedings of the Computer Vision - ECCV 2024, 2024

Long-Term Temporal Context Gathering for Neural Video Compression.
Proceedings of the Computer Vision - ECCV 2024, 2024

Instance-Aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Generative Latent Coding for Ultra-Low Bitrate Image Compression.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Sinkhorn Distance Minimization for Knowledge Distillation.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Semi-Supervised Spoken Language Glossification.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Revisiting Open-Set Panoptic Segmentation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

KGDM: A Diffusion Model to Capture Multiple Relation Semantics for Knowledge Graph Embedding.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

SUF: Stabilized Unconstrained Fine-Tuning for Offline-to-Online Reinforcement Learning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
TransVG++: End-to-End Visual Grounding With Language Conditioned Vision Transformer.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

Improving Deep Reinforcement Learning With Mirror Loss.
IEEE Trans. Games, September, 2023

SignBERT+: Hand-Model-Aware Self-Supervised Pre-Training for Sign Language Understanding.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2023

Multi-Modal 3D Object Detection in Autonomous Driving: A Survey.
Int. J. Comput. Vis., August, 2023

Exploring the diversity and invariance in yourself for visual pre-training task.
Pattern Recognit., July, 2023

Unsupervised Person Re-Identification With Wireless Positioning Under Weak Scene Labeling.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2023

Masked Contrastive Representation Learning for Reinforcement Learning.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

VPFNet: Improving 3D Object Detection With Virtual Point Based LiDAR and Stereo Data Fusion.
IEEE Trans. Multim., 2023

FI-WSOD: Foreground Information Guided Weakly Supervised Object Detection.
IEEE Trans. Multim., 2023

Hash Bit Selection With Reinforcement Learning for Image Retrieval.
IEEE Trans. Multim., 2023

Improving Person Re-Identification With Multi-Cue Similarity Embedding and Propagation.
IEEE Trans. Multim., 2023

Coherent Image Animation Using Spatial-Temporal Correspondence.
IEEE Trans. Multim., 2023

Radio-Assisted Human Detection.
IEEE Trans. Multim., 2023

Hybrid Motion Representation Learning for Prediction From Raw Sensor Data.
IEEE Trans. Multim., 2023

Frame-Level Rate Control for Geometry-Based LiDAR Point Cloud Compression.
IEEE Trans. Multim., 2023

Collaborative Multilingual Continuous Sign Language Recognition: A Unified Framework.
IEEE Trans. Multim., 2023

Plenoptic Point Cloud Compression Using Multiview Extension of High Efficiency Video Coding.
IEEE Trans. Multim., 2023

Deep Graph Convolutional Quantization Networks for Image Retrieval.
IEEE Trans. Multim., 2023

Model-Aware Pre-Training for Radial Distortion Rectification.
IEEE Trans. Image Process., 2023

Passive Non-Line-of-Sight Imaging with Light Transport Modulation.
CoRR, 2023

TinySAM: Pushing the Envelope for Efficient Segment Anything Model.
CoRR, 2023

DanZero+: Dominating the GuanDan Game through Reinforcement Learning.
CoRR, 2023

Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs.
CoRR, 2023

DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding.
CoRR, 2023

State Sequences Prediction via Fourier Transform for Representation Learning.
CoRR, 2023

I<sup>2</sup>MD: 3D Action Representation Learning with Inter- and Intra-modal Mutual Distillation.
CoRR, 2023

Accelerate Presolve in Large-Scale Linear Programming via Reinforcement Learning.
CoRR, 2023

MSight: An Edge-Cloud Infrastructure-based Perception System for Connected Automated Vehicles.
CoRR, 2023

Improving the Reliability of Large Language Models by Leveraging Uncertainty-Aware In-Context Learning.
CoRR, 2023

UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding.
CoRR, 2023

DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory.
CoRR, 2023

LVVC: A Learned Versatile Video Coding Framework for Efficient Human-Machine Vision.
CoRR, 2023

Exploring Effective Mask Sampling Modeling for Neural Image Compression.
CoRR, 2023

Learning Transferable Pedestrian Representation from Multimodal Information Supervision.
CoRR, 2023

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.
CoRR, 2023

ROCO: A Roundabout Traffic Conflict Dataset.
CoRR, 2023

Discriminative Experience Replay for Efficient Multi-agent Reinforcement Learning.
CoRR, 2023

Recurrent Contour-based Instance Segmentation with Progressive Learning.
CoRR, 2023

OA-BEV: Bringing Object Awareness to Bird's-Eye-View Representation for Multi-Camera 3D Object Detection.
CoRR, 2023

Reinforcement Learning-based Frame-level Bit Allocation for VVC.
Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2023

End-to-end Action Quality Assessment with Action Parsing Transformer.
Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2023

Neural Network-based Occupancy Map Joint Sampling for Video-based Point Cloud Compression.
Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2023

Learning robust representation for reinforcement learning with distractions by reward sequence prediction.
Proceedings of the Uncertainty in Artificial Intelligence, 2023

Multi-Agent First Order Constrained Optimization in Policy Space.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

State Sequences Prediction via Fourier Transform for Representation Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Hierarchical Multi-Agent Skill Discovery.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

CLIP4HOI: Towards Adapting CLIP for Practical Zero-Shot HOI Detection.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DIFFER: Decomposing Individual Reward for Fair Experience Replay in Multi-Agent Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Text-Only Training for Visual Storytelling.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Dual-view Molecular Pre-training.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Q-SAT: Value Factorization with Self-Attention for Deep Multi-Agent Reinforcement Learning.
Proceedings of the International Joint Conference on Neural Networks, 2023

MA2CL: Masked Attentive Contrastive Learning for Multi-Agent Reinforcement Learning.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Robust Person Re-Identification with Wireless Signals.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

DocMAE: Document Image Rectification via Self-supervised Representation Learning.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

𝒪-GNN: incorporating ring priors into molecular modeling.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Making Better Decision by Directly Planning in Continuous Control.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Alignment.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

A General Rank Preserving Framework for Asymmetric Image Retrieval.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Cyclic-Bootstrap Labeling for Weakly Supervised Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Sign Language Translation with Iterative Prototype.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DIRE for Diffusion-Generated Image Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Masked Motion Predictors are Strong 3D Action Representation Learners.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Focus on Your Target: A Dual Teacher-Student Framework for Domain-adaptive Semantic Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Stare at What You See: Masked Image Modeling without Reconstruction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Asymmetric Feature Fusion for Image Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

AltFreezing for More General Video Face Forgery Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Motion Information Propagation for Neural Video Compression.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

HandNeRF: Neural Radiance Fields for Animatable Interacting Hands.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Human Pose as Compositional Tokens.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

DanZero: Mastering GuanDan Game with Reinforcement Learning.
Proceedings of the IEEE Conference on Games, 2023

Mastering Curling with RL-revised Decision Tree.
Proceedings of the IEEE Conference on Games, 2023

Sample Efficient Reinforcement Learning with Double Importance Sampling Weight Clipping.
Proceedings of the IEEE Conference on Games, 2023

Implementing First-Person Shooter Game AI in WILD-SCAV with Rule-Enhanced Deep Reinforcement Learning.
Proceedings of the IEEE Conference on Games, 2023

Multi-Agent Reinforcement Learning with Safety Layer for Active Voltage Control.
Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

Hybrid and Collaborative Passage Reranking.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

BEST: BERT Pre-training for Sign Language Recognition with Coupling Tokenization.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Low-Light Video Enhancement with Synthetic Event Guidance.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Spatial-Temporal Multi-Cue Network for Sign Language Recognition and Translation.
IEEE Trans. Multim., 2022

Conditional Sentence Generation and Cross-Modal Reranking for Sign Language Translation.
IEEE Trans. Multim., 2022

Learning Temporal-Correlated and Channel- Decorrelated Siamese Networks for Visual Tracking.
IEEE Trans. Multim., 2022

Deep Enhanced Weakly-Supervised Hashing With Iterative Tag Refinement.
IEEE Trans. Multim., 2022

Weakly Supervised Temporal Adjacent Network for Language Grounding.
IEEE Trans. Multim., 2022

Multi-Modal Context Propagation for Person Re-Identification With Wireless Positioning.
IEEE Trans. Multim., 2022

Motion Estimation and Coding Structure for Inter-Prediction of LiDAR Point Cloud Geometry.
IEEE Trans. Multim., 2022

Heterogeneous Contrastive Learning: Encoding Spatial Information for Compact Visual Representations.
IEEE Trans. Multim., 2022

E-Commerce Storytelling Recommendation Using Attentional Domain-Transfer Network and Adversarial Pre-Training.
IEEE Trans. Multim., 2022

Direct Molecular Conformation Generation.
Trans. Mach. Learn. Res., 2022

Passive Non-Line-of-Sight Imaging Using Optimal Transport.
IEEE Trans. Image Process., 2022

Anti-Distractor Active Object Tracking in 3D Environments.
IEEE Trans. Circuits Syst. Video Technol., 2022

${\mathsf{EZFusion}}$: A Close Look at the Integration of LiDAR, Millimeter-Wave Radar, and Camera for Accurate 3D Object Detection and Tracking.
IEEE Robotics Autom. Lett., 2022

End-to-End Optimized Versatile Image Compression With Wavelet-Like Transform.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Publisher Correction: Rotamer-free protein sequence design based on deep learning and self-consistency.
Nat. Comput. Sci., 2022

Rotamer-free protein sequence design based on deep learning and self-consistency.
Nat. Comput. Sci., 2022

PMIVec: a word embedding model guided by point-wise mutual information criterion.
Multim. Syst., 2022

Coach-assisted multi-agent reinforcement learning framework for unexpected crashed agents.
Frontiers Inf. Technol. Electron. Eng., 2022

Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management.
CoRR, 2022

CLIP2GAN: Towards Bridging Text with the Latent Space of GANs.
CoRR, 2022

SinDiffusion: Learning a Diffusion Model from a Single Natural Image.
CoRR, 2022

CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment.
CoRR, 2022

Semantic Image Synthesis via Diffusion Models.
CoRR, 2022

Self-Adaptive Label Augmentation for Semi-supervised Few-shot Classification.
CoRR, 2022

A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation.
CoRR, 2022

Simultaneous Double Q-learning with Conservative Advantage Learning for Actor-Critic Methods.
CoRR, 2022

Multi-Target Active Object Tracking with Monte Carlo Tree Search and Target Motion Modeling.
CoRR, 2022

Learning Enriched Illuminants for Cross and Single Sensor Color Constancy.
CoRR, 2022

CTDS: Centralized Teacher with Decentralized Student for Multi-Agent Reinforcement Learning.
CoRR, 2022

DQMIX: A Distributional Perspective on Multi-Agent Reinforcement Learning.
CoRR, 2022

Revisiting QMIX: Discriminative Credit Assignment by Gradient Entropy Regularization.
CoRR, 2022

Direct Molecular Conformation Generation.
CoRR, 2022

Global Homography Motion Compensation for Versatile Video Coding.
Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2022

PolyTracker: Progressive Contour Regression for Multiple Object Tracking and Segmentation.
Proceedings of the Pattern Recognition and Computer Vision - 5th Chinese Conference, 2022

LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Hand-Object Interaction Image Generation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

WiFi-Based Human Pose Image Generation.
Proceedings of the 24th IEEE International Workshop on Multimedia Signal Processing, 2022

UDoc-GAN: Unpaired Document Illumination Correction with Background Light Prior.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Neighbor Correspondence Matching for Flow-based Video Frame Synthesis.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Estimation of Reliable Proposal Quality for Temporal Action Detection.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Unified 2D and 3D Pre-Training of Molecular Representations.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Stabilizing Voltage in Power Distribution Networks via Multi-Agent Reinforcement Learning with Transformer.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Spatially Scalable Video-Based Point Cloud Compression.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

Supervised Off-Policy Ranking.
Proceedings of the International Conference on Machine Learning, 2022

Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent.
Proceedings of the International Conference on Machine Learning, 2022

Hardware-Oriented Shallow Joint Demosaicing and Denoising.
Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

Neural-based Mixture Probabilistic Query Embedding for Answering FOL queries on Knowledge Graphs.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

MVP: Multimodality-Guided Visual Pre-training.
Proceedings of the Computer Vision - ECCV 2022, 2022

CMD: Self-supervised 3D Action Representation Learning with Cross-Modal Mutual Distillation.
Proceedings of the Computer Vision - ECCV 2022, 2022

Multi-modal Sign Language Spotting by Multi/One-Shot Learning.
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

TAPE: Task-Agnostic Prior Embedding for Image Restoration.
Proceedings of the Computer Vision - ECCV 2022, 2022

CMT: Context-Matching-Guided Transformer for 3D Tracking in Point Clouds.
Proceedings of the Computer Vision - ECCV 2022, 2022

Geometric Representation Learning for Document Image Rectification.
Proceedings of the Computer Vision - ECCV 2022, 2022

Contextual Similarity Distillation for Asymmetric Image Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Uformer: A General U-Shaped Transformer for Image Restoration.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Domain-Agnostic Prior for Transfer Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Large-Scale Pre-training for Person Re-identification with Noisy Labels.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning.
Proceedings of the IEEE Conference on Games, CoG 2022, Beijing, 2022

Mastering the Game of 3v3 Snakes with Rule-Enhanced Multi-Agent Reinforcement Learning.
Proceedings of the IEEE Conference on Games, CoG 2022, Beijing, 2022

Learning Token-Based Representation for Image Retrieval.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Sample-Efficient Reinforcement Learning via Conservative Model-Based Actor-Critic.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
MFECN: Multi-level Feature Enhanced Cumulative Network for Scene Text Detection.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Affinity Derivation for Accurate Instance Segmentation.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Residual Refinement Network with Attribute Guidance for Precise Saliency Detection.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Neural-Network-Based Cross-Channel Intra Prediction.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Global-Local Enhancement Network for NMF-Aware Sign Language Recognition.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Progressive Learning of Low-Precision Networks for Image Classification.
IEEE Trans. Multim., 2021

Progressive Unsupervised Person Re-Identification by Tracklet Association With Spatio-Temporal Regularization.
IEEE Trans. Multim., 2021

Collaborative Image Relevance Learning for Visual Re-Ranking.
IEEE Trans. Multim., 2021

Efficient Projected Frame Padding for Video-Based Point Cloud Compression.
IEEE Trans. Multim., 2021

Single Shot Video Object Detector.
IEEE Trans. Multim., 2021

Deep Relation Embedding for Cross-Modal Retrieval.
IEEE Trans. Image Process., 2021

SSSIC: Semantics-to-Signal Scalable Image Coding With Learned Structural Representations.
IEEE Trans. Image Process., 2021

Learning Diverse Models for End-to-End Ensemble Tracking.
IEEE Trans. Image Process., 2021

An End-to-End Foreground-Aware Network for Person Re-Identification.
IEEE Trans. Image Process., 2021

MINet: Meta-Learning Instance Identifiers for Video Object Detection.
IEEE Trans. Image Process., 2021

MCFD: A Hardware-Efficient Noniterative Multicue Fusion Demosaicing Algorithm.
IEEE Trans. Circuits Syst. Video Technol., 2021

Semantic Boundary Detection With Reinforcement Learning for Continuous Sign Language Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2021

Cascaded Regression Tracking: Towards Online Hard Distractor Discrimination.
IEEE Trans. Circuits Syst. Video Technol., 2021

Occupancy-Map-Based Rate Distortion Optimization and Partition for Video-Based Point Cloud Compression.
IEEE Trans. Circuits Syst. Video Technol., 2021

From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection.
IEEE Trans. Circuits Syst. Video Technol., 2021

Context-Adaptive Inverse Quantization for Inter-Frame Coding.
IEEE Open J. Circuits Syst., 2021

Unsupervised Deep Representation Learning for Real-Time Tracking.
Int. J. Comput. Vis., 2021

Semantics-to-Signal Scalable Image Compression with Learned Revertible Representations.
Int. J. Comput. Vis., 2021

State Representation Learning With Adjacent State Consistency Loss for Deep Reinforcement Learning.
IEEE Multim., 2021

Deep Learning-Based Video Coding: A Review and a Case Study.
ACM Comput. Surv., 2021

DocScanner: Robust Document Image Rectification with Progressive Learning.
CoRR, 2021

One-shot Key Information Extraction from Document with Deep Partial Graph Matching.
CoRR, 2021

Heredity-aware Child Face Image Generation with Latent Space Disentanglement.
CoRR, 2021

Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training.
CoRR, 2021

Dual-view Molecule Pre-training.
CoRR, 2021

Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Dual Progressive Prototype Network for Generalized Zero-Shot Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Contextual Similarity Aggregation with Self-attention for Visual Re-ranking.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Cross-modal Joint Prediction and Alignment for Composed Query Image Retrieval.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Learning Fine-Grained Motion Embedding for Landscape Animation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Semantic Scalable Image Compression with Cross-Layer Priors.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

SEMI: A Sequential Multi-Modal Information Transfer Network for E-Commerce Micro-Video Recommendations.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Efficient Integer-Arithmetic-Only Convolutional Networks with Bounded ReLU.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

Random Sampling Weights Allocation Update for Deep Reinforcement Learning.
Proceedings of the Neural Information Processing - 28th International Conference, 2021

BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining.
Proceedings of the 38th International Conference on Machine Learning, 2021

Path Ranking Model for Entity Prediction.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Attentive Update of Multi-Critic for Deep Reinforcement Learning.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

IOT: Instance-wise Layer Reordering for Transformer Structures.
Proceedings of the 9th International Conference on Learning Representations, 2021

A Deeply Modulated Scheme for Variable-Rate Video Compression.
Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

Learning Deep Local Features with Multiple Dynamic Attentions for Large-Scale Image Retrieval.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Instance-wise Hard Negative Example Generation for Contrastive Learning in Unpaired Image-to-Image Translation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Conditional DETR for Fast Training Convergence.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Joint Inductive and Transductive Learning for Video Object Segmentation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

3D Local Convolutional Neural Networks for Gait Recognition.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

SignBERT: Pre-Training of Hand-Model-Aware Representation for Sign Language Recognition.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

TransVG: End-to-End Visual Grounding with Transformers.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Fine-grained Semantic Alignment Network for Weakly Supervised Temporal Language Grounding.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Discovering Representation Sprachbund For Multilingual Pre-Training.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Modulated Variable-Rate Deep Video Compression.
Proceedings of the 31st Data Compression Conference, 2021

Improving Sign Language Translation With Monolingual Data by Sign Back-Translation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Image Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Revisiting Knowledge Distillation: An Inheritance and Exploration Framework.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Model-Aware Gesture-to-Gesture Translation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Unsupervised Pre-Training for Person Re-Identification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Representing Videos As Discriminative Sub-Graphs for Action Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

ProphetNet-X: Large-Scale Pre-training Models for English, Chinese, Multi-lingual, Dialog, and Code Generation.
Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Instance Mining with Class Feature Banks for Weakly Supervised Object Detection.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Contrastive Transformation for Self-supervised Correspondence Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Task-Independent Knowledge Makes for Transferable Representations for Generalized Zero-Shot Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Auto-Encoding Transformations in Reparameterized Lie Groups for Unsupervised Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Hand-Model-Aware Sign Language Recognition.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
AB-LSTM: Attention-based Bidirectional LSTM Model for Scene Text Detection.
ACM Trans. Multim. Comput. Commun. Appl., 2020

Single-stage Instance Segmentation.
ACM Trans. Multim. Comput. Commun. Appl., 2020

MV2Flow: Learning Motion Representation for Fast Compressed Video Action Recognition.
ACM Trans. Multim. Comput. Commun. Appl., 2020

Neighborhood Pyramid Preserving Hashing.
IEEE Trans. Multim., 2020

On the Energy-Delay Tradeoff in Streaming Data: Finite Blocklength Analysis.
IEEE Trans. Inf. Theory, 2020

Real-Time Correlation Tracking Via Joint Model Compression and Transfer.
IEEE Trans. Image Process., 2020

Rate Control for Video-Based Point Cloud Compression.
IEEE Trans. Image Process., 2020

Hierarchical Recurrent Deep Fusion Using Adaptive Clip Summarization for Sign Language Translation.
IEEE Trans. Image Process., 2020

Improving Person Re-Identification With Iterative Impression Aggregation.
IEEE Trans. Image Process., 2020

Advanced 3D Motion Prediction for Video-Based Dynamic Point Cloud Compression.
IEEE Trans. Image Process., 2020

Quadtree-Based Coding Framework for High-Density Camera Array-Based Light Field Image.
IEEE Trans. Circuits Syst. Video Technol., 2020

MUcast: Linear Uncoded Multiuser Video Streaming With Channel Assignment and Power Allocation Optimization.
IEEE Trans. Circuits Syst. Video Technol., 2020

λ-Domain Perceptual Rate Control for 360-Degree Video Compression.
IEEE J. Sel. Top. Signal Process., 2020

Heterogeneous Contrastive Learning: Encoding Spatial Information for Compact Visual Representations.
CoRR, 2020

Can Semantic Labels Assist Self-Supervised Visual Representation Learning?
CoRR, 2020

Masked Contrastive Representation Learning for Reinforcement Learning.
CoRR, 2020

Global-local Enhancement Network for NMFs-aware Sign Language Recognition.
CoRR, 2020

Efficient Integer-Arithmetic-Only Convolutional Neural Networks.
CoRR, 2020

Soft Hindsight Experience Replay.
CoRR, 2020

ProphetNet-Ads: A Looking Ahead Strategy for Generative Retrieval Models in Sponsored Search Engine.
Proceedings of the Natural Language Processing and Chinese Computing, 2020

Promoting Stochasticity for Expressive Policies via a Simple and Efficient Regularization Method.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Boosting Continuous Sign Language Recognition via Cross Modality Augmentation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Vision Meets Wireless Positioning: Effective Person Re-identification with Recurrent Context Propagation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Spatio-Temporal Inception Graph Convolutional Networks for Skeleton-Based Action Recognition.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Towards Semantically Scalable Image Coding using Semantic Map.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2020

Quantile Regression Hindsight Experience Replay.
Proceedings of the Neural Information Processing - 27th International Conference, 2020

State Representation Learning For Effective Deep Reinforcement Learning.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Contextual Adversarial Attacks For Object Detection.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Sed-Net: Detecting Multi-Type Edits Of Images.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Phrase-Level Global-Local Hybrid Model For Sentence Embedding.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Time-Sensitive Collaborative Interest Aware Model for Session-Based Recommendation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Incorporating BERT into Neural Machine Translation.
Proceedings of the 8th International Conference on Learning Representations, 2020

Semantically Scalable Image Coding With Compression of Feature Maps.
Proceedings of the IEEE International Conference on Image Processing, 2020

The Eighth Visual Object Tracking VOT2020 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

Video-Based Compression for Plenoptic Point Clouds.
Proceedings of the Data Compression Conference, 2020

Transformation GAN for Unsupervised Image Synthesis and Representation Learning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

M-LVC: Multiple Frames Prediction for Learned Video Compression.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Spatial-Temporal Multi-Cue Network for Continuous Sign Language Recognition.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Deep Model-Based Reinforcement Learning via Estimated Uncertainty and Conservative Policy Optimization.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

POST: POlicy-Based Switch Tracking.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Attentive Experience Replay.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Relation-Guided Spatial Attention and Temporal Refinement for Video-Based Person Re-Identification.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Multi-Question Learning for Visual Question Answering.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Deep Scalable Supervised Quantization by Self-Organizing Map.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Invertibility-Driven Interpolation Filter for Video Coding.
IEEE Trans. Image Process., 2019

Advanced Spherical Motion Model and Local Padding for 360° Video Compression.
IEEE Trans. Image Process., 2019

Learning a Convolutional Neural Network for Image Compact-Resolution.
IEEE Trans. Image Process., 2019

Mobile Visual Search Compression With Grassmann Manifold Embedding.
IEEE Trans. Circuits Syst. Video Technol., 2019

Convolutional Neural Network-Based Fractional-Pixel Motion Compensation.
IEEE Trans. Circuits Syst. Video Technol., 2019

Reliable Re-Detection for Long-Term Tracking.
IEEE Trans. Circuits Syst. Video Technol., 2019

Reference Clip for Inter Prediction in Video Coding.
IEEE Trans. Circuits Syst. Video Technol., 2019

Convolutional Neural Network-Based Block Up-Sampling for HEVC.
IEEE Trans. Circuits Syst. Video Technol., 2019

Attention-Based 3D-CNNs for Large-Vocabulary Sign Language Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2019

Multi-tracker fusion via adaptive outlier detection.
Multim. Tools Appl., 2019

Scene text detection with fully convolutional neural networks.
Multim. Tools Appl., 2019

Exploiting weak mask representation with convolutional neural networks for accurate object tracking.
Multim. Tools Appl., 2019

A Generalization Theory based on Independent and Task-Identically Distributed Assumption.
CoRR, 2019

AETv2: AutoEncoding Transformations for Self-Supervised Representation Learning by Minimizing Geodesic Distances in Lie Groups.
CoRR, 2019

Progressive Learning of Low-Precision Networks.
CoRR, 2019

Long Short-Term Relation Networks for Video Action Detection.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Densely Supervised Hierarchical Policy-Value Network for Image Paragraph Generation.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Dynamic Pseudo Label Decoding for Continuous Sign Language Recognition.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Dynamic Cascaded Regression Network with Reinforcement Learning for Robust Face Alignment.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Learning Motion-Aware Policies for Robust Visual Tracking.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Objective Quality Assessment Method for Stereoscopic Image Retargeting.
Proceedings of the IEEE International Conference on Multimedia & Expo Workshops, 2019

Knowledge Distillation with Category-Aware Attention and Discriminant Logit Losses.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Continuous Sign Language Recognition via Reinforcement Learning.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Occupancy-Map-Based Rate Distortion Optimization for Video-Based Point Cloud Compression.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Relation Distillation Networks for Video Object Detection.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Exploiting Channel Assignment and Power Allocation for Linear Uncoded Multiuser Video Streaming.
Proceedings of the 2019 IEEE International Conference on Communications, 2019

In Defense of the Classification Loss for Person Re-Identification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Quantization Networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Unsupervised Deep Tracking.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Iterative Alignment Network for Continuous Sign Language Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Deep Grammatical Multi-classifier for Continuous Sign Language Recognition.
Proceedings of the Fifth IEEE International Conference on Multimedia Big Data, 2019

Spatial and Temporal Mutual Promotion for Video-Based Person Re-Identification.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Online Early-Late Fusion Based on Adaptive HMM for Sign Language Recognition.
ACM Trans. Multim. Comput. Commun. Appl., 2018

Foveation-Based Wireless Soft Image Delivery.
IEEE Trans. Multim., 2018

Distortion Bounds for Source Broadcast Problems.
IEEE Trans. Inf. Theory, 2018

A General Framework for Linear Distance Preserving Hashing.
IEEE Trans. Image Process., 2018

Assessing Image Retrieval Quality at the First Glance.
IEEE Trans. Image Process., 2018

Retrieval Oriented Deep Feature Learning With Complementary Supervision Mining.
IEEE Trans. Image Process., 2018

MCast: High-Quality Linear Video Transmission With Time and Frequency Diversities.
IEEE Trans. Image Process., 2018

Automatic Generation of Social Event Storyboard From Image Click-Through Data.
IEEE Trans. Circuits Syst. Video Technol., 2018

An Efficient Four-Parameter Affine Motion Model for Video Coding.
IEEE Trans. Circuits Syst. Video Technol., 2018

λ-Domain Optimal Bit Allocation Algorithm for High Efficiency Video Coding.
IEEE Trans. Circuits Syst. Video Technol., 2018

Convolutional Neural Network-Based Block Up-Sampling for Intra Frame Coding.
IEEE Trans. Circuits Syst. Video Technol., 2018

Collaborative Index Embedding for Image Retrieval.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Low-Latency Human Action Recognition with Weighted Multi-Region Convolutional Neural Network.
CoRR, 2018

Visual Attribute-augmented Three-dimensional Convolutional Neural Network for Enhanced Human Action Recognition.
CoRR, 2018

Convolutional Neural Networks with Generalized Attentional Pooling for Action Recognition.
Proceedings of the IEEE Visual Communications and Image Processing, 2018

Effective Similarity Measurement for Video-based Person Re-identification.
Proceedings of the IEEE Visual Communications and Image Processing, 2018

Convolutional Neural Network-Based Residue Super-Resolution for Video Coding.
Proceedings of the IEEE Visual Communications and Image Processing, 2018

Generative Adversarial Network-Based Frame Extrapolation for Video Coding.
Proceedings of the IEEE Visual Communications and Image Processing, 2018

Retrieval Across Optical and SAR Images with Deep Neural Network.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Residual Compression Network for Faster Correlation Tracking.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Reflectance Reference for Intra-Frame Coding of Surveillance Video.
Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

Scalable Bag of Selected Deep Features for Visual Instance Retrieval.
Proceedings of the MultiMedia Modeling - 24th International Conference, 2018

Local Convolutional Neural Networks for Person Re-Identification.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Unregularized Auto-Encoder with Generative Adversarial Networks for Image Generation.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Cascaded Feature Augmentation with Diffusion for Image Retrieval.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Convolutional Neural Network-Based Motion Compensation Refinement for Video Coding.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

Improving Deep Neural Network Sparsity through Decorrelation Regularization.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Dilated Convolutional Network with Iterative Optimization for Continuous Sign Language Recognition.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Adaptive Layerwise Quantization for Deep Neural Network Compression.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Online Filter Weakening and Pruning for Efficient Convnets.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Weighted Multi-Region Convolutional Neural Network for Action Recognition With Low-Latency Online Prediction.
Proceedings of the 2018 IEEE International Conference on Multimedia & Expo Workshops, 2018

Enhanced Action Recognition With Visual Attribute-Augmented 3D Convolutional Neural Network.
Proceedings of the 2018 IEEE International Conference on Multimedia & Expo Workshops, 2018

Robust Object Tracking Via Part-Based Correlation Particle Filter.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Major-Subordinate-Task Learning for Image Orientation Estimation.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Online Filter Clustering and Pruning for Efficient Convnets.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Convolutional Neural Network-Based Invertible Half-Pixel Interpolation Filter for Video Coding.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Cooperative Hybrid Digital-Analog Video Transmission in D2D Networks.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Facial Expression Recognition with Data Augmentation and Compact Feature Learning.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

A Hybrid Neural Network for Chroma Intra Prediction.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Cascaded Deep Convolutional Neural Network for Robust Face Alignment.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Affinity Derivation and Graph Merge for Instance Segmentation.
Proceedings of the Computer Vision - ECCV 2018, 2018

The Sixth Visual Object Tracking VOT2018 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Feature Selective Networks for Object Detection.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Multi-Cue Correlation Filters for Robust Visual Tracking.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Towards Open-Set Identity Preserving Face Synthesis.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

CCNet: Cluster-Coordinated Net for Learning Multi-agent Communication Protocols with Reinforcement Learning.
Proceedings of The 10th Asian Conference on Machine Learning, 2018

Video-Based Sign Language Recognition Without Temporal Segmentation.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Hierarchical LSTM for Sign Language Translation.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Joint Source-Channel Secrecy Using Uncoded Schemes: Towards Secure Source Broadcast.
IEEE Trans. Inf. Theory, 2017

Source-Channel Secrecy for Shannon Cipher System.
IEEE Trans. Inf. Theory, 2017

Block-Composed Background Reference for High Efficiency Video Coding.
IEEE Trans. Circuits Syst. Video Technol., 2017

Video restoration based on a novel second order nonlocal total variation model.
Signal Process., 2017

Pseudo-Sequence-Based 2-D Hierarchical Coding Structure for Light-Field Image Compression.
IEEE J. Sel. Top. Signal Process., 2017

Local residual similarity for image re-ranking.
Inf. Sci., 2017

Recent Advance in Content-based Image Retrieval: A Literature Survey.
CoRR, 2017

An Efficient Four-Parameter Affine Motion Model for Video Coding.
CoRR, 2017

Neural network-based arithmetic coding of intra prediction modes in HEVC.
Proceedings of the 2017 IEEE Visual Communications and Image Processing, 2017

Hierarchical piece-wise linear projections for efficient intra-prediction coding.
Proceedings of the 2017 IEEE Visual Communications and Image Processing, 2017

Fast encoding of surveillance videos based on HEVC.
Proceedings of the 2017 IEEE Visual Communications and Image Processing, 2017

Seeing Bot.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

Exploiting Time and Frequency Diversities for High-Quality Linear Video Transmission: A MCast Framework.
Proceedings of the Advances in Multimedia Information Processing - PCM 2017, 2017

No-Reference Image Quality Assessment Based on Internal Generative Mechanism.
Proceedings of the MultiMedia Modeling - 23rd International Conference, 2017

Deep Supervised Quantization by Self-Organizing Map.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

To Create What You Tell: Generating Videos from Captions.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

A convolutional neural network approach for half-pel interpolation in video coding.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2017

Co-projection-plane based 3-D padding for polyhedron projection for 360-degree video.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Visual query compression with locality preserving projection on Grassmann manifold.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Deep network-based image coding for simultaneous compression and retrieval.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Quasi rate distortion optimization for binary hashing.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Projection based advanced motion model for cubic mapping for 360-degree video.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Orientation Estimation Network.
Proceedings of the Image and Graphics - 9th International Conference, 2017

The Visual Object Tracking VOT2017 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Video Captioning with Transferred Semantic Attributes.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Scalable Object Retrieval with Compact Image Representation from Generic Object Regions.
ACM Trans. Multim. Comput. Commun. Appl., 2016

λ-Domain Rate Control Algorithm for HEVC Scalable Extension.
IEEE Trans. Multim., 2016

Effective Active Skeleton Representation for Low Latency Human Action Recognition.
IEEE Trans. Multim., 2016

Comments on "Approximate Characterizations for the Gaussian Source Broadcast Distortion Region".
IEEE Trans. Inf. Theory, 2016

Subpixel Image Quality Assessment Syncretizing Local Subpixel and Global Pixel Features.
IEEE Trans. Image Process., 2016

Robust Blur Kernel Estimation for License Plate Images From Fast Moving Vehicles.
IEEE Trans. Image Process., 2016

An Efficient Fast Mode Decision Method for Inter Prediction in HEVC.
IEEE Trans. Circuits Syst. Video Technol., 2016

Making Residual Vector Distribution Uniform for Distinctive Image Representation.
IEEE Trans. Circuits Syst. Video Technol., 2016

Scalable Feature Matching by Dual Cascaded Scalar Quantization for Image Retrieval.
IEEE Trans. Pattern Anal. Mach. Intell., 2016

A no-reference Image sharpness metric based on structural information using sparse representation.
Inf. Sci., 2016

Joint Source-Channel Secrecy Using Analog Coding: Towards Secure Source Broadcast.
CoRR, 2016

Distortion Bounds for Sending Source over Degraded Broadcast Channel.
CoRR, 2016

Generalized Common Information: Common Information Extraction and Private Sources Synthesis.
CoRR, 2016

Pseudo Sequence based 2-D hierarchical reference structure for Light-Field Image Compression.
CoRR, 2016

Compressive tracking with adaptive color feature selection and foreground modeling.
Proceedings of the 2016 Visual Communications and Image Processing, 2016

Diagonal motion partitions for inter prediction in HEVC.
Proceedings of the 2016 Visual Communications and Image Processing, 2016

No-reference image quality assessment based on global and local content perception.
Proceedings of the 2016 Visual Communications and Image Processing, 2016

Two-stage picture padding for high efficiency video coding.
Proceedings of the 2016 Picture Coding Symposium, 2016

Sparse Matrix Based Hashing for Approximate Nearest Neighbor Search.
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

Sign Language Recognition with Multi-modal Features.
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

Respiration Motion State Estimation on 4D CT Rib Cage Images.
Proceedings of the MultiMedia Modeling - 22nd International Conference, 2016

Sign Language Recognition Based on Trajectory Modeling with HMMs.
Proceedings of the MultiMedia Modeling - 22nd International Conference, 2016

Linear Distance Preserving Pseudo-Supervised and Unsupervised Hashing.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Distortion bounds for source broadcast over degraded channel.
Proceedings of the IEEE International Symposium on Information Theory, 2016

OMP-based transform for inter coding in HEVC.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2016

Hybrid digital-analog scheme for video transmission over fading channel.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2016

Learning Deep Intrinsic Video Representation by Exploring Temporal Coherence and Graph Structure.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Chinese sign language recognition with adaptive HMM.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2016

Sign language recognition with long short-term memory.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Combining directional intra prediction and intra block copy with block partition for HEVC.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Sign language recognition based on adaptive HMMS with data augmentation.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Improve Visual Tracking by End-to-end Multi-Tracker Selection.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2016

Adaptively Weighted Graph Fusion for Image Retrieval.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2016

Jointly Modeling Embedding and Translation to Bridge Video and Language.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Comparative Deep Learning of Hybrid Representations for Image Recommendations.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Distortion bounds for transmitting correlated sources with common part over MAC.
Proceedings of the 54th Annual Allerton Conference on Communication, 2016

2015
Uniting Keypoints: Local Visual Information Fusion for Large-Scale Image Search.
IEEE Trans. Multim., 2015

A Real-Time Hand Posture Recognition System Using Deep Neural Networks.
ACM Trans. Intell. Syst. Technol., 2015

BSIFT: Toward Data-Independent Codebook for Large Scale Image Search.
IEEE Trans. Image Process., 2015

Robust Transmission of Scalable Video Coding Bitstream Over Heterogeneous Networks.
IEEE Trans. Circuits Syst. Video Technol., 2015

Wireless Cooperative Video Coding Using a Hybrid Digital-Analog Scheme.
IEEE Trans. Circuits Syst. Video Technol., 2015

Visual word expansion and BSIFT verification for large-scale image search.
Multim. Syst., 2015

Accurate sensing of scene geo-context via mobile visual localization.
Multim. Syst., 2015

Distributed Lossless Coding Techniques for Hyperspectral Images.
IEEE J. Sel. Top. Signal Process., 2015

Secrecy Communication with Security Rate Measure.
CoRR, 2015

A Rate-Distortion Optimized Coding Method for Region of Interest in Scalable Video Coding.
Adv. Multim., 2015

Efficient background picture coding for videos obtained from static cameras.
Proceedings of the 2015 Visual Communications and Image Processing, 2015

Semi-supervised Hashing with Semantic Confidence for Large Scale Visual Search.
Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2015

Improved downstream rate-distortion performance of SHVC in DASH using sub-layer-selective interlayer prediction.
Proceedings of the 17th IEEE International Workshop on Multimedia Signal Processing, 2015

A Novel Error Concealment Algorithm for H.264/AVC.
Proceedings of the MultiMedia Modeling - 21st International Conference, 2015

Photo Quality Assessment with DCNN that Understands Image Well.
Proceedings of the MultiMedia Modeling - 21st International Conference, 2015

Improved Rate-Distortion Optimization Algorithms for HEVC Lossless Coding.
Proceedings of the MultiMedia Modeling - 21st International Conference, 2015

Attribute Mining for Scalable 3D Human Action Recognition.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

On the energy-delay tradeoff in lossy network communications.
Proceedings of the 2015 IEEE Information Theory Workshop, 2015

Image deblocking via group sparsity optimization.
Proceedings of the 2015 IEEE International Symposium on Circuits and Systems, 2015

An affine motion compensation framework for high efficiency video coding.
Proceedings of the 2015 IEEE International Symposium on Circuits and Systems, 2015

λ Domain based optimal bit allocation for scalable high efficiency video coding.
Proceedings of the 2015 IEEE International Symposium on Circuits and Systems, 2015

Disparity-compensated inter-layer motion prediction using standardized HEVC extensions.
Proceedings of the 2015 IEEE International Symposium on Circuits and Systems, 2015

Sign Language Recognition using 3D convolutional neural networks.
Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

Seamless switching of H.265/HEVC-coded dash representations with open GOP prediction structure.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Rank-aware graph fusion with contextual dissimilarity measurement for image retrieval.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Inter-picture prediction based on 3D point cloud model.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

A Bayesian adaptive weighted total generalized variation model for image restoration.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Overview of the multiview high efficiency video coding (MV-HEVC) standard.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Scalable local feature matching without visual codebook training.
Proceedings of the 7th International Conference on Internet Multimedia Computing and Service, 2015

SOM: Semantic obviousness metric for image quality assessment.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Semi-supervised Domain Adaptation with Subspace Learning for visual recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

A new system for Chinese sign language recognition.
Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

Saliency-aware semantic image coding for mobile visual search.
Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

Single image super-resolution based on nonlocal similarity and sparse representation.
Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

Sign language recognition using real-sense.
Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

Three-Dimensional Point-Cloud Plus Patches: Towards Model-Based Image Coding in the Cloud.
Proceedings of the 2015 IEEE International Conference on Multimedia Big Data, BigMM 2015, 2015

Vision-Based Fine-Grained Location Estimation.
Proceedings of the Multimodal Location Estimation of Videos and Images, 2015

2014
Towards Codebook-Free: Scalable Cascaded Hashing for Mobile Image Search.
IEEE Trans. Multim., 2014

Cross-Indexing of Binary SIFT Codes for Large-Scale Image Search.
IEEE Trans. Image Process., 2014

Contextual Hashing for Large-Scale Image Search.
IEEE Trans. Image Process., 2014

λ Domain Rate Control Algorithm for High Efficiency Video Coding.
IEEE Trans. Image Process., 2014

Wireless Scalable Video Coding Using a Hybrid Digital-Analog Scheme.
IEEE Trans. Circuits Syst. Video Technol., 2014

Denoising of Hyperspectral Images Employing Two-Phase Matrix Decomposition.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2014

Image restoration and enhancement: Recent advances and applications.
Signal Process., 2014

Encoding Spatial Context for Large-Scale Partial-Duplicate Web Image Retrieval.
J. Comput. Sci. Technol., 2014

Click-through-based cross-view learning for image search.
Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014

Click-through-based Subspace Learning for Image Search.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

A new non-local video denoising scheme using low-rank representation and total variation regularization.
Proceedings of the IEEE International Symposium on Circuits and Systemss, 2014

Hybrid transform for HEVC-based lossless coding.
Proceedings of the IEEE International Symposium on Circuits and Systemss, 2014

A Threshold-based HMM-DTW Approach for Continuous Sign Language Recognition.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2014

Search by Detection: Object-Level Feature for Image Retrieval.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2014

Separable Kernel for Image Deblurring.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013
SIFT match verification by geometric coding for large-scale partial-duplicate web image search.
ACM Trans. Multim. Comput. Commun. Appl., 2013

Robust and accurate mobile visual localization and its applications.
ACM Trans. Multim. Comput. Commun. Appl., 2013

Interactive Multimodal Visual Search on Mobile Device.
IEEE Trans. Multim., 2013

Multiview-Video-Plus-Depth Coding Based on the Advanced Video Coding Standard.
IEEE Trans. Image Process., 2013

Luma-Chroma Space Filter Design for Subpixel-Based Monochrome Image Downsampling.
IEEE Trans. Image Process., 2013

Multi-Level Video Frame Interpolation: Exploiting the Interaction Among Different Levels.
IEEE Trans. Circuits Syst. Video Technol., 2013

Robust Temporal-Spatial Decomposition and Its Applications in Video Processing.
IEEE Trans. Circuits Syst. Video Technol., 2013

Detection of Blotch and Scratch in Video Based on Video Decomposition.
IEEE Trans. Circuits Syst. Video Technol., 2013

Multiview-video-plus-depth coding and inter-component prediction in high-level-syntax extension of H.265/HEVC.
Proceedings of the 30th Picture Coding Symposium, 2013

A Video Communication System Based on Spatial Rewriting and ROI Rewriting.
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013

Image search by graph-based label propagation with image representation from DNN.
Proceedings of the ACM Multimedia Conference, 2013

Scale based region growing for scene text detection.
Proceedings of the ACM Multimedia Conference, 2013

Adaptive packet encapsulation of Scalable Video Coding bitstream.
Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013

Line-based distributed coding scheme for onboard lossless compression of high-resolution stereo images.
Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013

Hybrid digital-analog scheme for video transmission over wireless.
Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013

QP refinement according to Lagrange multiplier for High Efficiency Video Coding.
Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013

Rate-distortion optimization with adaptive weighted distortion in high Efficiency Video Coding.
Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013

Noise reduction for hyperspectral images based on structural sparse and low-rank matrix decomposition.
Proceedings of the 2013 IEEE International Geoscience and Remote Sensing Symposium, 2013

Semantic-Spatial Matching for image classification.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

Refining QP to improve coding efficiency in AVS.
Proceedings of the IEEE International Conference on Image Processing, 2013

Coding of mixed-resolution multiview video in 3D video application.
Proceedings of the IEEE International Conference on Image Processing, 2013

Low Bit-Rate Subpixel-Based Color Image Compression.
Proceedings of the 2013 Data Compression Conference, 2013

Video denoising based on matrix recovery with total variation priori.
Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013

View order alternation symmetrization and temporal inter-view combined prediction for 3D video coding.
Proceedings of the 3DTV-Conference 2013: The True Vision, 2013

2012
Principal Visual Word Discovery for Automatic License Plate Detection.
IEEE Trans. Image Process., 2012

Rate-Distortion Optimized Reference Picture Management for High Efficiency Video Coding.
IEEE Trans. Circuits Syst. Video Technol., 2012

Intra coding for depth maps using adaptive boundary location.
Proceedings of the 2012 Visual Communications and Image Processing, 2012

Depth-based motion vector prediction in 3D video coding.
Proceedings of the 2012 Picture Coding Symposium, 2012

Gradual view refresh in depth-enhanced multiview video.
Proceedings of the 2012 Picture Coding Symposium, 2012

Scalar quantization for large scale image search.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Finding perfect rendezvous on the go: accurate mobile visual localization and its applications to routing.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Embedding spatial context information into inverted filefor large-scale image retrieval.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Mixed Gaussian-impulse video noise removal via temporal-spatial decomposition.
Proceedings of the 2012 IEEE International Symposium on Circuits and Systems, 2012

Counter based adaptation for CAVLC in HEVC.
Proceedings of the 2012 IEEE International Symposium on Circuits and Systems, 2012

An adaptive down-sampling based video coding with hybrid super-resolution method.
Proceedings of the 2012 IEEE International Symposium on Circuits and Systems, 2012

Fast Transcoding from H.264 AVC to High Efficiency Video Coding.
Proceedings of the 2012 IEEE International Conference on Multimedia and Expo, 2012

Video error concealment via total variation regularized matrix completion.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

Video frame interpolation using 3-D total variation regularized completion.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

Joint view filtering for multiview depth map sequences.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

Binary SIFT: towards efficient feature matching verification for image search.
Proceedings of the 4th International Conference on Internet Multimedia Computing and Service, 2012

Efficient and scalable cloud-assisted SVC video streaming through mesh networks.
Proceedings of the International Conference on Computing, Networking and Communications, 2012

2011
Peak Tree: A New Tool for Multiscale Hierarchical Representation and Peak Detection of Mass Spectrometry Data.
IEEE ACM Trans. Comput. Biol. Bioinform., 2011

Latent visual context learning for web image applications.
Pattern Recognit., 2011

Modeling spatial and semantic cues for large-scale near-duplicated image retrieval.
Comput. Vis. Image Underst., 2011

Wyner-Ziv video coding using progressive encoding and decoding.
Proceedings of the 2011 IEEE Visual Communications and Image Processing, 2011

Region based motion vector prediction using data hiding and decoder side reasoning.
Proceedings of the 2011 IEEE Visual Communications and Image Processing, 2011

Parsing robustness in High Efficiency Video Coding - analysis and improvement.
Proceedings of the 2011 IEEE Visual Communications and Image Processing, 2011

Intermedia: system and application for video adaptation.
Proceedings of the 10th International Conference on Mobile and Ubiquitous Multimedia, 2011

Optimized reference frame selection for video coding by cloud.
Proceedings of the IEEE 13th International Workshop on Multimedia Signal Processing (MMSP 2011), 2011

Large scale image search with geometric coding.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

JIGSAW: interactive mobile visual search with multimodal queries.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Distributed residual coding for multi-view video with joint motion vector projection and 3-D warping.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2011), 2011

A novel tracking-by-encoding scheme based on linear programming matching.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2011), 2011

Smoothing rate control for multiple video streams using game theory.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2011), 2011

2010
Introduction of the TCSVT Associate Editors.
IEEE Trans. Circuits Syst. Video Technol., 2010

Rate control based on intermediate description.
Proceedings of the Visual Communications and Image Processing 2010, 2010

Hybrid bit-stream rewriting from scalable video coding to H.264/AVC.
Proceedings of the Visual Communications and Image Processing 2010, 2010

Time-variable camera separation for compression of stereoscopic video.
Proceedings of the Visual Communications and Image Processing 2010, 2010

MAP spatial pyramid mean shift for object tracking.
Proceedings of the Visual Communications and Image Processing 2010, 2010

Large scale partially duplicated web image retrieval.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Spatial coding for large scale partial-duplicate web image search.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Error resilient scalability for video bit-stream over heterogeneous packet loss networks.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

Canonical Image Selection by Visual Context Learning.
Proceedings of the 20th International Conference on Pattern Recognition, 2010

Depth-level-adaptive view synthesis for 3D video.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

Distributed lossless coding of hyperspectral images.
Proceedings of the International Conference on Image Processing, 2010

Joint multiview video plus depth coding.
Proceedings of the International Conference on Image Processing, 2010

Low-complexity rate control based on rho-domain model for Scalable Video Coding.
Proceedings of the International Conference on Image Processing, 2010

Congestion-aware transmission rate control using Medium Grain Scalability of Scalable Video Coding.
Proceedings of the International Conference on Image Processing, 2010

Inter-view-predicted redundant pictures for viewpoint switching in multiview video streaming.
Proceedings of the IEEE International Conference on Acoustics, 2010

Large scale partial-duplicate image retrieval with bi-space quantization and geometric consistency.
Proceedings of the IEEE International Conference on Acoustics, 2010

Latent visual context analysis for image re-ranking.
Proceedings of the 9th ACM International Conference on Image and Video Retrieval, 2010

2009
Error Resilient Video Coding Using Redundant Pictures.
IEEE Trans. Circuits Syst. Video Technol., 2009

Error Resilient Coding and Error Concealment in Scalable Video Coding.
IEEE Trans. Circuits Syst. Video Technol., 2009

Intermediate description for multiple video adaptation.
IEEE Trans. Consumer Electron., 2009

3D neuron dendritic spine detection and dendrite reconstruction.
Int. J. Comput. Aided Eng. Technol., 2009

Efficient hierarchical inter picture coding for H.264/AVC baseline profile.
Proceedings of the 2009 Picture Coding Symposium, 2009

Coding techniques in Multiview Video Coding and Joint Multiview Video Model.
Proceedings of the 2009 Picture Coding Symposium, 2009

Progressive distributed coding of multispectral images.
Proceedings of the 5th International Conference on Mobile Multimedia Communications, 2009

Joint Texture and Depth Map Video Coding based on the Scalable Extension of H.264/AVC.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2009), 2009

Distributed coding techniques for onboard lossless compression of multispectral images.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Content-based hierarchical motion description for multiple video adaptation.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Spatial transcoding from Scalable Video Coding to H.264/AVC.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Distributed multiview video coding using the fusion of triple side information.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Visual block link analysis for image re-ranking.
Proceedings of the First International Conference on Internet Multimedia Computing and Service, 2009

2008
Video Error Concealment Using Spatio-Temporal Boundary Matching and Partial Differential Equation.
IEEE Trans. Multim., 2008

A comparison between SVC and transcoding.
IEEE Trans. Consumer Electron., 2008

Peak detection using peak tree approach for mass spectrometry data.
Int. J. Hybrid Intell. Syst., 2008

Distributed image coding based on integrated Markov random field modeling and LDPC decoding.
Proceedings of the International Workshop on Multimedia Signal Processing, 2008

Error resilient transcoding of Scalable Video bitstreams.
Proceedings of the International Workshop on Multimedia Signal Processing, 2008

3D Dendrite Reconstruction and Spine Identification.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention, 2008

Frame loss error concealment for multiview video coding.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2008), 2008

Video coding with spatio-temporal texture synthesis and edge-based inpainting.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Distributed image coding based on integrated Markov modeling and LDPC decoding.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Priority-based template matching intra prediction.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Low-complexity asymmetric multiview video coding.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

2007
Adaptive Directional Lifting-Based Wavelet Transform for Image Coding.
IEEE Trans. Image Process., 2007

An Attention-Information-Based Spatial Adaptation Framework for Browsing Videos via Mobile Devices.
EURASIP J. Adv. Signal Process., 2007

Accurate 3D Facial Synthesis for Plastic Surgery Simulation.
Proceedings of the Advances in Multimedia Modeling, 2007

Distributed Video Coding with Trellis Coded Quantization.
Proceedings of the Advances in Multimedia Modeling, 2007

Automated Segmentation of Drosophila RNAi Fluorescence Cellular Images Using Graph Cuts.
Proceedings of the Advances in Multimedia Modeling, 2007

Discardable data adaptation in scalable video coding.
Proceedings of the international workshop on Workshop on mobile video, 2007

Video Streaming to Mobile Handheld Devices: Challenges in Decoding, Adaptation, and Browsing.
Proceedings of the Multimedia Content Analysis and Mining, International Workshop, 2007

Graph Cut Based Active Contour for Automated Cellular Image Segmentation in High Throughput Rna Interface (rnai) Screening.
Proceedings of the 2007 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2007

Video Coding with Spatio-Temporal Texture Synthesis.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

A Transform Domain Classification Based Wyner-Ziv Video Codec.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Volume Graph Model for 3D Facial Surface Extraction.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Video Inpainting for Largely Occluded Moving Human.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Adaptive Redundant Picture for Error Resilient Video Coding.
Proceedings of the International Conference on Image Processing, 2007

2006
Nuclei Segmentation Using Marker-Controlled Watershed, Tracking Using Mean-Shift, and Kalman Filter in Time-Lapse Microscopy.
IEEE Trans. Circuits Syst. I Regul. Pap., 2006

An Attention Based Spatial Adaptation Scheme for H.264 Videos on Mobiles.
Int. J. Pattern Recognit. Artif. Intell., 2006

Attention Information Based Spatial Adaptation Framework for Browsing Videos Via Mobile Devices.
Proceedings of the Advances in Multimedia Information Processing, 2006

An attention based spatial adaptation scheme for H.264 videos on mobiles.
Proceedings of the 12th International Conference on Multi Media Modeling (MMM 2006), 2006

Identification of Cell-Cycle Phases Using Neural Network and Steerable Filter Features.
Proceedings of the Advances in Neural Networks - ISNN 2006, Third International Symposium on Neural Networks, Chengdu, China, May 28, 2006

Off-Line Motion Description for Fast Video Stream Generation in MPEG-4 AVC/H.264.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

A Fast Downsizing Video Transcoder for H.264/AVC with Rate-Distortion Optimal Mode Decision.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Transcoding to FGS Streams from H.264/AVC Hierarchical B-Pictures.
Proceedings of the International Conference on Image Processing, 2006

Error Resilient Mode Decision in Scalable Video Coding.
Proceedings of the International Conference on Image Processing, 2006

Sketch-Guided Texture-Based Image Inpainting.
Proceedings of the International Conference on Image Processing, 2006

An Image Inpainting Approach Based on the Poisson Equation.
Proceedings of the Second International Workshop on Document Image Analysis for Libraries (DIAL 2006), 2006

2005
Face Recognition Using Neighborhood Preserving Projections.
Proceedings of the Advances in Multimedia Information Processing, 2005

Neighborhood Preserving Projections (NPP): A Novel Linear Dimension Reduction Method.
Proceedings of the Advances in Intelligent Computing, 2005

2004
A novel video coding scheme for mobile devices.
Proceedings of the 3rd International Conference on Mobile and Ubiquitous Multimedia, 2004


  Loading...