Junsong Yuan

Orcid: 0000-0002-7901-8793

Affiliations:
  • State University of New York at Buffalo, Department of Computer Science and Engineering, NY, USA
  • Northwestern University, Evanston, IL, USA (PhD 2009)
  • Nanyang Technological University, School of Electrical and Electronics Engineering, Singapore (former)


According to our database1, Junsong Yuan authored at least 423 papers between 2004 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
A Modular Neural Motion Retargeting System Decoupling Skeleton and Shape Perception.
IEEE Trans. Pattern Anal. Mach. Intell., October, 2024

FADE: A Dataset for Detecting Falling Objects around Buildings in Video.
Dataset, June, 2024

Characters Link Shots: Character Attention Network for Movie Scene Segmentation.
ACM Trans. Multim. Comput. Commun. Appl., April, 2024

A Dual Reinforcement Learning Framework for Weakly Supervised Phrase Grounding.
IEEE Trans. Multim., 2024

Shared Latent Membership Enables Joint Shape Abstraction and Segmentation With Deformable Superquadrics.
IEEE Trans. Image Process., 2024

Pluralistic Salient Object Detection.
CoRR, 2024

FADE: A Dataset for Detecting Falling Objects around Buildings in Video.
CoRR, 2024

Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation.
CoRR, 2024

STAT: Towards Generalizable Temporal Action Localization.
CoRR, 2024

AM^2-EmoJE: Adaptive Missing-Modality Emotion Recognition in Conversation via Joint Embedding Learning.
CoRR, 2024

Show Your Face: Restoring Complete Facial Images from Partial Observations for VR Meeting.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation.
Proceedings of the Computer Vision - ECCV 2024, 2024

IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

Forecasting Future Videos from Novel Views via Disentangled 3D Scene Representation.
Proceedings of the Computer Vision - ECCV 2024, 2024

GRiT: A Generative Region-to-Text Transformer for Object Understanding.
Proceedings of the Computer Vision - ECCV 2024, 2024

Interaction-Centric Spatio-Temporal Context Reasoning for Multi-person Video HOI Recognition.
Proceedings of the Computer Vision - ECCV 2024, 2024

Divide and Fuse: Body Part Mesh Recovery from Partially Visible Human Images.
Proceedings of the Computer Vision - ECCV 2024, 2024

FSC: Few-Point Shape Completion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Spectrum AUC Difference (SAUCD): Human-Aligned 3D Shape Evaluation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data.
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., September, 2023

Consistent 3D Hand Reconstruction in Video via Self-Supervised Learning.
IEEE Trans. Pattern Anal. Mach. Intell., August, 2023

Adaptive Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2023

NeRFPlayer: A Streamable Dynamic Scene Representation with Decomposed Neural Radiance Fields.
IEEE Trans. Vis. Comput. Graph., 2023

Joint-Bone Fusion Graph Convolutional Network for Semi-Supervised Skeleton Action Recognition.
IEEE Trans. Multim., 2023

Federated Learning With Privacy-Preserving Ensemble Attention Distillation.
IEEE Trans. Medical Imaging, 2023

DTCM: Joint Optimization of Dark Enhancement and Action Recognition in Videos.
IEEE Trans. Image Process., 2023

Eyelid's Intrinsic Motion-Aware Feature Learning for Real-Time Eyeblink Detection in the Wild.
IEEE Trans. Inf. Forensics Secur., 2023

Efficient-NeRF2NeRF: Streamlining Text-Driven 3D Editing with Multiview Correspondence-Enhanced Diffusion Models.
CoRR, 2023

RoomDreamer: Text-Driven 3D Indoor Scene Synthesis with Coherent Geometry and Texture.
CoRR, 2023

Dynamic Voxel Grid Optimization for High-Fidelity RGB-D Supervised Surface Reconstruction.
CoRR, 2023

Harnessing Low-Frequency Neural Fields for Few-Shot View Synthesis.
CoRR, 2023

Self-Supervised Distilled Learning for Multi-modal Misinformation Identification.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of Semantics and Depth.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Chain-of-Look Prompting for Verb-centric Surgical Triplet Recognition in Endoscopic Videos.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Exploring the Knowledge Transferred by Response-Based Teacher-Student Distillation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

RoomDreamer: Text-Driven 3D Indoor Scene Synthesis with Coherent Geometry and Texture.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Relit-NeuLF: Efficient Relighting and Novel View Synthesis via Neural 4D Light Field.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Multi-label Emotion Analysis in Conversation via Multimodal Knowledge Distillation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Language-guided Human Motion Synthesis with Atomic Actions.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Source-Free Domain Adaptation for Medical Image Segmentation via Prototype-Anchored Feature Alignment and Contrastive Learning.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Open Set Video HOI detection from Action-centric Chain-of-Look Prompting.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

NeuRBF: A Neural Fields Representation with Adaptive Radial Basis Functions.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Uncertainty-aware State Space Transformer for Egocentric 3D Hand Trajectory Forecasting.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SOAR: Scene-debiasing Open-set Action Recognition.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Towards Generic Image Manipulation Detection with Weakly-Supervised Self-Consistency Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

POINTACL: Adversarial Contrastive Learning for Robust Point Clouds Representation Under Adversarial Attack.
Proceedings of the IEEE International Conference on Acoustics, 2023

3D-aware Facial Landmark Detection via Multi-view Consistent Training on Synthetic Data.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

High Fidelity 3D Hand Shape Reconstruction via Scalable Graph Frequency Decomposition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Neural Voting Field for Camera-Space 3D Hand Pose Estimation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

AMuSE: Adaptive Multimodal Analysis for Speaker Emotion Recognition in Group Conversations.
Proceedings of the Ninth IEEE Multimedia Big Data, 2023

Progressive Multi-View Human Mesh Recovery with Self-Supervision.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
ForestDet: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation.
IEEE Trans. Multim., 2022

Motion-Driven Visual Tempo Learning for Video-Based Action Recognition.
IEEE Trans. Image Process., 2022

MAT: Multianchor Visual Tracking With Selective Search Region.
IEEE Trans. Cybern., 2022

Forest Graph Convolutional Network for Surgical Action Triplet Recognition in Endoscopic Videos.
IEEE Trans. Circuits Syst. Video Technol., 2022

AppFuse: An Appearance Fusion Framework for Saliency Cues.
IEEE Trans. Circuits Syst. Video Technol., 2022

Hierarchical domain adaptation with local feature patterns.
Pattern Recognit., 2022

Video anomaly detection with spatio-temporal dissociation.
Pattern Recognit., 2022

Adversarial structured prediction for domain-adaptive semantic segmentation.
Mach. Vis. Appl., 2022

An Image-Based Approach to Detecting Structural Similarity Among Mixed Integer Programs.
INFORMS J. Comput., 2022

Slow-Fast Visual Tempo Learning for Video-based Action Recognition.
CoRR, 2022

Optical flow for video super-resolution: a survey.
Artif. Intell. Rev., 2022

NeuLF: Efficient Novel View Synthesis with Neural 4D Light Field.
Proceedings of the 33rd Eurographics Symposium on Rendering, 2022

Personalized Prediction of Indoor Comfort Using Graph Convolutional Matrix Completion.
Proceedings of the 5th IEEE International Conference on Multimedia Information Processing and Retrieval, 2022

Multimodal Attentive Learning for Real-time Explainable Emotion Recognition in Conversations.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

Multi-view Knowledge Graph for Explainable Course Content Recommendation in Course Discussion Posts.
Proceedings of the 26th International Conference on Pattern Recognition, 2022

Joint Global-Local Alignment for Domain Adaptive Semantic Segmentation.
Proceedings of the IEEE International Conference on Acoustics, 2022

Deformable VisTR: Spatio Temporal Deformable Attention for Video Instance Segmentation.
Proceedings of the IEEE International Conference on Acoustics, 2022

Generation for Unsupervised Domain Adaptation: A Gan-Based Approach for Object Classification with 3D Point Cloud Data.
Proceedings of the IEEE International Conference on Acoustics, 2022

PREF: Predictability Regularized Neural Motion Fields.
Proceedings of the Computer Vision - ECCV 2022, 2022

Neural Correspondence Field for Object Pose Estimation.
Proceedings of the Computer Vision - ECCV 2022, 2022

AiATrack: Attention in Attention for Transformer Visual Tracking.
Proceedings of the Computer Vision - ECCV 2022, 2022

MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Efficient Video Instance Segmentation via Tracklet Query and Proposal.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Learning Transferable Human-Object Interaction Detector with Natural Language Supervision.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

OVIS: Open-Vocabulary Visual Instance Search via Visual-Semantic Aligned Representation Learning.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Survey on depth and RGB image-based 3D hand shape and pose estimation.
Virtual Real. Intell. Hardw., 2021

Introduction to the Special Issue on Explainable AI on Multimedia Computing.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Hierarchical Soft Quantization for Skeleton-Based Human Action Recognition.
IEEE Trans. Multim., 2021

3D Object Representation Learning: A Set-to-Set Matching Perspective.
IEEE Trans. Image Process., 2021

Image Co-Skeletonization via Co-Segmentation.
IEEE Trans. Image Process., 2021

Joint Hand-Object 3D Reconstruction From a Single Image With Cross-Branch Feature Fusion.
IEEE Trans. Image Process., 2021

Revisiting Modified Greedy Algorithm for Monotone Submodular Maximization with a Knapsack Constraint.
Proc. ACM Meas. Anal. Comput. Syst., 2021

SibNet: Sibling Convolutional Encoder for Video Captioning.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

3D Hand Pose Estimation Using Synthetic Data and Weakly Labeled RGB Images.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Human pose estimation and its application to action recognition: A survey.
J. Vis. Commun. Image Represent., 2021

Pseudo Supervised Monocular Depth Estimation with Teacher-Student Network.
CoRR, 2021

Two-Stream Consensus Network: Submission to HACS Challenge 2021 Weakly-Supervised Learning Track.
CoRR, 2021

NeLF: Practical Novel View Synthesis with Neural Light Field.
CoRR, 2021

NeCH: Neural Clothed Human Model.
Proceedings of the International Conference on Visual Communications and Image Processing, 2021

Learning to Detect Monoclonal Protein in Electrophoresis Images.
Proceedings of the International Conference on Visual Communications and Image Processing, 2021

Handling Difficult Labels for Multi-label Image Classification via Uncertainty Distillation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Learning Kinematic Formulas from Multiple View Videos.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Rethinking Soft Labels for Knowledge Distillation: A Bias-Variance Tradeoff Perspective.
Proceedings of the 9th International Conference on Learning Representations, 2021

Discovering Human Interactions with Large-Vocabulary Objects via Query and Multi-Scale Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Stacked Homography Transformations for Multi-View Pedestrian Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

A Unified 3D Human Motion Synthesis Model via Conditional Variational Auto-Encoder<sup>∗</sup>.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

High Quality Disparity Remapping with Two-Stage Warping.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Track To Detect and Segment: An Online Multi-Object Tracker.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Model-Based 3D Hand Reconstruction via Self-Supervised Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Multimodal Co-training for Fake News Identification Using Attention-aware Fusion.
Proceedings of the Pattern Recognition - 6th Asian Conference, 2021

Proactive Student Persistence Prediction in MOOCs via Multi-domain Adversarial Learning.
Proceedings of the Pattern Recognition - 6th Asian Conference, 2021

Robust Knowledge Transfer via Hybrid Forward on the Teacher-Student Model.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

ACSNet: Action-Context Separation Network for Weakly Supervised Temporal Action Localization.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Weakly Supervised Temporal Action Localization Through Learning Explicit Subspaces for Action and Context.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Pruning 3D Filters For Accelerating 3D ConvNets.
IEEE Trans. Multim., 2020

Unsupervised Learning of Optical Flow With CNN-Based Non-Local Filtering.
IEEE Trans. Image Process., 2020

Context-Integrated and Feature-Refined Network for Lightweight Object Parsing.
IEEE Trans. Image Process., 2020

Towards Real-Time Eyeblink Detection in the Wild: Dataset, Theory and Practices.
IEEE Trans. Inf. Forensics Secur., 2020

Occlusion Pattern Discovery for Object Detection and Occlusion Reasoning.
IEEE Trans. Circuits Syst. Video Technol., 2020

Early Action Recognition With Category Exclusion Using Policy-Based Reinforcement Learning.
IEEE Trans. Circuits Syst. Video Technol., 2020

Asymmetric Mapping Quantization for Nearest Neighbor Search.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

Detecting spatiotemporal irregularities in videos via a 3D convolutional autoencoder.
J. Vis. Commun. Image Represent., 2020

Product Quantization Network for Fast Visual Search.
Int. J. Comput. Vis., 2020

Interventional Domain Adaptation.
CoRR, 2020

Attention-Aware Noisy Label Learning for Image Classification.
CoRR, 2020

Deep Reinforcement Learning with Label Embedding Reward for Supervised Image Hashing.
CoRR, 2020

Towards Understanding the Adversarial Vulnerability of Skeleton-based Action Recognition.
CoRR, 2020

Temporal Pulses Driven Spiking Neural Network for Fast Object Recognition in Autonomous Driving.
CoRR, 2020

Learning Depth for Scene Reconstruction Using an Encoder-Decoder Model.
IEEE Access, 2020

Multipath Event-Based Network for Low-Power Human Action Recognition.
Proceedings of the 6th IEEE World Forum on Internet of Things, 2020

Self-Mimic Learning for Small-scale Pedestrian Detection.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical Understanding of Outdoor Scene.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

HOT-Net: Non-Autoregressive Transformer for 3D Hand-Object Pose Estimation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Dynamic Graph CNN for Event-Camera Based Gesture Recognition.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2020

Temporal Pulses Driven Spiking Neural Network for Time and Power Efficient Object Recognition in Autonomous Driving.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

S3F: A Multi-View Slow-Fast Network For Alzheimer's Disease Diagnosis.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization.
Proceedings of the Computer Vision - ECCV 2020, 2020

Structure-Aware Human-Action Generation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Temporal Distinct Representation Learning for Action Recognition.
Proceedings of the Computer Vision - ECCV 2020, 2020

Hand-Transformer: Non-Autoregressive Structured Modeling for 3D Hand Pose Estimation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Clustering Driven Deep Autoencoder for Video Anomaly Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020

Learning Progressive Joint Propagation for Human Motion Prediction.
Proceedings of the Computer Vision - ECCV 2020, 2020


Temporal-Context Enhanced Detection of Heavily Occluded Pedestrians.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Discovering Human Interactions With Novel Objects via Zero-Shot Learning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

3DV: 3D Dynamic Voxel for Action Recognition in Depth Video.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Semantic Enhanced Sketch Based Image Retrieval with Incomplete Multimodal Query.
Proceedings of the 6th IEEE International Conference on Multimedia Big Data, 2020

Learning Diverse Stochastic Human-Action Generators by Learning Smooth Latent Transitions.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Boosting Positive and Unlabeled Learning for Anomaly Detection With Multi-Features.
IEEE Trans. Multim., 2019

Codebook-Free Compact Descriptor for Scalable Visual Search.
IEEE Trans. Multim., 2019

Action-Stage Emphasized Spatiotemporal VLAD for Video Action Recognition.
IEEE Trans. Image Process., 2019

Hough Forest With Optimized Leaves for Global Hand Pose Estimation With Arbitrary Postures.
IEEE Trans. Cybern., 2019

Dictionary Learning-Based, Directional, and Optimized Prediction for Lenslet Image Coding.
IEEE Trans. Circuits Syst. Video Technol., 2019

Discriminative Spatio-Temporal Pattern Discovery for 3D Action Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2019

Semantic Cues Enhanced Multimodality Multistream CNN for Action Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2019

Robust Distracter-Resistive Tracker via Learning a Multi-Component Discriminative Dictionary.
IEEE Trans. Circuits Syst. Video Technol., 2019

Efficient Video Object Co-Localization With Co-Saliency Activated Tracklets.
IEEE Trans. Circuits Syst. Video Technol., 2019

Real-Time Detection of Fall From Bed Using a Single Depth Camera.
IEEE Trans Autom. Sci. Eng., 2019

A survey of variational and CNN-based optical flow techniques.
Signal Process. Image Commun., 2019

Multi-label learning of part detectors for occluded pedestrian detection.
Pattern Recognit., 2019

Learning a robust representation via a deep network on symmetric positive definite manifolds.
Pattern Recognit., 2019

Real-Time 3D Hand Pose Estimation with 3D Convolutional Neural Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

SLTFNet: A spatial and language-temporal tensor fusion network for video moment retrieval.
Inf. Process. Manag., 2019

Context-Integrated and Feature-Refined Network for Lightweight Urban Scene Parsing.
CoRR, 2019

Progress Regression RNN for Online Spatial-Temporal Action Localization in Unconstrained Videos.
CoRR, 2019

Space-Time Event Clouds for Gesture Recognition: From RGB Cameras to Event Cameras.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Cross-Modal Video Moment Retrieval with Spatial and Language-Temporal Attention.
Proceedings of the 2019 on International Conference on Multimedia Retrieval, 2019

SPAGAN: Shortest Path Graph Attention Network.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Bayesian Uncertainty Matching for Unsupervised Domain Adaptation.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Motion-Let Clustering for Skeleton-Based Action Recognition.
Proceedings of the IEEE International Conference on Multimedia & Expo Workshops, 2019

Spatio-Temporal Multi-scale Soft Quantization Learning for Skeleton-Based Human Action Recognition.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Discriminative Feature Transformation for Occluded Pedestrian Detection.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

PointCloud Saliency Maps.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Temporal Structure Mining for Weakly Supervised Action Detection.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation From a Single Depth Image.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

SO-HandNet: Self-Organizing Network for 3D Hand Pose Estimation With Semi-Supervised Learning.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Exploiting Spatial-Temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Kervolutional Neural Networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Joint Representative Selection and Feature Learning: A Semi-Supervised Approach.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

3D Hand Shape and Pose Estimation From a Single RGB Image.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Multi-View, Generative, Transfer Learning for Distributed Time Series Classification.
Proceedings of the 2019 IEEE International Conference on Big Data (IEEE BigData), 2019

Exploiting Local Feature Patterns for Unsupervised Domain Adaptation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Traffic-Optimized Data Placement for Social Media.
IEEE Trans. Multim., 2018

Quality-Guided Fusion-Based Co-Saliency Estimation for Image Co-Segmentation and Colocalization.
IEEE Trans. Multim., 2018

Query Adaptive Multiview Object Instance Search and Localization Using Sketches.
IEEE Trans. Multim., 2018

Profit Maximization for Viral Marketing in Online Social Networks: Algorithms and Analysis.
IEEE Trans. Knowl. Data Eng., 2018

Simultaneously Discovering and Localizing Common Objects in Wild Images.
IEEE Trans. Image Process., 2018

Video Summarization Via Multiview Representative Selection.
IEEE Trans. Image Process., 2018

Fried Binary Embedding: From High-Dimensional Visual Features to High-Dimensional Binary Codes.
IEEE Trans. Image Process., 2018

Robust 3D Hand Pose Estimation From Single Depth Images Using Multi-View CNNs.
IEEE Trans. Image Process., 2018

Minimizing Reconstruction Bias Hashing via Joint Projection Learning and Quantization.
IEEE Trans. Image Process., 2018

Local Large-Margin Multi-Metric Learning for Face and Kinship Verification.
IEEE Trans. Circuits Syst. Video Technol., 2018

Representative Selection on a Hypersphere.
IEEE Signal Process. Lett., 2018

An efficient and effective hop-based approach for influence maximization in social networks.
Soc. Netw. Anal. Min., 2018

Multi-stream CNN: Learning representations based on human-related regions for action recognition.
Pattern Recognit., 2018

Temporally enhanced image object proposals for online video object and action detections.
J. Vis. Commun. Image Represent., 2018

Learning Saliency Maps for Adversarial Point-Cloud Generation.
CoRR, 2018

Online Processing Algorithms for Influence Maximization.
Proceedings of the 2018 International Conference on Management of Data, 2018

Towards Profit Maximization for Online Social Network Providers.
Proceedings of the 2018 IEEE Conference on Computer Communications, 2018

3D Convolutional Generative Adversarial Networks for Detecting Temporal Irregularities in Videos.
Proceedings of the 24th International Conference on Pattern Recognition, 2018

Unsupervised Multiple-Instance Learning for Instance Search.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Selecting Informative Frames for Action Recognition with Partial Observations.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Bi-box Regression for Pedestrian Detection and Occlusion Estimation.
Proceedings of the Computer Vision - ECCV 2018, 2018

Product Quantization Network for Fast Image Retrieval.
Proceedings of the Computer Vision - ECCV 2018, 2018

Deformable Pose Traversal Convolution for 3D Action and Gesture Recognition.
Proceedings of the Computer Vision - ECCV 2018, 2018

Point-to-Point Regression PointNet for 3D Hand Pose Estimation.
Proceedings of the Computer Vision - ECCV 2018, 2018

Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images.
Proceedings of the Computer Vision - ECCV 2018, 2018

Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Multi-View Harmonized Bilinear Network for 3D Object Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Salience Guided Depth Calibration for Perceptually Optimized Compressive Light Field 3D Display.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Recognizing Human Actions as the Evolution of Pose Estimation Maps.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Conditional Generative Adversarial Network for Structured Domain Adaptation.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Hand PointNet: 3D Hand Pose Estimation Using Point Sets.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Understanding Human-Object Interaction in RGB-D videos for Human Robot Interaction.
Proceedings of Computer Graphics International 2018, 2018

Actor-Action Semantic Segmentation with Region Masks.
Proceedings of the British Machine Vision Conference 2018, 2018

Kernel Cross-Correlator.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Tensorized Projection for High-Dimensional Binary Embedding.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Distributed Composite Quantization.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Visual Pattern Discovery and Recognition
Springer Briefs in Computer Science, Springer, ISBN: 978-981-10-4839-5, 2017

Sound-Event Classification Using Robust Texture Features for Robot Hearing.
IEEE Trans. Multim., 2017

Person Reidentification Using Multiple Egocentric Views.
IEEE Trans. Circuits Syst. Video Technol., 2017

Discovering Class-Specific Spatial Layouts for Scene Recognition.
IEEE Signal Process. Lett., 2017

LBP-Structure Optimization With Symmetry and Uniformity Regularizations for Scene Classification.
IEEE Signal Process. Lett., 2017

Representative Selection with Structured Sparsity.
Pattern Recognit., 2017

Fusing disparate object signatures for salient object detection in video.
Pattern Recognit., 2017

Learning location constrained pixel classifiers for image parsing.
J. Vis. Commun. Image Represent., 2017

3D Hand Pose Estimation: From Current Achievements to Future Goals.
CoRR, 2017

Non-Iterative Localization and Fast Mapping.
CoRR, 2017

Non-Iterative SLAM: A Fast Dense Method for Inertial-Visual SLAM.
CoRR, 2017

Positive and Unlabeled Learning for Anomaly Detection with Multi-features.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Efficient tracking of closely spaced objects in depth data using sequential dirichlet process clustering.
Proceedings of the IEEE Intelligent Vehicles Symposium, 2017

Efficient ground object segmentation in 3D LIDAR based on cascaded mode seeking.
Proceedings of the 20th IEEE International Conference on Intelligent Transportation Systems, 2017

Is My Object in This Video? Reconstruction-based Object Search in Videos.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Real time hand gesture recognition via finger-emphasized multi-scale description.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Temporally enhanced image object proposals for videos.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Context-aware graph-based analysis for detecting anomalous activities.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Efficient directional and L1-optimized intra-prediction for light field image compression.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Video Summarization via Multi-view Representative Selection.
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

Multi-label Learning of Part Detectors for Heavily Occluded Pedestrian Detection.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Compressive Quantization for Fast Object Instance Search in Videos.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Common Action Discovery and Localization in Unconstrained Videos.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Non-iterative SLAM.
Proceedings of the 18th International Conference on Advanced Robotics, 2017

Real-time hierarchical fusion system for semantic segmentation in offroad scenes.
Proceedings of the 20th International Conference on Information Fusion, 2017

HOPE: Hierarchical Object Prototype Encoding for Efficient Object Instance Search in Videos.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Spatio-Temporal Naive-Bayes Nearest-Neighbor (ST-NBNN) for Skeleton-Based Action Recognition.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Object Co-skeletonization with Co-segmentation.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Fried Binary Embedding for High-Dimensional Visual Features.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

3D Convolutional Neural Networks for Efficient and Robust Hand Pose Estimation from Single Depth Images.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Influence Maximization Meets Efficiency and Effectiveness: A Hop-Based Approach.
Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, Sydney, Australia, July 31, 2017

A framework based on deep learning and mathematical morphology for cabin door detection in an automated aerobridge docking system.
Proceedings of the 11th Asian Control Conference, 2017

Common visual pattern discovery and search.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Efficient Object Instance Search Using Fuzzy Objects Matching.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Body Movement Analysis and Recognition.
Proceedings of the Context Aware Human-Robot and Human-Agent Interaction, 2016

Face and Facial Expressions Recognition and Analysis.
Proceedings of the Context Aware Human-Robot and Human-Agent Interaction, 2016

Object Instance Search in Videos via Spatio-Temporal Trajectory Discovery.
IEEE Trans. Multim., 2016

Image Co-segmentation via Saliency Co-fusion.
IEEE Trans. Multim., 2016

Query-Adaptive Small Object Search Using Object Proposals and Shape-Aware Descriptors.
IEEE Trans. Multim., 2016

Fast Appearance Modeling for Automatic Primary Video Object Segmentation.
IEEE Trans. Image Process., 2016

Adobe Boxes: Locating Object Proposals Using Object Adobes.
IEEE Trans. Image Process., 2016

Discovering Primary Objects in Videos by Saliency Fusion and Iterative Appearance Estimation.
IEEE Trans. Circuits Syst. Video Technol., 2016

Introduction of New Associate Editors.
IEEE Trans. Circuits Syst. Video Technol., 2016

Discriminative Action States Discovery for Online Action Recognition.
IEEE Signal Process. Lett., 2016

Parsing 3D motion trajectory for gesture recognition.
J. Vis. Commun. Image Represent., 2016

Finding spatio-temporal salient paths for video objects discovery.
J. Vis. Commun. Image Represent., 2016

Guest Editorial: Human Activity Understanding from 2D and 3D Data.
Int. J. Comput. Vis., 2016

Invariant multi-scale descriptor for shape representation, matching and retrieval.
Comput. Vis. Image Underst., 2016

Barehanded music: real-time hand interaction for virtual piano.
Proceedings of the 20th ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, 2016

L1-optimized linear prediction for light field image compression.
Proceedings of the 2016 Picture Coding Symposium, 2016

A Compact Binary Aggregated Descriptor via Dual Selection for Visual Search.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Learning a Multi-class Discriminative Dictionary with Nonredundancy Constraints for Visual Classification.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Query Adaptive Instance Search using Object Sketches.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

To Project More or to Quantize More: Minimize Reconstruction Bias for Learning Compact Binary Codes.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Profit maximization for viral marketing in Online Social Networks.
Proceedings of the 24th IEEE International Conference on Network Protocols, 2016

Collaborative multi-view metric learning for visual classification.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2016

Invariant multi-scale shape descriptor for object matching and recognition.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Bayesian tracking of multiple objects with vision and radar.
Proceedings of the 14th International Conference on Control, 2016

CATS: Co-saliency Activated Tracklet Selection for Video Co-localization.
Proceedings of the Computer Vision - ECCV 2016, 2016

From Keyframes to Key Objects: Video Summarization by Representative Object Proposal Selection.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Robust 3D Hand Pose Estimation in Single Depth Images: From Single-View CNN to Multi-View CNNs.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Learning to Integrate Occlusion-Specific Detectors for Heavily Occluded Pedestrian Detection.
Proceedings of the Computer Vision - ACCV 2016, 2016

Random Forest with Suppressed Leaves for Hough Voting.
Proceedings of the Computer Vision - ACCV 2016, 2016

Multi-layer light field display characterisation.
Proceedings of the True Vision - Capture, Transmission and Display of 3D Video, 2016

2015
Efficient Mining of Optimal AND/OR Patterns for Visual Recognition.
IEEE Trans. Multim., 2015

Topical Video Object Discovery From Key Frames by Modeling Word Co-Occurrence Prior.
IEEE Trans. Image Process., 2015

Robust Discriminative Tracking via Landmark-Based Label Propagation.
IEEE Trans. Image Process., 2015

Manifold Kernel Sparse Representation of Symmetric Positive-Definite Matrices and Its Applications.
IEEE Trans. Image Process., 2015

A Chi-Squared-Transformed Subspace of LBP Histogram for Visual Recognition.
IEEE Trans. Image Process., 2015

Randomized Spatial Context for Object Search.
IEEE Trans. Image Process., 2015

Collaborative Multifeature Fusion for Transductive Spectral Learning.
IEEE Trans. Cybern., 2015

Propagative Hough Voting for Human Activity Detection and Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2015

Resolving Ambiguous Hand Pose Predictions by Exploiting Part Correlations.
IEEE Trans. Circuits Syst. Video Technol., 2015

LBP Encoding Schemes Jointly Utilizing the Information of Current Bit and Other LBP Bits.
IEEE Signal Process. Lett., 2015

Learning LBP structure by maximizing the conditional mutual information.
Pattern Recognit., 2015

Flexible Trajectory Indexing for 3D Motion Recognition.
Proceedings of the 2015 IEEE Winter Conference on Applications of Computer Vision, 2015

First-Person Palm Pose Tracking and Gesture Recognition in Augmented Reality.
Proceedings of the Computer Vision, Imaging and Computer Graphics Theory and Applications, 2015

Two-layer optimized light field display using depth initialization.
Proceedings of the 2015 Visual Communications and Image Processing, 2015

Glasses-free light field 3D display.
Proceedings of the 2015 Visual Communications and Image Processing, 2015

QCCE: Quality constrained co-saliency estimation for common object detection.
Proceedings of the 2015 Visual Communications and Image Processing, 2015

AR in Hand: Egocentric Palm Pose Tracking and Gesture Recognition for Augmented Reality Applications.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Query-Adaptive Logo Search using Shape-Aware Descriptors.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Egocentric hand pose estimation and distance recovery in a single RGB image.
Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

Fast object instance search in videos from one example.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Group saliency propagation for large scale and quick image co-segmentation.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Optimizing Inter-server Communication for Online Social Networks.
Proceedings of the 35th IEEE International Conference on Distributed Computing Systems, 2015

Adaptive Exponential Smoothing for Online Filtering of Pixel Prediction Maps.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Quantized fuzzy LBP for face recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Fast action proposals for human action detection and search.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Automatic Visual Pattern Discovery via Cohesive Subgraph Mining.
Proceedings of the Mobile Cloud Visual Media Computing - From Interaction to Service, 2015

2014
Visual pattern discovery in image and video data: a brief survey.
WIREs Data Mining Knowl. Discov., 2014

Parsing the Hand in Depth Images.
IEEE Trans. Multim., 2014

mCENTRIST: A Multi-Channel Feature Generation Mechanism for Scene Categorization.
IEEE Trans. Image Process., 2014

Context-Aware Discovery of Visual Co-Occurrence Patterns.
IEEE Trans. Image Process., 2014

Optimizing LBP Structure For Visual Recognition Using Binary Quadratic Programming.
IEEE Signal Process. Lett., 2014

Entropic image thresholding based on GLGM histogram.
Pattern Recognit. Lett., 2014

Modelling Multi-Party Interactions among Virtual Characters, Robots, and Humans.
Presence Teleoperators Virtual Environ., 2014

Human-Robot Interaction by Understanding Upper Body Gestures.
Presence Teleoperators Virtual Environ., 2014

Learning Actionlet Ensemble for 3D Human Action Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Video Event Detection: From Subvolume Localization to Spatiotemporal Path Search.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Fusion of 3D-LIDAR and camera data for scene parsing.
J. Vis. Commun. Image Represent., 2014

Tracking and fusion for multiparty interaction with a virtual character and a social robot.
Proceedings of the SIGGRAPH Asia 2014 Autonomous Virtual Humans and Social Robot for Telepresence, 2014

Activity recognition in unconstrained RGB-D video using 3D trajectories.
Proceedings of the SIGGRAPH Asia 2014 Autonomous Virtual Humans and Social Robot for Telepresence, 2014

Boosting cross-media retrieval via visual-auditory feature analysis and relevance feedback.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Scalable forest hashing for fast similarity search.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

Hierarchical multi-feature fusion for multimodal data analysis.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Automatic image co-segmentation using geometric mean saliency.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Efficient Online Spatio-Temporal Filtering for Video Event Detection.
Proceedings of the Computer Vision - ECCV 2014 Workshops, 2014

Multi-feature Spectral Clustering with Minimax Optimization.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Non-rectangular Part Discovery for Object Detection.
Proceedings of the British Machine Vision Conference, 2014

Location Constrained Pixel Classifiers for Image Parsing with Regular Spatial Layout.
Proceedings of the British Machine Vision Conference, 2014

Discriminative Orderlet Mining for Real-Time Recognition of Human-Object Interaction.
Proceedings of the Computer Vision - ACCV 2014, 2014

Large Margin Multi-metric Learning for Face and Kinship Verification in the Wild.
Proceedings of the Computer Vision - ACCV 2014, 2014

Height Gradient Histogram (HIGH) for 3D Scene Labeling.
Proceedings of the 2nd International Conference on 3D Vision, 2014

Low-Rank Online Metric Learning.
Proceedings of the Low-Rank and Sparse Modeling for Visual Analysis, 2014

2013
Model-based hand pose estimation via spatial-temporal hand parsing and 3D fingertip localization.
Vis. Comput., 2013

Robust Part-Based Hand Gesture Recognition Using Kinect Sensor.
IEEE Trans. Multim., 2013

Action Search by Example Using Randomized Visual Vocabularies.
IEEE Trans. Image Process., 2013

Noise-Resistant Local Binary Pattern With an Embedded Error-Correction Mechanism.
IEEE Trans. Image Process., 2013

Self-Supervised Online Metric Learning With Low Rank Constraint for Scene Categorization.
IEEE Trans. Image Process., 2013

Video Anomaly Search in Crowded Scenes via Spatio-Temporal Motion Context.
IEEE Trans. Inf. Forensics Secur., 2013

Hybrid Saliency Detection for Images.
IEEE Signal Process. Lett., 2013

A complete and fully automated face verification system on mobile devices.
Pattern Recognit., 2013

Abnormal event detection in crowded scenes using sparse representation.
Pattern Recognit., 2013

Minimum Near-Convex Shape Decomposition.
IEEE Trans. Pattern Anal. Mach. Intell., 2013

Human-virtual human interaction by upper body gesture understanding.
Proceedings of the 19th ACM Symposium on Virtual Reality Software and Technology, 2013

Salient object detection in videos by optimal spatio-temporal path discovery.
Proceedings of the ACM Multimedia Conference, 2013

Mobile media communication, processing, and analysis: A review of recent advances.
Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013

Hierarchical sparse coding based on spatial pooling and multi-feature fusion.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

Direct mining co-occurrence features for visual recognition: A branch and bound method.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

Learning weighted geometric pooling for image classification.
Proceedings of the IEEE International Conference on Image Processing, 2013

Relaxed local ternary pattern for face recognition.
Proceedings of the IEEE International Conference on Image Processing, 2013

Learning binarized pixel-difference pattern for scene recognition.
Proceedings of the IEEE International Conference on Image Processing, 2013

Voxel labelling in CT images with data-driven contextual features.
Proceedings of the IEEE International Conference on Image Processing, 2013

Thematic Saliency Detection Using Spatial-Temporal Context.
Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops, 2013

Dynamic texture recognition using enhanced LBP features.
Proceedings of the IEEE International Conference on Acoustics, 2013

Topical Video Object Discovery from Key Frames by Modeling Word Co-occurrence Prior.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Object instance search in videos.
Proceedings of the 9th International Conference on Information, 2013

2012
Mining Visual Collocation Patterns via Self-Supervised Subspace Learning.
IEEE Trans. Syst. Man Cybern. Part B, 2012

Towards Scalable Summarization of Consumer Videos Via Sparse Dictionary Selection.
IEEE Trans. Multim., 2012

Discovering Thematic Objects in Image Collections and Videos.
IEEE Trans. Image Process., 2012

Spatial Locality-Aware Sparse Coding and Dictionary Learning.
Proceedings of the 4th Asian Conference on Machine Learning, 2012

Location Discriminative Vocabulary Coding for Mobile Landmark Search.
Int. J. Comput. Vis., 2012

Hand pose estimation by combining fingertip tracking and articulated ICP.
Proceedings of the Virtual Reality Continuum and its Applications in Industry, 2012

Max-Margin Structured Output Regression for Spatio-Temporal Action Localization.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Predicting human activities using spatio-temporal structure of interest points.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

3D fingertip and palm tracking in depth image sequences.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Rapid object search engine for contextual advertisement.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Social Image Tagging by Mining Sparse Tag Patterns from Auxiliary Data.
Proceedings of the 2012 IEEE International Conference on Multimedia and Expo, 2012

Curb detection and tracking using 3D-LIDAR scanner.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

Learning sparse tag patterns for social image classification.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

Object tracking via online metric learning.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

Fusion of Velodyne and camera data for scene parsing.
Proceedings of the 15th International Conference on Information Fusion, 2012

Propagative Hough Voting for Human Activity Recognition.
Proceedings of the Computer Vision - ECCV 2012, 2012

Randomized Spatial Partition for Scene Recognition.
Proceedings of the Computer Vision - ECCV 2012, 2012

Mining actionlet ensemble for action recognition with depth cameras.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Randomized visual phrases for object search.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Arbitrary-Shape Object Localization Using Adaptive Image Grids.
Proceedings of the Computer Vision - ACCV 2012, 2012

2011
Fast Action Detection via Discriminative Random Forest Voting and Top-K Subvolume Search.
IEEE Trans. Multim., 2011

Saliency Density Maximization for Efficient Visual Objects Discovery.
IEEE Trans. Circuits Syst. Video Technol., 2011

Discriminative Video Pattern Search for Efficient Action Detection.
IEEE Trans. Pattern Anal. Mach. Intell., 2011

Discovering the Thematic Object in Commercial Videos.
IEEE Multim., 2011

Learning spatio-temporal dependency of local patches for complex motion segmentation.
Comput. Vis. Image Underst., 2011

Anomalous video event detection using spatiotemporal context.
Comput. Vis. Image Underst., 2011

Real-time human action search using random forest based hough voting.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Robust hand gesture recognition based on finger-earth mover's distance with a commodity depth camera.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Robust hand gesture recognition with kinect sensor.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Grassmann Hashing for approximate nearest neighbor search in high dimensional space.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Salient region detection and its application to video retargeting.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Grid-based local feature bundling for efficient object search and localization.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

Discovering Thematic Patterns in Videos via Cohesive Sub-graph Mining.
Proceedings of the 11th IEEE International Conference on Data Mining, 2011

Combining Feature Context and Spatial Context for Image Pattern Discovery.
Proceedings of the 11th IEEE International Conference on Data Mining, 2011

A fast and accurate cascade subspace face/eye detector on mobile devices.
Proceedings of the IEEE International Conference on Computer Vision Workshops, 2011

Minimum near-convex decomposition for robust shape representation.
Proceedings of the IEEE International Conference on Computer Vision, 2011

Mining discriminative co-occurrence patterns for visual recognition.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Unsupervised random forest indexing for fast action search.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Optimal spatio-temporal path discovery for video event detection.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Sparse reconstruction cost for abnormal event detection.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Multiple instance boosting with global smoothness regularization.
Proceedings of the 8th International Conference on Information, 2011

Depth camera based hand gesture recognition and its applications in Human-Computer-Interaction.
Proceedings of the 8th International Conference on Information, 2011

2010
Mining Compositional Features From GPS and Visual Cues for Event Recognition in Photo Collections.
IEEE Trans. Multim., 2010

Mining and cropping common objects from images.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

KPB-SIFT: a compact local feature descriptor.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Interactive visual object search through mutual information maximization.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Bipolar grouping.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

Efficient search of Top-K video subvolumes for multi-instance action detection.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

Video anomaly detection in spatiotemporal context.
Proceedings of the International Conference on Image Processing, 2010

Middle-Level Representation for Human Activities Recognition: The Role of Spatio-Temporal Relationships.
Proceedings of the Trends and Topics in Computer Vision, 2010

Saliency Density Maximization for Object Detection and Localization.
Proceedings of the Computer Vision - ACCV 2010, 2010

2009
Mining Repetitive Patterns in Multimedia Data.
Proceedings of the Encyclopedia of Data Warehousing and Mining, Second Edition (4 Volumes), 2009

Speeding up spatio-temporal sliding-window search for efficient event detection in crowded videos.
Proceedings of the 1st ACM international workshop on Events in multimedia, 2009

Multimodal partial estimates fusion.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

Discriminative subvolume search for efficient action detection.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

Multimedia Data Indexing.
Proceedings of the Semantic Mining Technologies for Multimedia Databases., 2009

2008
Mining Recurring Events Through Forest Growing.
IEEE Trans. Circuits Syst. Video Technol., 2008

Locality Versus Globality: Query-Driven Localized Linear Models for Facial Image Computing.
IEEE Trans. Circuits Syst. Video Technol., 2008

Mining GPS traces and visual words for event classification.
Proceedings of the 1st ACM SIGMM International Conference on Multimedia Information Retrieval, 2008

Mining Motifs from Human Motion.
Proceedings of the 29th Annual Conference of the European Association for Computer Graphics, 2008

Context-aware clustering.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

Mining compositional features for boosting.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

2007
Mining repetitive clips through finding continuous paths.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

From frequent itemsets to semantically meaningful visual patterns.
Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007

Query Driven Localized Linear Discriminant Models for Head Pose Estimation.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Common Spatial Pattern Discovery by Efficient Candidate Pruning.
Proceedings of the International Conference on Image Processing, 2007

Query-Driven Locally Adaptive Fisher Faces and Expert-Model for Face Recognition.
Proceedings of the International Conference on Image Processing, 2007

Spatial Random Partition for Common Visual Pattern Discovery.
Proceedings of the IEEE 11th International Conference on Computer Vision, 2007

Discovery of Collocation Patterns: from Visual Words to Visual Phrases.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

Spatial selection for attentional visual tracking.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

2004
Fast and Robust Short Video Clip Search for Copy Detection.
Proceedings of the Advances in Multimedia Information Processing - PCM 2004, 5th Pacific Rim Conference on Multimedia, Tokyo, Japan, November 30, 2004

Fast and robust video clip search using index structure.
Proceedings of the 12th ACM International Conference on Multimedia, 2004

Fast and robust short video clip search using an index structure.
Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2004

Fast and Robust Search Method for Short Video Clips from Large Video Collection.
Proceedings of the 17th International Conference on Pattern Recognition, 2004


  Loading...