Xiaodong Yang

Orcid: 0009-0003-4638-8039

  • NVIDIA Research, Santa Clara, CA, USA
  • The City College of New York, NY, USA (former)

According to our database1, Xiaodong Yang authored at least 63 papers between 2010 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


Attribute Descent: Simulating Object-Centric Datasets on the Content Level and Beyond.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2024

Cross-Modal Self-Supervised Learning with Effective Contrastive Units for LiDAR Point Clouds.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024

Partial Convolution for Padding, Inpainting, and Image Synthesis.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2023

Attribute Descent: Simulating Object-Centric Datasets on the Content Level and Beyond.
CoRR, 2022

The 5th AI City Challenge.
CoRR, 2021

The 5th AI City Challenge.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

Hierarchical Contrastive Motion Learning for Video Action Recognition.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

UFO$^2$: A Unified Framework towards Omni-supervised Object Detection.
CoRR, 2020

Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification.
Proceedings of the Computer Vision - ECCV 2020, 2020

Simulating Content Consistent Vehicle Datasets with Attribute Descent.
Proceedings of the Computer Vision - ECCV 2020, 2020

UFO<sup>2</sup>: A Unified Framework Towards Omni-supervised Object Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020

Contrastive Learning for Weakly Supervised Phrase Grounding.
Proceedings of the Computer Vision - ECCV 2020, 2020

Instance-Aware, Context-Focused, and Memory-Efficient Weakly Supervised Object Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

The 4th AI City Challenge.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Discovering spatio-temporal action tubes.
J. Vis. Commun. Image Represent., 2019

Dancing to Music.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

PAMTRI: Pose-Aware Multi-Task Learning for Vehicle Re-Identification Using Highly Randomized Synthetic Data.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Joint Discriminative and Generative Learning for Person Re-Identification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

STEP: Spatio-Temporal Progressive Learning for Video Action Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Video you only look once: Overall temporal convolutions for action recognition.
J. Vis. Commun. Image Represent., 2018

Making Convolutional Networks Recurrent for Visual Sequence Learning.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

MoCoGAN: Decomposing Motion and Content for Video Generation.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Budget-Aware Activity Detection with A Recurrent Policy Network.
Proceedings of the British Machine Vision Conference 2018, 2018

Evaluation of Low-Level Features for Real-World Surveillance Event Detection.
IEEE Trans. Circuits Syst. Video Technol., 2017

Super Normal Vector for Human Activity Recognition with Depth Cameras.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

3D convolutional neural network with multi-model framework for action recognition.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Dynamic Facial Analysis: From Bayesian Filtering to Recurrent Neural Network.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Multilayer and Multimodal Fusion of Deep Neural Networks for Video Classification.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Region Trajectories for Video Semantic Concept Detection.
Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016

Towards selecting robust hand gestures for automotive interfaces.
Proceedings of the 2016 IEEE Intelligent Vehicles Symposium, 2016

Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.
IEEE Trans. Neural Networks Learn. Syst., 2015

CCNY at TRECVID 2015: Localization.
Proceedings of the 2015 TREC Video Retrieval Evaluation, 2015

Exploring Pooling Strategies based on Idiosyncrasies of Spatio-Temporal Interest Points.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Hybrid Example-Based Single Image Super-Resolution.
Proceedings of the Advances in Visual Computing - 11th International Symposium, 2015

Assistive Clothing Pattern Recognition for Visually Impaired People.
IEEE Trans. Hum. Mach. Syst., 2014

Effective 3D action recognition using EigenJoints.
J. Vis. Commun. Image Represent., 2014

CCNY at TRECVID 2014: Surveillance Event Detection.
Proceedings of the 2014 TREC Video Retrieval Evaluation, 2014

Polynormal Fisher vector for activity recognition from depth sequences.
Proceedings of the SIGGRAPH Asia 2014 Autonomous Virtual Humans and Social Robot for Telepresence, 2014

Scene text recognition in multiple frames based on text tracking.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

Action Recognition Using Super Sparse Coding Vector with Spatio-temporal Awareness.
Proceedings of the Computer Vision - ECCV 2014, 2014

Super Normal Vector for Activity Recognition Using Depth Sequences.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Texture representations using subspace embeddings.
Pattern Recognit. Lett., 2013

Detecting signage and doors for blind navigation and wayfinding.
Netw. Model. Anal. Health Informatics Bioinform., 2013

Monitoring activity of taking medicine by incorporating RFID and video analysis.
Netw. Model. Anal. Health Informatics Bioinform., 2013

Toward a computer vision-based wayfinding aid for blind persons to access unfamiliar indoor environments.
Mach. Vis. Appl., 2013

AT&T Research at TRECVID 2013: Surveillance Event Detection.
Proceedings of the 2013 TREC Video Retrieval Evaluation, 2013

Feature Representations for Scene Text Character Recognition: A Comparative Study.
Proceedings of the 12th International Conference on Document Analysis and Recognition, 2013

Histogram of 3D Facets: A characteristic descriptor for hand gesture recognition.
Proceedings of the 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, 2013

Visual speech learning from an e-tutor via dynamic lip movement-based video segmentation and comparison.
Proceedings of the 2013 IEEE International Conference on Bioinformatics and Biomedicine, 2013

Robust and Effective Component-Based Banknote Recognition for the Blind.
IEEE Trans. Syst. Man Cybern. Part C, 2012

MediaCCNY at TRECVID 2012: Surveillance Event Detection.
Proceedings of the 2012 TREC Video Retrieval Evaluation, 2012

Recognizing actions using depth motion maps-based histograms of oriented gradients.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

EigenJoints-based action recognition using Naïve-Bayes-Nearest-Neighbor.
Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012

Robust and effective component-based banknote recognition by SURF features.
Proceedings of the 20th Annual Wireless and Optical Communications Conference, 2011

Recognizing clothes patterns for blind people by confidence margin based feature combination.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Recognizing clothes patterns for blind people by confidence margin based feature combination.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Context-based indoor object detection as an aid to blind persons accessing unfamiliar environments.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Computer Vision-Based Door Detection for Accessibility of Unfamiliar Environments to Blind Persons.
Proceedings of the Computers Helping People with Special Needs, 2010

Robust door detection in unfamiliar environments by combining edge and corner features.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2010
