Pichao Wang

Orcid: 0000-0002-1430-0237

According to our database1, Pichao Wang authored at least 104 papers between 2013 and 2024.

Collaborative distances:



In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels.
Int. J. Comput. Vis., 2024

DFN: A deep fusion network for flexible single and multi-modal action recognition.
Expert Syst. Appl., 2024

GQE: Generalized Query Expansion for Enhanced Text-Video Retrieval.
CoRR, 2024

Hallucination of Multimodal Large Language Models: A Survey.
CoRR, 2024

Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval.
CoRR, 2024

A Unified Multimodal De- and Re-Coupling Framework for RGB-D Motion Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

What Limits the Performance of Local Self-attention?
Int. J. Comput. Vis., October, 2023

Multi-hypothesis representation learning for transformer-based 3D human pose estimation.
Pattern Recognit., September, 2023

Exploiting Temporal Contexts With Strided Transformer for 3D Human Pose Estimation.
IEEE Trans. Multim., 2023

BP-triplet net for unsupervised domain adaptation: A Bayesian perspective.
Pattern Recognit., 2023

FT-HID: a large-scale RGB-D dataset for first- and third-person human interaction analysis.
Neural Comput. Appl., 2023

Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation.
CoRR, 2023

Human Pose-based Estimation, Tracking and Action Recognition with Deep Learning: A Survey.
CoRR, 2023

Revisit Parameter-Efficient Transfer Learning: A Two-Stage Paradigm.
CoRR, 2023

Multi-stage Factorized Spatio-Temporal Representation for RGB-D Action and Gesture Recognition.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Revisiting Vision Transformer from the View of Path Ensemble.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose Estimation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Selective Structured State-Spaces for Long-Form Video Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

DOAD: Decoupled One Stage Action Detection Network.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Making Vision Transformers Efficient from A Token Sparsification View.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Frequency Domain Disentanglement for Arbitrary Neural Style Transfer.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Head-Free Lightweight Semantic Segmentation with Linear Transformer.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Class-Aware Feature Aggregation Network for Video Object Detection.
IEEE Trans. Circuits Syst. Video Technol., 2022

Trear: Transformer-Based RGB-D Egocentric Action Recognition.
IEEE Trans. Cogn. Dev. Syst., 2022

Effective Vision Transformer Training: A Data-Centric Perspective.
CoRR, 2022

BP-Triplet Net for Unsupervised Domain Adaptation: A Bayesian Perspective.
CoRR, 2022

VTC-LFC: Vision Transformer Compression with Low-Frequency Components.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation.
Proceedings of the HCMA@MM 2022: Proceedings of the 3rd International Workshop on Human-Centric Multimedia Analysis, 2022

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Image-to-Video Re-Identification via Mutual Discriminative Knowledge Transfer.
Proceedings of the IEEE International Conference on Acoustics, 2022

TransFGU: A Top-Down Approach to Fine-Grained Unsupervised Semantic Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022

KVT: k-NN Attention for Boosting Vision Transformers.
Proceedings of the Computer Vision, 2022

Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Focal and Global Spatial-Temporal Transformer for Skeleton-Based Action Recognition.
Proceedings of the Computer Vision - ACCV 2022, 2022

Scaled ReLU Matters for Training Vision Transformers.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

BR$^2$Net: Defocus Blur Detection Via a Bidirectional Channel Attention Residual Refining Network.
IEEE Trans. Multim., 2021

Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition.
IEEE Trans. Image Process., 2021

TransRPPG: Remote Photoplethysmography Transformer for 3D Mask Face Presentation Attack Detection.
IEEE Signal Process. Lett., 2021

Transformer guided geometry model for flow-based unsupervised visual odometry.
Neural Comput. Appl., 2021

Context and Structure Mining Network for Video Object Detection.
Int. J. Comput. Vis., 2021

ELSA: Enhanced Local Self-Attention for Vision Transformer.
CoRR, 2021

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification.
CoRR, 2021

KVT: k-NN Attention for Boosting Vision Transformers.
CoRR, 2021

Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation.
CoRR, 2021

Lifting Transformer for 3D Human Pose Estimation in Video.
CoRR, 2021

Zen-NAS: A Zero-Shot NAS for High-Performance Deep Image Recognition.
CoRR, 2021

Zen-NAS: A Zero-Shot NAS for High-Performance Image Recognition.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

TransReID: Transformer-based Object Re-Identification.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

A Hybrid Network for Large-Scale Action Recognition from RGB and Depth Modalities.
Sensors, 2020

A Review of Dynamic Maps for 3D Human Motion Recognition Using ConvNets and Its Improvement.
Neural Process. Lett., 2020

SAR-NAS: Skeleton-based action recognition via neural architecture searching.
J. Vis. Commun. Image Represent., 2020

RobustTAD: Robust Time Series Anomaly Detection via Decomposition and Convolutional Neural Networks.
CoRR, 2020

Exploiting Better Feature Aggregation for Video Object Detection.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

R²MRF: Defocus Blur Detection via Recurrently Refining Multi-Scale Residual Features.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Learning a Joint Affinity Graph for Multiview Subspace Clustering.
IEEE Trans. Multim., 2019

Adaptive Hypergraph Embedded Semi-Supervised Multi-Label Image Annotation.
IEEE Trans. Multim., 2019

Multiview-Based 3-D Action Recognition Using Deep Networks.
IEEE Trans. Hum. Mach. Syst., 2019

Unsupervised feature selection via latent representation learning and manifold regularization.
Neural Networks, 2019

Learning attentive dynamic maps (ADMs) for Understanding Human Actions.
J. Vis. Commun. Image Represent., 2019

DVONet: Unsupervised Monocular Depth Estimation and Visual Odometry.
Proceedings of the 2019 IEEE Visual Communications and Image Processing, 2019

Light Weight Stereo Matching via Deep Extraction and Integration of Low and High Level Information.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Self-Attention Guided Deep Features for Action Recognition.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Salient Object Detection via Recurrently Aggregating Spatial Attention Weighted Cross-Level Deep Features.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Depth Pooling Based Large-Scale 3-D Action Recognition With Convolutional Neural Networks.
IEEE Trans. Multim., 2018

Skeleton Optical Spectra-Based Action Recognition Using Convolutional Neural Networks.
IEEE Trans. Circuits Syst. Video Technol., 2018

Action recognition based on joint trajectory maps with convolutional neural networks.
Knowl. Based Syst., 2018

Robust unsupervised feature selection via dual self-representation and manifold regularization.
Knowl. Based Syst., 2018

Consensus learning guided multi-view unsupervised feature selection.
Knowl. Based Syst., 2018

Online human action recognition based on incremental learning of weighted covariance descriptors.
Inf. Sci., 2018

Saliency detection via affinity graph learning and weighted manifold ranking.
Neurocomputing, 2018

Robust graph regularized unsupervised feature selection.
Expert Syst. Appl., 2018

RGB-D-based human motion recognition with deep learning: A survey.
Comput. Vis. Image Underst., 2018

Depth Pooling Based Large-scale 3D Action Recognition with Convolutional Neural Networks.
CoRR, 2018

Spatially and Temporally Structured Global to Local Aggregation of Dynamic Depth Information for Action Recognition.
IEEE Access, 2018

Cooperative Training of Deep Aggregation Networks for RGB-D Action Recognition.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Salient Object Detection via Weighted Low Rank Matrix Recovery.
IEEE Signal Process. Lett., 2017

Joint Distance Maps Based Action Recognition With Convolutional Neural Networks.
IEEE Signal Process. Lett., 2017

An effective edge-preserving smoothing method for image manipulation.
Digit. Signal Process., 2017

Skeleton-based action recognition using LSTM and CNN.
Proceedings of the 2017 IEEE International Conference on Multimedia & Expo Workshops, 2017

Investigation of different skeleton features for CNN-based 3D action recognition.
Proceedings of the 2017 IEEE International Conference on Multimedia & Expo Workshops, 2017

Weakly structured information aggregation for upper-body posture assessment using ConvNets.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Large-Scale Multimodal Gesture Segmentation and Recognition Based on Convolutional Neural Networks.
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

Large-Scale Multimodal Gesture Recognition Using Heterogeneous Networks.
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

Structured Images for RGB-D Action Recognition.
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

Scene Flow to Action Map: A New Representation for RGB-D Based Action Recognition with Convolutional Neural Networks.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Action Recognition From Depth Maps Using Deep Convolutional Neural Networks.
IEEE Trans. Hum. Mach. Syst., 2016

A Spectral and Spatial Approach of Coarse-to-Fine Blurred Image Region Detection.
IEEE Signal Process. Lett., 2016

RGB-D-based action recognition datasets: A survey.
Pattern Recognit., 2016

Salient object detection using color spatial distribution and minimum spanning tree weight.
Multim. Tools Appl., 2016

Large-scale Continuous Gesture Recognition Using Convolutional Neutral Networks.
CoRR, 2016

Combining ConvNets with Hand-Crafted Features for Action Recognition Based on an HMM-SVM Classifier.
CoRR, 2016

Action Recognition Based on Joint Trajectory Maps Using Convolutional Neural Networks.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

A Large Scale RGB-D Dataset for Action Recognition.
Proceedings of the Understanding Human Activities Through 3D Sensors, 2016

Large-scale Continuous Gesture Recognition Using Convolutional Neural Networks.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

Large-scale Isolated Gesture Recognition using Convolutional Neural Networks.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

A novel rate control algorithm for video coding based on fuzzy-PID controller.
Signal Image Video Process., 2015

Deep Convolutional Neural Networks for Action Recognition Using Depth Map Sequences.
CoRR, 2015

Online Action Recognition based on Incremental Learning of Weighted Covariance Descriptors.
CoRR, 2015

ConvNets-Based Action Recognition from Depth Maps through Virtual Cameras and Pseudocoloring.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Mining Mid-Level Features for Action Recognition Based on Effective Skeleton Representation.
Proceedings of the 2014 International Conference on Digital Image Computing: Techniques and Applications, 2014

An Improved Direction Finding Algorithm Based on Toeplitz Approximation.
Sensors, 2013
