Zheng Shou
Orcid: 0000-0002-7681-2166Affiliations:
- National University of Singapore
- Columbia University, New York, NY, USA (former)
According to our database1,
Zheng Shou
authored at least 183 papers
between 2016 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
Pattern Recognit., 2025
2024
IEEE Trans. Knowl. Data Eng., December, 2024
IEEE Trans. Circuits Syst. Video Technol., June, 2024
Enhancing Visual Grounding in Vision-Language Pre-Training With Position-Guided Text Prompts.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2024
DR-FER: Discriminative and Robust Representation Learning for Facial Expression Recognition.
IEEE Trans. Multim., 2024
Int. J. Comput. Vis., 2024
CoRR, 2024
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.
CoRR, 2024
CoRR, 2024
CoRR, 2024
CoRR, 2024
High Quality Human Image Animation using Regional Supervision and Motion Blur Condition.
CoRR, 2024
VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation.
CoRR, 2024
CoRR, 2024
CoRR, 2024
CoRR, 2024
Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation.
CoRR, 2024
Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters.
CoRR, 2024
CoRR, 2024
Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions.
CoRR, 2024
CoRR, 2024
Proceedings of the SIGGRAPH Asia 2024 Conference Papers, 2024
MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
AssistEditor: Multi-Agent Collaboration for GUI Workflow Automation in Video Creation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Apprenticeship-Inspired Elegance: Synergistic Knowledge Distillation Empowers Spiking Neural Networks for Efficient Single-Eye Emotion Recognition.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
Delocate: Detection and Localization for Deepfake Videos with Randomly-Located Tampered Traces.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the 5th International Workshop on Human-centric Multimedia Analysis, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Free-ATM: Harnessing Free Attention Masks for Representation Learning on Diffusion-Generated Images.
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
L4D-Track: Language-to-4D Modeling Towards 6-DoF Tracking and Shape Reconstruction in 3D Point Cloud Stream.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
2023
IEEE Trans. Image Process., 2023
CoRR, 2023
CoRR, 2023
ColonNeRF: Neural Radiance Fields for High-Fidelity Long-Sequence Colonoscopy Reconstruction.
CoRR, 2023
MD-Splatting: Learning Metric Deformation from 4D Gaussians in Highly Deformable Scenes.
CoRR, 2023
DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing.
CoRR, 2023
CoRR, 2023
Bridging Sensor Gaps via Single-Direction Tuning for Hyperspectral Image Classification.
CoRR, 2023
Recap: Detecting Deepfake Video with Unpredictable Tampered Traces via Recovering Faces and Mapping Recovered Faces.
CoRR, 2023
Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks.
CoRR, 2023
CoRR, 2023
AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn.
CoRR, 2023
Mover: Mask and Recovery based Facial Part Consistency Aware Method for Deepfake Video Detection.
CoRR, 2023
CoRR, 2023
CoRR, 2023
DeepfakeMAE: Facial Part Consistency Aware Masked Autoencoder for Deepfake Video Detection.
CoRR, 2023
STPrivacy: Spatio-Temporal Tubelet Sparsification and Anonymization for Privacy-preserving Action Recognition.
CoRR, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Transformer-based Open-world Instance Segmentation with Cross-task Consistency Regularization.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the 1st Workshop on Large Generative Models Meet Multimodal Applications, 2023
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
GazeVQA: A Video Question Answering Dataset for Multiview Eye-Gaze Task-Oriented Collaborations.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
MIST : Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
IEEE Trans. Image Process., 2022
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation.
CoRR, 2022
CoRR, 2022
An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022.
CoRR, 2022
Single-Stage Open-world Instance Segmentation with Cross-task Consistency Regularization.
CoRR, 2022
Egocentric Video-Language Pretraining @ EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022.
CoRR, 2022
Sense The Physical, Walkthrough The Virtual, Manage The Metaverse: A Data-centric Perspective.
CoRR, 2022
GEB+: A benchmark for generic event boundary captioning, grounding and text-based retrieval.
CoRR, 2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
From Token to Word: OCR Token Evolution via Contrastive Learning and Semantic Matching for Text-VQA.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Proceedings of the HCMA@MM 2022: Proceedings of the 3rd International Workshop on Human-Centric Multimedia Analysis, 2022
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022
MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022
AssistQ: Affordance-Centric Question-Driven Task Completion for Egocentric Assistant.
Proceedings of the Computer Vision - ECCV 2022, 2022
Proceedings of the Computer Vision - ECCV 2022, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
2021
Is Someone Speaking?: Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
2020
CoRR, 2020
Proceedings of the Computer Vision - ECCV 2020, 2020
2019
LPAT: Learning to Predict Adaptive Threshold for Weakly-supervised Temporal Action Localization.
CoRR, 2019
CDSA: Cross-Dimensional Self-Attention for Multivariate, Geo-tagged Time Series Imputation.
CoRR, 2019
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
2018
CoRR, 2018
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018
Proceedings of the Computer Vision - ECCV 2018, 2018
Proceedings of the Computer Vision - ECCV 2018, 2018
2017
CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017
2016
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016