Xiaoshuai Sun
Orcid: 0000-0003-3912-9306
According to our database1,
Xiaoshuai Sun
authored at least 219 papers
between 2008 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
Pattern Recognit., 2025
2024
Int. J. Comput. Vis., January, 2024
A Survivor in the Era of Large-Scale Pretraining: An Empirical Study of One-Stage Referring Expression Comprehension.
IEEE Trans. Multim., 2024
CoRR, 2024
DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion.
CoRR, 2024
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models.
CoRR, 2024
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model.
CoRR, 2024
Routing Experts: Learning to Route Dynamic Experts in Multi-modal Large Language Models.
CoRR, 2024
DiffusionFace: Towards a Comprehensive Dataset for Diffusion-Based Face Forgery Analysis.
CoRR, 2024
Not All Attention is Needed: Parameter and Computation Efficient Transfer Learning for Multi-modal Large Language Models.
CoRR, 2024
Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models.
CoRR, 2024
StealthDiffusion: Towards Evading Diffusion Forensic Detection through Diffusion Model.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
QueryMatch: A Query-based Contrastive Learning Framework for Weakly Supervised Visual Grounding.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
X-RefSeg3D: Enhancing Referring 3D Instance Segmentation via Structured Cross-Modal Graph Neural Networks.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
Improving Panoptic Narrative Grounding by Harnessing Semantic Relationships and Visual Confirmation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
A Real-Time Global Inference Network for One-Stage Referring Expression Comprehension.
IEEE Trans. Neural Networks Learn. Syst., 2023
Fast Monocular Depth Estimation via Side Prediction Aggregation with Continuous Spatial Refinement.
IEEE Trans. Multim., 2023
IEEE Trans. Multim., 2023
IEEE Trans. Multim., 2023
X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation.
CoRR, 2023
NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative Learning.
CoRR, 2023
CoRR, 2023
Adapting Pre-trained Language Models to Vision-Language Tasks via Dynamic Visual Prompting.
CoRR, 2023
CoRR, 2023
IEEE Access, 2023
Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
PixelFace+: Towards Controllable Face Generation and Manipulation with Text Descriptions and Segmentation Masks.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
RefTeacher: A Strong Baseline for Semi-Supervised Referring Expression Comprehension.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
RefCLIP: A Universal Teacher for Weakly Supervised Referring Expression Comprehension.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
IEEE Trans. Multim., 2022
Towards Lightweight Transformer Via Group-Wise Transformation for Vision-and-Language Tasks.
IEEE Trans. Image Process., 2022
IEEE Trans. Image Process., 2022
IEEE Trans. Pattern Anal. Mach. Intell., 2022
IEEE Trans. Pattern Anal. Mach. Intell., 2022
Modeling long-term video semantic distribution for temporal action proposal generation.
Neurocomputing, 2022
What Goes beyond Multi-modal Fusion in One-stage Referring Expression Comprehension: An Empirical Study.
CoRR, 2022
CoRR, 2022
CoRR, 2022
Differentiated Relevances Embedding for Group-based Referring Expression Comprehension.
CoRR, 2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Proceedings of the Computer Vision - ECCV 2022, 2022
Proceedings of the Computer Vision - ECCV 2022, 2022
Proceedings of the Computer Vision - ECCV 2022, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
2021
Deep Semantic Parsing of Freehand Sketches With Homogeneous Transformation, Soft-Weighted Loss, and Staged Learning.
IEEE Trans. Multim., 2021
IEEE Trans. Pattern Anal. Mach. Intell., 2021
Neurocomputing, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
IEEE Trans. Image Process., 2020
Pattern Recognit., 2020
IEEE Trans. Pattern Anal. Mach. Intell., 2020
Multim. Tools Appl., 2020
Neurocomputing, 2020
K-armed Bandit based Multi-Modal Network Architecture Search for Visual Question Answering.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
SSAH: Semi-Supervised Adversarial Deep Hashing with Self-Paced Hard Sample Generation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
2019
IEEE Trans. Multim., 2019
Multim. Tools Appl., 2019
CoRR, 2019
Social Media Based Topic Modeling for Smart Campus: A Deep Topical Correlation Analysis Method.
IEEE Access, 2019
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019
Multi-modal Multi-layer Fusion Network with Average Binary Center Loss for Face Anti-spoofing.
Proceedings of the 27th ACM International Conference on Multimedia, 2019
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
Towards Optimal Fine Grained Retrieval via Decorrelated Centralized Loss with Normalize-Scale Layer.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
2018
Two-Stream 3-D convNet Fusion for Action Recognition in Videos With Arbitrary Size and Length.
IEEE Trans. Multim., 2018
Signal Process., 2018
Multim. Tools Appl., 2018
Comput. Vis. Image Underst., 2018
The Effectiveness of Instance Normalization: a Strong Baseline for Single Image Dehazing.
CoRR, 2018
Centralized Ranking Loss with Weakly Supervised Localization for Fine-Grained Object Retrieval.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018
Proceedings of the 10th International Conference on Internet Multimedia Computing and Service, 2018
Proceedings of the 10th International Conference on Internet Multimedia Computing and Service, 2018
Proceedings of the Database Systems for Advanced Applications, 2018
GroupCap: Group-Based Image Captioning With Structured Relevance and Diversity Constraints.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018
Strong Baseline for Single Image Dehazing with Deep Features and Instance Normalization.
Proceedings of the British Machine Vision Conference 2018, 2018
2017
IEEE Trans. Multim., 2017
IEEE Trans. Image Process., 2017
Anomaly detection based on spatio-temporal sparse representation and visual attention analysis.
Multim. Tools Appl., 2017
Exploiting the complementary strengths of multi-layer CNN features for image retrieval.
Neurocomputing, 2017
Proceedings of the Advances in Multimedia Information Processing - PCM 2017, 2017
Proceedings of the Advances in Multimedia Information Processing - PCM 2017, 2017
Proceedings of the Advances in Multimedia Information Processing - PCM 2017, 2017
Proceedings of the Advances in Multimedia Information Processing - PCM 2017, 2017
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017
Dancing like a superstar: Action guidance based on pose estimation and conditional pose alignment.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017
SPTF: A Scalable Probabilistic Tensor Factorization Model for Semantic-Aware Behavior Prediction.
Proceedings of the 2017 IEEE International Conference on Data Mining, 2017
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017
2016
Neurocomputing, 2016
Neurocomputing, 2016
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
2015
深度学习中的自编码器的表达能力研究 (Representation Ability Research of Auto-encoders in Deep Learning).
计算机科学, 2015
Neurocomputing, 2015
Strategy for aesthetic photography recommendation via collaborative composition model.
IET Comput. Vis., 2015
Proceedings of the Advances in Multimedia Information Processing - PCM 2015, 2015
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015
Proceedings of the 7th International Conference on Internet Multimedia Computing and Service, 2015
Boost sparse coding based abnormal event detection via explicitly applying temporal continuity constraint.
Proceedings of the 7th International Conference on Internet Multimedia Computing and Service, 2015
2014
IEEE Trans. Image Process., 2014
Where should I stand? Learning based human position recommendation for mobile photographing.
Multim. Tools Appl., 2014
Proceedings of the Advances in Multimedia Information Processing - PCM 2014, 2014
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2014
2013
Bidirectional-isomorphic manifold learning at image semantic understanding & representation.
Multim. Tools Appl., 2013
J. Vis. Commun. Image Represent., 2013
Neurocomputing, 2013
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013
Proceedings of the IEEE International Conference on Image Processing, 2013
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013
2012
Proceedings of the 2012 Visual Communications and Image Processing, 2012
Proceedings of the Advances in Multimedia Information Processing - PCM 2012, 2012
Real-Time Viewfinder Composition Assessment and Recommendation to Mobile Photographing.
Proceedings of the Advances in Multimedia Information Processing - PCM 2012, 2012
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012
Proceedings of the 19th IEEE International Conference on Image Processing, 2012
What are we looking for: Towards statistical modeling of saccadic eye movements and visual saliency.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012
2011
Actor-independent action search using spatiotemporal vocabulary with appearance hashing.
Pattern Recognit., 2011
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011
Proceedings of the 18th IEEE International Conference on Image Processing, 2011
Video stabilization based on saliency driven SIFT matching and discriminative RANSAC.
Proceedings of the ICIMCS 2011, 2011
Proceedings of the ICIMCS 2011, 2011
Proceedings of the ICIMCS 2011, 2011
Proceedings of the Sixth International Conference on Image and Graphics, 2011
Proceedings of the Sixth International Conference on Image and Graphics, 2011
2010
Proceedings of the Visual Communications and Image Processing 2010, 2010
Proceedings of the International Conference on Image Processing, 2010
Proceedings of the International Conference on Image Processing, 2010
Proceedings of the Second International Conference on Internet Multimedia Computing and Service, 2010
Proceedings of the Second International Conference on Internet Multimedia Computing and Service, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010
2009
Multim. Syst., 2009
Proceedings of the 17th International Conference on Multimedia 2009, 2009
Proceedings of the 17th International Conference on Multimedia 2009, 2009
Proceedings of the First International Conference on Internet Multimedia Computing and Service, 2009
2008
Proceedings of the Advances in Multimedia Information Processing, 2008
Proceedings of the 16th International Conference on Multimedia 2008, 2008
Proceedings of the 1st ACM SIGMM International Conference on Multimedia Information Retrieval, 2008
Proceedings of the 1st ACM SIGMM International Conference on Multimedia Information Retrieval, 2008
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008
Proceedings of the Image Analysis and Recognition, 5th International Conference, 2008