Xing Sun

Orcid: 0000-0002-9006-4512

According to our database1, Xing Sun authored at least 129 papers between 2006 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Distilling consistent relations for multi-source domain adaptive person re-identification.
Pattern Recognit., 2025

2024
Turning a CLIP Model Into a Scene Text Spotter.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2024

Multi-dataset Detection with Transformers.
Int. J. Comput. Vis., July, 2024

Optimal Weighting Factor Design of Finite Control Set Model Predictive Control Based on Multiobjective Ant Colony Optimization.
IEEE Trans. Ind. Electron., 2024

Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM.
CoRR, 2024

Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing.
CoRR, 2024

CJEval: A Benchmark for Assessing Large Language Models Using Chinese Junior High School Exam Data.
CoRR, 2024

Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models.
CoRR, 2024

VITA: Towards Open-Source Interactive Omni Multimodal LLM.
CoRR, 2024

Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models.
CoRR, 2024

VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language Large Models.
CoRR, 2024

FinVerse: An Autonomous Agent System for Versatile Financial Analysis.
CoRR, 2024

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis.
CoRR, 2024

HRVDA: High-Resolution Visual Document Assistant.
CoRR, 2024

RESTORE: Towards Feature Shift for Vision-Language Prompt Learning.
CoRR, 2024

FIPO: Free-form Instruction-oriented Prompt Optimization with Preference Dataset and Modular Fine-tuning Schema.
CoRR, 2024

Parameter optimization of S-LCC dual load WPT system.
Comput. Electr. Eng., 2024

Litchi picking points localization in natural environment based on the Litchi-YOSO model and branch morphology reconstruction algorithm.
Comput. Electron. Agric., 2024

Multimodal Inplace Prompt Tuning for Open-set Object Detection.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Cantor: Inspiring Multimodal Chain-of-Thought of MLLM.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Cloth Tiger Hunt: An Embodied Experiential Educational Game for the Intangible Cultural Heritage of Artistic Handicraft.
Proceedings of the HCI in Games, 2024

Utilizing Party Game Strategies for Language Acquisition: A Novel Approach to Language Learning.
Proceedings of the HCI in Games, 2024

Make NPC More Realistic: Design and Practice of a Hybrid Stealth Game NPC AI Framework Based on OODA Theory.
Proceedings of the HCI International 2024 Posters, 2024

Design Mobile Exergames to Large-Scalely Promote Adolescent Physical Activity Based on Interval Training Theory.
Proceedings of the Human-Centered Design, Operation and Evaluation of Mobile Communications, 2024

Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Multimodal Label Relevance Ranking via Reinforcement Learning.
Proceedings of the Computer Vision - ECCV 2024, 2024

Aligning and Prompting Everything All at Once for Universal Visual Perception.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

HRVDA: High-Resolution Visual Document Assistant.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

A General and Efficient Training for Transformer via Token Expansion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Sinkhorn Distance Minimization for Knowledge Distillation.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Visual Hallucination Elevates Speech Recognition.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Grab What You Need: Rethinking Complex Table Structure Recognition with Flexible Components Deliberation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

SPD-DDPM: Denoising Diffusion Probabilistic Models in the Symmetric Positive Definite Space.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

SoftCLIP: Softer Cross-Modal Alignment Makes CLIP Stronger.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Reciprocal normalization for domain adaptation.
Pattern Recognit., August, 2023

Co-Salient Object Detection With Co-Representation Purification.
IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

Creative Console: A Player-Driven Game Based on a Modular Fast-Evolving-and-Verifying Framework.
Proc. ACM Hum. Comput. Interact., 2023

A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise.
CoRR, 2023

MMICT: Boosting Multi-Modal Fine-Tuning with In-Context Examples.
CoRR, 2023

Towards Robust Text Retrieval with Progressive Learning.
CoRR, 2023

Woodpecker: Hallucination Correction for Multimodal Large Language Models.
CoRR, 2023

Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration.
CoRR, 2023

Unified and Dynamic Graph for Temporal Character Grouping in Long Videos.
CoRR, 2023

MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation.
CoRR, 2023

A Survey on Multimodal Large Language Models.
CoRR, 2023

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models.
CoRR, 2023

Looking and Listening: Audio Guided Text Recognition.
CoRR, 2023

SoftCLIP: Softer Cross-modal Alignment Makes CLIP Stronger.
CoRR, 2023

Grab What You Need: Rethinking Complex Table Structure Recognition with Flexible Components Deliberation.
CoRR, 2023

Graph-Based Self-Learning for Robust Person Re-identification.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

CAPro: Webly Supervised Learning with Cross-modality Aligned Prototypes.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Research on the Gameplay Evolution Based on Warcraft 3 Mod Platform.
Proceedings of the Entertainment Computing - ICEC 2023, 2023

Mitigating Memorization of Noisy Labels via Regularization between Representations.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

Coarse-to-Fine: Learning Compact Discriminative Representation for Single-Stage Image Retrieval.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Towards Dual Optimization of Efficiency and Accuracy: Hyper Billion Scale Model Computing Platform for Address Services.
Proceedings of the 2023 7th International Conference on Computer Science and Artificial Intelligence, 2023

Create Ice Cream: Real-time Creative Element Synthesis Framework Based on GPT3.0.
Proceedings of the IEEE Conference on Games, 2023

Research on the Reconstruction of Ming Dynasty History Based on AIGC.
Proceedings of the Eleventh International Symposium of Chinese CHI, 2023

Span-level Aspect-based Sentiment Analysis via Table Filling.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Conditional Feature Embedding by Visual Clue Correspondence Graph for Person Re-Identification.
IEEE Trans. Image Process., 2022

Conditional Feature Learning Based Transformer for Text-Based Person Search.
IEEE Trans. Image Process., 2022

Self-supervised Models are Good Teaching Assistants for Vision Transformers.
Proceedings of the International Conference on Machine Learning, 2022

AS-MLP: An Axial Shifted MLP Architecture for Vision.
Proceedings of the Tenth International Conference on Learning Representations, 2022

PAC-Net: Highlight Your Video via History Preference Modeling.
Proceedings of the Computer Vision - ECCV 2022, 2022

DisCo: Remedying Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022

Efficient Decoder-Free Object Detection with Transformers.
Proceedings of the Computer Vision - ECCV 2022, 2022

Training-free Transformer Architecture Search.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

DIFNet: Boosting Visual Information Flow for Image Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Gulliver's Game: Multiviewer and Vtuber Extreme Asymmetric Game.
Proceedings of the IEEE Conference on Games, CoG 2022, Beijing, 2022

Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
High-Dimensional Dense Residual Convolutional Neural Network for Light Field Reconstruction.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Learning fused features with parallel training for person re-identification.
Knowl. Based Syst., 2021

RMNet: Equivalently Removing Residual Connection from Networks.
CoRR, 2021

Demystifying How Self-Supervised Features Improve Training from Noisy Labels.
CoRR, 2021

DisCo: Remedy Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning.
CoRR, 2021

On Evolving Attention Towards Domain Adaptation.
CoRR, 2021

Part2Whole: Iteratively Enrich Detail for Cross-Modal Retrieval with Partial Query.
CoRR, 2021

On The Consistency Training for Open-Set Semi-Supervised Learning.
CoRR, 2021

Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search.
CoRR, 2021

Image generation and constrained two-stage feature fusion for person re-identification.
Appl. Intell., 2021

Discriminator-free Generative Adversarial Attack.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Dig into Multi-modal Cues for Video Retrieval with Hierarchical Alignment.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Integrated Modalities And Multi-Level Granularity: Towards A Unified Video-Text Retrieval Framework.
Proceedings of the 2021 IEEE International Conference on Multimedia & Expo Workshops, 2021

Learning with Instance-Dependent Label Noise: A Sample Sieve Approach.
Proceedings of the 9th International Conference on Learning Representations, 2021

Learning to Know Where to See: A Visibility-Aware Approach for Occluded Person Re-identification.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Learning Canonical View Representation for 3D Shape Recognition with Arbitrary Views.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

PR-Net: Preference Reasoning for Personalized Video Highlight Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval with Partial Query.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Temporal Modulation Network for Controllable Space-Time Video Super-Resolution.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Removing the Background by Adding the Background: Towards Background Robust Self-Supervised Video Representation Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Learning 3D Shape Feature for Texture-Insensitive Person Re-Identification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

One for More: Selecting Generalizable Samples for Generalizable ReID Model.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning.
CoRR, 2020

Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion.
CoRR, 2020

Devil's in the Detail: Graph-based Key-point Alignment and Embedding for Person Re-ID.
CoRR, 2020

DGD: Densifying the Knowledge of Neural Networks with Filter Grafting and Knowledge Distillation.
CoRR, 2020

Pruning Filter in Filter.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Research on Traceability of Agricultural Products Supply Chain System Based on Blockchain and Internet of Things Technology.
Proceedings of the Artificial Intelligence and Security - 6th International Conference, 2020

Do Not Disturb Me: Person Re-identification Under the Interference of Other Pedestrians.
Proceedings of the Computer Vision - ECCV 2020, 2020

Filter Grafting for Deep Neural Networks.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Viewpoint-Aware Loss with Angular Regularization for Person Re-Identification.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Asymmetric Co-Teaching for Unsupervised Cross-Domain Person Re-Identification.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Rethinking Temporal Fusion for Video-Based Person Re-Identification on Semantic and Time Aspect.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Synthesizing Virtual-Real Artworks Using Sun Orientation Estimation.
Proceedings of the Cognitive Internet of Things: Frameworks, Tools and Applications, 2019

Hierarchical multi-modal fusion FCN with attention model for RGB-D tracking.
Inf. Fusion, 2019

Computational Light Field Generation Using Deep Nonparametric Bayesian Learning.
IEEE Access, 2019

The Seventh Visual Object Tracking VOT2019 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Estimation of Vessel Emissions Inventory in Qingdao Port Based on Big data Analysis.
Symmetry, 2018

A Coarse-to-fine Pyramidal Model for Person Re-identification via Multi-Loss Dynamic Training.
CoRR, 2018

Multiobject Tracking in Videos Based on LSTM and Deep Reinforcement Learning.
Complex., 2018

2017
Computationally Efficient Hyperspectral Data Learning Based on the Doubly Stochastic Dirichlet Process.
IEEE Trans. Geosci. Remote. Sens., 2017

Human arm pose modeling with learned features using joint convolutional neural network.
Mach. Vis. Appl., 2017

2016
Nonparametric Bayesian methods for visual data association.
PhD thesis, 2016

Unsupervised Tracking With the Doubly Stochastic Dirichlet Process Mixture Model.
IEEE Trans. Intell. Transp. Syst., 2016

Consistency Analysis for the Doubly Stochastic Dirichlet Process.
CoRR, 2016

Data-driven light field depth estimation using deep Convolutional Neural Networks.
Proceedings of the 2016 International Joint Conference on Neural Networks, 2016

Sparse Hierarchical Nonparametric Bayesian learning for light field representation and denoising.
Proceedings of the 2016 International Joint Conference on Neural Networks, 2016

Speed governor PID gains optimal tuning of hydraulic turbine generator set with an improved artificial fish swarm algorithm.
Proceedings of the IEEE International Conference on Information and Automation, 2016

Unsupervised tracking with a low computational cost using the doubly stochastic Dirichlet process mixture model.
Proceedings of the Image Processing: Machine Vision Applications IX, 2016

2015
SAMSVM: A tool for misalignment filtration of SAM-format sequences with support vector machine.
J. Bioinform. Comput. Biol., 2015

2013
Preference limits of the visual dynamic range for ultra high quality and aesthetic conveyance.
Proceedings of the Human Vision and Electronic Imaging XVIII, 2013

2006
Mining Approximate Frequent Itemsets In the Presence of Noise: Algorithm and Analysis.
Proceedings of the Sixth SIAM International Conference on Data Mining, 2006

Significance and Recovery of Block Structures in Binary Matrices with Noise.
Proceedings of the Learning Theory, 19th Annual Conference on Learning Theory, 2006


  Loading...