Yan Xia

Orcid: 0000-0003-4631-741X

Affiliations:
  • Zhejiang University, Hangzhou, China


According to our database1, Yan Xia authored at least 12 papers between 2022 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Multi-Granularity Relational Attention Network for Audio-Visual Question Answering.
IEEE Trans. Circuits Syst. Video Technol., August, 2024

ACE: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling.
CoRR, 2024

EAGER: Two-Stream Generative Recommender with Behavior-Semantic Collaboration.
CoRR, 2024

Unlocking the Potential of Multimodal Unified Discrete Representation through Training-Free Codebook Optimization and Hierarchical Alignment.
CoRR, 2024

EAGER: Two-Stream Generative Recommender with Behavior-Semantic Collaboration.
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Multi-Modal Domain Adaptation Across Video Scenes for Temporal Video Grounding.
CoRR, 2023

Achieving Cross Modal Generalization with Multimodal Unified Representation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Cross-modal Prompts: Adapting Large Pre-trained Models for Audio-Visual Downstream Tasks.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Scene-robust Natural Language Video Localization via Learning Domain-invariant Representations.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
Video-Guided Curriculum Learning for Spoken Video Grounding.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Cross-modal Background Suppression for Audio-Visual Event Localization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022


  Loading...