Yitian Yuan

Orcid: 0000-0001-8701-7689

According to our database1, Yitian Yuan authored at least 16 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
VidCompress: Memory-Enhanced Temporal Compression for Video Understanding in Large Language Models.
CoRR, 2024

Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models.
CoRR, 2024

3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach.
ACM Trans. Multim. Comput. Commun. Appl., November, 2023

A Survey on Temporal Sentence Grounding in Videos.
ACM Trans. Multim. Comput. Commun. Appl., 2023

Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment.
CoRR, 2023

Curriculum Multi-Negative Augmentation for Debiased Video Grounding.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Syntax Customized Video Captioning by Imitating Exemplar Sentences.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

2021
A Closer Look at Temporal Sentence Grounding in Videos: Datasets and Metrics.
CoRR, 2021

A Closer Look at Temporal Sentence Grounding in Videos: Dataset and Metric.
Proceedings of the HUMA'21: Proceedings of the 2nd International Workshop on Human-centric Multimedia Analysis, 2021

2020
Controllable Video Captioning with an Exemplar Sentence.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

2019
Video Summarization by Learning Deep Side Semantic Embedding.
IEEE Trans. Circuits Syst. Video Technol., 2019

Sentence Specified Dynamic Video Thumbnail Generation.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Cross-Modal Dual Learning for Sentence-to-Video Generation.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

To Find Where You Talk: Temporal Sentence Localization in Video with Attention Based Location Regression.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019


  Loading...