Shuhuai Ren

According to our database1, Shuhuai Ren authored at least 28 papers between 2009 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey.
CoRR, 2024

Parallelized Autoregressive Visual Generation.
CoRR, 2024

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis.
CoRR, 2024

DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models.
CoRR, 2024

Towards Multimodal Video Paragraph Captioning Models Robust to Missing Modality.
CoRR, 2024

LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

TempCompass: Do Video LLMs Really Understand Videos?
Proceedings of the Findings of the Association for Computational Linguistics, 2024

PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond.
CoRR, 2023

M<sup>3</sup>IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning.
CoRR, 2023

Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Delving into the Openness of CLIP.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
Rethinking the Openness of CLIP.
CoRR, 2022

2021
CUGE: A Chinese Language Understanding and Generation Evaluation Benchmark.
CoRR, 2021

Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Dynamic Knowledge Distillation for Pre-trained Language Models.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Learning Relation Alignment for Calibrated Cross-modal Retrieval.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Accelerating Pre-trained Language Models via Calibrated Cascade.
CoRR, 2020

DCA: Diversified Co-attention Towards Informative Live Video Commenting.
Proceedings of the Natural Language Processing and Chinese Computing, 2020

2019
Diversified Co-Attention towards Informative Live Video Commenting.
CoRR, 2019

Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2010
The Virtual Learning Commons Architecture Based on Semantic Technologies.
Proceedings of the New Horizons in Web-Based Learning - ICWL 2010 Workshops, 2010

2009
From information commons to knowledge commons: Building a collaborative knowledge sharing environment for innovative communities.
Electron. Libr., 2009


  Loading...