Guo Chen

Affiliations:
  • Nanjing University, State Key Laboratory for Novel Software Technology, China


According to our database1, Guo Chen authored at least 23 papers between 2022 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Matching Compound Prototypes for Few-Shot Action Recognition.
Int. J. Comput. Vis., September, 2024

EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation.
CoRR, 2024

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding.
CoRR, 2024

Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding.
CoRR, 2024

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

InternVideo2: Scaling Foundation Models for Multimodal Video Understanding.
Proceedings of the Computer Vision - ECCV 2024, 2024

Retrieval-Augmented Egocentric Video Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Intern VL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
BasicTAD: An astounding RGB-Only baseline for temporal action detection.
Comput. Vis. Image Underst., July, 2023

InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks.
CoRR, 2023

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark.
CoRR, 2023

AVSegFormer: Audio-Visual Segmentation with Transformer.
CoRR, 2023

VideoLLM: Modeling Video Sequence with Large Language Models.
CoRR, 2023

Champion Solution for the WSDM2023 Toloka VQA Challenge.
CoRR, 2023

MRSN: Multi-Relation Support Network for Video Action Detection.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

ELAN: Enhancing Temporal Action Detection with Location Awareness.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Memory-and-Anticipation Transformer for Online Action Understanding.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
InternVideo: General Video Foundation Models via Generative and Discriminative Learning.
CoRR, 2022

InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges.
CoRR, 2022

Exploring State Change Capture of Heterogeneous Backbones @ Ego4D Hands and Objects Challenge 2022.
CoRR, 2022

DCAN: Improving Temporal Action Detection via Dual Context Aggregation.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022


  Loading...