Shaoxiang Chen

Orcid: 0000-0002-7627-7124

Affiliations:
  • Fudan University, Shanghai Key Lab of Intelligent Information Processing, Shanghai, China


According to our database1, Shaoxiang Chen authored at least 29 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
EAGLE: Towards Efficient Arbitrary Referring Visual Prompts Comprehension for Multimodal Large Language Models.
CoRR, 2024

EventHallusion: Diagnosing Event Hallucinations in Video LLMs.
CoRR, 2024

MindBench: A Comprehensive Benchmark for Mind Map Structure Recognition and Analysis.
CoRR, 2024

Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models.
CoRR, 2024

Lumen: Unleashing Versatile Vision-Centric Capabilities of Large Multimodal Models.
CoRR, 2024

LLaVA-MoLE: Sparse Mixture of LoRA Experts for Mitigating Data Conflicts in Instruction Finetuning MLLMs.
CoRR, 2024

Making Large Language Models Better Planners with Reasoning-Decision Alignment.
Proceedings of the Computer Vision - ECCV 2024, 2024

Instance-Aware Multi-Camera 3D Object Detection with Structural Priors Mining and Self-Boosting Learning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
FT-TDR: Frequency-Guided Transformer and Top-Down Refinement Network for Blind Face Inpainting.
IEEE Trans. Multim., 2023

Scene Graph Refinement Network for Visual Question Answering.
IEEE Trans. Multim., 2023

Self-Supervised Learning for Semi-Supervised Temporal Language Grounding.
IEEE Trans. Multim., 2023

Prompting Large Language Models to Reformulate Queries for Moment Localization.
CoRR, 2023

MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection.
CoRR, 2022

MT-Net Submission to the Waymo 3D Detection Leaderboard.
CoRR, 2022

MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
Towards Bridging Video and Language by Caption Generation and Sentence Localization.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Motion Guided Region Message Passing for Video Captioning.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Towards Bridging Event Captioner and Sentence Localizer for Weakly Supervised Dense Event Captioning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos.
Proceedings of the Computer Vision - ECCV 2020, 2020

Hierarchical Visual-Textual Graph for Temporal Activity Localization via Language.
Proceedings of the Computer Vision - ECCV 2020, 2020

2019
FDU Participation in TRECVID 2019 VTT Task.
Proceedings of the 2019 TREC Video Retrieval Evaluation, 2019

Black-box Adversarial Attacks on Video Recognition Models.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Deep Learning for Video Captioning: A Review.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Semantic Proposal for Activity Localization in Videos via Sentence Query.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Motion Guided Spatial Attention for Video Captioning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
ESU-P-Net: Cascading Network for Full Quantification of Left Ventricle from Cine MRI.
Proceedings of the Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges, 2018

Non-local NetVLAD Encoding for Video Classification.
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

2017
Aggregating Frame-level Features for Large-Scale Video Classification.
CoRR, 2017


  Loading...