Shaoxiang Chen

Orcid: 0000-0002-7627-7124

Affiliations:

Fudan University, Shanghai Key Lab of Intelligent Information Processing, Shanghai, China

According to our database¹, Shaoxiang Chen authored at least 29 papers between 2017 and 2024.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

EAGLE: Towards Efficient Arbitrary Referring Visual Prompts Comprehension for Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

EventHallusion: Diagnosing Event Hallucinations in Video LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

MindBench: A Comprehensive Benchmark for Mind Map Structure Recognition and Analysis.

[BibT_eX]

[DOI]

CoRR, 2024

Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Lumen: Unleashing Versatile Vision-Centric Capabilities of Large Multimodal Models.

[BibT_eX]

[DOI]

CoRR, 2024

LLaVA-MoLE: Sparse Mixture of LoRA Experts for Mitigating Data Conflicts in Instruction Finetuning MLLMs.

[BibT_eX]

[DOI]

Shaoxiang Chen

Zequn Jie

Lin Ma

CoRR, 2024

Making Large Language Models Better Planners with Reasoning-Decision Alignment.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Instance-Aware Multi-Camera 3D Object Detection with Structural Priors Mining and Self-Boosting Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

FT-TDR: Frequency-Guided Transformer and Top-Down Refinement Network for Blind Face Inpainting.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Scene Graph Refinement Network for Visual Question Answering.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Self-Supervised Learning for Semi-Supervised Temporal Language Grounding.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Prompting Large Language Models to Reformulate Queries for Moment Localization.

[BibT_eX]

[DOI]

CoRR, 2023

MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection.

[BibT_eX]

[DOI]

CoRR, 2022

MT-Net Submission to the Waymo 3D Detection Leaderboard.

[BibT_eX]

[DOI]

CoRR, 2022

MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2021

Towards Bridging Video and Language by Caption Generation and Sentence Localization.

[BibT_eX]

[DOI]

Shaoxiang Chen

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Motion Guided Region Message Passing for Video Captioning.

[BibT_eX]

[DOI]

Shaoxiang Chen

Yu-Gang Jiang

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Towards Bridging Event Captioner and Sentence Localizer for Weakly Supervised Dense Event Captioning.

[BibT_eX]

[DOI]

Shaoxiang Chen

Yu-Gang Jiang

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Hierarchical Visual-Textual Graph for Temporal Activity Localization via Language.

[BibT_eX]

[DOI]

Shaoxiang Chen

Yu-Gang Jiang

Proceedings of the Computer Vision - ECCV 2020, 2020

2019

FDU Participation in TRECVID 2019 VTT Task.

[BibT_eX]

[DOI]

Shaoxiang Chen

Yu-Gang Jiang

Proceedings of the 2019 TREC Video Retrieval Evaluation, 2019

Black-box Adversarial Attacks on Video Recognition Models.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Deep Learning for Video Captioning: A Review.

[BibT_eX]

[DOI]

Shaoxiang Chen

Ting Yao

Yu-Gang Jiang

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Semantic Proposal for Activity Localization in Videos via Sentence Query.

[BibT_eX]

[DOI]

Shaoxiang Chen

Yu-Gang Jiang

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Motion Guided Spatial Attention for Video Captioning.

[BibT_eX]

[DOI]

Shaoxiang Chen

Yu-Gang Jiang

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

ESU-P-Net: Cascading Network for Full Quantification of Left Ventricle from Cine MRI.

[BibT_eX]

[DOI]

Proceedings of the Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges, 2018

Non-local NetVLAD Encoding for Video Classification.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

2017

Aggregating Frame-level Features for Large-Scale Video Classification.

[BibT_eX]

[DOI]

CoRR, 2017

Shaoxiang Chen

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...