Haodong Duan

Orcid: 0000-0002-3052-4177

According to our database1, Haodong Duan authored at least 34 papers between 2017 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models.
CoRR, 2024

CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution.
CoRR, 2024

GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI.
CoRR, 2024

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.
CoRR, 2024

MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning.
CoRR, 2024

Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs.
CoRR, 2024

MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding.
CoRR, 2024

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions.
CoRR, 2024

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD.
CoRR, 2024

Are We on the Right Way for Evaluating Large Vision-Language Models?
CoRR, 2024

InternLM2 Technical Report.
CoRR, 2024

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model.
CoRR, 2024

Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

BotChat: Evaluating LLMs' Capabilities of Having Multi-Turn Dialogues.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

VLMEvalKit: An Open-Source ToolKit for Evaluating Large Multi-Modality Models.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

MMBench: Is Your Multi-modal Model an All-Around Player?
Proceedings of the Computer Vision - ECCV 2024, 2024

MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition.
CoRR, 2023

SkeleTR: Towrads Skeleton-based Action Recognition in the Wild.
CoRR, 2023

JourneyDB: A Benchmark for Generative Image Understanding.
CoRR, 2023

JourneyDB: A Benchmark for Generative Image Understanding.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

SkeleTR: Towards Skeleton-based Action Recognition in the Wild.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Self-Supervised Action Representation Learning from Partial Spatio-Temporal Skeleton Sequences.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action Recognition.
CoRR, 2022

PYSKL: Towards Good Practices for Skeleton Action Recognition.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Mitigating Representation Bias in Action Recognition: Algorithms and Benchmarks.
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

OCSampler: Compressing Videos to One Clip with Single-step Sampling.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Revisiting Skeleton-based Action Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Revisiting Skeleton-based Action Recognition.
CoRR, 2021

2020
Omni-Sourced Webly-Supervised Learning for Video Recognition.
Proceedings of the Computer Vision - ECCV 2020, 2020

2019
TRB: A Novel Triplet Representation for Understanding 2D Human Body.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

2017
SRPGAN: Perceptual Generative Adversarial Network for Single Image Super Resolution.
CoRR, 2017


  Loading...