Wei Li

Orcid: 0000-0002-1221-7915

Affiliations:

NewsBreak, Seattle, WA, USA
Google Research, Zurich, Switzerland (former)
Chinese University of Hong Kong, Hong Kong (former)

According to our database¹, Wei Li authored at least 38 papers between 2010 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions.

[BibT_eX]

[DOI]

CoRR, 2024

GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI.

[BibT_eX]

[DOI]

CoRR, 2024

MinerU: An Open-Source Solution for Precise Document Content Extraction.

[BibT_eX]

[DOI]

CoRR, 2024

GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI.

[BibT_eX]

[DOI]

CoRR, 2024

OpenDataLab: Empowering General Artificial Intelligence with Open Datasets.

[BibT_eX]

[DOI]

CoRR, 2024

Investigating Public Fine-Tuning Datasets: A Complex Review of Current Practices from a Construction Perspective.

[BibT_eX]

[DOI]

Runyuan Ma

Wei Li

Fukai Shang

CoRR, 2024

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

[BibT_eX]

[DOI]

CoRR, 2024

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.

[BibT_eX]

[DOI]

CoRR, 2024

FoundaBench: Evaluating Chinese Fundamental Knowledge Capabilities of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites.

[BibT_eX]

[DOI]

CoRR, 2024

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD.

[BibT_eX]

[DOI]

CoRR, 2024

InternLM2 Technical Report.

[BibT_eX]

[DOI]

et al.

CoRR, 2024

RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition.

[BibT_eX]

[DOI]

CoRR, 2024

WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset.

[BibT_eX]

[DOI]

CoRR, 2024

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model.

[BibT_eX]

[DOI]

CoRR, 2024

OMG-Seg: Is One Model Good Enough for all Segmentation?

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

VIGC: Visual Instruction Generation and Correction.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Learning to Overcome Noise in Weak Caption Supervision for Object Detection.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., April, 2023

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition.

[BibT_eX]

[DOI]

CoRR, 2023

WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models.

[BibT_eX]

[DOI]

CoRR, 2023

2022

Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

MeMOT: Multi-Object Tracking with Memory.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Semi-TCL: Semi-Supervised Track Contrastive Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2021

2020

SMOT: Single-Shot Multi Object Tracking.

[BibT_eX]

[DOI]

CoRR, 2020

2019

Cap2Det: Learning to Amplify Weak Caption Supervision for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

2018

Learning to discover and localize visual objects with open vocabulary.

[BibT_eX]

[DOI]

CoRR, 2018

Appearance-and-Relation Networks for Video Classification.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

WebVision Database: Visual Learning and Understanding from Web Data.

[BibT_eX]

[DOI]

CoRR, 2017

WebVision Challenge: Visual Learning and Understanding With Web Data.

[BibT_eX]

[DOI]

CoRR, 2017

2016

CUHK & ETHZ & SIAT Submission to ActivityNet Challenge 2016.

[BibT_eX]

[DOI]

CoRR, 2016

2014

Scene-Specific Pedestrian Detection for Static Video Surveillance.

[BibT_eX]

[DOI]

Xiaogang Wang

Meng Wang

Wei Li

IEEE Trans. Pattern Anal. Mach. Intell., 2014

DeepReID: Deep Filter Pairing Neural Network for Person Re-identification.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013

Dimensionality Reduction with Generalized Linear Models.

[BibT_eX]

[DOI]

Proceedings of the IJCAI 2013, 2013

Locally Aligned Feature Transforms across Views.

[BibT_eX]

[DOI]

Wei Li

Xiaogang Wang

Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

2012

Transferring a generic pedestrian detector towards specific scenes.

[BibT_eX]

[DOI]

Meng Wang

Wei Li

Xiaogang Wang

Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Human Reidentification with Transferred Metric Learning.

[BibT_eX]

[DOI]

Wei Li

Rui Zhao

Xiaogang Wang

Proceedings of the Computer Vision - ACCV 2012, 2012

2010

Term Filtering with Bounded Error.

[BibT_eX]

[DOI]

Proceedings of the ICDM 2010, 2010

Wei Li

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...