Lewei Lu

Orcid: 0009-0009-9809-3818

According to our database1, Lewei Lu authored at least 48 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

2020
2021
2022
2023
2024
0
5
10
15
20
25
30
17
6
3
11
6
2
1

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Delving Into the Devils of Bird's-Eye-View Perception: A Review, Evaluation and Recipe.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2024

Mini-InternVL: a flexible-transfer pocket multi-modal model with 5% parameters and 90% performance.
Vis. Intell., 2024

Multimodal 3D Reasoning Segmentation with Complex Scenes.
CoRR, 2024

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization.
CoRR, 2024

Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance.
CoRR, 2024

MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity.
CoRR, 2024

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.
CoRR, 2024

VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks.
CoRR, 2024

Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning.
CoRR, 2024

Needle In A Multimodal Haystack.
CoRR, 2024

Learning 1D Causal Visual Representation with De-focus Attention Networks.
CoRR, 2024

Parameter-Inverted Image Pyramid Networks.
CoRR, 2024

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites.
CoRR, 2024

Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures.
CoRR, 2024

MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer.
CoRR, 2024

Multi-scale 2D Temporal Map Diffusion Models for Natural Language Video Localization.
CoRR, 2024

Learning to Prompt Segment Anything Models.
CoRR, 2024

3D Data Augmentation for Driving Scenes on Camera.
Proceedings of the Pattern Recognition and Computer Vision - 7th Chinese Conference, 2024

ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained Descriptors.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

The All-Seeing Project V2: Towards General Relation Comprehension of the Open World.
Proceedings of the Computer Vision - ECCV 2024, 2024

ControlLLM: Augment Language Models with Tools by Searching on Graphs.
Proceedings of the Computer Vision - ECCV 2024, 2024

Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Weakly Supervised Monocular 3D Detection with a Single-View Image.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Intern VL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Masked AutoDecoder is Effective Multi-Task Vision Generalist.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Modeling Continuous Motion for 3D Point Cloud Object Tracking.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks.
CoRR, 2023

DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving.
CoRR, 2023

ControlLLM: Augment Language Models with Tools by Searching on Graphs.
CoRR, 2023

Exploring the Potential of Flexible 8-bit Format: Design and Algorithm.
CoRR, 2023

Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory.
CoRR, 2023

3D Data Augmentation for Driving Scenes on Camera.
CoRR, 2023

Scene as Occupancy.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Distilling Focal Knowledge from Imperfect Expert for 3D Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Planning-oriented Autonomous Driving.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Towards All-in-One Pre-Training via Maximizing Multi-Modal Mutual Information.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Goal-oriented Autonomous Driving.
CoRR, 2022

Demystify Transformers & Convolutions in Modern Image Deep Networks.
CoRR, 2022

Delving into the Devils of Bird's-eye-view Perception: A Review, Evaluation and Recipe.
CoRR, 2022

2021
Decoupled Spatial-Temporal Transformer for Video Inpainting.
CoRR, 2021

Deformable DETR: Deformable Transformers for End-to-End Object Detection.
Proceedings of the 9th International Conference on Learning Representations, 2021

FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020
1st Place Solution of LVIS Challenge 2020: A Good Box is not a Guarantee of a Good Mask.
CoRR, 2020

VL-BERT: Pre-training of Generic Visual-Linguistic Representations.
Proceedings of the 8th International Conference on Learning Representations, 2020


  Loading...