Gen Luo

Orcid: 0000-0001-5334-1843

According to our database¹, Gen Luo authored at least 37 papers between 2016 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding.

[BibT_eX]

[DOI]

CoRR, January, 2025

2024

Towards Language-Guided Visual Recognition via Dynamic Convolutions.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., January, 2024

A Survivor in the Era of Large-Scale Pretraining: An Empirical Study of One-Stage Referring Expression Comprehension.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual Compression.

[BibT_eX]

[DOI]

CoRR, 2024

ChatRex: Taming Multimodal LLM for Joint Perception and Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

γ-MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training.

[BibT_eX]

[DOI]

CoRR, 2024

ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Routing Experts: Learning to Route Dynamic Experts in Multi-modal Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

3D-GRES: Generalized 3D Referring Expression Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Deep Instruction Tuning for Segment Anything Model.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

QueryMatch: A Query-based Contrastive Learning Framework for Weakly Supervised Visual Grounding.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

CaM: Cache Merging for Memory-efficient LLMs Inference.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Towards Omni-supervised Referring Expression Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

APL: Anchor-Based Prompt Learning for One-Stage Weakly Supervised Referring Expression Comprehension.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

A Real-Time Global Inference Network for One-Stage Referring Expression Comprehension.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2023

Multi-Branch Distance-Sensitive Self-Attention Network for Image Captioning.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Towards End-to-end Semi-supervised Learning for One-stage Object Detection.

[BibT_eX]

[DOI]

CoRR, 2023

Towards Efficient Visual Adaption via Structural Re-parameterization.

[BibT_eX]

[DOI]

CoRR, 2023

Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

RefTeacher: A Strong Baseline for Semi-Supervised Referring Expression Comprehension.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

RefCLIP: A Universal Teacher for Weakly Supervised Referring Expression Comprehension.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Towards Lightweight Transformer Via Group-Wise Transformation for Vision-and-Language Tasks.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

What Goes beyond Multi-modal Fusion in One-stage Referring Expression Comprehension: An Empirical Study.

[BibT_eX]

[DOI]

CoRR, 2022

SeqTR: A Simple Yet Universal Network for Visual Grounding.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Active Teacher for Semi-Supervised Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Towards Language-guided Visual Recognition via Dynamic Convolutions.

[BibT_eX]

[DOI]

CoRR, 2021

Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

K-armed Bandit based Multi-Modal Network Architecture Search for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Cascade Grouped Attention Network for Referring Expression Segmentation.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2016

No-reference image sharpness Algorithm based on gradient shape.

[BibT_eX]

[DOI]

Proceedings of the 9th International Congress on Image and Signal Processing, 2016

Gen Luo

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...