Gen Luo

Orcid: 0000-0001-5334-1843

According to our database1, Gen Luo authored at least 35 papers between 2016 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Towards Language-Guided Visual Recognition via Dynamic Convolutions.
Int. J. Comput. Vis., January, 2024

A Survivor in the Era of Large-Scale Pretraining: An Empirical Study of One-Stage Referring Expression Comprehension.
IEEE Trans. Multim., 2024

FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual Compression.
CoRR, 2024

RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation.
CoRR, 2024

ChatRex: Taming Multimodal LLM for Joint Perception and Understanding.
CoRR, 2024

γ-MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models.
CoRR, 2024

Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training.
CoRR, 2024

ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models.
CoRR, 2024

Routing Experts: Learning to Route Dynamic Experts in Multi-modal Large Language Models.
CoRR, 2024

Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models.
CoRR, 2024

3D-GRES: Generalized 3D Referring Expression Segmentation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Deep Instruction Tuning for Segment Anything Model.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

QueryMatch: A Query-based Contrastive Learning Framework for Weakly Supervised Visual Grounding.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

CaM: Cache Merging for Memory-efficient LLMs Inference.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Towards Omni-supervised Referring Expression Segmentation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

APL: Anchor-Based Prompt Learning for One-Stage Weakly Supervised Referring Expression Comprehension.
Proceedings of the Computer Vision - ECCV 2024, 2024

3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
A Real-Time Global Inference Network for One-Stage Referring Expression Comprehension.
IEEE Trans. Neural Networks Learn. Syst., 2023

Multi-Branch Distance-Sensitive Self-Attention Network for Image Captioning.
IEEE Trans. Multim., 2023

Towards End-to-end Semi-supervised Learning for One-stage Object Detection.
CoRR, 2023

Towards Efficient Visual Adaption via Structural Re-parameterization.
CoRR, 2023

Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

RefTeacher: A Strong Baseline for Semi-Supervised Referring Expression Comprehension.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

RefCLIP: A Universal Teacher for Weakly Supervised Referring Expression Comprehension.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Towards Lightweight Transformer Via Group-Wise Transformation for Vision-and-Language Tasks.
IEEE Trans. Image Process., 2022

What Goes beyond Multi-modal Fusion in One-stage Referring Expression Comprehension: An Empirical Study.
CoRR, 2022

SeqTR: A Simple Yet Universal Network for Visual Grounding.
Proceedings of the Computer Vision - ECCV 2022, 2022

Active Teacher for Semi-Supervised Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Towards Language-guided Visual Recognition via Dynamic Convolutions.
CoRR, 2021

Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
K-armed Bandit based Multi-Modal Network Architecture Search for Visual Question Answering.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Cascade Grouped Attention Network for Referring Expression Segmentation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2016
No-reference image sharpness Algorithm based on gradient shape.
Proceedings of the 9th International Congress on Image and Signal Processing, 2016


  Loading...