We stand with Ukraine

We stand with Ukraine

Qinghong Lin

Orcid: 0000-0003-2568-2346

According to our database¹, Qinghong Lin authored at least 34 papers between 2020 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

2020

2021

2022

2023

2024

0

5

10

15

7

6

3

7

6

2

1

2

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

ROICtrl: Boosting Instance Control for Visual Generation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Kevin Qinghong Lin

,

Mike Zheng Shou

CoRR, 2024

ShowUI: One Vision-Language-Action Model for GUI Visual Agent.

[BibT_eX]

[DOI]

Kevin Qinghong Lin

,

,

,

,

,

,

,

,

Mike Zheng Shou

CoRR, 2024

MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Kevin Qinghong Lin

,

,

Mike Zheng Shou

CoRR, 2024

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation.

[BibT_eX]

[DOI]

,

,

,

David Junhao Zhang

,

,

Kevin Qinghong Lin

,

,

,

,

Mike Zheng Shou

CoRR, 2024

GUI Action Narrator: Where and When Did That Action Take Place?

[BibT_eX]

[DOI]

,

,

Kevin Qinghong Lin

,

,

,

,

,

,

Mike Zheng Shou

CoRR, 2024

Learning Long-form Video Prior via Generative Pre-Training.

[BibT_eX]

[DOI]

,

,

,

Kevin Qinghong Lin

,

,

,

,

,

,

,

Mike Zheng Shou

CoRR, 2024

COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training.

[BibT_eX]

[DOI]

Alex Jinpeng Wang

,

,

Kevin Qinghong Lin

,

,

,

,

,

Mike Zheng Shou

CoRR, 2024

VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation.

[BibT_eX]

[DOI]

,

,

Kevin Qinghong Lin

,

,

,

,

,

,

,

Mike Zheng Shou

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

VideoGUI: A Benchmark for GUI Automation from Instructional Videos.

[BibT_eX]

[DOI]

Kevin Qinghong Lin

,

,

,

,

,

,

,

Mike Zheng Shou

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

AssistEditor: Multi-Agent Collaboration for GUI Workflow Automation in Video Creation.

[BibT_eX]

[DOI]

,

,

,

,

Mike Zheng Shou

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

AssistGPT: Towards Multi-modal Agent for Human-Centric AI Assistant.

[BibT_eX]

[DOI]

,

,

,

Mike Zheng Shou

Proceedings of the 5th International Workshop on Human-centric Multimedia Analysis, 2024

Learning Video Context as Interleaved Multimodal Sequences.

[BibT_eX]

[DOI]

Kevin Qinghong Lin

,

Pengchuan Zhang

,

,

,

,

,

,

,

Mike Zheng Shou

Proceedings of the Computer Vision - ECCV 2024, 2024

Bootstrapping SparseFormers from Vision Foundation Models.

[BibT_eX]

[DOI]

,

,

Kevin Qinghong Lin

,

,

Mike Zheng Shou

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

VideoLLM-online: Online Video Large Language Model for Streaming Video.

[BibT_eX]

[DOI]

,

,

,

Kevin Qinghong Lin

,

,

,

,

,

,

Mike Zheng Shou

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Unsupervised Cross-Modal Hashing With Modality-Interaction.

[BibT_eX]

[DOI]

,

,

,

,

,

,

IEEE Trans. Circuits Syst. Video Technol., September, 2023

Unsupervised Cross-Modal Hashing via Semantic Text Mining.

[BibT_eX]

[DOI]

,

,

,

,

,

,

IEEE Trans. Multim., 2023

Unsupervised Hashing with Semantic Concept Mining.

[BibT_eX]

[DOI]

,

,

Kevin Qinghong Lin

,

,

,

,

,

Proc. ACM Manag. Data, 2023

DiffusionVMR: Diffusion Model for Video Moment Retrieval.

[BibT_eX]

[DOI]

,

Kevin Qinghong Lin

,

,

CoRR, 2023

AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn.

[BibT_eX]

[DOI]

,

,

,

Kevin Qinghong Lin

,

,

,

Mike Zheng Shou

CoRR, 2023

VisorGPT: Learning Visual Prior via Generative Pre-Training.

[BibT_eX]

[DOI]

,

,

,

,

Kevin Qinghong Lin

,

,

,

Mike Zheng Shou

CoRR, 2023

Learning Visual Prior via Generative Pre-Training.

[BibT_eX]

[DOI]

,

,

,

,

Kevin Qinghong Lin

,

,

,

Mike Zheng Shou

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Too Large; Data Reduction for Vision-Language Pre-Training.

[BibT_eX]

[DOI]

Alex Jinpeng Wang

,

Kevin Qinghong Lin

,

David Junhao Zhang

,

Stan Weixian Lei

,

Mike Zheng Shou

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone.

[BibT_eX]

[DOI]

Shraman Pramanick

,

,

,

Kevin Qinghong Lin

,

,

Mike Zheng Shou

,

,

Pengchuan Zhang

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

UniVTG: Towards Unified Video-Language Temporal Grounding.

[BibT_eX]

[DOI]

Kevin Qinghong Lin

,

Pengchuan Zhang

,

,

Shraman Pramanick

,

,

Alex Jinpeng Wang

,

,

Mike Zheng Shou

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

All in One: Exploring Unified Video-Language Pre-Training.

[BibT_eX]

[DOI]

,

,

,

,

Kevin Qinghong Lin

,

Satoshi Tsutsui

,

,

,

,

,

,

Mike Zheng Shou

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Affordance Grounding from Demonstration Video to Target Image.

[BibT_eX]

[DOI]

,

,

Kevin Qinghong Lin

,

Mike Zheng Shou

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Egocentric Video-Language Pretraining @ Ego4D Challenge 2022.

[BibT_eX]

[DOI]

Kevin Qinghong Lin

,

Alex Jinpeng Wang

,

,

,

,

Eric Zhongcong Xu

,

,

,

,

,

,

,

,

,

,

Mike Zheng Shou

CoRR, 2022

Egocentric Video-Language Pretraining @ EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022.

[BibT_eX]

[DOI]

Kevin Qinghong Lin

,

Alex Jinpeng Wang

,

,

Eric Zhongcong Xu

,

,

,

,

,

,

,

,

Mike Zheng Shou

CoRR, 2022

Egocentric Video-Language Pretraining.

[BibT_eX]

[DOI]

Kevin Qinghong Lin

,

Alex Jinpeng Wang

,

,

,

,

Eric Zhongcong Xu

,

,

,

,

,

,

,

,

,

,

Mike Zheng Shou

CoRR, 2022

Egocentric Video-Language Pretraining.

[BibT_eX]

[DOI]

Kevin Qinghong Lin

,

,

,

,

,

Eric Zhongcong Xu

,

,

,

,

,

,

,

,

,

,

Mike Zheng Shou

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Deep Unsupervised Hashing with Latent Semantic Components.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Deep Self-Adaptive Hashing for Image Retrieval.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

2020

Label Self-Adaption Hashing for Image Retrieval.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 25th International Conference on Pattern Recognition, 2020

Deep Superpixel Cut for Unsupervised Image Segmentation.

[BibT_eX]

[DOI]

,

,

Proceedings of the 25th International Conference on Pattern Recognition, 2020

Loading...