Junnan Li
Orcid: 0000-0002-1405-2034Affiliations:
- National University of Singapore, Graduate School for Integrative Sciences and Engineering, Singapore
- Salesforce Research Asia, Singapore
According to our database1,
Junnan Li
authored at least 56 papers
between 2016 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024
X-InstructBLIP: A Framework for Aligning Image, 3D, Audio, Video to LLMs and its Emergent Cross-Modal Reasoning.
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
2023
Pattern Recognit. Lett., April, 2023
X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning.
CoRR, 2023
Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2023
BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models.
Proceedings of the International Conference on Machine Learning, 2023
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
From Images to Textual Prompts: Zero-shot Visual Question Answering with Frozen Large Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
CoRR, 2022
BotSIM: An End-to-End Bot Simulation Toolkit for Commercial Task-Oriented Dialog Systems.
CoRR, 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation.
Proceedings of the International Conference on Machine Learning, 2022
Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero Training.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022
Proceedings of the Computer Vision - ECCV 2022, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
2021
CoRR, 2021
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Proceedings of the 9th International Conference on Learning Representations, 2021
Proceedings of the 9th International Conference on Learning Representations, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
2020
IEEE Trans. Multim., 2020
Improving out-of-distribution generalization via multi-task self-supervised pretraining.
CoRR, 2020
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020
Proceedings of the 8th International Conference on Learning Representations, 2020
Learning on the Fly: An RNN-Based Online Throughput Prediction Framework for UAV Communications.
Proceedings of the 2020 IEEE International Conference on Communications Workshops, 2020
The Devil Is in Classification: A Simple Framework for Long-Tail Instance Segmentation.
Proceedings of the Computer Vision - ECCV 2020, 2020
2019
ACM Trans. Multim. Comput. Commun. Appl., 2019
IEEE Robotics Autom. Lett., 2019
Proceedings of the 27th ACM International Conference on Multimedia, 2019
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
2018
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018
2017
Hierarchical & multimodal video captioning: Discovering and transferring multimodal knowledge for vision to language.
Comput. Vis. Image Underst., 2017
Proceedings of the 2017 ACM on Multimedia Conference, 2017
Proceedings of the IEEE International Conference on Computer Vision, 2017
2016
Proceedings of the IEEE International Symposium on Multimedia, 2016
Proceedings of the IEEE International Symposium on Multimedia, 2016