Xi Chen

Orcid: 0000-0002-1581-4627

Affiliations:

Google DeepMind, Google Research, Mountain View, USA
Harvard University, Cambridge, MA, USA

According to our database¹, Xi Chen authored at least 20 papers between 2020 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

PaliGemma: A versatile 3B VLM for transfer.

[BibT_eX]

[DOI]

CoRR, 2024

Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

On Scaling Up a Multilingual Vision and Language Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

GeomVerse: A Systematic Evaluation of Large Models for Geometric Reasoning.

[BibT_eX]

[DOI]

CoRR, 2023

PaLI-3 Vision Language Models: Smaller, Faster, Stronger.

[BibT_eX]

[DOI]

Ibrahim Alabdulmohsin

CoRR, 2023

PaLI-X: On Scaling up a Multilingual Vision and Language Model.

[BibT_eX]

[DOI]

CoRR, 2023

PaLI: A Jointly-Scaled Multilingual Language-Image Model.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

PreSTU: Pre-Training for Scene-Text Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

MaXM: Towards Multilingual Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Improving Robust Generalization by Direct PAC-Bayesian Bound Minimization.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control.

[BibT_eX]

[DOI]

Proceedings of the Conference on Robot Learning, 2023

2022

PaLI: A Jointly-Scaled Multilingual Language-Image Model.

[BibT_eX]

[DOI]

CoRR, 2022

Towards Multi-Lingual Visual Question Answering.

[BibT_eX]

[DOI]

CoRR, 2022

PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks.

[BibT_eX]

[DOI]

CoRR, 2022

All You May Need for VQA are Image Captions.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2021

Bridging the Gap Between Practice and PAC-Bayes Theory in Few-Shot Meta-Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2020

Improving Text Generation Evaluation with Batch Centering and Tempered Word Mover Distance.

[BibT_eX]

[DOI]

Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems, 2020

Xi Chen

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...