Xi Chen

Orcid: 0000-0002-1581-4627

Affiliations:
  • Google DeepMind, Google Research, Mountain View, USA
  • Harvard University, Cambridge, MA, USA


According to our database1, Xi Chen authored at least 20 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
PaliGemma: A versatile 3B VLM for transfer.
CoRR, 2024

Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment.
Proceedings of the Computer Vision - ECCV 2024, 2024


2023
GeomVerse: A Systematic Evaluation of Large Models for Geometric Reasoning.
CoRR, 2023

PaLI-3 Vision Language Models: Smaller, Faster, Stronger.
CoRR, 2023

PaLI-X: On Scaling up a Multilingual Vision and Language Model.
CoRR, 2023

PaLI: A Jointly-Scaled Multilingual Language-Image Model.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

PreSTU: Pre-Training for Scene-Text Understanding.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

MaXM: Towards Multilingual Visual Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Improving Robust Generalization by Direct PAC-Bayesian Bound Minimization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023


2022
PaLI: A Jointly-Scaled Multilingual Language-Image Model.
CoRR, 2022

Towards Multi-Lingual Visual Question Answering.
CoRR, 2022

PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks.
CoRR, 2022

All You May Need for VQA are Image Captions.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
Bridging the Gap Between Practice and PAC-Bayes Theory in Few-Shot Meta-Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2020
Improving Text Generation Evaluation with Batch Centering and Tempered Word Mover Distance.
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems, 2020


  Loading...