Radu Soricut

Orcid: 0000-0003-1565-3365

Affiliations:
  • Google Research, USA


According to our database1, Radu Soricut authored at least 80 papers between 2002 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
PaliGemma: A versatile 3B VLM for transfer.
CoRR, 2024

Wavelet-Based Image Tokenizer for Vision Transformers.
CoRR, 2024

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.
CoRR, 2024

CausalLM is not optimal for in-context learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

ImageInWords: Unlocking Hyper-Detailed Image Descriptions.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-Rank Experts.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024


2023
GeomVerse: A Systematic Evaluation of Large Models for Geometric Reasoning.
CoRR, 2023

Gemini: A Family of Highly Capable Multimodal Models.
CoRR, 2023

Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling.
CoRR, 2023

PaLI-3 Vision Language Models: Smaller, Faster, Stronger.
CoRR, 2023

PaLI-X: On Scaling up a Multilingual Vision and Language Model.
CoRR, 2023

PreSTU: Pre-Training for Scene-Text Understanding.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

MaXM: Towards Multilingual Visual Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Improving Robust Generalization by Direct PAC-Bayesian Bound Minimization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Connecting Vision and Language with Video Localized Narratives.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023


2022
2.5D visual relationship detection.
Comput. Vis. Image Underst., 2022

PaLI: A Jointly-Scaled Multilingual Language-Image Model.
CoRR, 2022

Towards Multi-Lingual Visual Question Answering.
CoRR, 2022

PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks.
CoRR, 2022

All You May Need for VQA are Image Captions.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks.
Proceedings of the Computer Vision - ECCV 2022, 2022

End-to-end Dense Video Captioning as Sequence Generation.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Denoising Large-Scale Image Captioning from Alt-text Data Using Content Selection Models.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

2021
Telling the What while Pointing the Where: Fine-grained Mouse Trace and Language Supervision for Improved Image Retrieval.
CoRR, 2021

Bridging the Gap Between Practice and PAC-Bayes Theory in Few-Shot Meta-Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Quality Estimation for Image Captions Based on Large-scale Human Evaluations.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Telling the What while Pointing to the Where: Multimodal Queries for Image Retrieval.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

COSMic: A Coherence-Aware Generation Metric for Image Descriptions.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

CrossVQA: Scalably Generating Benchmarks for Systematically Testing VQA Generalization.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Understanding Guided Image Captioning Performance across Domains.
Proceedings of the 25th Conference on Computational Natural Language Learning, 2021

H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Attention that does not Explain Away.
CoRR, 2020

Weakly Supervised Content Selection for Improved Image Captioning.
CoRR, 2020

Multi-Image Summarization: Textual Summary from a Set of Cohesive Images.
CoRR, 2020

Clue: Cross-modal Coherence Modeling for Caption Generation.
CoRR, 2020

Multimodal Pretraining for Dense Video Captioning.
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 2020

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations.
Proceedings of the 8th International Conference on Learning Representations, 2020

Improving Text Generation Evaluation with Batch Centering and Tempered Word Mover Distance.
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems, 2020

Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

TeaForN: Teacher-Forcing with N-grams.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Connecting Vision and Language with Localized Narratives.
Proceedings of the Computer Vision - ECCV 2020, 2020

Cross-modal Language Generation using Pivot Stabilization for Web-scale Language Coverage.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Cross-modal Coherence Modeling for Caption Generation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Reinforcing an Image Caption Generator Using Off-Line Human Feedback.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Multi-stage Pretraining for Abstractive Summarization.
CoRR, 2019

Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

A Case Study on Combining ASR and Visual Features for Generating Instructional Video Captions.
Proceedings of the 23rd Conference on Computational Natural Language Learning, 2019

Informative Image Captioning with External Sources of Information.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
SHAPED: Shared-Private Encoder-Decoder for Text Style Adaptation.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017
Cold-Start Reinforcement Learning with Softmax Policy Gradients.
CoRR, 2017

Cold-Start Reinforcement Learning with Softmax Policy Gradient.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

2016
Morpho-syntactic Lexicon Generation Using Graph-based Semi-supervised Learning.
Trans. Assoc. Comput. Linguistics, 2016

Multilingual Word Embeddings using Multigraphs.
CoRR, 2016

Building Large Machine Reading-Comprehension Datasets using Paragraph Vectors.
CoRR, 2016

Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task.
CoRR, 2016

2015
Unsupervised Morphology Induction Using Word Embeddings.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

2014
Findings of the 2014 Workshop on Statistical Machine Translation.
Proceedings of the Ninth Workshop on Statistical Machine Translation, 2014

2013
Quality estimation for machine translation: preface.
Mach. Transl., 2013

Findings of the 2013 Workshop on Statistical Machine Translation.
Proceedings of the Eighth Workshop on Statistical Machine Translation, 2013

2012
Combining Quality Prediction and System Selection for Improved Automatic Translation Output.
Proceedings of the Seventh Workshop on Statistical Machine Translation, 2012

The SDL Language Weaver Systems in the WMT12 Quality Estimation Shared Task.
Proceedings of the Seventh Workshop on Statistical Machine Translation, 2012

Findings of the 2012 Workshop on Statistical Machine Translation.
Proceedings of the Seventh Workshop on Statistical Machine Translation, 2012

2010
TrustRank: Inducing Trust in Automatic Translations via Ranking.
Proceedings of the ACL 2010, 2010

2008
Automatic Prediction of Parser Accuracy.
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, 2008

2007
Abstractive headline generation using WIDL-expressions.
Inf. Process. Manag., 2007

2006
Automatic question answering using the web: Beyond the Factoid.
Inf. Retr., 2006

Discourse Generation Using Utility-Trained Coherence Models.
Proceedings of the ACL 2006, 2006

Stochastic Language Generation Using WIDL-Expressions and its Application in Machine Translation and Summarization.
Proceedings of the ACL 2006, 2006

2005
Towards Developing Generation Algorithms for Text-to-Text Applications.
Proceedings of the ACL 2005, 2005

Natural Language Generation for Text-to-Text Applications Using an Information-Slim Representation.
Proceedings of the Proceedings, 2005

2004
Automatic Question Answering: Beyond the Factoid.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, 2004

A Unified Framework For Automatic Evaluation Using 4-Gram Co-occurrence Statistics.
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 2004

2003
Sentence Level Discourse Parsing using Syntactic and Lexical Information.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, 2003

2002
Using a Large Monolingual Corpus to Improve Translation Accuracy.
Proceedings of the Machine Translation: From Research to Real Users, 2002


  Loading...