Yinfei Yang

According to our database1, Yinfei Yang authored at least 92 papers between 2012 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms.
CoRR, 2024

Improve Vision Language Model Chain-of-thought Reasoning.
CoRR, 2024

MM-Ego: Towards Building Egocentric Multimodal LLMs.
CoRR, 2024

Contrastive Localized Language-Image Pre-Training.
CoRR, 2024

Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models.
CoRR, 2024

MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning.
CoRR, 2024

Understanding Alignment in Multimodal LLMs: A Comprehensive Study.
CoRR, 2024

MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs.
CoRR, 2024

Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models.
CoRR, 2024

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training.
CoRR, 2024

How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts.
CoRR, 2024

Empowering Unsupervised Domain Adaptation with Large-scale Pre-trained Vision-Language Models.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Ferret: Refer and Ground Anything Anywhere at Any Granularity.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

MOFI: Learning Image Representations from Noisy Entity Annotated Images.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Compressing LLMs: The Truth is Rarely Pure and Never Simple.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Guiding Instruction-based Image Editing via Multimodal Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs.
Proceedings of the Computer Vision - ECCV 2024, 2024


VeCLIP: Improving CLIP Training via Visual-Enriched Captions.
Proceedings of the Computer Vision - ECCV 2024, 2024

On the Intractability to Synthesize Factual Inconsistencies in Summarization.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024

2023
Hierarchical temporal transformer network for tool wear state recognition.
Adv. Eng. Informatics, October, 2023

From Scarcity to Efficiency: Improving CLIP Training via Visual-enriched Captions.
CoRR, 2023

Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts.
CoRR, 2023

MOFI: Learning Image Representations from Noisy Entity Annotated Images.
CoRR, 2023

Less is More: Removing Text-regions Improves CLIP Training Efficiency and Robustness.
CoRR, 2023

On Robustness in Multimodal Learning.
CoRR, 2023

STAIR: Learning Sparse Text and Image Representation in Grounded Tokens.
CoRR, 2023

Self Supervision Does Not Help Natural Language Supervision at Scale.
CoRR, 2023

Robustness in Multimodal Learning under Train-Test Modality Mismatch.
Proceedings of the International Conference on Machine Learning, 2023

Perceptual Grouping in Contrastive Vision-Language Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

STAIR: Learning Sparse Text and Image Representation in Grounded Tokens.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

DocAsRef: An Empirical Study on Repurposing Reference-based Summary Quality Metrics as Reference-free Metrics.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Masked Autoencoding Does Not Help Natural Language Supervision at Scale.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Simple and Effective Synthesis of Indoor 3D Scenes.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation.
Trans. Mach. Learn. Res., 2022

Perceptual Grouping in Vision-Language Models.
CoRR, 2022

LongT5: Efficient Text-To-Text Transformer for Long Sequences.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

SueNes: A Weakly Supervised Approach to Evaluating Single-Document Summarization via Negative Sampling.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Large Dual Encoders Are Generalizable Retrievers.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Language-agnostic BERT Sentence Embedding.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
An approach for optimising the fixturing configuration in flexible machining fixtures.
Int. J. Prod. Res., 2021

MURAL: Multimodal, Multitask Retrieval Across Languages.
CoRR, 2021

Interpretability Analysis for Named Entity Recognition to Understand System Predictions and How They Can Improve.
Comput. Linguistics, 2021

Text-to-Image Generation Grounded by Fine-Grained User Attention.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision.
Proceedings of the 38th International Conference on Machine Learning, 2021

Pathdreamer: A World Model for Indoor Navigation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Universal Sentence Representation Learning with Conditional Masked Language Model.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

A Simple and Effective Method To Eliminate the Self Language Bias in Multilingual Representations.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Multi-stage Training with Improved Negative Contrast for Neural Passage Retrieval.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

MURAL: Multimodal, Multitask Representations Across Languages.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Crisscrossed Captions: Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Zero-shot Neural Passage Retrieval via Domain-targeted Synthetic Question Generation.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Cross-Modal Contrastive Learning for Text-to-Image Generation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Neural Retrieval for Question Answering with Cross-Attention Supervised Data Augmentation.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Uncertainty quantification in machining deformation based on Bayesian network.
Reliab. Eng. Syst. Saf., 2020

Neural Passage Retrieval with Improved Negative Contrast.
CoRR, 2020

End-to-end Semantics-based Summary Quality Assessment for Single-document Summarization.
CoRR, 2020

MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering Models.
CoRR, 2020

Zero-shot Neural Retrieval via Domain-targeted Synthetic Query Generation.
CoRR, 2020

Entity-Switched Datasets: An Approach to Auditing the In-Domain Robustness of Named Entity Recognition Models.
CoRR, 2020

Self-Supervised Learning for Pairwise Data Refinement.
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 2020

LAReQA: Language-Agnostic Answer Retrieval from a Multilingual Pool.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Multilingual Universal Sentence Encoder for Semantic Retrieval.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020

Learning a Multi-Domain Curriculum for Neural Machine Translation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Learning a Multitask Curriculum for Neural Machine Translation.
CoRR, 2019

Multi-Domain Gated CNN for Review Helpfulness Prediction.
Proceedings of the World Wide Web Conference, 2019

Hierarchical Document Encoder for Parallel Corpus Mining.
Proceedings of the Fourth Conference on Machine Translation, 2019

Learning Cross-Lingual Sentence Representations via a Multi-task Dual-Encoder Model.
Proceedings of the 4th Workshop on Representation Learning for NLP, 2019

Predicting Annotation Difficulty to Improve Task Routing and Model Performance for Biomedical Information Extraction.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

ReQA: An Evaluation for End-to-End Answer Retrieval Models.
Proceedings of the 2nd Workshop on Machine Reading for Question Answering, 2019

2018
Review Helpfulness Prediction with Embedding-Gated CNN.
CoRR, 2018

Universal Sentence Encoder.
CoRR, 2018

Effective Parallel Corpus Mining using Bilingual Sentence Embeddings.
Proceedings of the Third Conference on Machine Translation: Research Papers, 2018

Learning Semantic Textual Similarity from Conversations.
Proceedings of The Third Workshop on Representation Learning for NLP, 2018

Syntactic Patterns Improve Information Extraction for Medical Search.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Cross-Domain Review Helpfulness Prediction Based on Convolutional Neural Networks with Auxiliary Domain Discriminators.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Universal Sentence Encoder for English.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018

A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017
Combining Lexical and Syntactic Features for Detecting Content-Dense Texts in News.
J. Artif. Intell. Res., 2017

Aspect Extraction from Product Reviews Using Category Hierarchy Information.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

Detecting (Un)Important Content for Single-Document News Summarization.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

2016
Aspect-Based Helpfulness Prediction for Online Product Reviews.
Proceedings of the 28th IEEE International Conference on Tools with Artificial Intelligence, 2016

2015
Semantic Analysis and Helpfulness Prediction of Text for Online Product Reviews.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014
Online Multiple targets Detection and Tracking from Mobile robot in Cluttered indoor Environments with Depth Camera.
Int. J. Pattern Recognit. Artif. Intell., 2014

Single image 3D object detection and pose estimation for grasping.
Proceedings of the 2014 IEEE International Conference on Robotics and Automation, 2014

Detecting Information-Dense Texts in Multiple News Domains.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2012
Linking Named Entities to Any Database.
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012

Navigation toward Non-static Target Object Using Footprint Detection Based Tracking.
Proceedings of the Computer Vision - ACCV 2012, 2012


  Loading...