Yonatan Belinkov

According to our database1, Yonatan Belinkov authored at least 127 papers between 2013 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Distinguishing Ignorance from Error in LLM Hallucinations.
CoRR, 2024

Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics.
CoRR, 2024

Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods.
CoRR, 2024

LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations.
CoRR, 2024

Jamba-1.5: Hybrid Transformer-Mamba Models at Scale.
CoRR, 2024

The Quest for the Right Mediator: A History, Survey, and Theoretical Grounding of Causal Interpretability.
CoRR, 2024

Answer, Assemble, Ace: Understanding How Transformers Answer Multiple Choice Questions.
CoRR, 2024

Confidence Regulation Neurons in Language Models.
CoRR, 2024

REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space.
CoRR, 2024

DEPTH: Discourse Education through Pre-Training Hierarchically.
CoRR, 2024

Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs.
CoRR, 2024

Jamba: A Hybrid Transformer-Mamba Language Model.
CoRR, 2024

Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models.
CoRR, 2024

Have Faith in Faithfulness: Going Beyond Circuit Overlap When Finding Model Mechanisms.
CoRR, 2024

Effect of tokenization on transformers for biological sequences.
Bioinform., 2024

Unified Concept Editing in Diffusion Models.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

ContraSim - Analyzing Neural Representations Based on Contrastive Learning.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Leveraging Prototypical Representations for Mitigating Social Bias without Demographic Information.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Short Papers, 2024

ReFACT: Updating Text-to-Image Models by Editing the Text Encoder.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Linearity of Relation Decoding in Transformer Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Fast Forwarding Low-Rank Training.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Backward Lens: Projecting Language Model Gradients into the Vocabulary Space.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

A Dataset for Metaphor Detection in Early Medieval Hebrew Poetry.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

Generating Benchmarks for Factuality Evaluation of Language Models.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Concept-Best-Matching: Evaluating Compositionality In Emergent Communication.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Accelerating the Global Aggregation of Local Explanations.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Trans. Mach. Learn. Res., 2023

Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias.
CoRR, 2023

Understanding Arithmetic Reasoning in Language Models using Causal Mediation Analysis.
CoRR, 2023

Interpreting Transformer's Attention Dynamic Memory and Visualizing the Semantic Information Flow of GPT.
CoRR, 2023

ContraSim - A Similarity Measure Based on Contrastive Learning.
CoRR, 2023

Mass-Editing Memory in a Transformer.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Multiple sequence alignment as a sequence-to-sequence learning problem.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Editing Implicit Assumptions in Text-to-Image Diffusion Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

VISIT: Visualizing and Interpreting the Semantic Information Flow of Transformers.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

When Language Models Fall in Love: Animacy Processing in Transformer Language Models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

FigureOut - Automatic Detection of Metaphors in Hebrew Across the Eras.
Proceedings of the Annual International Conference of the Alliance of Digital Humanities Organizations, 2023

Parallel Context Windows for Large Language Models.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

What Are You Token About? Dense Retrieval as Distributions Over the Vocabulary.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

BLIND: Bias Removal With No Demographics.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Shielded Representations: Protecting Sensitive Attributes Through Iterative Gradient-Based Projection.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Emergent Quantized Communication.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Parallel Context Windows Improve In-Context Learning of Large Language Models.
CoRR, 2022

Debiasing NLP Models Without Demographic Information.
CoRR, 2022

Choose Your Lenses: Flaws in Gender Bias Evaluation.
CoRR, 2022

Measuring Causal Effects of Data Statistics on Language Model's 'Factual' Predictions.
CoRR, 2022

IDANI: Inference-time Domain Adaptation via Neuron-level Interventions.
CoRR, 2022

MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning.
CoRR, 2022

Locating and Editing Factual Knowledge in GPT.
CoRR, 2022

Probing Classifiers: Promises, Shortcomings, and Advances.
Comput. Linguistics, 2022

A Generative Approach for Mitigating Structural Biases in Natural Language Inference.
Proceedings of the 11th Joint Conference on Lexical and Computational Semantics, 2022

Locating and Editing Factual Associations in GPT.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Measures of Information Reflect Memorization Patterns.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

How Gender Debiasing Affects Internal Model Representations, and Why It Matters.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

On the Pitfalls of Analyzing Individual Neurons in Language Models.
Proceedings of the Tenth International Conference on Learning Representations, 2022

A Multilingual Perspective Towards the Evaluation of Attribution Methods in Natural Language Inference.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Supervising Model Attention with Human Explanations for Robust Natural Language Inference.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Natural Language Inference with a Human Touch: Using Human Explanations to Guide Model Attention.
CoRR, 2021

Probing Classifiers: Promises, Shortcomings, and Alternatives.
CoRR, 2021

IRM - when it works and when it doesn't: A test case of natural language inference.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning from others' mistakes: Avoiding dataset biases without modeling them.
Proceedings of the 9th International Conference on Learning Representations, 2021

Variational Information Bottleneck for Effective Low-Resource Fine-Tuning.
Proceedings of the 9th International Conference on Learning Representations, 2021

Similarity Analysis of Self-Supervised Speech Representations.
Proceedings of the IEEE International Conference on Acoustics, 2021

Debiasing Methods in Natural Language Understanding Make Bias More Accessible.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance?
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Probing Neural Dialog Models for Conversational Understanding.
CoRR, 2020

Causal Mediation Analysis for Interpreting Neural NLP: The Case of Gender Bias.
CoRR, 2020

Exploiting Redundancy in Pre-trained Language Models for Efficient Transfer Learning.
CoRR, 2020

On the Linguistic Representational Power of Neural Machine Translation Models.
Comput. Linguistics, 2020

Findings of the WMT 2020 Shared Task on Machine Translation Robustness.
Proceedings of the Fifth Conference on Machine Translation, 2020

Investigating Gender Bias in Language Models Using Causal Mediation Analysis.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

A Constructive Prediction of the Generalization Error Across Scales.
Proceedings of the 8th International Conference on Learning Representations, 2020

Analyzing Individual Neurons in Pre-trained Language Models.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Analyzing Redundancy in Pretrained Transformer Models.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Similarity Analysis of Contextual Word Representation Models.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

End-to-End Bias Mitigation by Modelling Biases in Corpora.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Interpretability and Analysis in Neural NLP.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts, 2020

The Sensitivity of Language Models and Humans to Winograd Schema Perturbations.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Analysis Methods in Neural Language Processing: A Survey.
Trans. Assoc. Comput. Linguistics, 2019

Studying the history of the Arabic language: language technology and a large-scale historical corpus.
Lang. Resour. Evaluation, 2019

Language processing and learning models for community question answering in Arabic.
Inf. Process. Manag., 2019

Memory-Augmented Recurrent Neural Networks Can Learn Generalized Dyck Languages.
CoRR, 2019

Adversarial Regularization for Visual Question Answering: Strengths, Shortcomings, and Side Effects.
CoRR, 2019

LSTM Networks Can Perform Dynamic Counting.
CoRR, 2019

Character-based Surprisal as a Model of Human Reading in the Presence of Errors.
CoRR, 2019

Findings of the First Shared Task on Machine Translation Robustness.
Proceedings of the Fourth Conference on Machine Translation, 2019

On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference.
Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics, 2019

Linguistic Knowledge and Transferability of Contextual Representations.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

One Size Does Not Fit All: Comparing NMT Representations of Different Granularities.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Identifying and Controlling Important Neurons in Neural Machine Translation.
Proceedings of the 7th International Conference on Learning Representations, 2019

Character-based Surprisal as a Model of Reading Difficulty in the Presence of Errors.
Proceedings of the 41th Annual Meeting of the Cognitive Science Society, 2019

Analyzing the Structure of Attention in a Transformer Language Model.
Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, 2019

Improving Neural Language Models by Segmenting, Attending, and Predicting the Future.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Don't Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

NeuroX: A Toolkit for Analyzing Individual Neurons in Neural Networks.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
On internal language representations in deep learning: an analysis of machine translation and speech recognition.
PhD thesis, 2018

On Evaluating the Generalization of LSTM Models in Formal Languages.
CoRR, 2018

On the Evaluation of Semantic Phenomena in Neural Machine Translation Using Natural Language Inference.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Synthetic and Natural Noise Both Break Neural Machine Translation.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
Analysis of sentence embedding models using prediction tasks in natural language processing.
IBM J. Res. Dev., 2017

Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Neural Machine Translation Training in a Multi-Domain Scenario.
Proceedings of the 14th International Conference on Spoken Language Translation, 2017

QMDIS: QCRI-MIT Advanced Dialect Identification System.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Understanding and Improving Morphological Learning in the Neural Machine Translation Decoder.
Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017

Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks.
Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017

Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks.
Proceedings of the 5th International Conference on Learning Representations, 2017

Challenging Language-Dependent Segmentation for Arabic: An Application to Machine Translation and Part-of-Speech Tagging.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

What do Neural Machine Translation Models Learn about Morphology?
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016
Large-Scale Machine Translation between Arabic and Hebrew: Available Corpora and Initial Results.
CoRR, 2016

A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects.
Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects, 2016

Improving Sequence to Sequence Learning for Morphological Inflection Generation: The BIU-MIT Systems for the SIGMORPHON 2016 Shared Task for Morphological Reinflection.
Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, 2016

SLS at SemEval-2016 Task 3: Neural-based Approaches for Ranking in Community Question Answering.
Proceedings of the 10th International Workshop on Semantic Evaluation, 2016

Shamela: A Large-Scale Historical Arabic Corpus.
Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities, 2016

Neural Attention for Learning to Rank Questions in Community Question Answering.
Proceedings of the COLING 2016, 2016

2015
Erratum: "Exploring Compositional Architectures and Word Vector Representations for Prepositional Phrase Attachment".
Trans. Assoc. Comput. Linguistics, 2015

Answer Selection in Arabic Community Question Answering: A Feature-Rich Approach.
Proceedings of the Second Workshop on Arabic Natural Language Processing, 2015

VectorSLU: A Continuous Word Vector Approach to Answer Selection in Community Question Answering Systems.
Proceedings of the 9th International Workshop on Semantic Evaluation, 2015

Arabic Diacritization with Recurrent Neural Networks.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

2014
Exploring Compositional Architectures and Word Vector Representations for Prepositional Phrase Attachment.
Trans. Assoc. Comput. Linguistics, 2014

arTenTen: Arabic Corpus and Word Sketches.
J. King Saud Univ. Comput. Inf. Sci., 2014

2013
Translating Dialectal Arabic to English.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013


  Loading...