Michael Collins

  • Google Inc.
  • Columbia University, Department of Computer Science
  • Massachusetts Institute of Technology, Computer Science and Artificial Intelligence Laboratory (2003 - 2010)
  • AT&T Labs Research (1999 - 2002)
  • University of Pennsylvania, Department of Computer and Information Science (PhD 1998)

According to our database1, Michael Collins authored at least 131 papers between 1993 and 2024.

Collaborative distances:



In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation.
CoRR, 2024

Learning to Reject with a Fixed Predictor: Application to Decontextualization.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Improving Low-Resource Cross-lingual Parsing with Expected Statistic Regularization.
Trans. Assoc. Comput. Linguistics, 2023

Coreference Resolution through a seq2seq Transition-Based System.
Trans. Assoc. Comput. Linguistics, 2023

Learning to Reject with a Fixed Predictor: Application to Decontextualization.
CoRR, 2023

Measuring Attribution in Natural Language Generation Models.
Comput. Linguistics, 2023

Query Refinement Prompts for Closed-Book Long-Form QA.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Evaluating Explanations: How Much Do Explanations from the Teacher Aid Students?
Trans. Assoc. Comput. Linguistics, 2022

Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models.
CoRR, 2022

Towards Computationally Verifiable Semantic Grounding for Language Models.
CoRR, 2022

Query Refinement Prompts for Closed-Book Long-Form Question Answering.
CoRR, 2022

Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model.
CoRR, 2022

A Well-Composed Text is Half Done! Composition Sampling for Diverse Conditional Generation.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Sparse, Dense, and Attentional Representations for Text Retrieval.
Trans. Assoc. Comput. Linguistics, 2021

QED: A Framework and Dataset for Explanations in Question Answering.
Trans. Assoc. Comput. Linguistics, 2021

Partially Supervised Named Entity Recognition via the Expected Entity Ratio Loss.
Trans. Assoc. Comput. Linguistics, 2021

Decontextualization: Making Sentences Stand-Alone.
Trans. Assoc. Comput. Linguistics, 2021

Measuring Attribution in Natural Language Generation Models.
CoRR, 2021

Investigating the Effect of Background Knowledge on Natural Questions.
Proceedings of Deep Learning Inside Out: The 2nd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, 2021

TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages.
Trans. Assoc. Comput. Linguistics, 2020

Evaluating Explanations: How much do explanations from the teacher aid students?
CoRR, 2020

Natural Questions: a Benchmark for Question Answering Research.
Trans. Assoc. Comput. Linguistics, 2019

Kernel Approximation Methods for Speech Recognition.
J. Mach. Learn. Res., 2019

A BERT Baseline for the Natural Questions.
CoRR, 2019

Low-Resource Syntactic Transfer with Unsupervised Source Reordering.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Fusion of Detected Objects in Text for Visual Question Answering.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Synthetic QA Corpora Generation with Roundtrip Consistency.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Improving Span-based Question Answering Systems with Coarsely Labeled Data.
CoRR, 2018

Cross-Lingual Syntactic Transfer with Limited Resources.
Trans. Assoc. Comput. Linguistics, 2017

A Polynomial-Time Dynamic Programming Algorithm for Phrase-Based Decoding with a Fixed Distortion Limit.
Trans. Assoc. Comput. Linguistics, 2017

SyntaxNet Models for the CoNLL 2017 Shared Task.
CoRR, 2017

Source-Side Left-to-Right or Target-Side Left-to-Right? An Empirical Comparison of Two Phrase-Based Decoding Algorithms.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Unsupervised Part-Of-Speech Tagging with Anchor Hidden Markov Models.
Trans. Assoc. Comput. Linguistics, 2016

Transforming Dependency Structures to Logical Forms for Semantic Parsing.
Trans. Assoc. Comput. Linguistics, 2016

Predicting the impact of scientific concepts using full-text features.
J. Assoc. Inf. Sci. Technol., 2016

Compact kernel models for acoustic modeling via random feature selection.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

A comparison between deep neural nets and kernel acoustic models for speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Towards a Convex HMM Surrogate for Word Alignment.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Globally Normalized Transition-Based Neural Networks.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Simple Semi-Supervised POS Tagging.
Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, 2015

On A Strictly Convex IBM Model 1.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Density-Driven Cross-Lingual Transfer of Dependency Parsers.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Structured Training for Neural Network Transition-Based Parsing.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

Model-based Word Embeddings from Decompositions of Count Matrices.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

A Family of Latent Variable Convex Relaxations for IBM Model 2.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

Spectral learning of latent-variable PCFGs: algorithms and sample complexity.
J. Mach. Learn. Res., 2014

How to Scale Up Kernel Methods to Be As Good As Deep Neural Nets.
CoRR, 2014

A Spectral Algorithm for Learning Class-Based n-gram Models of Natural Language.
Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, 2014

Some Experiments with a Convex IBM Model 2.
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, 2014

Learning Dictionaries for Named Entity Recognition using Minimal Supervision.
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, 2014

Modeling Weather Impact on a Secondary Electrical Grid.
Proceedings of the 5th International Conference on Ambient Systems, 2014

A Provably Correct Learning Algorithm for Latent-Variable PCFGs.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

A Constrained Viterbi Relaxation for Bidirectional Word Alignment.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Experiments with Spectral Learning of Latent-Variable PCFGs.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

Approximate PCFG Parsing Using Tensor Decomposition.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

Spectral Learning Algorithms for Natural Language Processing.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

A Convex Alternative to IBM Model 2.
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013

Optimal Beam Search for Machine Translation.
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013

Spectral Learning of Refinement HMMs.
Proceedings of the Seventeenth Conference on Computational Natural Language Learning, 2013

A Tutorial on Dual Decomposition and Lagrangian Relaxation for Inference in Natural Language Processing.
J. Artif. Intell. Res., 2012

Tensor Decomposition for Fast Parsing with Latent-Variable PCFGs.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Improved Parsing and POS Tagging Using Inter-Sentence Consistency Constraints.
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012

Spectral Dependency Parsing with Latent Variables.
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012

Spectral Learning of Latent-Variable PCFGs.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

Lagrangian Relaxation for Inference in Natural Language Processing.
Proceedings of the 12th International Conference on Parsing Technologies, 2011

Exact Decoding of Phrase-Based Translation Models through Lagrangian Relaxation.
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 2011

Exact Decoding of Syntactic Translation Models through Lagrangian Relaxation.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011

Dual Decomposition for Natural Language Processing.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, 2011

Dialect recognition using a phone-GMM-supervector-based SVM kernel.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

On Dual Decomposition and Linear Programming Relaxations for Natural Language Processing.
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010

Dual Decomposition for Parsing with Non-Projective Head Automata.
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010

Maximum Margin Ranking Algorithms for Information Retrieval.
Proceedings of the Advances in Information Retrieval, 2010

Efficient Third-Order Dependency Parsers.
Proceedings of the ACL 2010, 2010

Learning Label Embeddings for Nearest-Neighbor Multi-class Classification with an Application to Speech Recognition.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

An efficient projection for <i>l</i><sub>1</sub>,<sub>infinity</sub> regularization.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

An Empirical Study of Semi-supervised Structured Conditional Models for Dependency Parsing.
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009

Non-Projective Parsing for Statistical Machine Translation.
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009

Learning Context-Dependent Mappings from Sentences to Logical Form.
Proceedings of the ACL 2009, 2009

Exponentiated Gradient Algorithms for Conditional Random Fields and Max-Margin Markov Networks.
J. Mach. Learn. Res., 2008

Case-factor diagrams for structured probabilistic modeling.
J. Comput. Syst. Sci., 2008

Syntactic Reordering in Preprocessing for Japanese → English Translation: MIT System Description for NTCIR-7 Patent Translation Task.
Proceedings of the 7th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2008

Transfer learning for image classification with sparse prototype representations.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

TAG, Dynamic Programming, and the Perceptron for Efficient, Feature-Rich Parsing.
Proceedings of the Twelfth Conference on Computational Natural Language Learning, 2008

Simple Semi-supervised Dependency Parsing.
Proceedings of the ACL 2008, 2008

Hidden Conditional Random Fields.
IEEE Trans. Pattern Anal. Mach. Intell., 2007

Discriminative n-gram language modeling.
Comput. Speech Lang., 2007

Dimensionality reduction for speech recognition using neighborhood components analysis.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Exponentiated gradient algorithms for log-linear structured prediction.
Proceedings of the Machine Learning, 2007

Trigger-Based Language Modeling using a Loss-Sensitive Perceptron Algorithm.
Proceedings of the IEEE International Conference on Acoustics, 2007

Online Learning of Relaxed CCG Grammars for Parsing to Logical Form.
Proceedings of the EMNLP-CoNLL 2007, 2007

Chinese Syntactic Reordering for Statistical Machine Translation.
Proceedings of the EMNLP-CoNLL 2007, 2007

Structured Prediction Models via the Matrix-Tree Theorem.
Proceedings of the EMNLP-CoNLL 2007, 2007

Learning Visual Representations using Images with Captions.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

A Discriminative Model for Tree-to-Tree Translation.
Proceedings of the EMNLP 2006, 2006

Discriminative Reranking for Natural Language Parsing.
Comput. Linguistics, 2005

Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars.
Proceedings of the UAI '05, 2005

Hidden-Variable Models for Discriminative Reranking.
Proceedings of the HLT/EMNLP 2005, 2005

Morphology and Reranking for the Statistical Parsing of Spanish.
Proceedings of the HLT/EMNLP 2005, 2005

Discriminative Syntactic Language Modeling for Speech Recognition.
Proceedings of the ACL 2005, 2005

Clause Restructuring for Statistical Machine Translation.
Proceedings of the ACL 2005, 2005

Conditional Random Fields for Object Recognition.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Exponentiated Gradient Algorithms for Large-margin Structured Classification.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Corrective language modeling for large vocabulary ASR with the perceptron algorithm.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Max-Margin Parsing.
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing , 2004

Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm.
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 2004

Incremental Parsing with the Perceptron Algorithm.
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 2004

Head-Driven Statistical Models for Natural Language Parsing.
Comput. Linguistics, 2003

Tutorial: Machine Learning Methods in Natural Language Processing.
Proceedings of the Computational Learning Theory and Kernel Machines, 2003

Logistic Regression, AdaBoost and Bregman Distances.
Mach. Learn., 2002

Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms.
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, 2002

New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron.
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002

Ranking Algorithms for Named Entity Extraction: Boosting and the Voted Perceptron.
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002

A Generalization of Principal Components Analysis to the Exponential Family.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

Convolution Kernels for Natural Language.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

Parameter Estimation for Statistical Parsing Models: Theory and Practice of Distribution-Free Methods.
Proceedings of the Seventh International Workshop on Parsing Technologies (IWPT-2001), 2001

Discriminative Reranking for Natural Language Parsing.
Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

Improving intonational phrasing with syntactic information.
Proceedings of the IEEE International Conference on Acoustics, 2000

Answer Extraction.
Proceedings of the 6th Applied Natural Language Processing Conference, 2000

The Rules Behind Roles: Identifying Speaker Role in Radio Broadcasts.
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on on Innovative Applications of Artificial Intelligence, July 30, 2000

AT&T at TREC-8.
Proceedings of The Eighth Text REtrieval Conference, 1999

Unsupervised Models for Named Entity Classification.
Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, 1999

A Statistical Parser for Czech.
Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics, 1999

Semantic Tagging using a Probabilistic Context Free Grammar.
Proceedings of the Sixth Workshop on Very Large Corpora, 1998

Three Generative, Lexicalised Models for Statistical Parsing.
Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics, 1997

A New Statistical Parser Based on Bigram Lexical Dependencies.
Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, 1996

University of Pennsylvania: description of the University of Pennsylvania system used for MUC-6.
Proceedings of the 6th Conference on Message Understanding, 1995

Prepositional Phrase Attachment through a Backed-off Model.
Proceedings of the Third Workshop on Very Large Corpora, 1995

Spoken language translation with MID-90's technology: a case study.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993
