Brian Roark

According to our database1, Brian Roark authored at least 128 papers between 1998 and 2023.

Collaborative distances:




In proceedings 
PhD thesis 





Spelling convention sensitivity in neural language models.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023

Graphemic Normalization of the Perso-Arabic Script.
CoRR, 2022

Beyond Arabic: Software for Perso-Arabic Script Manipulation.
Proceedings of the The Seventh Arabic Natural Language Processing Workshop, 2022

Design principles of an open-source language modeling microservice package for AAC text-entry applications.
Proceedings of the Ninth Workshop on Speech and Language Processing for Assistive Technologies, 2022

Extensions to Brahmic script processing within the Nisaba library: new scripts, languages and utilities.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Criteria for Useful Automatic Romanization in South Asian Languages.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Approximating Probabilistic Models as Weighted Finite Automata.
Comput. Linguistics, 2021

Finding Concept-specific Biases in Form-Meaning Associations.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Structured abbreviation expansion in context.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Disambiguatory Signals are Stronger in Word-initial Positions.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Finite-state script normalization and processing utilities: The Nisaba Brahmic library.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, 2021

Phonotactic Complexity and its Trade-offs.
Trans. Assoc. Comput. Linguistics, 2020

Processing South Asian Languages Written in the Latin Script: the Dakshina Dataset.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Language-Agnostic Multilingual Modeling.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Neural Models of Text Normalization for Speech Applications.
Comput. Linguistics, 2019

Latin script keyboards for South Asian languages with finite-state normalization.
Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing, 2019

Distilling weighted finite automata from arbitrary probabilistic models.
Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing, 2019

Meaning to Form: Measuring Systematicity as Information.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

What Kind of Language Is Hard to Language-Model?
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Rethinking Phonotactic Complexity.
Proceedings of the 2019 Workshop on Widening NLP@ACL 2019, Florence, Italy, July 28, 2019, 2019

Transliteration Based Approaches to Improve Code-Switched Speech Recognition Performance.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Are All Languages Equally Hard to Language-Model?
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Transliterated Mobile Keyboard Input via Weighted Finite-State Transducers.
Proceedings of the 13th International Conference on Finite State Methods and Natural Language Processing, 2017

Learning N-Gram Language Models from Uncertain Data.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Contextual Prediction Models for Speech Recognition.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Graph-Based Word Alignment for Clinical Language Evaluation.
Comput. Linguistics, 2015

Composition-based on-the-fly rescoring for salient n-gram biasing.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Bringing contextual information to google speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Applications of Lexicographic Semirings to Problems in Speech and Language Processing.
Comput. Linguistics, 2014

Computational analysis of trajectories of linguistic development in autism.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Encoding linear models as weighted finite-state transducers.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Backoff inspired features for maximum entropy language models.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Data Driven Grammatical Error Detection in Transcripts of Children's Speech.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

Transforming trees into hedges and parsing with "hedgebank" grammars.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Hippocratic Abbreviation Expansion.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Challenges in Automating Maze Detection.
Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 2014

Huffman scanning: Using language models within fixed-grid keyboard emulation.
Comput. Speech Lang., 2013

Speech and Language processing as assistive technologies.
Comput. Speech Lang., 2013

Distributional semantic models for the evaluation of disordered language.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

Discriminative Joint Modeling of Lexical Variation and Acoustic Confusion for Automated Narrative Retelling Assessment.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

Investigation of MT-based ASR confusion models for semi-supervised discriminative language modeling.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Pair Language Models for Deriving Alternative Pronunciations and Spellings from Pronunciation Dictionaries.
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013

The Utility of Manual and Automatic Linguistic Error Codes for Identifying Neurodevelopmental Disorders.
Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications, 2013

Improved inference and autotyping in EEG-based BCI typing systems.
Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility, 2013

Smoothed marginal distribution constraints for language modeling.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

Discriminative Language Modeling With Linguistic and Statistically Derived Features.
IEEE Trans. Speech Audio Process., 2012

Finite-State Chart Constraints for Reduced Complexity Context-Free Parsing Pipelines.
Comput. Linguistics, 2012

Hallucinating system outputs for discriminative language modeling.
Proceedings of the 2012 Symposium on Machine Learning in Speech and Language Processing, 2012

Phrasal Cohort Based Unsupervised Discriminative Language Modeling.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Fully Automated Neuropsychological Assessment for Detecting Mild Cognitive Impairment.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Deriving conversation-based features from unlabeled speech for discriminative language modeling.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

RSVP keyboard: An EEG based typing interface.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Improved accuracy using recursive Bayesian estimation based language model fusion in ERP-based BCI typing systems.
Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2012

Designing and evaluating text entry methods.
Proceedings of the CHI Conference on Human Factors in Computing Systems, 2012

Graph-based alignment of narratives for automated neurological assessment.
Proceedings of the 2012 Workshop on Biomedical Natural Language Processing, 2012

The OpenGrm open-source finite-state grammar software libraries.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, 2012

Spoken Language Derived Measures for Detecting Mild Cognitive Impairment.
IEEE Trans. Speech Audio Process., 2011

Towards technology-assisted co-construction with communication partners.
Proceedings of the Second Workshop on Speech and Language Processing for Assistive Technologies, 2011

Asynchronous fixed-grid scanning with dynamic codes.
Proceedings of the Second Workshop on Speech and Language Processing for Assistive Technologies, 2011

Efficient Matrix-Encoded Grammars and Low Latency Parallelization Strategies for CYK.
Proceedings of the 12th International Conference on Parsing Technologies, 2011

Extraction of Narrative Recall Patterns for Neuropsychological Assessment.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation.
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 2011

Fusion with language models improves spelling accuracy for ERP-based brain computer interface spellers.
Proceedings of the 33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2011

Efficient determinization of tagged word lattices using categorial and lexicographic semirings.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Alignment of spoken narratives for automated neuropsychological assessment.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Context information significantly improves brain computer interface performance - a case study on text entry using a language model assisted BCI.
Proceedings of the Conference Record of the Forty Fifth Asilomar Conference on Signals, 2011

Lexicographic Semirings for Exact Automata Encoding of Sequence Models.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, 2011

Semi-Supervised Modeling for Prenominal Modifier Ordering.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, 2011

An ERP-based Brain-Computer Interface for text entry using Rapid Serial Visual Presentation and Language Modeling.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, 2011

Unary Constraints for Efficient Context-Free Parsing.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, 2011

Beam-Width Prediction for Efficient Context-Free Parsing.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011

Classification of Atypical Language in Autism.
Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics, 2011

Scanning methods and language modeling for binary switch typing.
Proceedings of the Workshop on Speech and Language Processing for Assistive Technologies, 2010

Demo Session Abstracts.
Proceedings of the Workshop on Speech and Language Processing for Assistive Technologies, 2010

Prenominal Modifier Ordering via Multiple Sequence Alignment.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2010

Syntactic and sub-lexical features for Turkish discriminative language models.
Proceedings of the IEEE International Conference on Acoustics, 2010

OHSU Summarization and Entity Linking Systems.
Proceedings of the Second Text Analysis Conference, 2009

Linear Complexity Context-Free Parsing Pipelines via Chart Constraints.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, May 31, 2009

Deriving lexical and syntactic expectation-based measures for psycholinguistic modeling via incremental top-down parsing.
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009

Multiple Sequence Alignment for Morphology Induction.
Proceedings of the Working Notes for CLEF 2009 Workshop co-located with the 13th European Conference on Digital Libraries (ECDL 2009) , Corfù, Greece, September 30, 2009

Morphological Analysis by Multiple Sequence Alignment.
Proceedings of the Multilingual Information Access Evaluation I. Text Retrieval Experiments, 2009

Probabilistic ParaMor.
Proceedings of the Working Notes for CLEF 2009 Workshop co-located with the 13th European Conference on Digital Libraries (ECDL 2009) , Corfù, Greece, September 30, 2009

Simulating Morphological Analyzers with Stochastic Taggers for Confidence Estimation.
Proceedings of the Multilingual Information Access Evaluation I. Text Retrieval Experiments, 2009

Query-focused Supervised Sentence Ranking for Update Summaries.
Proceedings of the First Text Analysis Conference, 2008

Discriminative n-gram language modeling for Turkish.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Classifying Chart Cells for Quadratic Complexity Context-Free Inference.
Proceedings of the COLING 2008, 2008

Discriminative n-gram language modeling.
Comput. Speech Lang., 2007

<i>Putting Linguistics into Speech Recognition: The Regulus Grammar Compiler</i> Manny Rayner, Beth Ann Hockey, and Pierette Bouillon (NASA Ames Research Center and University of Geneva) Stanford, CA: CSLI Publications (CSLI studies in computational linguistics, edited by Ann Copestake), 2006, xiv+305 pp; hardbound, ISBN 1-57586-525-4.
Comput. Linguistics, 2007

The OHSU Biomedical Question Answering System Framework.
Proceedings of The Sixteenth Text REtrieval Conference, 2007

The SRI/OGI 2006 spoken term detection system.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Syntactic complexity measures for detecting Mild Cognitive Impairment.
Proceedings of the Biological, translational, and clinical language processing, 2007

Pipeline Iteration.
Proceedings of the ACL 2007, 2007

The utility of parse-derived features for automatic discourse segmentation.
Proceedings of the ACL 2007, 2007

Utterance classification with discriminative language modeling.
Speech Commun., 2006

MAP adaptation of stochastic grammars.
Comput. Speech Lang., 2006

Combining Lexicon Expansion, Information Retrieval, and Cluster-based Ranking for Biomedical Question Answering.
Proceedings of the Fifteenth Text REtrieval Conference, 2006

Probabilistic Context-Free Grammar Induction Based on Structural Zeros.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2006

SParseval: Evaluation Metrics for Parsing Speech.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Reranking for Sentence Boundary Detection in Conversational Speech.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

PCFGs with Syntactic and Prosodic Indicators of Speech Repairs.
Proceedings of the ACL 2006, 2006

The design principles and algorithms of a weighted grammar library.
Int. J. Found. Comput. Sci., 2005

Comparing and Combining Finite-State and Context-Free Parsers.
Proceedings of the HLT/EMNLP 2005, 2005

Joint Discriminative Language Modeling and Utterance Classification.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Discriminative Syntactic Language Modeling for Speech Recognition.
Proceedings of the ACL 2005, 2005

Robust garden path parsing.
Nat. Lang. Eng., 2004

A General Weighted Grammar Library.
Proceedings of the Implementation and Application of Automata, 2004

Language Model Adaptation with MAP Estimation and the Perceptron Algorithm.
Proceedings of HLT-NAACL 2004: Short Papers, Boston, Massachusetts, USA, May 2-7, 2004, 2004

Corrective language modeling for large vocabulary ASR with the perceptron algorithm.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Improved name recognition with meta-data dependent name networks.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Meta-data conditional language modeling.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

A generalized construction of integrated speech recognition transducers.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm.
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 2004

Incremental Parsing with the Perceptron Algorithm.
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 2004

Supervised and unsupervised PCFG adaptation to novel domains.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, 2003

Unsupervised language model adaptation.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Generalized Algorithms for Constructing Statistical Language Models.
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, 2003

Markov Parsing: Lattice Rescoring with a Statistical Parser.
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002

Robust Probabilistic Predictive Syntactic Processing
CoRR, 2001

Probabilistic Top-Down Parsing and Language Modeling.
Comput. Linguistics, 2001

Noun-phrase co-occurrence statistics for semi-automatic semantic lexicon construction
CoRR, 2000

Compact non-left-recursive grammars using the selective left-corner transform and factoring.
Proceedings of the COLING 2000, 18th International Conference on Computational Linguistics, Proceedings of the Conference, 2 Volumes, July 31, 2000

Measuring Efficiency in High-accuracy, Broad-coverage Statistical Parsing.
Proceedings of the Workshop on Efficiency In Large-Scale Parsing Systems, 2000

Efficient probabilistic top-down and left-corner parsing.
Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics, 1999

Noun-Phrase Co-Occurence Statistics for Semi-Automatic Semantic Lexicon Construction.
Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, 1998
