Sharon Goldwater

Affiliations:
  • University of Edinburgh, UK


According to our database1, Sharon Goldwater authored at least 97 papers between 1998 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Orthogonality and isotropy of speaker and phonetic information in self-supervised speech representations.
CoRR, 2024

Estimating the Level of Dialectness Predicts Interannotator Agreement in Multi-dialect Arabic Datasets.
CoRR, 2024

A predictive learning model can simulate temporal dynamics and context effects found in neural representations of continuous speech.
CoRR, 2024

2023
Infant Phonetic Learning as Perceptual Space Learning: A Crosslinguistic Evaluation of Computational Models.
Cogn. Sci., July, 2023

Prosodic features improve sentence segmentation and parsing.
CoRR, 2023

Acoustic Word Embeddings for Untranscribed Target Languages with Continued Pretraining and Learned Pooling.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Parsing dialog turns with prosodic features in English.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Self-supervised Predictive Coding Models Encode Speaker and Phonetic Information in Orthogonal Subspaces.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Analyzing Acoustic Word Embeddings from Pre-Trained Self-Supervised Speech Models.
Proceedings of the IEEE International Conference on Acoustics, 2023

ALDi: Quantifying the Arabic Level of Dialectness of Text.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022
Regularization or lexical probability-matching? How German speakers generalize plural morphology.
Proceedings of the 44th Annual Meeting of the Cognitive Science Society, 2022

2021
Improved Acoustic Word Embeddings for Zero-Resource Languages Using Multilingual Transfer.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Black or White but Never Neutral: How Readers Perceive Identity from Yellow or Skin-toned Emoji.
Proc. ACM Hum. Comput. Interact., 2021

Multilingual and unsupervised subword modeling for zero-resource languages.
Comput. Speech Lang., 2021

Cross-linguistically Consistent Semantic and Syntactic Annotation of Child-directed Speech.
CoRR, 2021

On the Difficulty of Segmenting Words with Attention.
CoRR, 2021

Identity Signals in Emoji do not Influence Perception of Factual Truth on Twitter.
Proceedings of the Workshop Proceedings of the 15th International AAAI Conference on Web and Social Media, 2021

A phonetic model of non-native spoken word processing.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Prosodic segmentation for parsing spoken dialogue.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Emoji Skin Tone Modifiers: Analyzing Variation in Usage on Social Media.
ACM Trans. Soc. Comput., 2020

LemMED: Fast and Effective Neural Morphological Analysis with Short Context Windows.
CoRR, 2020

Analyzing autoencoder-based acoustic word embeddings.
CoRR, 2020

Analyzing ASR Pretraining for Low-Resource Speech-to-Text Translation.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Multilingual Acoustic Word Embedding Models for Processing Zero-resource Languages.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Cross-Lingual Topic Prediction For Speech Using Translations.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

The role of context in neural pitch accent detection in English.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Evaluating computational models of infant phonetic learning across languages.
Proceedings of the 42th Annual Meeting of the Cognitive Science Society, 2020

Input matters in the modeling of early phonetic learning.
Proceedings of the 42th Annual Meeting of the Cognitive Science Society, 2020

Inflecting When There's No Majority: Limitations of Encoder-Decoder Neural Networks as Cognitive Models for German Plurals.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Classifying topics in speech when all you have is crummy translations.
CoRR, 2019

Data Augmentation for Context-Sensitive Neural Lemmatization Using Inflection Tables and Raw Text.
CoRR, 2019

Training Data Augmentation for Context-Sensitive Neural Lemmatizer Using Inflection Tables and Raw Text.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Pre-training on high-resource speech recognition improves low-resource speech-to-text translation.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Are we there yet? Encoder-decoder neural networks as cognitive models of English past tense inflection.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Evaluating Historical Text Normalization Systems: How Well Do They Generalize?
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Context Sensitive Neural Lemmatization with Lematus.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Multilingual Bottleneck Features for Subword Modeling in Zero-resource Languages.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Low-Resource Speech-to-Text Translation.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Self-Representation on Twitter Using Emoji Skin Color Modifiers.
Proceedings of the Twelfth International Conference on Web and Social Media, 2018

Inducing a lexicon of sociolinguistic variables from code-mixed text.
Proceedings of the 4th Workshop on Noisy User-generated Text, 2018

2017
A segmental framework for fully-unsupervised large-vocabulary speech recognition.
Comput. Speech Lang., 2017

Weakly supervised spoken term discovery using cross-lingual side information.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Spoken Term Discovery for Language Documentation using Translations.
Proceedings of the Workshop on Speech-Centric Natural Language Processing, 2017

Aye or naw, whit dae ye hink? Scottish independence and linguistic identity on social media.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

Towards speech-to-text translation without speech recognition.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

From Segmentation to Analyses: a Probabilistic Model for Unsupervised Morphology Induction.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

Training Data Augmentation for Low-Resource Morphological Inflection.
Proceedings of the CoNLL SIGMORPHON 2017 Shared Task: Universal Morphological Reinflection, 2017

An embedded segmental K-means model for unsupervised segmentation and clustering of speech.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016
Unsupervised Word Segmentation and Lexicon Discovery Using Acoustic Word Embeddings.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Towards robust cross-linguistic comparisons of phonological networks.
Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, 2016

2015
A comparison of neural network methods for unsupervised representation learning on the zero resource speech challenge.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Fully unsupervised small-vocabulary speech recognition using a segmental Bayesian model.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Unsupervised neural network based feature extraction using weak top-down constraints.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Unsupervised lexical clustering of speech segments using fixed-dimensional acoustic embeddings.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

POS induction with distributional and morphological information using a distance-dependent Chinese restaurant process.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Weak semantic context helps phonetic learning in a model of infant language acquisition.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

2013
Adding Sentence Types to a Model of Syntactic Category Acquisition.
Top. Cogn. Sci., 2013

Minimally-Supervised Morphological Segmentation using Adaptor Grammars.
Trans. Assoc. Comput. Linguistics, 2013

Unsupervised Dependency Parsing with Acoustic Cues.
Trans. Assoc. Comput. Linguistics, 2013


Modeling Graph Languages with Grammars Extracted via Tree Decompositions.
Proceedings of the 11th International Conference on Finite State Methods and Natural Language Processing, 2013

Exploring the Utility of Joint Morphological and Syntactic Learning from Child-directed Speech.
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013

A Joint Learning Model of Word Segmentation, Lexical Acquisition, and Phonetic Variability.
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013

2012
A Probabilistic Model of Syntactic and Semantic Acquisition from Child-Directed Utterances and their Meanings.
Proceedings of the EACL 2012, 2012

Semantic Parsing with Bayesian Tree Transducers.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

Bootstrapping a Unified Model of Lexical and Phonetic Acquisition.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

2011
Producing Power-Law Distributions and Damping Word Frequencies with Two-Stage Language Models.
J. Mach. Learn. Res., 2011

Computational Modeling of Human Language Acquisition Afra Alishahi (University of the Saarland) Morgan & Claypool (Synthesis Lectures on Human Language Technologies, edited by Graeme Hirst, volume 11), 2010, xiv+93 pp; paperbound, ISBN 978-1-60845-339-9, $40.00; ebook, ISBN 978-1-60845-340-5, $30.00 or by subscription.
Comput. Linguistics, 2011

Lexical Generalization in CCG Grammar Induction for Semantic Parsing.
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 2011

Unsupervised NLP and Human Language Acquisition: Making Connections to Make Progress.
Proceedings of the First workshop on Unsupervised Learning in NLP@EMNLP 2011, 2011

A Bayesian Mixture Model for PoS Induction Using Multiple Features.
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 2011

Predictability effects in adult-directed and infant-directed speech: Does the listener matter?
Proceedings of the 33th Annual Meeting of the Cognitive Science Society, 2011

Unsupervised Extraction of Recurring Words from Infant-Directed Speech.
Proceedings of the 33th Annual Meeting of the Cognitive Science Society, 2011

Unsupervised Syntactic Chunking with Acoustic Cues: Computational Models for Prosodic Bootstrapping.
Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics, 2011

Formalizing Semantic Parsing with Tree Transducers.
Proceedings of the Australasian Language Technology Association Workshop 2011, 2011

2010
Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase speech recognition error rates.
Speech Commun., 2010

Inducing Tree-Substitution Grammars.
J. Mach. Learn. Res., 2010

Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification.
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010

Two Decades of Unsupervised POS Induction: How Far Have We Come?
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010

Using Sentence Type Information for Syntactic Category Acquisition.
Proceedings of the 2010 Workshop on Cognitive Modeling and Computational Linguistics, 2010

2009
Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, May 31, 2009

Inducing Compact but Accurate Tree-Substitution Grammars.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, May 31, 2009

Improving Morphology Induction by Learning Spelling Rules.
Proceedings of the IJCAI 2009, 2009

A Note on the Implementation of Hierarchical Dirichlet Processes.
Proceedings of the ACL 2009, 2009

2008
Which Words Are Hard to Recognize? Prosodic, Lexical, and Disfluency Factors that Increase ASR Error Rates.
Proceedings of the ACL 2008, 2008

2007
Bayesian Inference for PCFGs via Markov Chain Monte Carlo.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2007

A fully Bayesian approach to unsupervised part-of-speech tagging.
Proceedings of the ACL 2007, 2007

2006
Adaptor Grammars: A Framework for Specifying Compositional Nonparametric Bayesian Models.
Proceedings of the Advances in Neural Information Processing Systems 19, 2006

A Non-Parametric Bayesian Approach to Spike Sorting.
Proceedings of the 28th International Conference of the IEEE Engineering in Medicine and Biology Society, 2006

Contextual Dependencies in Unsupervised Word Segmentation.
Proceedings of the ACL 2006, 2006

2005
Interpolating between types and tokens by estimating power-law generators.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Improving Statistical MT through Morphological Analysis.
Proceedings of the HLT/EMNLP 2005, 2005

Representational Bias in Unsupervised Learning of Syllable Structure.
Proceedings of the Ninth Conference on Computational Natural Language Learning, 2005

2004
Priors in Bayesian Learning of Phonological Rules.
Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology, 2004

2003
A Type System for Statically Detecting Spreadsheet Errors.
Proceedings of the 18th IEEE International Conference on Automated Software Engineering (ASE 2003), 2003

2000
Compiling Language Models from a Linguistically Motivated Unification Grammar.
Proceedings of the COLING 2000, 18th International Conference on Computational Linguistics, Proceedings of the Conference, 2 Volumes, July 31, 2000

1998
Edge-Based Best-First Chart Parsing.
Proceedings of the Sixth Workshop on Very Large Corpora, 1998


  Loading...