Kenneth Church

Orcid: 0000-0001-8378-6069

Affiliations:
  • Northeastern University, MA, USA
  • Johns Hopkins University (former)
  • Microsoft Research (former)


According to our database1, Kenneth Church authored at least 172 papers between 1979 and 2024.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2023, "For contributions to empirical methods in natural language processing".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Emerging trends: a gentle introduction to RAG.
Nat. Lang. Eng., 2024

Emerging trends: When can users trust GPT, and when should they intervene?
Nat. Lang. Eng., 2024

Academic Article Recommendation Using Multiple Perspectives.
CoRR, 2024

Are Generative Language Models Multicultural? A Study on Hausa Culture and Emotions using ChatGPT.
CoRR, 2024

Since the Scientific Literature Is Multilingual, Our Models Should Be Too.
CoRR, 2024

Comparing Edge-based and Node-based Methods on a Citation Prediction Task.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

No Culture Left Behind: ArtELingo-28, a Benchmark of WikiArt with Captions in 28 Languages.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

On Translating Technical Terminology: A Translation Workflow for Machine-Translated Acronyms.
Proceedings of the 16th Conference of the Association for Machine Translation in the Americas, 2024

2023
Emerging trends: Smooth-talking machines.
Nat. Lang. Eng., September, 2023

Benchmark for Pairs of Papers in Semantic Scholar: 1 hop vs. 2-4 hops version 0.0.
Dataset, August, 2023

Emerging trends: Risks 3.0 and proliferation of spyware to 50,000 cell phones.
Nat. Lang. Eng., May, 2023

Emerging trends: Unfair, biased, addictive, dangerous, deadly, and insanely profitable.
Nat. Lang. Eng., March, 2023

A Research-Based Guide for the Creation and Deployment of a Low-Resource Machine Translation System.
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, 2023

Improved Contextualized Speech Representations for Tonal Analysis.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

An Example of (Too Much) Hyper-Parameter Tuning In Suicide Ideation Detection.
Proceedings of the Seventeenth International AAAI Conference on Web and Social Media, 2023

Some Useful Things to Know When Combining IR and NLP: the Easy, the Hard and the Ugly.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

2022
Emerging Trends: SOTA-Chasing.
Nat. Lang. Eng., 2022

Emerging trends: General fine-tuning (gft).
Nat. Lang. Eng., 2022

Emerging trends: Deep nets thrive on scale.
Nat. Lang. Eng., 2022

Data-Driven Adaptive Simultaneous Machine Translation.
CoRR, 2022

Efficiently Disentangle Causal Representations.
CoRR, 2022

Training on Lexical Resources.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

W-CTC: a Connectionist Temporal Classification Loss with Wild Cards.
Proceedings of the Tenth International Conference on Learning Representations, 2022

ArtELingo: A Million Emotion Annotations of WikiArt with Emphasis on Diversity over Language and Culture.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

A Gentle Introduction to Deep Nets and Opportunities for the Future.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022

2021
Emerging trends: Deep nets for poets.
Nat. Lang. Eng., 2021

Emerging trends: Ethics, intimidation, and the Cold War.
Nat. Lang. Eng., 2021

Emerging trends: A gentle introduction to fine-tuning.
Nat. Lang. Eng., 2021

Acronyms and Opportunities for Improving Deep Nets.
Frontiers Artif. Intell., 2021

The Future of Computational Linguistics: On Beyond Alchemy.
Frontiers Artif. Intell., 2021

The Role of Phonetic Units in Speech Emotion Recognition.
CoRR, 2021

Automatic recognition of suprasegmentals in speech.
CoRR, 2021

Better than BERT but Worse than Baseline.
CoRR, 2021

Exploiting a Zoo of Checkpoints for Unseen Tasks.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

On Attention Redundancy: A Comprehensive Study.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

The Third DIHARD Diarization Challenge.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Speech Emotion Recognition with Multi-Task Learning.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

SSPF: a Simple and Scalable Parameter Free Clustering Method.
Proceedings of the 2021 International Conference on Data Mining, 2021

Isotropy in the Contextual Embedding Space: Clusters and Manifolds.
Proceedings of the 9th International Conference on Learning Representations, 2021

Exploring Long Tail Visual Relationship Recognition with Large Vocabulary.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Pause-Encoded Language Models for Recognition of Alzheimer's Disease and Emotion.
Proceedings of the IEEE International Conference on Acoustics, 2021

Speaking Rate and Tonal Realization in Mandarin Chinese: What Can We Learn From Large Speech Corpora?
Proceedings of the IEEE International Conference on Acoustics, 2021

Large Margin Training Improves Language Models for ASR.
Proceedings of the IEEE International Conference on Acoustics, 2021

Data Collection vs. Knowledge Graph Completion: What is Needed to Improve Coverage?
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Decoupling Recognition and Transcription in Mandarin ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Benchmarks and goals.
Nat. Lang. Eng., 2020

Emerging trends: Subwords, seriously?
Nat. Lang. Eng., 2020

Emerging trends: Reviewing the reviewers (again).
Nat. Lang. Eng., 2020

Pauses for Detection of Alzheimer's Disease.
Frontiers Comput. Sci., 2020

Third DIHARD Challenge Evaluation Plan.
CoRR, 2020

Long-tail Visual Relationship Recognition with a Visiolinguistic Hubless Loss.
CoRR, 2020

Disfluencies and Fine-Tuning Pre-Trained Language Models for Detection of Alzheimer's Disease.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Compositional Language Continual Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020

Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Incremental Text-to-Speech Synthesis with Prefix-to-Prefix Framework.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Improving Bilingual Lexicon Induction for Low Frequency Words.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

2019
A survey of 25 years of evaluation.
Nat. Lang. Eng., 2019

GANs vs. good enough.
Nat. Lang. Eng., 2019

Language Modeling at Scale.
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

The Second DIHARD Diarization Challenge: Dataset, Task, and Baselines.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Hubless Nearest Neighbor Search for Bilingual Lexicon Induction.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Emerging trends: APIs for speech and machine translation and more.
Nat. Lang. Eng., 2018

Emerging trends: Artificial Intelligence, China and my new job at Baidu.
Nat. Lang. Eng., 2018

Emerging trends: A tribute to Charles Wayne.
Nat. Lang. Eng., 2018

Minsky, Chomsky and Deep Nets.
Proceedings of the Text, Speech, and Dialogue - 21st International Conference, 2018

Enhancement and Analysis of Conversational Speech: JSALT 2017.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Emerging trends: Inflation.
Nat. Lang. Eng., 2017

Emerging trends: I did it, I did it, I did it, but. . .
Nat. Lang. Eng., 2017

Word2Vec.
Nat. Lang. Eng., 2017

Symbol Sequence Search from Telephone Conversation.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Speaker diarization: A perspective on challenges and opportunities from theory to practice.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Corpus Methods in a Digitized World.
Proceedings of the Computational and Corpus-Based Phraseology, 2017

2016
The next generation.
Nat. Lang. Eng., 2016

2014
TALIP Perspectives, Guest Editorial Commentary: What Counts (and What Ought to Count)?
ACM Trans. Asian Lang. Inf. Process., 2014

2013
How many multiword expressions do people know?
ACM Trans. Speech Lang. Process., 2013

Approximate inference: A sampling based modeling technique to capture complex dependencies in a language model.
Speech Commun., 2013

Intent classification of voice queries on mobile devices.
Proceedings of the 22nd International World Wide Web Conference, 2013

Deep neural network features and semi-supervised training for low resource speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013


Intent focused summarization of caller-agent conversations.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Inverting the Point Process Model for Fast Phonetic Keyword Search.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011
Large science databases - are cloud services ready for them?
Sci. Program., 2011

Report on the first summer school on NLP and IR in Beijing.
SIGIR Forum, 2011

Towards Unsupervised Training of Speaker Independent Acoustic Models.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A Fast Re-scoring Strategy to Capture Long-Distance Dependencies.
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 2011

Estimating document frequencies in a speech corpus.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Bootstrapping a spoken language identification system using unsupervised integrated sensing and processing decision trees.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Using Large Monolingual and Bilingual Corpora to Improve Coordination Disambiguation.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011

Repetition and Language Models and Comparable Corpora.
Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora, 2011

2010
New Tools for Web-Scale N-grams.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

Towards spoken term discovery at scale with zero resources.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

NLP on Spoken Documents Without ASR.
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010

Using Web-scale N-grams to Improve Base NP Parsing Performance.
Proceedings of the COLING 2010, 2010

More is More.
Proceedings of the A Way with Words, 2010

2009
A Data Structure for Sponsored Search.
Proceedings of the 25th International Conference on Data Engineering, 2009

Using Word-Sense Disambiguation Methods to Classify Web Queries by Intent.
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009

Substring Statistics.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2009

Has Computational Linguistics Become More Applied?
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2009

2008
Entropy of search logs: how hard is search? with personalization? with backoff?
Proceedings of the International Conference on Web Search and Web Data Mining, 2008

One sketch for all: Theory and Application of Conditional Random Sampling.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

On Delivering Embarrassingly Distributed Cloud Services.
Proceedings of the 7th ACM Workshop on Hot Topics in Networks, 2008

Query suggestion using hitting time.
Proceedings of the 17th ACM Conference on Information and Knowledge Management, 2008

Variable selection for ad prediction.
Proceedings of the 2nd International Workshop on Data Mining and Audience Intelligence for Advertising, 2008

2007
Nonlinear Estimators and Tail Bounds for Dimension Reduction in <i>l</i><sub>1</sub> Using Cauchy Random Projections.
J. Mach. Learn. Res., 2007

A Sketch Algorithm for Estimating Two-Way and Multi-Way Associations.
Comput. Linguistics, 2007

The wild thing goes local.
Proceedings of the SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2007

Heavy-tailed distributions and multi-keyword queries.
Proceedings of the SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2007

K-Best Suffix Arrays.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2007

Compressing Trigram Language Models With Golomb Coding.
Proceedings of the EMNLP-CoNLL 2007, 2007

2006
Nonlinear Estimators and Tail Bounds for Dimension Reduction in $l_1$ Using Cauchy Random Projections
CoRR, 2006

Conditional Random Sampling: A Sketch-based Sampling Technique for Sparse Data.
Proceedings of the Advances in Neural Information Processing Systems 19, 2006

Very sparse random projections.
Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006

Improving Random Projections Using Marginal Information.
Proceedings of the Learning Theory, 19th Annual Conference on Learning Theory, 2006

2005
Reviewing the Reviewers.
Comput. Linguistics, 2005

Using Sketches to Estimate Associations.
Proceedings of the HLT/EMNLP 2005, 2005

The Wild Thing.
Proceedings of the ACL 2005, 2005

2004
The submatrices character count problem: an efficient solution using separable values.
Inf. Comput., 2004

Speech and Language Processing: Can We Use the Past to Predict the Future?
Proceedings of the Text, Speech and Dialogue, 7th International Conference, 2004

2003
Speech and language processing: where have we been and where are we going?
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002
Dedication to William A. Gale.
Nat. Lang. Eng., 2002

Separable attributes: a technique for solving the sub matrices character count problem.
Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 2002

NLP Found Helpful (at least for one Text Categorization Task).
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, 2002

2001
Using Suffix Arrays to Compute Term Frequency and Document Frequency for All Substrings in a Corpus.
Comput. Linguistics, 2001

Using Bins to Empirically Estimate Term Weights for Text Categorization.
Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2001

2000
Engineering the compression of massive tables: an experimental approach.
Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms, 2000

Dynamic programming: a method for taking advantage of technical terminology in Japanese documents.
Proceedings of the Fifth International Workshop on Information Retrieval with Asian Languages, 2000, Hong Kong, China, September 30, 2000

Using variable length ngrams for retrieving technical abstracts in Japanese (poster session).
Proceedings of the Fifth International Workshop on Information Retrieval with Asian Languages, 2000, Hong Kong, China, September 30, 2000

Empirical Term Weighting and Expansion Frequency.
Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, 2000

Empirical Estimates of Adaptation: The chance of Two Noriegas is closer to p/2 than p2.
Proceedings of the COLING 2000, 18th International Conference on Computational Linguistics, Proceedings of the Conference, 2 Volumes, July 31, 2000

1999
Virtual Data Warehousing, Data Publishing, and Call Detail.
Proceedings of the Databases in Telecommunications, 1999

Japanese Word Segmentation Using Similarity Measure for IR.
Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, 1999

What's Happened Since the First SIGDAT Meeting?
Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, 1999

1997
Preface.
Mach. Transl., 1997

Termight: Coordinating Humans and Machines in Bilingual Terminology Acquisition.
Mach. Transl., 1997

Applications of Natural Language Processing.
Künstliche Intell., 1997

1996
Panel: The limits of automation: optimists vs skeptics.
Proceedings of the Conference of the Association for Machine Translation in the Americas, 1996

1995
Poisson mixtures.
Nat. Lang. Eng., 1995

Commercial Applications of Natural Language Processing.
Commun. ACM, 1995

Discrimination decisions for 100, 000-dimensional spaces.
Ann. Oper. Res., 1995

One Term Or Two?
Proceedings of the SIGIR'95, 1995

Inverse Document Frequency (IDF): A Measure of Deviations from Poisson.
Proceedings of the Third Workshop on Very Large Corpora, 1995

1994
Using OCR and equalization to downsample documents.
Proceedings of the 12th IAPR International Conference on Pattern Recognition, 1994

K-vec: A New Approach for Aligning Parallel Texts.
Proceedings of the 15th International Conference on Computational Linguistics, 1994

Fax: An Alternative to SGML.
Proceedings of the 15th International Conference on Computational Linguistics, 1994

Termight: Identifying and Translating Technical Terminology.
Proceedings of the 4th Applied Natural Language Processing Conference, 1994

Is MT Research Doing Any Good?
Proceedings of the First Conference of the Association for Machine Translation in the Americas, 1994

1993
Good applications for crummy machine translation.
Mach. Transl., 1993

A Program for Aligning Sentences in Bilingual Corpora.
Comput. Linguistics, 1993

Introduction to the Special Issue on Computational Linguistics Using Large Corpora.
Comput. Linguistics, 1993

Char_align: A Program for Aligning Parallel Texts at the Character Level.
Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, 1993

Robust Bilingual Word Alignment for Machine Aided Translation.
Proceedings of the Very Large Corpora: Academic and Industrial Perspectives, 1993

1992
A method for disambiguating word senses in a large corpus.
Comput. Humanit., 1992

One Sense Per Discourse.
Proceedings of the Speech and Natural Language: Proceedings of a Workshop Held at Harriman, 1992

Estimating Upper and Lower Bounds on the Performance of Word-Sense Disambiguation Programs.
Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, 28 June, 1992

1991
Identifying Word Correspondences in Parallel Texts.
Proceedings of the Speech and Natural Language, 1991

1990
Word Association Norms, Mutual Information, and Lexicography.
Comput. Linguistics, 1990

Morphology and rhyming: two powerful alternatives to letter-to-sound rules for speech synthesis.
Proceedings of the ESCA Workshop on Speech Synthesis, 1990

Poor Estimates of Context are Worse than None.
Proceedings of the Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, 1990

A Spelling Correction Program Based on a Noisy Channel Model.
Proceedings of the 13th International Conference on Computational Linguistics, 1990

1989
Parsing, Word Associations and Typical Predicate-Argument Relations.
Proceedings of the Speech and Natural Language: Proceedings of a Workshop Held at Cape Cod, 1989

Enhanced Good-Turing and Cat.Cal: Two New Methods for Estimating Probabilities of English Bigrams (abbreviated version).
Proceedings of the Speech and Natural Language: Proceedings of a Workshop Held at Cape Cod, 1989

Session 11 Natural Language III.
Proceedings of the Speech and Natural Language: Proceedings of a Workshop Held at Cape Cod, 1989

A stochastic parts program and noun phrase parser for unrestricted text.
Proceedings of the IEEE International Conference on Acoustics, 1989

1988
Complexity, two-level morphology and Finnish.
Proceedings of the 12th International Conference on Computational Linguistics, 1988

1986
Stress assignment in letter to sound rules for speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 1986

Morphoogicai Decomposition and Stress Assignment for Speech Synthesis.
Proceedings of the 24th Annual Meeting of the Association for Computational Linguistics, 1986

1983
Phrase-structure parsing: a method for taking advantage of allophonic constraints.
PhD thesis, 1983

Allophonic and Phonotactic Constraints Are Useful.
Proceedings of the 8th International Joint Conference on Artificial Intelligence. Karlsruhe, 1983

A Finite-State Parser for Use in Speech Recognition.
Proceedings of the 21st Annual Meeting of the Association for Computational Linguistics, 1983

1982
Coping with Syntactic Ambiguity or How to Put the Block in the Box on the Table.
Am. J. Comput. Linguistics, 1982

1980
On Parsing Strategies and Closure.
Proceedings of the 18th Annual Meeting of the Association for Computational Linguistics, 1980

1979
Co-ordinate Square: Solution to Many Chess Pawn Endgames.
Proceedings of the Sixth International Joint Conference on Artificial Intelligence, 1979


  Loading...