David M. Mimno

Orcid: 0000-0001-7510-9404

According to our database1, David M. Mimno authored at least 79 papers between 2004 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Automate or Assist? The Role of Computational Models in Identifying Gendered Discourse in US Capital Trial Transcripts.
CoRR, 2024

How Chinese are Chinese Language Models? The Puzzling Lack of Language Policy in China's LLMs.
CoRR, 2024

Stronger Random Baselines for In-Context Learning.
CoRR, 2024

[Lions: 1] and [Tigers: 2] and [Bears: 3], Oh My! Literary Coreference Annotation with LLMs.
CoRR, 2024

The Afterlives of Shakespeare and Company in Online Social Readership.
CoRR, 2024

A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Sensemaking about Contraceptive Methods across Online Platforms.
Proceedings of the Eighteenth International AAAI Conference on Web and Social Media, 2024

Contextualized Topic Coherence Metrics.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024

LLMs as Research Tools: Applications and Evaluations in HCI Data Work.
Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, 2024

2023
Report of the 1st Workshop on Generative AI and Law.
CoRR, 2023

Data Similarity is Not Enough to Explain Language Model Performance.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Modeling Legal Reasoning: LM Annotation at the Edge of Human Agreement.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Large Language Models and NER: better results with less work.
Proceedings of the Annual International Conference of the Alliance of Digital Humanities Organizations, 2023

T5 meets Tybalt: Author Attribution in Early Modern English Drama Using Large Language Models.
Proceedings of the Computational Humanities Research Conference 2023, 2023

The Chatbot and the Canon: Poetry Memorization in LLMs.
Proceedings of the Computational Humanities Research Conference 2023, 2023

2022
Breaking BERT: Evaluating and Optimizing Sparsified Attention.
CoRR, 2022

Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model.
CoRR, 2022

2021
Tags, Borders, and Catalogs: Social Re-Working of Genre on LibraryThing.
Proc. ACM Hum. Comput. Interact., 2021

Tecnologica cosa: Modeling Storyteller Personalities in Boccaccio's Decameron.
CoRR, 2021

On-the-fly Rectification for Robust Large-Vocabulary Topic Inference.
Proceedings of the 38th International Conference on Machine Learning, 2021

Comparing Text Representations: A Theory-Driven Approach.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Bad Seeds: Evaluating Lexical Methods for Bias Measurement.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
How We Do Things With Words: Analyzing Text as Social and Cultural Data.
Frontiers Artif. Intell., 2020

Topic Modeling with Contextualized Word Representation Clusters.
CoRR, 2020

Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Network Analysis Finds Shifts in the History of Modern Architecture.
Proceedings of the 15th Annual International Conference of the Alliance of Digital Humanities Organizations, 2020

Constructing and Analyzing Short Science Fiction at Scale.
Proceedings of the 15th Annual International Conference of the Alliance of Digital Humanities Organizations, 2020

Replication and Computational Literary Studies.
Proceedings of the 15th Annual International Conference of the Alliance of Digital Humanities Organizations, 2020

Prior-aware Composition Inference for Spectral Topic Models.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019
Boosted negative sampling by quadratically constrained entropy maximization.
Pattern Recognit. Lett., 2019

Narrative Paths and Negotiation of Power in Birth Stories.
Proc. ACM Hum. Comput. Interact., 2019

Practical Correlated Topic Modeling and Analysis via the Rectified Anchor Word Algorithm.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Unsupervised Discovery of Multimodal Links in Multi-image, Multi-sentence Documents.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

2018
Evaluating the Stability of Embedding-based Word Similarities.
Trans. Assoc. Comput. Linguistics, 2018

Learning topic models - provably and efficiently.
Commun. ACM, 2018

Quantifying the Visual Concreteness of Words and Topics in Multimodal Datasets.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Authorless Topic Models: Biasing Models Away from Known Structure.
Proceedings of the 27th International Conference on Computational Linguistics, 2018

2017
Comparing grounded theory and topic modeling: Extreme divergence or unlikely convergence?
J. Assoc. Inf. Sci. Technol., 2017

Applications of Topic Models.
Found. Trends Inf. Retr., 2017

Prior-aware Dual Decomposition: Document-specific Topic Inference for Spectral Topic Models.
CoRR, 2017

Cats and Captions vs. Creators and the Clock: Comparing Multimodal Content to Context in Predicting Relative Popularity.
Proceedings of the 26th International Conference on World Wide Web, 2017

Quantifying the Effects of Text Duplication on Semantic Models.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

The strange geometry of skip-gram with negative sampling.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Pulling Out the Stops: Rethinking Stopword Removal for Topic Models.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

2016
Comparing Apples to Apple: The Effects of Stemmers on Topic Models.
Trans. Assoc. Comput. Linguistics, 2016

Beyond Exchangeability: The Chinese Voting Process.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Machine Learning and Grounded Theory Method: Convergence, Divergence, and Combination.
Proceedings of the 19th International Conference on Supporting Group Work, Sanibel Island, FL, USA, November 13, 2016

2015
What do Vegans do in their Spare Time? Latent Interest Detection in Multi-Community Networks.
CoRR, 2015

Robust Spectral Inference for Joint Stochastic Matrix Factorization.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Evaluation methods for unsupervised word embeddings.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

2014
Care and Feeding of Topic Models.
Proceedings of the Handbook of Mixed Membership Models and Their Applications., 2014

Low-dimensional Embeddings for Interpretable Anchor-based Topic Inference.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

The Telltale Hat: LDA and Classification Problems in a Large Folklore Corpus.
Proceedings of the 9th Annual International Conference of the Alliance of Digital Humanities Organizations, 2014

2013
A Practical Algorithm for Topic Modeling with Provable Guarantees.
Proceedings of the 30th International Conference on Machine Learning, 2013

2012
Computational historiography: Data mining in a century of classics journals.
ACM Journal on Computing and Cultural Heritage, 2012

Scalable Inference of Overlapping Communities.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Topic models for taxonomies.
Proceedings of the 12th ACM/IEEE-CS Joint Conference on Digital Libraries, 2012

Sparse stochastic inference for latent Dirichlet allocation.
Proceedings of the 29th International Conference on Machine Learning, 2012

Topic Modeling the Past.
Proceedings of the 7th Annual International Conference of the Alliance of Digital Humanities Organizations, 2012

2011
Reconstructing Pompeian Households.
Proceedings of the UAI 2011, 2011

Optimizing Semantic Coherence in Topic Models.
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 2011

Bayesian Checking for Topic Models.
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 2011

2009
Classics in the Million Book Library.
Digit. Humanit. Q., 2009

Rethinking LDA: Why Priors Matter.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Efficient methods for topic model inference on streaming document collections.
Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28, 2009

Evaluation methods for topic models.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Polylingual Topic Models.
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009

2008
Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression.
Proceedings of the UAI 2008, 2008

InterNano: e-Science for the Nanomanufacturing Community.
Proceedings of the Fourth International Conference on e-Science, 2008

2007
Expertise modeling for matching papers with reviewers.
Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007

Organizing the OCA: learning faceted subjects from a library of digital books.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2007

Mining a digital library for influential authors.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2007

Mixtures of hierarchical topics with Pachinko allocation.
Proceedings of the Machine Learning, 2007

2006
Bibliometric impact measures leveraging topic analysis.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2006

Beyond Digital Incunabula: Modeling the Next Generation of Digital Libraries.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2006

2005
Hierarchical Catalog Records: Implementing a FRBR Catalog.
D Lib Mag., 2005

Finding a catalog: generating analytical catalog records from well-structured digital texts.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2005

2004
Services for a customizable authority linking environment.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2004


  Loading...