Mark Dredze

Orcid: 0000-0002-0422-2474

Affiliations:
  • Johns Hopkins University, Baltimore, MD, USA
  • University of Pennsylvania, Philadelphia, PA, USA (former)


According to our database1, Mark Dredze authored at least 250 papers between 2003 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Can Optimization Trajectories Explain Multi-Task Transfer?
CoRR, 2024

Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models.
CoRR, 2024

Evaluating Biases in Context-Dependent Health Questions.
CoRR, 2024

Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions.
CoRR, 2024

A Closer Look at Claim Decomposition.
Proceedings of the 13th Joint Conference on Lexical and Computational Semantics, 2024

Do LLMs Plan Like Human Writers? Comparing Journalist Coverage of Press Releases with LLMs.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Evaluating Biases in Context-Dependent Sexual and Reproductive Health Questions.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Gender Bias in Decision-Making with Large Language Models: A Study of Relationship Conflicts.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Academics Can Contribute to Domain-Specialized Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Schema-Driven Information Extraction from Heterogeneous Tables.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Multi-Task Transfer Matters During Instruction-Tuning.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Use large language models to promote equity.
CoRR, 2023

An Eye on Clinical BERT: Investigating Language Model Generalization for Diabetic Eye Disease Phenotyping.
CoRR, 2023

Selecting Shots for Demographic Fairness in Few-Shot Learning with Large Language Models.
CoRR, 2023

Generalizing Fairness using Multi-Task Learning without Demographic Information.
CoRR, 2023

BloombergGPT: A Large Language Model for Finance.
CoRR, 2023

MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Geo-Seq2seq: Twitter User Geolocation on Noisy Data through Sequence to Sequence Learning.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Joint End-to-end Semantic Proto-role Labeling.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

Characterization of Stigmatizing Language in Medical Records.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

2022
Using Open-Ended Stressor Responses to Predict Depressive Symptoms across Demographics.
CoRR, 2022

Then and Now: Quantifying the Longitudinal Validity of Self-Disclosed Depression Diagnoses.
CoRR, 2022

What Makes Data-to-Text Generation Hard for Pretrained Language Models?
CoRR, 2022

Crystal Cube: Forecasting Disruptive Events.
Appl. Artif. Intell., 2022

The Problem of Semantic Shift in Longitudinal Monitoring of Social Media: A Case Study on Mental Health During the COVID-19 Pandemic.
Proceedings of the WebSci '22: 14th ACM Web Science Conference 2022, Barcelona, Spain, June 26, 2022

Zero-shot Cross-lingual Transfer is Under-specified Optimization.
Proceedings of the 7th Workshop on Representation Learning for NLP, 2022

Do Text-to-Text Multi-Task Learners Suffer from Task Conflict?
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Bernice: A Multilingual Pre-trained Encoder for Twitter.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Changes in Tweet Geolocation over Time: A Study with Carmen 2.0.
Proceedings of the Eighth Workshop on Noisy User-generated Text, 2022

Enriching Unsupervised User Embedding via Medical Concepts.
Proceedings of the Conference on Health, Inference, and Learning, 2022

Model Distillation for Faithful Explanations of Medical Code Predictions.
Proceedings of the 21st Workshop on Biomedical Language Processing, 2022

Updated Headline Generation: Creating Updated Summaries for Evolving News Stories.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Demographic Representation and Collective Storytelling in the Me Too Twitter Hashtag Activism Movement.
Proc. ACM Hum. Comput. Interact., 2021

Learning to Look Inside: Augmenting Token-Based Encoders with Character-Level Information.
CoRR, 2021

Improving Zero-Shot Multi-Lingual Entity Linking.
CoRR, 2021

Faithful and Plausible Explanations of Medical Code Predictions.
CoRR, 2021

User Factor Adaptation for User Embedding via Multitask Learning.
CoRR, 2021

Generating Synthetic Text Data to Evaluate Causal Inference Methods.
CoRR, 2021

Fine-tuning Encoders for Improved Monolingual and Zero-shot Polylingual Neural Topic Modeling.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Proxy Model Explanations for Time Series RNNs.
Proceedings of the 20th IEEE International Conference on Machine Learning and Applications, 2021

Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Gender and Racial Fairness in Depression Research using Social Media.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Study of Manifestation of Civil Unrest on Twitter.
Proceedings of the Seventh Workshop on Noisy User-generated Text, 2021

Cross-Lingual Transfer in Zero-Shot Cross-Language Entity Linking.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

Using Noisy Self-Reports to Predict Twitter User Demographics.
Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media, 2021

2020
Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Coronavirus Twitter Data: A collection of COVID-19 tweets with automated annotations.
Dataset, March, 2020

Examining the Feasibility of Off-the-Shelf Algorithms for Masking Directly Identifiable Information in Social Media Data.
CoRR, 2020

On the State of Social Media Data for Mental Health Research.
CoRR, 2020

The COVID-19 Social Media Infodemic Reflects Uncertainty and State-Sponsored Propaganda.
CoRR, 2020

The Twitter Social Mobility Index: Measuring Social Distancing Practices from Geolocated Tweets.
CoRR, 2020

Are All Languages Created Equal in Multilingual BERT?
Proceedings of the 5th Workshop on Representation Learning for NLP, 2020

Aligning Public Feedback to Requests for Comments on Regulations.gov.
Proceedings of the Fourteenth International AAAI Conference on Web and Social Media, 2020

Examining Peer-to-Peer and Patient-Provider Interactions on a Social Media Community Facilitating Ask the Doctor Services.
Proceedings of the Fourteenth International AAAI Conference on Web and Social Media, 2020

Do Explicit Alignments Robustly Improve Multilingual Encoders?
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Do Models of Mental Health Based on Social Media Data Generalize?
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Crowd-Diagnosis: When the Public Turns to Social Media to Obtain Clinical Diagnoses.
Proceedings of the AMIA 2020, 2020

Civil Unrest on Twitter (CUT): A Dataset of Tweets to Support Research on Civil Unrest.
Proceedings of the Sixth Workshop on Noisy User-generated Text, 2020

Clinical Concept Linking with Contextualized Neural Representations.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Sources of Transfer in Multilingual Named Entity Recognition.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Identifying vulnerable older adult populations by contextualizing geriatric syndrome information in clinical notes of electronic health records.
J. Am. Medical Informatics Assoc., 2019

Elites and foreign actors among the alt-right: The Gab social media platform.
First Monday, 2019

Phenotyping of Clinical Notes with Improved Document Classification Models Using Contextualized Neural Language Models.
CoRR, 2019

Visual Attention Model for Cross-Sectional Stock Return Prediction and End-to-End Multimodal Market Representation Learning.
Proceedings of the Thirty-Second International Florida Artificial Intelligence Research Society Conference, 2019

Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Discriminative Candidate Generation for Medical Concept Linking.
Proceedings of the 1st Conference on Automated Knowledge Base Construction, 2019

2018
Don't quote me: reverse identification of research participants in social media studies.
npj Digit. Medicine, 2018

Summarizing Entities using Distantly Supervised Information Extractors.
Proceedings of the Joint Proceedings of the First International Workshop on Professional Search (ProfS2018); the Second Workshop on Knowledge Graphs and Semantics for Text Retrieval, 2018

Deep Dirichlet Multinomial Regression.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Enhancing Scientific Collaboration Through Knowledge Base Population and Linking for Meetings.
Proceedings of the 51st Hawaii International Conference on System Sciences, 2018

Challenges of Using Text Classifiers for Causal Inference.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Convolutions Are All You Need (For Classifying Character Sequences).
Proceedings of the 4th Workshop on Noisy User-generated Text, 2018

Using Author Embeddings to Improve Tweet Stance Classification.
Proceedings of the 4th Workshop on Noisy User-generated Text, 2018

Johns Hopkins or johnny-hopkins: Classifying Individuals versus Organizations on Twitter.
Proceedings of the Second Workshop on Computational Modeling of People's Opinions, 2018

Predicting Twitter User Demographics from Names Alone.
Proceedings of the Second Workshop on Computational Modeling of People's Opinions, 2018

2017
Social Monitoring for Public Health
Synthesis Lectures on Information Concepts, Retrieval, and Services, Morgan & Claypool Publishers, ISBN: 978-3-031-02311-8, 2017

Person entity linking in email with NIL detection.
J. Assoc. Inf. Sci. Technol., 2017

Feature Generation for Robust Semantic Role Labeling.
CoRR, 2017

Harmonic Grammar, Optimality Theory, and Syntax Learnability: An Empirical Exploration of Czech Word Order.
CoRR, 2017

Support for Interactive Identification of Mentioned Entities in Conversational Speech.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

Multi-task Domain Adaptation for Sequence Tagging.
Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017


Ethical Research Protocols for Social Media Health Research.
Proceedings of the First ACL Workshop on Ethics in Natural Language Processing, 2017

Leveraging side information for speaker identification with the Enron conversational telephone speech collection.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Monitoring Real-time Spatial Public Health Discussions in the Context of Vaccine Hesitancy.
Proceedings of the 2nd Social Media Mining for Health Research and Applications Workshop co-located with the American Medical Informatics Association Annual Symposium (AMIA 2017), 2017

Constructing an Alias List for Named Entities during an Event.
Proceedings of the 3rd Workshop on Noisy User-generated Text, 2017

Pocket Knowledge Base Population.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

Bayesian Modeling of Lexical Resources for Low-Resource Settings.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

How Does Twitter User Behavior Vary Across Demographic Groups?
Proceedings of the Second Workshop on NLP and Computational Social Science, 2017

Examining Patterns of Influenza Vaccination in Social Media.
Proceedings of the Workshops of the The Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Multi-task Multi-domain Representation Learning for Sequence Tagging.
CoRR, 2016

Learning Word Segmentation Representations to Improve Named Entity Recognition for Chinese Social Media.
CoRR, 2016

Twitter as a Source of Global Mobility Patterns for Social Good.
CoRR, 2016

After Sandy Hook Elementary: A Year in the Gun Control Debate on Twitter.
CoRR, 2016

Can Big Media Data Revolutionarize Gun Violence Prevention?
CoRR, 2016

Embedding Lexical Features via Low-Rank Tensors.
Proceedings of the NAACL HLT 2016, 2016

Geolocation for Twitter: Timing Matters.
Proceedings of the NAACL HLT 2016, 2016

A Study of Imitation Learning Methods for Semantic Role Labeling.
Proceedings of the Workshop on Structured Prediction for NLP@EMNLP 2016, 2016

How Twitter is Changing the Nature of Financial News Discovery.
Proceedings of the Second International Workshop on Data Science for Macro-Modeling, 2016

Discovering Shifts to Suicidal Ideation from Mental Health Content in Social Media.
Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 2016

Knowledge Base Population for Organization Mentions in Email.
Proceedings of the 5th Workshop on Automated Knowledge Base Construction, 2016

Name Variation in Community Question Answering Systems.
Proceedings of the 2nd Workshop on Noisy User-generated Text, 2016

Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Learning Multiview Embeddings of Twitter Users.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Twitter at the Grammys: A Social Media Corpus for Entity Linking and Disambiguation.
Proceedings of The Fourth International Workshop on Natural Language Processing for Social Media, 2016

Demographer: Extremely Simple Name Demographics.
Proceedings of the First Workshop on NLP and Computational Social Science, 2016

Studying Anonymous Health Issues and Substance Use on College Campuses with Yik Yak.
Proceedings of the World Wide Web and Population Health Intelligence, 2016

Collective Supervision of Topic Models for Predicting Surveys with Social Media.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Learning Composition Models for Phrase Embeddings.
Trans. Assoc. Comput. Linguistics, 2015

SPRITE: Generalizing Topic Models with Structured Priors.
Trans. Assoc. Comput. Linguistics, 2015

Approximation-Aware Dependency Parsing by Belief Propagation.
Trans. Assoc. Comput. Linguistics, 2015

Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance.
PLoS Comput. Biol., 2015

Interactive Knowledge Base Population.
CoRR, 2015

Combining Word Embeddings and Feature Embeddings for Fine-grained Relation Extraction.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Predicate Argument Alignment using a Global Coherence Model.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

A Concrete Chinese NLP Pipeline.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

CLPsych 2015 Shared Task: Depression and PTSD on Twitter.
Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 2015

From ADHD to SAD: Analyzing the Language of Mental Health on Twitter through Self-Reported Diagnoses.
Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 2015

Entity Linking for Spoken Language.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Detecting Changes in Suicide Content Manifested in Social Media Following Celebrity Suicides.
Proceedings of the 26th ACM Conference on Hypertext & Social Media, 2015

Named Entity Recognition for Chinese Social Media with Jointly Trained Embeddings.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Improved Relation Extraction with Feature-Rich Compositional Embedding Models.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

An Empirical Study of Chinese Name Matching and Applications.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

FrameNet+: Fast Paraphrastic Tripling of FrameNet.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

The Hurricane Sandy Twitter Corpus.
Proceedings of the World Wide Web and Public Health Intelligence, 2015

Worldwide Influenza Surveillance through Twitter.
Proceedings of the World Wide Web and Public Health Intelligence, 2015

2014
A large-scale quantitative analysis of latent factors and sentiment in online doctor reviews.
J. Am. Medical Informatics Assoc., 2014

Social Media Analytics for Smart Health.
IEEE Intell. Syst., 2014

Facebook, Twitter and Google Plus for Breaking News: Is There a Winner?
Proceedings of the Eighth International Conference on Weblogs and Social Media, 2014

Measuring Post Traumatic Stress Disorder in Twitter.
Proceedings of the Eighth International Conference on Weblogs and Social Media, 2014

Improving Lexical Embeddings with Semantic Knowledge.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Learning Polylingual Topic Models from Code-Switched Social Media Documents.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Low-Resource Semantic Role Labeling.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Robust Entity Clustering via Phylogenetic Inference.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Quantifying Mental Health Signals in Twitter.
Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 2014

2013
Entity Linking: Finding Extracted Entities in a Knowledge Base.
Proceedings of the Multi-source, Multilingual Information Extraction and Summarization, 2013

Adaptive regularization of weight vectors.
Mach. Learn., 2013

Estimating Confusions in the ASR Channel for Improved Topic-based Language Model Adaptation
CoRR, 2013

Topic Models and Metadata for Visualizing Text Corpora.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

Drug Extraction from the Web: Summarizing Drug Experiences with Multi-Dimensional Topic Models.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

Separating Fact from Fear: Tracking Flu Infections on Twitter.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

What's in a Domain? Multi-Domain Learning for Multi-Attribute Data.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

Broadly Improving User Classification via Communication-Based Name and Location Clustering on Twitter.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

PARMA: A Predicate Argument Aligner.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

2012
Confidence-Weighted Linear Classification for Text Categorization.
J. Mach. Learn. Res., 2012

How Social Media Will Change Public Health.
IEEE Intell. Syst., 2012

Factorial LDA: Sparse Multi-Dimensional Text Models.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Revisiting the Case for Explicit Syntactic Information in Language Models.
Proceedings of the Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT, 2012

Entity Clustering Across Languages.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2012

Shared Components Topic Models.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2012

Efficient Structured Language Modeling for Speech Recognition.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Deriving conversation-based features from unlabeled speech for discriminative language modeling.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

New ℌ<sup>∞</sup> bounds for the recursive least squares algorithm exploiting input structure.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Multi-Domain Learning: When Do Domains Matter?
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012

Name Phylogeny: A Generative Model of String Variation.
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012

Twitter as a Source for Learning about Patient Safety Events.
Proceedings of the AMIA 2012, 2012

Fast Syntactic Analysis for Statistical Language Modeling via Substructure Sharing and Uptraining.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

Experimenting with Drugs (and Topic Models): Multi-Dimensional Exploration of Recreational Drug Discussions.
Proceedings of the Information Retrieval and Knowledge Discovery in Biomedical Text, 2012

Malpractice and Malcontent: Analyzing Medical Complaints in Twitter.
Proceedings of the Information Retrieval and Knowledge Discovery in Biomedical Text, 2012

Investigating Twitter as a Source for Studying Behavioral Responses to Epidemics.
Proceedings of the Information Retrieval and Knowledge Discovery in Biomedical Text, 2012

2011
OOV Sensitive Named-Entity Recognition in Speech.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

You Are What You Tweet: Analyzing Twitter for Public Health.
Proceedings of the Fifth International Conference on Weblogs and Social Media, 2011

Hill climbing on speech lattices: A new rescoring framework.
Proceedings of the IEEE International Conference on Acoustics, 2011

Adapting n-gram maximum entropy language models with conditional entropy regularization.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Efficient discriminative training of long-span language models.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Estimating document frequencies in a speech corpus.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Learning Sub-Word Units for Open Vocabulary Speech Recognition.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011

2010
Multi-domain learning by confidence-weighted parameter combination.
Mach. Learn., 2010

Exploiting Feature Covariance in High-Dimensional Online Learning.
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Contextual Information Improves OOV Detection in Speech.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2010

Non-Expert Correction of Automatically Generated Relation Annotations.
Proceedings of the 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, 2010

Annotating Named Entities in Twitter Data with Crowdsourcing.
Proceedings of the 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, 2010

Creating Speech and Language Data With Amazon's Mechanical Turk.
Proceedings of the 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, 2010

A spoken term detection framework for recovering out-of-vocabulary words using the web.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

We're Not in Kansas Anymore: Detecting Domain Changes in Streams.
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010

NLP on Spoken Documents Without ASR.
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010

Streaming Cross Document Entity Coreference Resolution.
Proceedings of the COLING 2010, 2010

Entity Disambiguation for Knowledge Base Population.
Proceedings of the COLING 2010, 2010

2009
AAAI 2008 Workshop Reports.
AI Mag., 2009

HLTCOE Approaches to Knowledge Base Population at TAC 2009.
Proceedings of the Second Text Analysis Conference, 2009

Suggesting Email View Filters for Triage and Search.
Proceedings of the IJCAI 2009, 2009

Multi-Class Confidence Weighted Algorithms.
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009

2008
Exact Convex Confidence-Weighted Learning.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Generating summary keywords for emails using topics.
Proceedings of the 13th International Conference on Intelligent User Interfaces, 2008

Intelligent email: reply and attachment prediction.
Proceedings of the 13th International Conference on Intelligent User Interfaces, 2008

Confidence-weighted linear classification.
Proceedings of the Machine Learning, 2008

Online Methods for Multi-Domain Learning and Adaptation.
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, 2008

Reading the Markets: Forecasting Public Opinion of Political Candidates by News Analysis.
Proceedings of the COLING 2008, 2008

Icelandic Data Driven Part of Speech Tagging.
Proceedings of the ACL 2008, 2008

Active Learning with Confidence.
Proceedings of the ACL 2008, 2008

Intelligent Email: Aiding Users with AI.
Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

2007
Frustratingly Hard Domain Adaptation for Dependency Parsing.
Proceedings of the EMNLP-CoNLL 2007, 2007

Learning Fast Classifiers for Image Spam.
Proceedings of the CEAS 2007, 2007

Automatic Code Assignment to Medical Text.
Proceedings of the Biological, translational, and clinical language processing, 2007

Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification.
Proceedings of the ACL 2007, 2007

2006
Automatically classifying emails into activities.
Proceedings of the 11th International Conference on Intelligent User Interfaces, 2006

"Sorry, I Forgot the Attachment": Email Attachment Prediction.
Proceedings of the CEAS 2006, 2006

Activity-Centric Email: A Machine Learning Approach.
Proceedings of the Proceedings, 2006

2005
TREC 2005 Genomics Track Experiments at IBM Watson.
Proceedings of the Fourteenth Text REtrieval Conference, 2005

Managers' email: beyond tasks and to-dos.
Proceedings of the Extended Abstracts Proceedings of the 2005 Conference on Human Factors in Computing Systems, 2005

Reply Expectation Prediction for Email Management.
Proceedings of the CEAS 2005, 2005

2003
Beyond broadcast: a demo.
Proceedings of the 8th International Conference on Intelligent User Interfaces, 2003

Beyond broadcast.
Proceedings of the 8th International Conference on Intelligent User Interfaces, 2003


  Loading...