Hong Yu

Orcid: 0000-0001-9263-5035

Affiliations:
  • University of Massachusetts Lowell, MA, USA
  • University of Massachusetts Amherst, MA, USA
  • University of Wisconsin-Milwaukee, WI, USA (former)
  • Columbia University, NY, USA (PhD)


According to our database1, Hong Yu authored at least 162 papers between 1999 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data.
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., March, 2024

BioInstruct: instruction tuning of large language models for biomedical natural language processing.
J. Am. Medical Informatics Assoc., 2024

Blocks Architecture (BloArk): Efficient, Cost-Effective, and Incremental Dataset Architecture for Wikipedia Revision History.
CoRR, 2024

MedQA-CS: Benchmarking Large Language Models Clinical Skills Using an AI-SCE Framework.
CoRR, 2024

Exploring Interdisciplinary Team Collaboration in Clinical NLP Projects Through the Lens of Activity Theory.
CoRR, 2024

Large Language Model-based Role-Playing for Personalized Medical Jargon Extraction.
CoRR, 2024

Exploring LLM Multi-Agents for ICD Coding.
CoRR, 2024

ReadCtrl: Personalizing text generation with readability-controlled instruction learning.
CoRR, 2024

Synth-SBDH: A Synthetic Dataset of Social and Behavioral Determinants of Health for Clinical Text.
CoRR, 2024

JMLR: Joint Medical LLM and Retrieval Training for Enhancing Reasoning and Professional Question Answering Capability.
CoRR, 2024

ODD: A Benchmark Dataset for the Natural Language Processing Based Opioid Related Aberrant Behavior Detection.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Large Language Models are In-context Teachers for Knowledge Reasoning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

README: Bridging Medical Jargon and Lay Understanding for Patient Education through Data-Centric NLP.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

LlamaCare: An Instruction Fine-Tuned Large Language Model for Clinical NLP.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

LocalTweets to LocalHealth: A Mental Health Surveillance Framework Based on Twitter Data.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Do Clinicians Know How to Prompt? The Need for Automatic Prompt Optimization Help in Clinical Note Generation.
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, 2024

NoteChat: A Dataset of Synthetic Patient-Physician Conversations Conditioned on Clinical Notes.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

ClinicalMamba: A Generative Clinical Language Model on Longitudinal Clinical Notes.
Proceedings of the 6th Clinical Natural Language Processing Workshop, 2024

UMass-BioNLP at MEDIQA-M3G 2024: DermPrompt - A Systematic Exploration of Prompt Engineering with GPT-4V for Dermatological Diagnosis.
Proceedings of the 6th Clinical Natural Language Processing Workshop, 2024

2023
Automated identification of eviction status from electronic health record notes.
J. Am. Medical Informatics Assoc., July, 2023

Evaluating the efficacy of NoteAid on EHR note comprehension among US Veterans through Amazon Mechanical Turk.
Int. J. Medical Informatics, April, 2023

PaniniQA: Enhancing Patient Education Through Interactive Question Answering.
Trans. Assoc. Comput. Linguistics, 2023

EHR Interaction Between Patients and AI: NoteAid EHR Interaction.
CoRR, 2023

README: Bridging Medical Jargon and Lay Understanding for Patient Education through Data-Centric NLP.
CoRR, 2023

Do Physicians Know How to Prompt? The Need for Automatic Prompt Optimization Help in Clinical Note Generation.
CoRR, 2023

SELF-EXPLAIN: Teaching Large Language Models to Reason Complex Questions by Themselves.
CoRR, 2023

Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization.
CoRR, 2023

EHRTutor: Enhancing Patient Understanding of Discharge Instructions.
CoRR, 2023

NoteChat: A Dataset of Synthetic Doctor-Patient Conversations Conditioned on Clinical Notes.
CoRR, 2023

Leveraging Large Language Models for Mental Health Prediction via Online Text Data.
CoRR, 2023

Early Prediction of Alzheimers Disease Leveraging Symptom Occurrences from Longitudinal Electronic Health Records of US Military Veterans.
CoRR, 2023

ODD: A Benchmark Dataset for the NLP-based Opioid Related Aberrant Behavior Detection.
CoRR, 2023

Intent-based Web Page Summarization with Structure-Aware Chunking and Generative Language Models.
Proceedings of the Companion Proceedings of the ACM Web Conference 2023, 2023

Web Information Extraction for Social Good: Food Pantry Answering As an Example.
Proceedings of the ACM Web Conference 2023, 2023

Two Directions for Clinical Data Generation with Large Language Models: Data-to-Label and Label-to-Data.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Vision Meets Definitions: Unsupervised Visual Word Sense Disambiguation Incorporating Gloss Information.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Revisiting the Architectures like Pointer Networks to Efficiently Improve the Next Word Distribution, Summarization Factuality, and Beyond.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Generating User-Engaging News Headlines.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

UMASS_BioNLP at MEDIQA-Chat 2023: Can LLMs generate high-quality synthetic note-oriented doctor-patient conversations?
Proceedings of the 5th Clinical Natural Language Processing Workshop, 2023

Multi-Label Few-Shot ICD Coding as Autoregressive Generation with Prompt.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Enhancing the prediction of disease outcomes using electronic health records and pretrained deep learning models.
CoRR, 2022

Associations Between Natural Language Processing (NLP) Enriched Social Determinants of Health and Suicide Death among US Veterans.
CoRR, 2022

An Automatic SOAP Classification System Using Weakly Supervision And Transfer Learning.
CoRR, 2022

Context Variance Evaluation of Pretrained Language Models for Prompt-based Biomedical Knowledge Probing.
CoRR, 2022

ScAN: Suicide Attempt and Ideation Events Dataset.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Learning as Conversation: Dialogue Systems Reinforced for Information Acquisition.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

MedJEx: A Medical Jargon Extraction Model with Wiki's Hyperlink Span and Contextualized Masked Language Model Score.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Knowledge Injected Prompt Based Fine-tuning for Multi-label Few-shot ICD Coding.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Generation of Patient After-Visit Summaries to Support Physicians.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Extracting Biomedical Factual Knowledge Using Pretrained Language Model and Electronic Health Record Context.
Proceedings of the AMIA 2022, 2022

An Investigation of the Representation of Social Determinants of Health in the UMLS.
Proceedings of the AMIA 2022, 2022

Parameter Efficient Transfer Learning for Suicide Attempt and Ideation Detection.
Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis, 2022

2021
Neural data-to-text generation with dynamic content planning.
Knowl. Based Syst., 2021

Membership Inference Attack Susceptibility of Clinical Language Models.
CoRR, 2021

Epinoter: A Natural Language Processing Tool for Epidemiological Studies.
Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies, 2021

Improving Formality Style Transfer with Context-Aware Rule Injection.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Neural Data-to-Text Generation with Dynamic Content Planning.
CoRR, 2020

Improved Pretraining for Domain-specific Contextual Embedding Models.
CoRR, 2020

Generating Accurate EHR Assessment from Medical Graph.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Dynamic Data Selection for Curriculum Learning via Ability Estimation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Conversational Machine Comprehension: a Literature Review.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Inferring ADR causality by predicting the Naranjo Score from Clinical Notes.
Proceedings of the AMIA 2020, 2020

Bleeding Entity Recognition in Electronic Health Records: A Comprehensive Analysis of End-to-End Systems.
Proceedings of the AMIA 2020, 2020

Neural Multi-Task Learning for Adverse Drug Reaction Extraction.
Proceedings of the AMIA 2020, 2020

BENTO: A Visual Platform for Building Clinical NLP Pipelines Based on CodaLab.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020

Calibrating Structured Output Predictors for Natural Language Processing.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

MetaMT, a Meta Learning Method Leveraging Multiple Domain Data for Low Resource Machine Translation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

ICD Coding from Clinical Text Using Multi-Filter Residual Convolutional Neural Network.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Learning to detect and understand drug discontinuation events from clinical narratives.
J. Am. Medical Informatics Assoc., 2019

An investigation of single-domain and multidomain medication and adverse drug event relation extraction from electronic health record notes using advanced deep learning models.
J. Am. Medical Informatics Assoc., 2019

MetaMT, a MetaLearning Method Leveraging Multiple Domain Data for Low Resource Machine Translation.
CoRR, 2019

A Neural Abstractive Summarization Model Guided with Topic Sentences.
Aust. J. Intell. Inf. Process. Syst., 2019

Clinical Judgement Study using Question Answering from Electronic Health Records.
Proceedings of the Machine Learning for Healthcare Conference, 2019

Naranjo Question Answering using End-to-End Multi-task Learning Model.
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019

Generating Classical Chinese Poems from Vernacular Chinese.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Learning Latent Parameters without Human Response Patterns: Item Response Theory with Artificial Crowds.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Bacteria Biotope Relation Extraction via Lexical Chains and Dependency Graphs.
Proceedings of The 5th Workshop on BioNLP Open Shared Tasks, 2019

Extracting Drug-drug Interactions with a Dependency-based Graph Convolution Neural Network.
Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine, 2019

2018
HYPE: A High Performing NLP System for Automatically Detecting Hypoglycemia Events from Electronic Health Record Notes.
CoRR, 2018

Sentence Simplification with Memory-Augmented Neural Networks.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Understanding Deep Learning Performance through an Examination of Test Set Difficulty: A Psychometric Case Study.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Accuracy of International Classification of Disease Clinical Modification Codes for Detecting Bleeding Events in Electronic Health Records and When to Use Them.
Proceedings of the AMIA 2018, 2018

Reference Standard Development to Train Natural Language Processing Algorithms to Detect Problematic Buprenorphine-Naloxone Therapy.
Proceedings of the AMIA 2018, 2018

Detecting Hypoglycemia Incidents from Patients' Secure Messages.
Proceedings of the AMIA 2018, 2018

2017
Unsupervised ensemble ranking of terms in electronic health record notes based on their importance to patients.
J. Biomed. Informatics, 2017

Improving Machine Learning Ability with Fine-Tuning.
CoRR, 2017

An Analysis of Machine Learning Intelligence.
CoRR, 2017

Meta Networks.
Proceedings of the 34th International Conference on Machine Learning, 2017

Reasoning with Memory Augmented Neural Networks for Language Comprehension.
Proceedings of the 5th International Conference on Learning Representations, 2017

Exploiting PubMed for Protein Molecular Function Prediction via NMF Based Multi-label Classification.
Proceedings of the 2017 IEEE International Conference on Data Mining Workshops, 2017

Neural Semantic Encoders.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

Neural Tree Indexers for Text Understanding.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

Assessing Electronic Health Record Readability.
Proceedings of the AMIA 2017, 2017

Detecting Opioid-Related Aberrant Behavior using Natural Language Processing.
Proceedings of the AMIA 2017, 2017

A hybrid Neural Network Model for Joint Prediction of Medical Presence and Period Assertions in Clinical Notes.
Proceedings of the AMIA 2017, 2017

Generating a Test of Electronic Health Record Narrative Comprehension with Item Response Theory.
Proceedings of the AMIA 2017, 2017

2016
Methods for linking EHR notes to education materials.
Inf. Retr. J., 2016

Learning for Biomedical Information Extraction: Methodological Review of Recent Advances.
CoRR, 2016

Learning to Rank Scientific Documents from the Crowd.
CoRR, 2016

Beyond Majority Voting: Generating Evaluation Scales using Item Response Theory.
CoRR, 2016

Bidirectional Recurrent Neural Networks for Medical Event Detection in Electronic Health Records.
CoRR, 2016

Ranking medical jargon in electronic health record notes by adapted distant supervision.
CoRR, 2016

Bidirectional RNN for Medical Event Detection in Electronic Health Records.
Proceedings of the NAACL HLT 2016, 2016

Building an Evaluation Scale using Item Response Theory.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Structured prediction models for RNN based sequence labeling in clinical text.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Citation Analysis with Neural Attention Models.
Proceedings of the Seventh International Workshop on Health Text Mining and Information Analysis, 2016

2015
Key Concept Identification for Medical Information Retrieval.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Rethinking Document Retrieval for Scientific Literature: A Learning to Rank Approach.
Proceedings of the AMIA 2015, 2015

Identifying Key Concepts from EHR Notes Using Domain Adaptation.
Proceedings of the Sixth International Workshop on Health Text Mining and Information Analysis, 2015

Mining and Ranking Biomedical Synonym Candidates from Wikipedia.
Proceedings of the Sixth International Workshop on Health Text Mining and Information Analysis, 2015

2014
Automatically Detecting Acute Myocardial Infarction Events from EHR Text: A Preliminary Study.
Proceedings of the AMIA 2014, 2014

2013
CiteGraph: A Citation Network System for MEDLINE Articles and Analysis.
Proceedings of the MEDINFO 2013, 2013

Improving Patients' Electronic Health Record Comprehension with NoteAid.
Proceedings of the MEDINFO 2013, 2013

Automatically Identifying Health- and Clinical-Related Content in Wikipedia.
Proceedings of the MEDINFO 2013, 2013

2012
Automatic discourse connective detection in biomedical text.
J. Am. Medical Informatics Assoc., 2012

MedTxting: Learning based and Knowledge Rich SMS-style Medical Text Contraction.
Proceedings of the AMIA 2012, 2012

2011
Toward automated consumer question answering: Automatically separating consumer questions from professional questions in the healthcare domain.
J. Biomed. Informatics, 2011

Automatic figure classification in bioscience literature.
J. Biomed. Informatics, 2011

AskHERMES: An online question answering system for complex clinical questions.
J. Biomed. Informatics, 2011

Towards spoken clinical-question answering: evaluating and adapting automatic speech-recognition systems for spoken clinical questions.
J. Am. Medical Informatics Assoc., 2011

Parsing citations in biomedical articles using conditional random fields.
Comput. Biol. Medicine, 2011

The Biomedical Discourse Relation Bank.
BMC Bioinform., 2011

BioNOT: A searchable database of biomedical negated sentences.
BMC Bioinform., 2011

Simple and efficient machine learning frameworks for identifying protein-protein interaction relevant articles and experimental methods used to study the interactions.
BMC Bioinform., 2011

Figure summarizer browser extensions for PubMed Central.
Bioinform., 2011

2010
An IR-Aided Machine Learning Framework for the BioCreative II.5 Challenge.
IEEE ACM Trans. Comput. Biol. Bioinform., 2010

Automatically extracting information needs from complex clinical questions.
J. Biomed. Informatics, 2010

Detecting hedge cues and their scope in biomedical text with conditional random fields.
J. Biomed. Informatics, 2010

Lancet: a high precision medication event extraction system for clinical text.
J. Am. Medical Informatics Assoc., 2010

Biomedical negation scope detection with conditional random fields.
J. Am. Medical Informatics Assoc., 2010

2009
Automatically classifying sentences in full-text biomedical articles into Introduction, Methods, Results and Discussion.
Bioinform., 2009

Evaluation of the Clinical Question Answering Presentation.
Proceedings of the BioNLP Workshop, BioNLP@HLT-NAACL 2009, 2009

Hierarchical Image Classification in the Bioscience Literature.
Proceedings of the AMIA 2009, 2009

FigSum: Automatically Generating Structured Text Summaries for Figures in Biomedical Literature.
Proceedings of the AMIA 2009, 2009

2008
Session Introduction.
Proceedings of the Biocomputing 2008, 2008

A Pilot Annotation to Investigate Discourse Connectivity in Biomedical Text.
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing, 2008

Automatically Extracting Information Needs from Ad Hoc Clinical Questions.
Proceedings of the AMIA 2008, 2008

2007
Development, implementation, and a cognitive evaluation of a definitional question answering system for physicians.
J. Biomed. Informatics, 2007

Using MEDLINE as a knowledge source for disambiguating abbreviations and acronyms in full-text biomedical journal articles.
J. Biomed. Informatics, 2007

Natural language processing and visualization in the molecular imaging domain.
J. Biomed. Informatics, 2007

Frontiers of biomedical text mining: current progress.
Briefings Bioinform., 2007

Session Introduction.
Proceedings of the Biocomputing 2007, 2007

2006
A large scale, corpus-based approach for automatically disambiguating biomedical abbreviations.
ACM Trans. Inf. Syst., 2006

Exploring supervised and unsupervised methods to detect topics in biomedical text.
BMC Bioinform., 2006

BioEx: A Novel User-Interface that Accesses Images from Abstract Sentences.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2006

Accessing bioscience images from abstract sentences.
Proceedings of the Proceedings 14th International Conference on Intelligent Systems for Molecular Biology 2006, 2006

Exploring Text and Image Features to Classify Images in Bioscience Literature.
Proceedings of the Workshop on Linking Natural Language and Biology, 2006

Beyond Information Retrieval - Medical Question Answering.
Proceedings of the AMIA 2006, 2006

Extracting Opinion Propositions and Opinion Holders using Syntactic and Lexical Cues.
Proceedings of the Computing Attitude and Affect in Text: Theory and Applications, 2006

2005
Question Analysis for Biomedical Question Answering.
Proceedings of the AMIA 2005, 2005

2004
GeneWays: a system for extracting, analyzing, visualizing, and integrating molecular pathway data.
J. Biomed. Informatics, 2004

Using MEDLINE as a Knowledge Source for Disambiguating Abbreviations in Full-Text Biomedical Journal Articles.
Proceedings of the 17th IEEE Symposium on Computer-Based Medical Systems (CBMS 2004), 2004

2002
Automatically identifying gene/protein terms in MEDLINE abstracts.
J. Biomed. Informatics, 2002

Research Paper: Mapping Abbreviations to Full Forms in Biomedical Articles.
J. Am. Medical Informatics Assoc., 2002


Automatic extraction of gene and protein synonyms from MEDLINE and journal articles.
Proceedings of the AMIA 2002, 2002

2001
GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles.
Proceedings of the Ninth International Conference on Intelligent Systems for Molecular Biology, 2001

GeneWays: A System for Mining Text and for Integrating Data on Molecular Pathways.
Proceedings of the Computer science and biology: Proceedings of the German Conference on Bioinformatics, 2001

2000
A Large Scale, Cross-disease Family Health History Data Set.
Proceedings of the AMIA 2000, 2000

Hereditary Disease Discovery from a Clinical Data Warehouse.
Proceedings of the AMIA 2000, 2000

1999
Representing genomic knowledge in the UMLS semantic network.
Proceedings of the AMIA 1999, 1999


  Loading...