2025
SUMIE: A Synthetic Benchmark for Incremental Entity Summarization.
Proceedings of the 31st International Conference on Computational Linguistics, 2025
2024
STRUM-LLM: Attributed and Structured Contrastive Summarization.
CoRR, 2024
FieldSwap: Data Augmentation for Effective Form-Like Document Extraction.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024
Enhancing Incremental Summarization with Structured Representations.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
2023
VRDU: A Benchmark for Visually-rich Document Understanding.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023
Selective Labeling: How to Radically Lower Data-Labeling Costs for Document Extraction Models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
2022
An Augmentation Strategy for Visually Rich Documents.
CoRR, 2022
A Benchmark for Structured Extractions from Complex Documents.
CoRR, 2022
Radically Lower Data-Labeling Costs for Visually Rich Document Extraction Models.
CoRR, 2022
Learning Transferable Node Representations for Attribute Extraction from Web Documents.
Proceedings of the WSDM '22: The Fifteenth ACM International Conference on Web Search and Data Mining, Virtual Event / Tempe, AZ, USA, February 21, 2022
ReLiable: Offline Reinforcement Learning for Tactical Strategies in Professional Basketball Games.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022
2021
Learning Robust Representations for Low-resource Information Extraction.
PhD thesis, 2021
Clinical Named Entity Recognition using Contextualized Token Representations.
CoRR, 2021
Simplified DOM Trees for Transferable Attribute Extraction from the Web.
CoRR, 2021
CREATe: Clinical Report Extraction and Annotation Technology.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021
#StayHome or #Marathon?: Social Media Enhanced Pandemic Surveillance on Spatial-temporal Dynamic Graphs.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021
Clinical Temporal Relation Extraction with Probabilistic Soft Logic Regularization and Global Inference.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
Recommending Themes for Ad Creative Design via Visual-Linguistic Representations.
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020
Social Media User Geolocation via Hybrid Attention.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020
Domain Knowledge Empowered Structured Neural Net for End-to-End Event Temporal Relation Extraction.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020
Learning to Create Better Ads: Generation and Ranking Approaches for Ad Creative Refinement.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020
"The Boating Store Had Its Best Sail Ever": Pronunciation-attentive Contextualized Pun Recognition.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
2019
Understanding Consumer Journey using Attention based Recurrent Neural Networks.
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019
Learning to Discriminate Perturbations for Blocking Adversarial Attacks in Text Classification.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019
2018
Learning Gender-Neutral Word Embeddings.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018
2017
Aztec: A Platform to Render Biomedical Software Findable, Accessible, Interoperable, and Reusable.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2017
AZTEC: A Cloud-based Computational Platform to Integrate Biomedical Resources.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017