Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Short Papers, 2024

On Evaluating the Integration of Reasoning and Action in LLM Agents with Database Question Answering.

[DOI]

Linyong Nan

Ellen Zhang

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization.

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

Investigating Data Contamination in Modern Benchmarks for Large Language Models.

[DOI]

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Revisiting Automated Evaluation for Long-form Table Question Answering.

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

OMG-QA: Building Open-Domain Multi-Modal Generative Question Answering Systems.

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: EMNLP 2024, 2024

M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models.

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains.

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

FOLIO: Natural Language Reasoning with First-Order Logic.

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

FinDVer: Explainable Claim Verification over Long and Hybrid-content Financial Documents.

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning.

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

Unveiling the Spectrum of Data Contamination in Language Model: A Survey from Detection to Remediation.

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

KnowledgeFMath: A Knowledge-Intensive Math Reasoning Dataset in Finance Domains.

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Financial Documents.

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

TaPERA: Enhancing Faithfulness and Interpretability in Long-Form Table QA by Content Planning and Execution-based Reasoning.

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning.

[DOI]

CoRR, 2023

ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks.

[DOI]

CoRR, 2023

DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in Understanding Long Documents with Tabular Data.

[DOI]

CoRR, 2023

KnowledgeMath: Knowledge-Intensive Math Word Problem Solving in Finance Domains.

[DOI]

CoRR, 2023

L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models.

[DOI]

CoRR, 2023

Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?

[DOI]

CoRR, 2023

ODSum: New Benchmarks for Open Domain Multi-Document Summarization.

[DOI]

CoRR, 2023

Large Language Models are Effective Table-to-Text Generators, Evaluators, and Feedback Providers.

[DOI]

CoRR, 2023

QTSumm: A New Benchmark for Query-Focused Table Summarization.

[DOI]

CoRR, 2023

Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies.

[DOI]

CoRR, 2023

Enhancing Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies.

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Towards Interpretable and Efficient Automatic Reference-Based Summarization Evaluation.

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Investigating Table-to-Text Generation Capabilities of Large Language Models in Real-World Information Seeking Scenarios.

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: EMNLP 2023, 2023

QTSumm: Query-Focused Summarization over Tabular Data.

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

LoFT: Enhancing Faithfulness and Diversity for Table-to-Text Generation via Logic Form Control.

[DOI]

Yilun Zhao

Zhenting Qi

Linyong Nan

Lorenzo Jaime Yu Flores

Dragomir Radev

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations.

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

OpenRT: An Open-source Framework for Reasoning Over Tabular Data.

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023

Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation.

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

Apparel-Invariant Feature Learning for Person Re-Identification.

[DOI]

IEEE Trans. Multim., 2022

FOLIO: Natural Language Reasoning with First-Order Logic.

[DOI]

CoRR, 2022

FinMath: Injecting a Tree-structured Solver for Question Answering over Financial Reports.

[DOI]

Chenying Li

Wenbo Ye

Yilun Zhao

Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples.

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

R2D2: Robust Data-to-Text with Replacement Detection.

[DOI]

Linyong Nan

Lorenzo Jaime Yu Flores

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data.

[DOI]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021

MusiCoder: A Universal Music-Acoustic Encoder Based on Transformer.

[DOI]

Yilun Zhao

Jia Guo

Proceedings of the MultiMedia Modeling - 27th International Conference, 2021

2020

LAMP: Label Augmented Multimodal Pretraining.

[DOI]

CoRR, 2020

Apparel-invariant Feature Learning for Apparel-changed Person Re-identification.

[DOI]

CoRR, 2020

MusiCoder: A Universal Music-Acoustic Encoder Based on Transformers.

[DOI]

CoRR, 2020