Xiangru Tang

Orcid: 0009-0006-2700-4513

According to our database1, Xiangru Tang authored at least 62 papers between 2019 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Data preparation for Deep Learning based Code Smell Detection: A systematic literature review.
J. Syst. Softw., 2024

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents.
CoRR, 2024

Unveiling the Spectrum of Data Contamination in Language Models: A Survey from Detection to Remediation.
CoRR, 2024

Step-Back Profiling: Distilling User History for Personalized Scientific Writing.
CoRR, 2024

PRESTO: Progressive Pretraining Enhances Synthetic Chemistry Outcomes.
CoRR, 2024

Lessons from the Trenches on Reproducible Evaluation of Language Models.
CoRR, 2024

MIMIR: A Streamlined Platform for Personalized Agent Tuning in Domain Expertise.
CoRR, 2024

StarCoder 2 and The Stack v2: The Next Generation.
CoRR, 2024

Data Interpreter: An LLM Agent For Data Science.
CoRR, 2024

A Survey of Generative AI for De Novo Drug Design: New Frontiers in Molecule and Protein Generation.
CoRR, 2024

Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science.
CoRR, 2024

Weaver: Foundation Models for Creative Writing.
CoRR, 2024

A survey of generative AI for <i>de novo</i> drug design: new frontiers in molecule and protein generation.
Briefings Bioinform., 2024

Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data?
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Short Papers, 2024

Investigating Data Contamination in Modern Benchmarks for Large Language Models.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

OctoPack: Instruction Tuning Code Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

PRESTO: Progressive Pretraining Enhances Synthetic Chemistry Outcomes.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

FinDVer: Explainable Claim Verification over Long and Hybrid-content Financial Documents.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Unveiling the Spectrum of Data Contamination in Language Model: A Survey from Detection to Remediation.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Financial Documents.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents.
CoRR, 2023

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning.
CoRR, 2023

ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks.
CoRR, 2023

DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in Understanding Long Documents with Tabular Data.
CoRR, 2023

Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity.
CoRR, 2023

Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models.
CoRR, 2023

Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?
CoRR, 2023

BioCoder: A Benchmark for Bioinformatics Code Generation with Contextual Pragmatic Knowledge.
CoRR, 2023

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs.
CoRR, 2023

Large Language Models are Effective Table-to-Text Generators, Evaluators, and Feedback Providers.
CoRR, 2023

QTSumm: A New Benchmark for Query-Focused Table Summarization.
CoRR, 2023

RWKV: Reinventing RNNs for the Transformer Era.
CoRR, 2023


Investigating Table-to-Text Generation Capabilities of Large Language Models in Real-World Information Seeking Scenarios.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: EMNLP 2023, 2023

QTSumm: Query-Focused Summarization over Tabular Data.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Crosslingual Generalization through Multitask Finetuning.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

GersteinLab at MEDIQA-Chat 2023: Clinical Note Summarization from Doctor-Patient Conversations through Fine-tuning and In-context Learning.
Proceedings of the 5th Clinical Natural Language Processing Workshop, 2023

Aligning Factual Consistency for Clinical Studies Summarization through Reinforcement Learning.
Proceedings of the 5th Clinical Natural Language Processing Workshop, 2023

2022
FeTaQA: Free-form Table Question Answering.
Trans. Assoc. Comput. Linguistics, 2022

EHRKit: A Python Natural Language Processing Toolkit for Electronic Health Record Texts.
CoRR, 2022

PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts.
CoRR, 2022

CONFIT: Toward Faithful Dialogue Summarization with Linguistically-Informed Contrastive Fine-tuning.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Investigating Crowdsourcing Protocols for Evaluating the Factual Consistency of Summaries.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Surfer100: Generating Surveys From Web Resources, Wikipedia-style.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022


2021
CLICKER: A Computational LInguistics Classification Scheme for Educational Resources.
CoRR, 2021

Improving RNA Secondary Structure Design using Deep Reinforcement Learning.
CoRR, 2021

Multi-modal Self-supervised Pre-training for Regulatory Genome Across Cell Types.
CoRR, 2021

Investigating Crowdsourcing Protocols for Evaluating the Factual Consistency of Summaries.
CoRR, 2021

FeTaQA: Free-form Table Question Answering.
CoRR, 2021

DART: Open-Domain Structured Data Record to Text Generation.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

2020
FILM: A Fast, Interpretable, and Low-rank Metric Learning Approach for Sentence Matching.
CoRR, 2020

Multi-Granularity Modularized Network for Abstract Visual Reasoning.
CoRR, 2020

DART: Open-Domain Structured Data Record to Text Generation.
CoRR, 2020

CUHK at SemEval-2020 Task 4: CommonSense Explanation, Reasoning and Prediction with Multi-task Learning.
CoRR, 2020

CUHK at SemEval-2020 Task 4: CommonSense Explanation, Reasoning and Prediction with Multi-task Learning.
Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020

Categorizing Offensive Language in Social Networks: A Chinese Corpus, Systems and an Explanation Tool.
Proceedings of the Chinese Computational Linguistics - 19th China National Conference, CCL 2020, Hainan, China, October 30, 2020

2019
Improving Code Generation From Descriptive Text By Combining Deep Learning and Syntax Rules.
Proceedings of the 31st International Conference on Software Engineering and Knowledge Engineering, 2019

Knowledge-Aware Self-Attention Networks for Document Grounded Dialogue Generation.
Proceedings of the Knowledge Science, Engineering and Management, 2019


  Loading...