2025
Tool Learning with Foundation Models.
ACM Comput. Surv., April, 2025

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems.
CoRR, April, 2025

MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents.
CoRR, March, 2025

2024
Exploring Format Consistency for Instruction Tuning.
Trans. Mach. Learn. Res., 2024

ResearchTown: Simulator of Human Research Community.
CoRR, 2024

RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework.
CoRR, 2024

How Far Are We From AGI.
CoRR, 2024

Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science.
CoRR, 2024

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
QASnowball: An Iterative Bootstrapping Framework for High-Quality Question-Answering Data Generation.
CoRR, 2023

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs.
CoRR, 2023

Tool Learning with Foundation Models.
CoRR, 2023

WebCPM: Interactive Web Search for Chinese Long-form Question Answering.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023