2025
Can LLMs Simulate Personas with Reversed Performance? A Benchmark for Counterfactual Instruction Following.
CoRR, April, 2025

A Survey of Large Language Model Agents for Question Answering.
CoRR, March, 2025

Efficient but Vulnerable: Benchmarking and Defending LLM Batch Prompting Attack.
CoRR, March, 2025

DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education.
CoRR, 2024

Large Language Model Cascades with Mixture of Thought Representations for Cost-Efficient Reasoning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Can LLM Find the Green Circle? Investigation and Human-Guided Tool Manipulation for Compositional Generalization.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Can LLM find the green circle? Investigation and Human-guided tool manipulation for compositional generalization.
CoRR, 2023

Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning.
CoRR, 2023

Gentopia: A Collaborative Platform for Tool-Augmented LLMs.
CoRR, 2023

Gentopia.AI: A Collaborative Platform for Tool-Augmented LLMs.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023