2025
Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language Agents.
CoRR, March, 2025

AURORA:Automated Training Framework of Universal Process Reward Models via Ensemble Prompting and Reverse Verification.
CoRR, February, 2025

SCP-116K: A High-Quality Problem-Solution Dataset and a Generalized Pipeline for Automated Extraction in the Higher Education Science Domain.
CoRR, January, 2025

2024
MINDECHO: Role-Playing Language Agents for Key Opinion Leaders.
CoRR, 2024

ConcEPT: Concept-Enhanced Pre-Training for Language Models.
CoRR, 2024

Source Prompt: Coordinated Pre-training of Language Models on Diverse Corpora from Multiple Sources.
Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024

2023
Source Prompt: Coordinated Pre-training of Language Models on Diverse Corpora from Multiple Sources.
CoRR, 2023

BBT-Fin: Comprehensive Construction of Chinese Financial Domain Pre-trained Language Model, Corpus and Benchmark.
CoRR, 2023