2025
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values.
CoRR, April, 2025

A Comprehensive Survey on Long Context Language Modeling.
CoRR, March, 2025

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines.
CoRR, February, 2025

Enhancing LLMs via High-Knowledge Data Selection.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
MIO: A Foundation Model on Multimodal Tokens.
CoRR, 2024

HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models.
CoRR, 2024

E^2-LLM: Efficient and Extreme Length Extension of Large Language Models.
CoRR, 2024

D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

DDK: Distilling Domain Knowledge for Efficient Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

RoleAgent: Building, Interacting, and Benchmarking High-quality Role-Playing Agents from Scripts.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

E2-LLM: Efficient and Extreme Length Extension of Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models.
CoRR, 2023