2025
Baichuan-Omni-1.5 Technical Report.
CoRR, January, 2025

SysBench: Can LLMs Follow System Message?
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
SysBench: Can Large Language Models Follow System Messages?
CoRR, 2024

MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark.
CoRR, 2024

CFBench: A Comprehensive Constraints-Following Benchmark for LLMs.
CoRR, 2024

PAS: Data-Efficient Plug-and-Play Prompt Augmentation System.
CoRR, 2024