2025
Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model.
CoRR, April, 2025

ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning.
CoRR, February, 2025

Correct like humans: Progressive learning framework for Chinese text error correction.
Expert Syst. Appl., 2025

2024
UltraWiki: Ultra-fine-grained Entity Set Expansion with Negative Seed Entities.
CoRR, 2024

LatEval: An Interactive LLMs Evaluation Benchmark with Incomplete Information from Lateral Thinking Puzzles.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

From Retrieval to Generation: Efficient and Effective Entity Set Expansion.
Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024

MESED: A Multi-Modal Entity Set Expansion Dataset with Fine-Grained Semantic Classes and Hard Negative Entities.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Automatic Context Pattern Generation for Entity Set Expansion.
IEEE Trans. Knowl. Data Eng., December, 2023

EcomGPT-CT: Continual Pre-training of E-commerce Large Language Models with Semi-structured Data.
CoRR, 2023

Progressive Multi-task Learning Framework for Chinese Text Error Correction.
CoRR, 2023

From Retrieval to Generation: Efficient and Effective Entity Set Expansion.
CoRR, 2023

2022
Towards Attribute-Entangled Controllable Text Generation: A Pilot Study of Blessing Generation.
CoRR, 2022

Linguistic Rules-Based Corpus Generation for Native Chinese Grammatical Error Correction.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Learning from the Dictionary: Heterogeneous Knowledge Guided Fine-tuning for Chinese Spell Checking.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022