2024
UOE: Unlearning One Expert Is Enough For Mixture-of-experts LLMS.
CoRR, 2024

Social Science Meets LLMs: How Reliable Are Large Language Models in Social Simulations?
CoRR, 2024

LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs.
CoRR, 2024

ScholarChemQA: Unveiling the Power of Language Models in Chemical Research Question Answering.
CoRR, 2024

Defending Jailbreak Prompts via In-Context Adversarial Game.
CoRR, 2024

Can LLMs Solve Molecule Puzzles? A Multimodal Benchmark for Molecular Structure Elucidation.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Defending Jailbreak Prompts via In-Context Adversarial Game.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

SceMQA: A Scientific College Entrance Level Multimodal Question Answering Benchmark.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, 2024

Uncertainty-Aware Yield Prediction with Multimodal Molecular Features.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Modeling non-uniform uncertainty in Reaction Prediction via Boosting and Dropout.
CoRR, 2023

What indeed can GPT models do in chemistry? A comprehensive benchmark on eight tasks.
CoRR, 2023

What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Graph-based Molecular Representation Learning.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Compositional Mathematical Encoding for Math Word Problems.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023