2025
SAE-V: Interpreting Multimodal Models for Enhanced Alignment.
CoRR, February, 2025

Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback.
CoRR, 2024

Language Models Resist Alignment.
CoRR, 2024

Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction.
CoRR, 2024

Aligner: Efficient Alignment by Learning to Correct.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

2023
AI Alignment: A Comprehensive Survey.
CoRR, 2023