2025
R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model.
CoRR, March, 2025

Multiscale Feature Fusion for Salient Object Detection of Strip Steel Surface Defects.
IEEE Access, 2025

HeavyLocker: Lock Heavy Hitters in Distributed Data Streams.
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025

Is Your Multimodal Language Model Oversensitive to Safe Queries?
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

A Simple Approach to Unifying Diffusion-based Conditional Generation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
Teaching Video Diffusion Model with Latent Physical Phenomenon Knowledge.
CoRR, 2024

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent.
CoRR, 2024

MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?
CoRR, 2024

DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers.
CoRR, 2024

Frame Fusion with Vehicle Motion Prediction for 3D Object Detection.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLMs Jailbreakers.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

VidToMe: Video Token Merging for Zero-Shot Video Editing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2022
Calibration Method for Industrial Robots Based on the Principle of Perigon Error Close.
IEEE Access, 2022

2021
Gait Identification under Surveillance Environment based on Human Skeleton.
CoRR, 2021