R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model.
CoRR, March, 2025
Multiscale Feature Fusion for Salient Object Detection of Strip Steel Surface Defects.
IEEE Access, 2025
HeavyLocker: Lock Heavy Hitters in Distributed Data Streams.
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025
Is Your Multimodal Language Model Oversensitive to Safe Queries?
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
A Simple Approach to Unifying Diffusion-based Conditional Generation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Teaching Video Diffusion Model with Latent Physical Phenomenon Knowledge.
CoRR, 2024
MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?
CoRR, 2024
DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers.
CoRR, 2024
Frame Fusion with Vehicle Motion Prediction for 3D Object Detection.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024
DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLMs Jailbreakers.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
VidToMe: Video Token Merging for Zero-Shot Video Editing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Calibration Method for Industrial Robots Based on the Principle of Perigon Error Close.
IEEE Access, 2022
Gait Identification under Surveillance Environment based on Human Skeleton.
CoRR, 2021