Measurement of LLM's Philosophies of Human Nature.
CoRR, April, 2025
Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
FineRAG: Fine-grained Retrieval-Augmented Text-to-Image Generation.
Proceedings of the 31st International Conference on Computational Linguistics, 2025
Don't Let Your Robot be Harmful: Responsible Robotic Manipulation.
CoRR, 2024
AutoDirector: Online Auto-scheduling Agents for Multi-sensory Composition.
CoRR, 2024
StrokeNUWA - Tokenizing Strokes for Vector Graphic Synthesis.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Multi-Attentional Distance for Zero-Shot Classification with Text-to-Image Diffusion Model.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
Responsible Visual Editing.
Proceedings of the Computer Vision - ECCV 2024, 2024
ORES: Open-Vocabulary Responsible Visual Synthesis.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
Ref-Diff: Zero-shot Referring Image Segmentation with Generative Models.
CoRR, 2023
NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
Learning 3D Photography Videos via Self-supervised Diffusion on Single Images.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023
ImaginaryNet: Learning Object Detectors without Real Images and Annotations.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
NÜWA-LIP: Language-guided Image Inpainting with Defect-free VQGAN.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Multi-domain Spoken Language Understanding Using Domain- and Task-aware Parameterization.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2022
NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN.
CoRR, 2022
Knowing Where to Leverage: Context-Aware Graph Convolutional Network With an Adaptive Fusion Layer for Contextual Spoken Language Understanding.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-Training.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Co-GAT: A Co-Interactive Graph Attention Network for Joint Dialog Act Recognition and Sentiment Classification.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training.
CoRR, 2020
Multi-Domain Spoken Language Understanding Using Domain- and Task-Aware Parameterization.
CoRR, 2020
CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020
DCR-Net: A Deep Co-Interactive Relation Network for Joint Dialog Act Recognition and Sentiment Classification.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020