2025

Measurement of LLM's Philosophies of Human Nature.

[DOI]

Minheng Ni

Ennan Wu

CoRR, April, 2025

Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning.

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

FineRAG: Fine-grained Retrieval-Augmented Text-to-Image Generation.

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

2024

Don't Let Your Robot be Harmful: Responsible Robotic Manipulation.

[DOI]

CoRR, 2024

AutoDirector: Online Auto-scheduling Agents for Multi-sensory Composition.

[DOI]

CoRR, 2024

StrokeNUWA - Tokenizing Strokes for Vector Graphic Synthesis.

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Multi-Attentional Distance for Zero-Shot Classification with Text-to-Image Diffusion Model.

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Responsible Visual Editing.

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

ORES: Open-Vocabulary Responsible Visual Synthesis.

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Ref-Diff: Zero-shot Referring Image Segmentation with Generative Models.

[DOI]

CoRR, 2023

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.

[DOI]

CoRR, 2023

Learning 3D Photography Videos via Self-supervised Diffusion on Single Images.

[DOI]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

ImaginaryNet: Learning Object Detectors without Real Images and Annotations.

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

NÜWA-LIP: Language-guided Image Inpainting with Defect-free VQGAN.

[DOI]

Minheng Ni

Xiaoming Li

Wangmeng Zuo

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

Multi-domain Spoken Language Understanding Using Domain- and Task-aware Parameterization.

[DOI]

ACM Trans. Asian Low Resour. Lang. Inf. Process., 2022

NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN.

[DOI]

CoRR, 2022

2021

Knowing Where to Leverage: Context-Aware Graph Convolutional Network With an Adaptive Fusion Layer for Contextual Spoken Language Understanding.

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-Training.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Co-GAT: A Co-Interactive Graph Attention Network for Joint Dialog Act Recognition and Sentiment Classification.

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training.

[DOI]

CoRR, 2020

Multi-Domain Spoken Language Understanding Using Domain- and Task-Aware Parameterization.

[DOI]

CoRR, 2020

CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP.

[DOI]

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

DCR-Net: A Deep Co-Interactive Relation Network for Joint Dialog Act Recognition and Sentiment Classification.

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020