Chenfei Wu

Orcid: 0000-0002-5678-9691

According to our database1, Chenfei Wu authored at least 42 papers between 2018 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
AutoDirector: Online Auto-scheduling Agents for Multi-sensory Composition.
CoRR, 2024

Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification.
CoRR, 2024

LVLM-Intrepret: An Interpretability Tool for Large Vision-Language Models.
CoRR, 2024

StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis.
CoRR, 2024

Low-code LLM: Graphical User Interface over Large Language Models.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: System Demonstrations, 2024

StrokeNUWA - Tokenizing Strokes for Vector Graphic Synthesis.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Using Left and Right Brains Together: Towards Vision and Language Planning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

LayoutNUWA: Revealing the Hidden Layout Expertise of Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Learning to Plan by Updating Natural Language.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

HORIZON: High-Resolution Semantically Controlled Panorama Synthesis.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

ORES: Open-Vocabulary Responsible Visual Synthesis.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation.
CoRR, 2023

GameEval: Evaluating LLMs on Conversational Games.
CoRR, 2023

DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory.
CoRR, 2023

Towards Medical Artificial General Intelligence via Knowledge-Enhanced Multimodal Pretraining.
CoRR, 2023

Learning to Program with Natural Language.
CoRR, 2023

Low-code LLM: Visual Programming over LLMs.
CoRR, 2023

TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs.
CoRR, 2023

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.
CoRR, 2023

Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models.
CoRR, 2023

Learning 3D Photography Videos via Self-supervised Diffusion on Single Images.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

ReCo: Region-Controlled Text-to-Image Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

BridgeTower: Building Bridges between Encoders in Vision-Language Representation Learning.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
HORIZON: A High-Resolution Panorama Synthesis Framework.
CoRR, 2022

Bridge-Tower: Building Bridges Between Encoders in Vision-Language Representation Learning.
CoRR, 2022

DiVAE: Photorealistic Images Synthesis with Denoising Diffusion Decoder.
CoRR, 2022

NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN.
CoRR, 2022

Learning Temporal Video Procedure Segmentation from an Automatically Collected Large Dataset.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

Trace Controlled Text to Image Generation.
Proceedings of the Computer Vision - ECCV 2022, 2022

NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion.
Proceedings of the Computer Vision - ECCV 2022, 2022

VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions.
CoRR, 2021

GEM: A General Evaluation Benchmark for Multimodal Tasks.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2019
Deep Reason: A Strong Baseline for Real-World Visual Reasoning.
CoRR, 2019

Differential Networks for Visual Question Answering.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Chain of Reasoning for Visual Question Answering.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Object-Difference Attention: A Simple Relational Attention for Visual Question Answering.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Sequential Visual Reasoning for Visual Question Answering.
Proceedings of the 5th IEEE International Conference on Cloud Computing and Intelligence Systems, 2018


  Loading...