Wanrong Zhu

Orcid: 0009-0005-3448-0078

According to our database1, Wanrong Zhu authored at least 32 papers between 2018 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension.
CoRR, 2024

MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos.
CoRR, 2024

List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs.
CoRR, 2024

Automatic Layout Planning for Visually-Rich Documents with Instruction-Following Models.
CoRR, 2024

High Confidence Level Inference is Almost Free using Parallel Stochastic Optimization.
CoRR, 2024

VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation.
CoRR, 2023

VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use.
CoRR, 2023

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models.
CoRR, 2023

Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality.
CoRR, 2023

Multimodal Procedural Planning via Dual Text-Image Prompting.
CoRR, 2023

Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Learning.
CoRR, 2023

Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved with Text.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

LayoutGPT: Compositional Visual Planning and Generation with Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

VisIT-Bench: A Dynamic Benchmark for Evaluating Instruction-Following Vision-and-Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Neuro-Symbolic Procedural Planning with Commonsense Prompting.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Visualize Before You Write: Imagination-Guided Open-Ended Text Generation.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023

ImaginE: An Imagination-Based Automatic Evaluation Metric for Natural Language Generation.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023

2022
Beyond Sub-Gaussian Noises: Sharp Concentration Analysis for Stochastic Gradient Descent.
J. Mach. Learn. Res., 2022

CLIP also Understands Text: Prompting CLIP for Phrase Understanding.
CoRR, 2022

Neuro-Symbolic Causal Language Planning with Commonsense Prompting.
CoRR, 2022

Diagnosing Vision-and-Language Navigation: What Really Matters.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Imagination-Augmented Natural Language Understanding.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

End-to-end Dense Video Captioning as Sequence Generation.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

2021
Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

2020
A Fully Online Approach for Covariance Matrices Estimation of Stochastic Gradient Descent Solutions.
CoRR, 2020

Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

2019
Text Infilling.
CoRR, 2019

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation.
CoRR, 2018


  Loading...