Bryan A. Plummer

CoRR, 2024

RECAST: Reparameterized, Compact weight Adaptation for Sequential Tasks.

[BibT_eX]

[DOI]

Nazia Tasnim

CoRR, 2024

SLANT: Spurious Logo ANalysis Toolkit.

[BibT_eX]

[DOI]

CoRR, 2024

Enhancing Feature Diversity Boosts Channel-Adaptive Vision Transformers.

[BibT_eX]

[DOI]

Chau Pham

CoRR, 2024

Vision-LLMs Can Fool Themselves with Self-Generated Typographic Attacks.

[BibT_eX]

[DOI]

CoRR, 2024

Text-to-image Editing by Image Information Removal.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Movie Genre Classification by Language Augmentation and Shot Sampling.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Learning to Compose SuperWeights for Neural Parameter Allocation Search.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

MixtureGrowth: Growing Neural Networks by Recombining Learned Parameters.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Multimodal Representation and Retrieval [MRR 2024].

[BibT_eX]

[DOI]

Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Let Models Speak Ciphers: Multiagent Debate through Embeddings.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

LNL+K: Enhancing Learning with Noisy Labels Through Noise Source Knowledge Integration.

[BibT_eX]

[DOI]

Siqi Wang

Proceedings of the Computer Vision - ECCV 2024, 2024

From Fake to Real: Pretraining on Balanced Synthetic Images to Prevent Spurious Correlations in Image Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

PanoFree: Tuning-Free Holistic Multi-view Image Generation with Cross-View Self-guidance.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Koala: Key Frame-Conditioned Long Video-LLM.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

UniHuman: A Unified Model For Editing Human Images in the Wild.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Machine-Generated Text Localization.

[BibT_eX]

[DOI]

Zhongping Zhang

Wenda Qin

Proceedings of the Findings of the Association for Computational Linguistics, 2024

Tell Me What's Next: Textual Foresight for Generic UI Representations.

[BibT_eX]

[DOI]

Andrea Burns

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

CLAMP: Contrastive LAnguage Model Prompt-tuning.

[BibT_eX]

[DOI]

CoRR, 2023

A Unified Framework for Connecting Noise Modeling to Boost Noise Detection.

[BibT_eX]

[DOI]

Siqi Wang

Chau Pham

CoRR, 2023

Socratis: Are large multimodal models emotionally aware?

[BibT_eX]

[DOI]

CoRR, 2023

From Fake to Real (FFR): A two-stage training pipeline for mitigating spurious correlations with synthetic data.

[BibT_eX]

[DOI]

CoRR, 2023

Multiscale Video Pretraining for Long-Term Activity Forecasting.

[BibT_eX]

[DOI]

CoRR, 2023

LNL+K: Learning with Noisy Labels and Noise Source Distribution Knowledge.

[BibT_eX]

[DOI]

Siqi Wang

CoRR, 2023

WikiWeb2M: A Page-Level Multimodal Wikipedia Dataset.

[BibT_eX]

[DOI]

CoRR, 2023

COLA: How to adapt vision-language models to Compose Objects Localized with Attributes?

[BibT_eX]

[DOI]

CoRR, 2023

ERM++: An Improved Baseline for Domain Generalization.

[BibT_eX]

[DOI]

Piotr Teterwak

Kuniaki Saito

Theodoros Tsiligkaridis

CoRR, 2023

Cola: A Benchmark for Compositional Text-to-image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

CHAMMI: A benchmark for channel-adaptive models in microscopy imaging.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Collecting The Puzzle Pieces: Disentangled Self-Driven Human Pose Transfer by Permuting Textures.

[BibT_eX]

[DOI]

Nannan Li

Kevin J. Shih

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Show, Write, and Retrieve: Entity-aware Article Generation and Retrieval.

[BibT_eX]

[DOI]

Zhongping Zhang

Yiwen Gu

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

A Suite of Generative Tasks for Multi-Level Multimodal Webpage Understanding.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Language-Guided Audio-Visual Source Separation via Trimodal Consistency.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Bias Mimicking: A Simple Sampling Approach for Bias Mitigation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Complex Scene Image Editing by Scene Graph Comprehension.

[BibT_eX]

[DOI]

Proceedings of the 34th British Machine Vision Conference 2023, 2023

2022

Revisiting Image-Language Networks for Open-Ended Phrase Detection.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Human Evaluation of Text-to-Image Models on a Multi-Task Benchmark.

[BibT_eX]

[DOI]

CoRR, 2022

Effectively leveraging Multi-modal Features for Movie Genre Classification.

[BibT_eX]

[DOI]

CoRR, 2022

Semantic Image Manipulation with Background-guided Internal Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Interactive Mobile App Navigation with Uncertain or Under-specified Natural Language Commands.

[BibT_eX]

[DOI]

CoRR, 2022

Explaining Reinforcement Learning Policies through Counterfactual Trajectories.

[BibT_eX]

[DOI]

CoRR, 2022

Neural Parameter Allocation Search.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

NewsStories: Illustrating Articles with Visual Summaries.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Supervised Attribute Information Removal and Reconstruction for Image Manipulation.

[BibT_eX]

[DOI]

Nannan Li

Proceedings of the Computer Vision - ECCV 2022, 2022

A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2021

Show and Write: Entity-aware News Generation with Image Information.

[BibT_eX]

[DOI]

Zhongping Zhang

Yiwen Gu

CoRR, 2021

Learning to Reason from General Concepts to Fine-grained Tokens for Discriminative Phrase Detection.

[BibT_eX]

[DOI]

CoRR, 2021

Anchoring to Exemplars for Training Mixture-of-Expert Cell Embeddings.

[BibT_eX]

[DOI]

CoRR, 2021

Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments.

[BibT_eX]

[DOI]

CoRR, 2021

LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video Moment Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Look at What I'm Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Self-supervised Visual Attribute Learning for Fashion Compatibility.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

MILA: Multi-Task Learning from Videos via Efficient Inter-Frame Attention.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

CDS: Cross-Domain Self-supervised Pre-training.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Effectively Leveraging Attributes for Visual Similarity.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

2020

Self-supervised Visual Attribute Learning for Fashion Compatibility.

[BibT_eX]

[DOI]

CoRR, 2020

Shapeshifter Networks: Cross-layer Parameter Sharing for Scalable and Effective Deep Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Cross-domain Self-supervised Learning for Domain Adaptation with Few Source Labels.

[BibT_eX]

[DOI]

CoRR, 2020

Multi-Task Learning from Videos via Efficient Inter-Frame Attention.

[BibT_eX]

[DOI]

CoRR, 2020

Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News.

[BibT_eX]

[DOI]

Reuben Tan

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Why Do These Match? Explaining the Behavior of Image Similarity Models.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Learning to Scale Multilingual Representations for Vision-Language Tasks.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

MULE: Multimodal Universal Language Embedding.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Combining Multiple Cues for Visual Madlibs Question Answering.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., 2019

wMAN: Weakly-supervised Moment Alignment Network for Text-based Video Segment Retrieval.

[BibT_eX]

[DOI]

CoRR, 2019

Give Me a Hint! Navigating Image Databases Using Human-in-the-Loop Feedback.

[BibT_eX]

[DOI]

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Learning Similarity Conditions Without Explicit Supervision.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Language Features Matter: Effective Language Representations for Vision-Language Tasks.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Multilevel Language and Vision Integration for Text-to-Clip Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Grounding natural language phrases in images and video

[BibT_eX]

[DOI]

PhD thesis, 2018

Open-vocabulary Phrase Detection.

[BibT_eX]

[DOI]

CoRR, 2018

Learning Type-Aware Embeddings for Fashion Compatibility.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

Conditional Image-Text Embedding Networks.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

2017

Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., 2017

Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues.

[BibT_eX]

[DOI]

Arun Mallya

Christopher M. Cervantes

Julia Hockenmaier

Svetlana Lazebnik

Proceedings of the IEEE International Conference on Computer Vision, 2017

Enhancing Video Summarization via Vision-Language Embedding.

[BibT_eX]

[DOI]

Matthew Brown

Svetlana Lazebnik

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016

Phrase Localization and Visual Relationship Detection with Comprehensive Linguistic Cues.

[BibT_eX]

[DOI]

Arun Mallya

Christopher M. Cervantes

Julia Hockenmaier

Svetlana Lazebnik

CoRR, 2016

Visual Analogies: A Framework for Defining Aspect Categorization.

[BibT_eX]

[DOI]

P. Daphne Tsatsoulis