Jedidiah Schloesser

CoRR, 2024

TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models.

[DOI]

CoRR, 2024

Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos.

[DOI]

Jianrui Zhang

CoRR, 2024

Interpolating Video-LLMs: Toward Longer-sequence LMMs in a Training-free Manner.

[DOI]

CoRR, 2024

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy.

[DOI]

CoRR, 2024

Matryoshka Multimodal Models.

[DOI]

CoRR, 2024

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models.

[DOI]

CoRR, 2024

LLM Inference Unveiled: Survey and Roofline Model Insights.

[DOI]

CoRR, 2024

Cohere3D: Exploiting Temporal Coherence for Unsupervised Representation Learning of Vision-based Autonomous Driving.

[DOI]

CoRR, 2024

Computer Vision on the Edge: Individual Cattle Identification in Real-time with ReadMyCow System.

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Testing Learning-Enabled Cyber-Physical Systems with Large-Language Models: A Formal Approach.

[DOI]

Bhaskar Krishnamachari

Dakai Zhu

Oleg Sokolsky

Insup Lee

Proceedings of the Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering, 2024

Interfacing Foundation Models' Embeddings.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Yo'LLaVA: Your Personalized Language and Vision Assistant.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Cross-Modal Self-Supervised Learning with Effective Contrastive Units for LiDAR Point Clouds.

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024

VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation.

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

MATE: Meet At The Embedding - Connecting Images with Long Texts.

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Removing Distributional Discrepancies in Captions Improves Image-Text Alignment.

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Edit One for All: Interactive Batch Image Editing.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Improved Baselines with Visual Instruction Tuning.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts.

[DOI]

Raghuraman Krishnamoorthi

Siva Karthik Mustikovela

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples.

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

Delving Deeper into Anti-Aliasing in ConvNets.

[DOI]

Int. J. Comput. Vis., 2023

Interfacing Foundation Models' Embeddings.

[DOI]

CoRR, 2023

Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images.

[DOI]

Zhuoran Yu

Chenchen Zhu

Sean Chang Culatana

CoRR, 2023

Making Large Multimodal Models Understand Arbitrary Visual Prompts.

[DOI]

Siva Karthik Mustikovela

CoRR, 2023

Testing learning-enabled cyber-physical systems with Large-Language Models: A Formal Approach.

[DOI]

Bhaskar Krishnamachari

Dakai Zhu

Oleg Sokolsky

Insup Lee

CoRR, 2023

Investigating the Catastrophic Forgetting in Multimodal Large Language Models.

[DOI]

CoRR, 2023

Visual Instruction Inversion: Image Editing via Visual Prompting.

[DOI]

CoRR, 2023

Benchmarking and Analyzing Generative Data for Visual Recognition.

[DOI]

CoRR, 2023

Generate Anything Anywhere in Any Scene.

[DOI]

CoRR, 2023

Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding.

[DOI]

CoRR, 2023

Segment Everything Everywhere All at Once.

[DOI]

CoRR, 2023

Segment Everything Everywhere All at Once.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

What Knowledge Gets Distilled in Knowledge Distillation?

[DOI]

Yuheng Li

Anirudh Sundara Rajan

Yingyu Liang

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Visual Instruction Inversion: Image Editing via Image Prompting.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Visual Instruction Tuning.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

InPL: Pseudo-labeling the Inliers First for Imbalanced Semi-supervised Learning.

[DOI]

Zhuoran Yu

Yin Li

Proceedings of the Eleventh International Conference on Learning Representations, 2023

A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Exploring the Capabilities of a General-Purpose Robotic Arm in Chess Gameplay.

[DOI]

Proceedings of the 22nd IEEE-RAS International Conference on Humanoid Robots, 2023

Generalized Decoding for Pixel, Image, and Language.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Towards Universal Fake Image Detectors that Generalize Across Generative Models.

[DOI]

Yuheng Li

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning Customized Visual Models with Retrieval-Augmented Knowledge.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

GLIGEN: Open-Set Grounded Text-to-Image Generation.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

YOLACT++ Better Real-Time Instance Segmentation.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Expeditious Saliency-guided Mix-up through Random Gradient Thresholding.

[DOI]

CoRR, 2022

EnergyMatch: Energy-based Pseudo-Labeling for Semi-Supervised Learning.

[DOI]

Zhuoran Yu

Yin Li

CoRR, 2022

What Knowledge Gets Distilled in Knowledge Distillation?

[DOI]

Yuheng Li

CoRR, 2022

The Two Dimensions of Worst-case Training and the Integrated Effect for Out-of-domain Generalization.

[DOI]

CoRR, 2022

End-to-End Instance Edge Detection.

[DOI]

Xueyan Zou

CoRR, 2022

Equine Pain Behavior Classification via Self-Supervised Disentangled Pose Representation.

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

Toward learning human-aligned cross-domain robust models by countering misaligned features.

[DOI]

Proceedings of the Uncertainty in Artificial Intelligence, 2022

ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Masked Discrimination for Self-supervised Learning on Point Clouds.

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Contrastive Learning for Diverse Disentangled Foreground Generation.

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

GIRAFFE HD: A High-Resolution 3D-aware Generative Model.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

The Two Dimensions of Worst-case Training and Their Integrated Effect for Out-of-domain Generalization.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Generating Furry Cars: Disentangling Object Shape & Appearance across Multiple Domains.

[DOI]

CoRR, 2021

SinGAN-GIF: Learning a Generative Video Model from a Single GIF.

[DOI]

Rajat Arora

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

YolactEdge: Real-time Instance Segmentation on the Edge.

[DOI]

Rafael A. Rivera Soto

Proceedings of the IEEE International Conference on Robotics and Automation, 2021

Generating Furry Cars: Disentangling Object Shape and Appearance across Multiple Domains.

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Seeing the Unseen: Predicting the First-Person Camera Wearer's Location and Pose in Third-Person Scenes.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

Collaging Class-specific GANs for Semantic Image Synthesis.

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Progressive Temporal Feature Alignment Network for Video Inpainting.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Few-Shot Image Generation via Cross-Domain Correspondence.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

PartGAN: Unsupervised Part Decomposition for Image Generation and Segmentation.

[DOI]

Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020

YolactEdge: Real-time Instance Segmentation on the Edge (Jetson AGX Xavier: 30 FPS, RTX 2080 Ti: 170 FPS).

[DOI]

Rafael A. Rivera Soto

CoRR, 2020

Audiovisual SlowFast Networks for Video Recognition.

[DOI]

Christoph Feichtenhofer

CoRR, 2020

Action Graphs: Weakly-supervised Action Localization with Graph Convolution Networks.

[DOI]

Maheen Rashid

Hedvig Kjellström

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

Boxer: Preventing fraud by scanning credit cards.

[DOI]

Proceedings of the 29th USENIX Security Symposium, 2020

Elastic-InfoGAN: Unsupervised Disentangled Representation Learning in Class-Imbalanced Data.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Password-Conditioned Anonymization and Deanonymization with Face Identity Transformers.

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Don't Judge an Object by Its Context: Learning to Overcome Contextual Bias.

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Instance-Aware, Context-Focused, and Memory-Efficient Weakly Supervised Object Detection.

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation.

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Delving Deeper into Anti-aliasing in ConvNets.

[DOI]

Proceedings of the 31st British Machine Vision Conference 2020, 2020

2019

A 16-Gb, 18-Gb/s/pin GDDR6 DRAM With Per-Bit Trainable Single-Ended DFE and PLL-Less Clocking.

[DOI]

IEEE J. Solid State Circuits, 2019

Elastic-InfoGAN: Unsupervised Disentangled Representation Learning in Imbalanced Data.

[DOI]

CoRR, 2019

Identity From Here, Pose From There: Self-Supervised Disentanglement and Generation of Objects Using Unlabeled Videos.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

YOLACT: Real-Time Instance Segmentation.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

You Reap What You Sow: Using Videos to Generate High Precision Object Proposals for Weakly-Supervised Object Detection.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-Scale Point Clouds.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond.

[DOI]

CoRR, 2018

Transferring Common-Sense Knowledge for Object Detection.

[DOI]

Santosh Kumar Divvala

Ali Farhadi

CoRR, 2018

Who Will Share My Image?: Predicting the Content Diffusion Path in Online Social Networks.

[DOI]

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, 2018

A 16Gb 18Gb/S/pin GDDR6 DRAM with per-bit trainable single-ended DFE and PLL-less clocking.

[DOI]

Proceedings of the 2018 IEEE International Solid-State Circuits Conference, 2018

A Visual Attention Grounding Neural Model for Multimodal Machine Translation.

[DOI]

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Video Object Detection with an Aligned Spatial-Temporal Memory.

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

DOCK: Detecting Objects by Transferring Common-Sense Knowledge.

[DOI]

Santosh Kumar Divvala

Ali Farhadi

Proceedings of the Computer Vision - ECCV 2018, 2018

Learning to Anonymize Faces for Privacy Preserving Action Detection.

[DOI]

Zhongzheng Ren

Michael S. Ryoo

Proceedings of the Computer Vision - ECCV 2018, 2018

Cross-Domain Self-Supervised Multi-Task Feature Learning Using Synthetic Imagery.

[DOI]

Zhongzheng Ren

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

Analyzing the Adoption and Cascading Process of OSN-Based Gifting Applications: An Empirical Study.

[DOI]

Mohammad Rezaur Rahman

Jinyoung Han

Chen-Nee Chuah

ACM Trans. Web, 2017

Spatial-Temporal Memory Networks for Video Object Detection.

[DOI]

CoRR, 2017

Who Moved My Cheese? Automatic Annotation of Rodent Behaviors with Convolutional Neural Networks.

[DOI]

Zhongzheng Ren

Adriana Noronha Annie

Vogel Ciernia

Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision, 2017

Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-Supervised Object and Action Localization.

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

Weakly-Supervised Visual Grounding of Phrases with Linguistic Structures.

[DOI]

Leonid Sigal

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Interspecies Knowledge Transfer for Facial Keypoint Detection.

[DOI]

Maheen Rashid

Xiuye Gu

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Identifying First-Person Camera Wearers in Third-Person Videos.

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016

Discovering Mid-level Visual Connections in Space and Time.

[DOI]

Alexei A. Efros

Martial Hebert

Proceedings of the Deep Learning and Convolutional Neural Networks for Medical Image Computing, 2016

End-to-End Localization and Ranking for Relative Attributes.

[DOI]

Proceedings of the Computer Vision - ECCV 2016, 2016

Track and Segment: An Iterative Unsupervised Approach for Video Object Proposals.

[DOI]

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection.

[DOI]

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015

Predicting Important Objects for Egocentric Video Summarization.

[DOI]

Int. J. Comput. Vis., 2015

Discovering the Spatial Extent of Relative Attributes.

[DOI]

Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

FlowWeb: Joint image set alignment by weaving consistent, pixel-wise correspondences.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014

AverageExplorer: interactive exploration and alignment of visual data collections.

[DOI]

Jun-Yan Zhu

Alexei A. Efros

ACM Trans. Graph., 2014

Development of a Monitoring System for Multichannel Cables Using TDR.

[DOI]

IEEE Trans. Instrum. Meas., 2014

Weakly-supervised Discovery of Visual Pattern Configurations.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

An Introduction to the 3rd Workshop on Egocentric (First-Person) Vision.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013

Style-Aware Mid-level Representation for Discovering Visual Connections in Space and Time.

[DOI]

Alexei A. Efros

Martial Hebert

Proceedings of the IEEE International Conference on Computer Vision, 2013

2012

Object-Graphs for Context-Aware Visual Category Discovery.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2012

Discovering important people and objects for egocentric video summarization.

[DOI]

Joydeep Ghosh

Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011

ShadowDraw: real-time user guidance for freehand drawing.

[DOI]

C. Lawrence Zitnick

Michael F. Cohen

ACM Trans. Graph., 2011

Face Tracking for Augmented Reality Game Interface and Brand Placement.

[DOI]

Young Jae Lee

Proceedings of the Ubiquitous Computing and Multimedia Applications, 2011

Key-segments for video object segmentation.

[DOI]

Jaechul Kim

Proceedings of the IEEE International Conference on Computer Vision, 2011

Learning the easy things first: Self-paced visual category discovery.

[DOI]

Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Face Discovery with Social Context.

[DOI]

Proceedings of the British Machine Vision Conference, 2011

2010

Simple, extensible and flexible random key predistribution schemes for wireless sensor networks using reusable key pools.

[DOI]

J. Intell. Manuf., 2010

Interface of Augmented Reality Game Using Face Tracking and Its Application to Advertising.

[DOI]

Young Jae Lee

Proceedings of the Security-Enriched Urban Computing and Smart Grid, 2010

Collect-cut: Segmentation with top-down cues discovered in multi-object images.

[DOI]

Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Object-graphs for context-aware category discovery.

[DOI]

Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009

Foreground Focus: Unsupervised Learning from Partially Matching Images.

[DOI]

Int. J. Comput. Vis., 2009

Shape discovery from unlabeled image collections.

[DOI]

Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

2008

Ray-based Color Image Segmentation.

[DOI]

Changhai Xu

Benjamin Kuipers

Proceedings of the Fifth Canadian Conference on Computer and Robot Vision, 2008

Foreground Focus: Finding Meaningful Features in Unlabeled Images.

[DOI]

Proceedings of the British Machine Vision Conference 2008, Leeds, UK, September 2008, 2008

2007

The Analysis of PPL Attention Effects in the Screen of Multimedia Contents.

[DOI]

Young Jae Lee