Yong Jae Lee

Orcid: 0000-0001-9863-1270

Affiliations:
  • University of Wisconsin, Madison, USA
  • University of California, Davis, CA, USA (former)
  • University of Texas at Austin, USA (PhD 2012)


According to our database1, Yong Jae Lee authored at least 124 papers between 2007 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos.
CoRR, 2024

Interpolating Video-LLMs: Toward Longer-sequence LMMs in a Training-free Manner.
CoRR, 2024

Cross-Modal Self-Supervised Learning with Effective Contrastive Units for LiDAR Point Clouds.
CoRR, 2024

VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation.
CoRR, 2024

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy.
CoRR, 2024

Yo'LLaVA: Your Personalized Language and Vision Assistant.
CoRR, 2024

Matryoshka Multimodal Models.
CoRR, 2024

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models.
CoRR, 2024

LLM Inference Unveiled: Survey and Roofline Model Insights.
CoRR, 2024

Cohere3D: Exploiting Temporal Coherence for Unsupervised Representation Learning of Vision-based Autonomous Driving.
CoRR, 2024

Computer Vision on the Edge: Individual Cattle Identification in Real-time with ReadMyCow System.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Testing Learning-Enabled Cyber-Physical Systems with Large-Language Models: A Formal Approach.
Proceedings of the Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering, 2024

VGBench: A Comprehensive Benchmark of Vector Graphics Understanding and Generation for Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

MATE: Meet At The Embedding - Connecting Images with Long Texts.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Removing Distributional Discrepancies in Captions Improves Image-Text Alignment.
Proceedings of the Computer Vision - ECCV 2024, 2024

Edit One for All: Interactive Batch Image Editing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Improved Baselines with Visual Instruction Tuning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Delving Deeper into Anti-Aliasing in ConvNets.
Int. J. Comput. Vis., 2023

Interfacing Foundation Models' Embeddings.
CoRR, 2023

Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images.
CoRR, 2023

Making Large Multimodal Models Understand Arbitrary Visual Prompts.
CoRR, 2023

Testing learning-enabled cyber-physical systems with Large-Language Models: A Formal Approach.
CoRR, 2023

Investigating the Catastrophic Forgetting in Multimodal Large Language Models.
CoRR, 2023

Visual Instruction Inversion: Image Editing via Visual Prompting.
CoRR, 2023

Benchmarking and Analyzing Generative Data for Visual Recognition.
CoRR, 2023

Generate Anything Anywhere in Any Scene.
CoRR, 2023

Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding.
CoRR, 2023

Segment Everything Everywhere All at Once.
CoRR, 2023

Segment Everything Everywhere All at Once.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

What Knowledge Gets Distilled in Knowledge Distillation?
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Visual Instruction Inversion: Image Editing via Image Prompting.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Visual Instruction Tuning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

InPL: Pseudo-labeling the Inliers First for Imbalanced Semi-supervised Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Exploring the Capabilities of a General-Purpose Robotic Arm in Chess Gameplay.
Proceedings of the 22nd IEEE-RAS International Conference on Humanoid Robots, 2023

Generalized Decoding for Pixel, Image, and Language.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Towards Universal Fake Image Detectors that Generalize Across Generative Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning Customized Visual Models with Retrieval-Augmented Knowledge.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

GLIGEN: Open-Set Grounded Text-to-Image Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
YOLACT++ Better Real-Time Instance Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Expeditious Saliency-guided Mix-up through Random Gradient Thresholding.
CoRR, 2022

EnergyMatch: Energy-based Pseudo-Labeling for Semi-Supervised Learning.
CoRR, 2022

What Knowledge Gets Distilled in Knowledge Distillation?
CoRR, 2022

The Two Dimensions of Worst-case Training and the Integrated Effect for Out-of-domain Generalization.
CoRR, 2022

End-to-End Instance Edge Detection.
CoRR, 2022

Equine Pain Behavior Classification via Self-Supervised Disentangled Pose Representation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

Toward learning human-aligned cross-domain robust models by countering misaligned features.
Proceedings of the Uncertainty in Artificial Intelligence, 2022

ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Masked Discrimination for Self-supervised Learning on Point Clouds.
Proceedings of the Computer Vision - ECCV 2022, 2022

Contrastive Learning for Diverse Disentangled Foreground Generation.
Proceedings of the Computer Vision - ECCV 2022, 2022

GIRAFFE HD: A High-Resolution 3D-aware Generative Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

The Two Dimensions of Worst-case Training and Their Integrated Effect for Out-of-domain Generalization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Generating Furry Cars: Disentangling Object Shape & Appearance across Multiple Domains.
CoRR, 2021

SinGAN-GIF: Learning a Generative Video Model from a Single GIF.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

YolactEdge: Real-time Instance Segmentation on the Edge.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

Generating Furry Cars: Disentangling Object Shape and Appearance across Multiple Domains.
Proceedings of the 9th International Conference on Learning Representations, 2021

Seeing the Unseen: Predicting the First-Person Camera Wearer's Location and Pose in Third-Person Scenes.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

Collaging Class-specific GANs for Semantic Image Synthesis.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Progressive Temporal Feature Alignment Network for Video Inpainting.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Few-Shot Image Generation via Cross-Domain Correspondence.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

PartGAN: Unsupervised Part Decomposition for Image Generation and Segmentation.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020
YolactEdge: Real-time Instance Segmentation on the Edge (Jetson AGX Xavier: 30 FPS, RTX 2080 Ti: 170 FPS).
CoRR, 2020

Audiovisual SlowFast Networks for Video Recognition.
CoRR, 2020

Action Graphs: Weakly-supervised Action Localization with Graph Convolution Networks.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

Boxer: Preventing fraud by scanning credit cards.
Proceedings of the 29th USENIX Security Symposium, 2020

Elastic-InfoGAN: Unsupervised Disentangled Representation Learning in Class-Imbalanced Data.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Password-Conditioned Anonymization and Deanonymization with Face Identity Transformers.
Proceedings of the Computer Vision - ECCV 2020, 2020

Don't Judge an Object by Its Context: Learning to Overcome Contextual Bias.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Instance-Aware, Context-Focused, and Memory-Efficient Weakly Supervised Object Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Delving Deeper into Anti-aliasing in ConvNets.
Proceedings of the 31st British Machine Vision Conference 2020, 2020

2019
A 16-Gb, 18-Gb/s/pin GDDR6 DRAM With Per-Bit Trainable Single-Ended DFE and PLL-Less Clocking.
IEEE J. Solid State Circuits, 2019

Elastic-InfoGAN: Unsupervised Disentangled Representation Learning in Imbalanced Data.
CoRR, 2019

Identity From Here, Pose From There: Self-Supervised Disentanglement and Generation of Objects Using Unlabeled Videos.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

YOLACT: Real-Time Instance Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

You Reap What You Sow: Using Videos to Generate High Precision Object Proposals for Weakly-Supervised Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-Scale Point Clouds.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond.
CoRR, 2018

Transferring Common-Sense Knowledge for Object Detection.
CoRR, 2018

Who Will Share My Image?: Predicting the Content Diffusion Path in Online Social Networks.
Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, 2018


A Visual Attention Grounding Neural Model for Multimodal Machine Translation.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Video Object Detection with an Aligned Spatial-Temporal Memory.
Proceedings of the Computer Vision - ECCV 2018, 2018

DOCK: Detecting Objects by Transferring Common-Sense Knowledge.
Proceedings of the Computer Vision - ECCV 2018, 2018

Learning to Anonymize Faces for Privacy Preserving Action Detection.
Proceedings of the Computer Vision - ECCV 2018, 2018

Cross-Domain Self-Supervised Multi-Task Feature Learning Using Synthetic Imagery.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Analyzing the Adoption and Cascading Process of OSN-Based Gifting Applications: An Empirical Study.
ACM Trans. Web, 2017

Spatial-Temporal Memory Networks for Video Object Detection.
CoRR, 2017

Who Moved My Cheese? Automatic Annotation of Rodent Behaviors with Convolutional Neural Networks.
Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision, 2017

Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-Supervised Object and Action Localization.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Weakly-Supervised Visual Grounding of Phrases with Linguistic Structures.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Interspecies Knowledge Transfer for Facial Keypoint Detection.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Identifying First-Person Camera Wearers in Third-Person Videos.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Discovering Mid-level Visual Connections in Space and Time.
Proceedings of the Deep Learning and Convolutional Neural Networks for Medical Image Computing, 2016

End-to-End Localization and Ranking for Relative Attributes.
Proceedings of the Computer Vision - ECCV 2016, 2016

Track and Segment: An Iterative Unsupervised Approach for Video Object Proposals.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
Predicting Important Objects for Egocentric Video Summarization.
Int. J. Comput. Vis., 2015

Discovering the Spatial Extent of Relative Attributes.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

FlowWeb: Joint image set alignment by weaving consistent, pixel-wise correspondences.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014
AverageExplorer: interactive exploration and alignment of visual data collections.
ACM Trans. Graph., 2014

Development of a Monitoring System for Multichannel Cables Using TDR.
IEEE Trans. Instrum. Meas., 2014

Weakly-supervised Discovery of Visual Pattern Configurations.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

An Introduction to the 3rd Workshop on Egocentric (First-Person) Vision.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013
Style-Aware Mid-level Representation for Discovering Visual Connections in Space and Time.
Proceedings of the IEEE International Conference on Computer Vision, 2013

2012
Object-Graphs for Context-Aware Visual Category Discovery.
IEEE Trans. Pattern Anal. Mach. Intell., 2012

Discovering important people and objects for egocentric video summarization.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011
ShadowDraw: real-time user guidance for freehand drawing.
ACM Trans. Graph., 2011

Face Tracking for Augmented Reality Game Interface and Brand Placement.
Proceedings of the Ubiquitous Computing and Multimedia Applications, 2011

Key-segments for video object segmentation.
Proceedings of the IEEE International Conference on Computer Vision, 2011

Learning the easy things first: Self-paced visual category discovery.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Face Discovery with Social Context.
Proceedings of the British Machine Vision Conference, 2011

2010
Simple, extensible and flexible random key predistribution schemes for wireless sensor networks using reusable key pools.
J. Intell. Manuf., 2010

Interface of Augmented Reality Game Using Face Tracking and Its Application to Advertising.
Proceedings of the Security-Enriched Urban Computing and Smart Grid, 2010

Collect-cut: Segmentation with top-down cues discovered in multi-object images.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Object-graphs for context-aware category discovery.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009
Foreground Focus: Unsupervised Learning from Partially Matching Images.
Int. J. Comput. Vis., 2009

Shape discovery from unlabeled image collections.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

2008
Ray-based Color Image Segmentation.
Proceedings of the Fifth Canadian Conference on Computer and Robot Vision, 2008

Foreground Focus: Finding Meaningful Features in Unlabeled Images.
Proceedings of the British Machine Vision Conference 2008, Leeds, UK, September 2008, 2008

2007
The Analysis of PPL Attention Effects in the Screen of Multimedia Contents.
Proceedings of the Future Generation Communication and Networking, 2007


  Loading...