Kate Saenko

Orcid: 0000-0002-7564-7218

According to our database1, Kate Saenko authored at least 255 papers between 2004 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
DualCoOp++: Fast and Effective Adaptation to Multi-Label Recognition With Limited Annotations.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2024

Guest Editorial: Special Issue on the Promises and Dangers of Large Vision Models.
Int. J. Comput. Vis., April, 2024

Video Frame Interpolation With Many-to-Many Splatting and Spatial Selective Refinement.
IEEE Trans. Pattern Anal. Mach. Intell., February, 2024

KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models.
CoRR, 2024

SLANT: Spurious Logo ANalysis Toolkit.
CoRR, 2024

An Introduction to Vision-Language Modeling.
CoRR, 2024

Vision-LLMs Can Fool Themselves with Self-Generated Typographic Attacks.
CoRR, 2024

SynCDR : Training Cross Domain Retrieval Models with Synthetic Data.
CoRR, 2024

Learning to Compose SuperWeights for Neural Parameter Allocation Search.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

From Fake to Real: Pretraining on Balanced Synthetic Images to Prevent Spurious Correlations in Image Recognition.
Proceedings of the Computer Vision - ECCV 2024, 2024

Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

Koala: Key Frame-Conditioned Long Video-LLM.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Tell Me What's Next: Textual Foresight for Generic UI Representations.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
CLAMP: Contrastive LAnguage Model Prompt-tuning.
CoRR, 2023

Lasagna: Layered Score Distillation for Disentangled Object Relighting.
CoRR, 2023

Socratis: Are large multimodal models emotionally aware?
CoRR, 2023

Label Budget Allocation in Multi-Task Learning.
CoRR, 2023

From Fake to Real (FFR): A two-stage training pipeline for mitigating spurious correlations with synthetic data.
CoRR, 2023

Multiscale Video Pretraining for Long-Term Activity Forecasting.
CoRR, 2023

Hardwiring ViT Patch Selectivity into CNNs using Patch Mixing.
CoRR, 2023

WikiWeb2M: A Page-Level Multimodal Wikipedia Dataset.
CoRR, 2023

COLA: How to adapt vision-language models to Compose Objects Localized with Attributes?
CoRR, 2023

ERM++: An Improved Baseline for Domain Generalization.
CoRR, 2023

Mind the Backbone: Minimizing Backbone Distortion for Robust Object Detection.
CoRR, 2023

The SwaNNFlight System: On-the-Fly Sim-to-Real Adaptation via Anchored Learning.
CoRR, 2023

RIFT: Disentangled Unsupervised Image Translation via Restricted Information Flow.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Select, Label, and Mix: Learning Discriminative Invariant Feature Representations for Partial Domain Adaptation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Cola: A Benchmark for Compositional Text-to-image Retrieval.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DIME-FM : DIstilling Multimodal and Efficient Foundation Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

A Suite of Generative Tasks for Multi-Level Multimodal Webpage Understanding.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Language-Guided Audio-Visual Source Separation via Trimodal Consistency.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Prefix Conditioning Unifies Language and Label Supervision.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Bias Mimicking: A Simple Sampling Approach for Bias Mitigation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MaskSketch: Unpaired Structure-guided Masked Image Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Leveraging Geometric Structure for Label-Efficient Semi-Supervised Scene Segmentation.
IEEE Trans. Image Process., 2022

Revisiting Image-Language Networks for Open-Ended Phrase Detection.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Ani-GIFs: A benchmark dataset for domain generalization of action recognition from GIFs.
Frontiers Comput. Sci., 2022

Exploring Consistency in Cross-Domain Transformer for Domain Adaptive Semantic Segmentation.
CoRR, 2022

Human Evaluation of Text-to-Image Models on a Multi-Task Benchmark.
CoRR, 2022

FETA: Towards Specializing Foundation Models for Expert Task Applications.
CoRR, 2022

Temporal Relevance Analysis for Video Action Models.
CoRR, 2022

Interactive Mobile App Navigation with Uncertain or Under-specified Natural Language Commands.
CoRR, 2022

Explaining Reinforcement Learning Policies through Counterfactual Trajectories.
CoRR, 2022

Evaluation of Correctness in Unsupervised Many-to-Many Image Translation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Finding Differences Between Transformers and ConvNets Using Counterfactual Simulation Testing.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

How Transferable are Video Representations Based on Synthetic Data?
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

FETA: Towards Specializing Foundational Models for Expert Task Applications.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Extending the WILDS Benchmark for Unsupervised Adaptation.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Neural Parameter Allocation Search.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Multi-Critic Actor Learning: Teaching RL Policies to Act with Style.
Proceedings of the Tenth International Conference on Learning Representations, 2022

NewsStories: Illustrating Articles with Visual Summaries.
Proceedings of the Computer Vision - ECCV 2022, 2022

Learning to Detect Every Thing in an Open World.
Proceedings of the Computer Vision, 2022

A Broad Study of Pre-training for Domain Generalization and Adaptation.
Proceedings of the Computer Vision - ECCV 2022, 2022

A Unified Framework for Domain Adaptive Pose Estimation.
Proceedings of the Computer Vision - ECCV 2022, 2022

The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning.
Proceedings of the Computer Vision - ECCV 2022, 2022

A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility.
Proceedings of the Computer Vision - ECCV 2022, 2022

MetaPose: Fast 3D Pose from Multiple Views without 3D Supervision.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Task2Sim: Towards Effective Pre-training and Transfer from Synthetic Data.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Many-to-many Splatting for Efficient Video Frame Interpolation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Unsupervised Domain Generalization by Learning a Bridge Across Domains.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

ZeroWaste Dataset: Towards Deformable Object Segmentation in Cluttered Scenes.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
How to Train Your Quadrotor: A Framework for Consistently Smooth and Responsive Flight Control via Reinforcement Learning.
ACM Trans. Cyber Phys. Syst., 2021

Real-Time Semantic Segmentation With Fast Attention.
IEEE Robotics Autom. Lett., 2021

Guided Zoom: Zooming into Network Evidence to Refine Fine-Grained Model Decisions.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Disentangled Unsupervised Image Translation via Restricted Information Flow.
CoRR, 2021

VisDA-2021 Competition Universal Domain Adaptation to Improve Performance on Out-of-Distribution Data.
CoRR, 2021

ZeroWaste Dataset: Towards Automated Waste Recycling.
CoRR, 2021

OpenMatch: Open-set Consistency Regularization for Semi-supervised Learning with Outliers.
CoRR, 2021

Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments.
CoRR, 2021

All at Once Network Quantization via Collaborative Knowledge Transfer.
CoRR, 2021

Good Actors can come in Smaller Sizes: A Case Study on the Value of Actor-Critic Asymmetry.
CoRR, 2021

VA-RED<sup>2</sup>: Video Adaptive Redundancy Reduction.
CoRR, 2021

LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video Moment Retrieval.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Look at What I'm Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

OpenMatch: Open-Set Semi-supervised Learning with Open-set Consistency Regularization.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021


VisDA-2021 Competition: Universal Domain Adaptation to Improve Performance on Out-of-Distribution Data.
Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, 2021

Weakly Supervised Domain Adaptation using Super-pixel labeling for Semantic Segmentation.
Proceedings of the 17th International Conference on Machine Vision and Applications, 2021

Regularizing Action Policies for Smooth Control with Reinforcement Learning.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

VA-RED2: Video Adaptive Redundancy Reduction.
Proceedings of the 9th International Conference on Learning Representations, 2021

AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition.
Proceedings of the 9th International Conference on Learning Representations, 2021

Self-supervised Visual Attribute Learning for Fashion Compatibility.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

Dynamic Network Quantization for Efficient Video Inference.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Temporal Action Detection with Multi-level Supervision.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

OVANet: One-vs-All Network for Universal Domain Adaptation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Tune it the Right Way: Unsupervised Validation of Domain Adaptation via Soft Neighborhood Density.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Learning Cross-Modal Contrastive Features for Video Domain Adaptation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

CDS: Cross-Domain Self-supervised Pre-training.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Detector-Free Weakly Supervised Grounding by Separation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Explainable Deep Classification Models for Domain Generalization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

Separating Skills and Concepts for Novel Visual Question Answering.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Semi-Supervised Action Recognition With Temporal Contrastive Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Black-Box Explanation of Object Detectors via Saliency Maps.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Fine-Grained Angular Contrastive Learning With Coarse Labels.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Honey. I Shrunk The Actor: A Case Study on Preserving Performance with Smaller Actors in Actor-Critic RL.
Proceedings of the 2021 IEEE Conference on Games (CoG), 2021

Surprisingly Simple Semi-Supervised Domain Adaptation with Pretraining and Consistency.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020
Learning from Lexical Perturbations for Consistent Visual Question Answering.
CoRR, 2020

Self-supervised Visual Attribute Learning for Fashion Compatibility.
CoRR, 2020

Shapeshifter Networks: Cross-layer Parameter Sharing for Scalable and Effective Deep Learning.
CoRR, 2020

Spatio-Temporal Action Detection with Multi-Object Interaction.
CoRR, 2020

Revisiting Few-shot Activity Detection with Class Similarity Control.
CoRR, 2020

Cross-domain Self-supervised Learning for Domain Adaptation with Few Source Labels.
CoRR, 2020

TwoStreamVAN: Improving Motion Modeling in Video Generation.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

DIPNet: Dynamic Identity Propagation Network for Video Object Segmentation.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable Neural Distribution Alignment.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Auxiliary Task Reweighting for Minimum-data Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Universal Domain Adaptation through Self Supervision.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Uncertainty-Aware Learning for Zero-Shot Semantic Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Learning Visual Servo Policies via Planner Cloning.
Proceedings of the Experimental Robotics - The 17th International Symposium, 2020

Beyond the Visual Analysis of Deep Model Saliency.
Proceedings of the xxAI - Beyond Explainable AI, 2020

Federated Adversarial Domain Adaptation.
Proceedings of the 8th International Conference on Learning Representations, 2020

Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Class-Imbalanced Domain Adaptation: An Empirical Odyssey.
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder.
Proceedings of the Computer Vision - ECCV 2020, 2020

Why Do These Match? Explaining the Behavior of Image Similarity Models.
Proceedings of the Computer Vision - ECCV 2020, 2020

Domain2Vec: Domain Embedding for Unsupervised Domain Adaptation.
Proceedings of the Computer Vision - ECCV 2020, 2020

AR-Net: Adaptive Frame Resolution for Efficient Action Recognition.
Proceedings of the Computer Vision - ECCV 2020, 2020

A Broader Study of Cross-Domain Few-Shot Learning.
Proceedings of the Computer Vision - ECCV 2020, 2020

Learning to Scale Multilingual Representations for Vision-Language Tasks.
Proceedings of the Computer Vision - ECCV 2020, 2020

MULE: Multimodal Universal Language Embedding.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Two-Stream Region Convolutional 3D Network for Temporal Activity Detection.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Generalized Domain Adaptation with Covariate and Label Shift CO-ALignment.
CoRR, 2019

wMAN: Weakly-supervised Moment Alignment Network for Text-based Video Segment Retrieval.
CoRR, 2019

Weakly-supervised Compositional FeatureAggregation for Few-shot Recognition.
CoRR, 2019

PuppetGAN: Transferring Disentangled Properties from Synthetic to Real Images.
CoRR, 2019

Joint Event Detection and Description in Continuous Video Streams.
Proceedings of the IEEE Winter Applications of Computer Vision Workshops, 2019

Adversarial Self-Defense for Cycle-Consistent GANs.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Domain Agnostic Learning with Disentangled Representations.
Proceedings of the 36th International Conference on Machine Learning, 2019

Learning Multi-Level Hierarchies with Hindsight.
Proceedings of the 7th International Conference on Learning Representations, 2019

PuppetGAN: Cross-Domain Image Manipulation by Demonstration.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Learning Similarity Conditions Without Explicit Supervision.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Semi-Supervised Domain Adaptation via Minimax Entropy.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Moment Matching for Multi-Source Domain Adaptation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Language-Conditioned Graph Networks for Relational Reasoning.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Language Features Matter: Effective Language Representations for Vision-Language Tasks.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Strong-Weak Distribution Alignment for Adaptive Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Are CNN Predictions based on Reasonable Evidence?
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Guided Zoom: Questioning Network Evidence for Fine-grained Classification.
Proceedings of the 30th British Machine Vision Conference 2019, 2019

Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Multilevel Language and Vision Integration for Text-to-Clip Retrieval.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Vision and Language Integration Meets Multimedia Fusion.
IEEE Multim., 2018

Similarity R-C3D for Few-shot Temporal Activity Detection.
CoRR, 2018

A Two-Stream Variational Adversarial Network for Video Generation.
CoRR, 2018

SPLAT: Semantic Pixel-Level Adaptation Transforms for Detection.
CoRR, 2018

Open-vocabulary Phrase Detection.
CoRR, 2018

Adapting control policies from simulation to reality using a pairwise loss.
CoRR, 2018

Women also Snowboard: Overcoming Bias in Captioning Models (Extended Abstract).
CoRR, 2018

Syn2Real: A New Benchmark forSynthetic-to-Real Visual Domain Adaptation.
CoRR, 2018

Unsupervised Video-to-Video Translation.
CoRR, 2018

Text-to-Clip Video Retrieval with Early Fusion and Re-Captioning.
CoRR, 2018

Contextual Multi-Scale Region Convolutional 3D Network for Activity Detection.
CoRR, 2018

Synthetic to Real Adaptation with Generative Correlation Alignment Networks.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Speaker-Follower Models for Vision-and-Language Navigation.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Adapting Control Policies from Simulation to Reality Using a Pairwise Loss.
Proceedings of the 2018 International Symposium on Experimental Robotics, 2018

CyCADA: Cycle-Consistent Adversarial Domain Adaptation.
Proceedings of the 35th International Conference on Machine Learning, 2018

Stable Distribution Alignment Using the Dual of the Adversarial Distance.
Proceedings of the 6th International Conference on Learning Representations, 2018

Adversarial Dropout Regularization.
Proceedings of the 6th International Conference on Learning Representations, 2018

Object Hallucination in Image Captioning.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Explainable Neural Computation via Stack Neural Module Networks.
Proceedings of the Computer Vision - ECCV 2018, 2018

Women Also Snowboard: Overcoming Bias in Captioning Models.
Proceedings of the Computer Vision - ECCV 2018, 2018

Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

VisDA: A Synthetic-to-Real Benchmark for Visual Domain Adaptation.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018

RISE: Randomized Input Sampling for Explanation of Black-box Models.
Proceedings of the British Machine Vision Conference 2018, 2018

2017
Correlation Alignment for Unsupervised Domain Adaptation.
Proceedings of the Domain Adaptation in Computer Vision Applications., 2017

Simultaneous Deep Transfer Across Domains and Tasks.
Proceedings of the Domain Adaptation in Computer Vision Applications., 2017

Long-Term Recurrent Convolutional Networks for Visual Recognition and Description.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

Grasp Pose Detection in Point Clouds.
Int. J. Robotics Res., 2017

Guest Editorial: Image and Language Understanding.
Int. J. Comput. Vis., 2017

Hierarchical Actor-Critic.
CoRR, 2017

VisDA: The Visual Domain Adaptation Challenge.
CoRR, 2017

Learning a visuomotor controller for real world robotic grasping using easily simulated depth images.
CoRR, 2017

Synthetic to Real Adaptation with Deep Generative Correlation Alignment Networks.
CoRR, 2017

Adversarial Discriminative Domain Adaptation (workshop extended abstract).
Proceedings of the 5th International Conference on Learning Representations, 2017

Ground2sky label transfer for fine-grained aerial car recognition.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

R-C3D: Region Convolutional 3D Network for Temporal Activity Detection.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Learning to Reason: End-to-End Module Networks for Visual Question Answering.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Captioning Images with Diverse Objects.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Adversarial Discriminative Domain Adaptation.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Top-Down Visual Saliency Guided by Captions.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Modeling Relationships in Referential Expressions with Compositional Modular Networks.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Learning a visuomotor controller for real world robotic grasping using simulated depth images.
Proceedings of the 1st Annual Conference on Robot Learning, CoRL 2017, Mountain View, 2017

2016
Large Scale Visual Recognition through Adaptation using Joint Representation and Multiple Instance Learning.
J. Mach. Learn. Res., 2016

Understanding object descriptions in robotics by open-vocabulary object retrieval and detection.
Int. J. Robotics Res., 2016

Correlation Alignment for Unsupervised Domain Adaptation.
CoRR, 2016

Adapting Deep Visuomotor Representations with Weak Pairwise Constraints.
Proceedings of the Algorithmic Foundations of Robotics XII, 2016

Multimodal Video Description.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Vision and Language Integration Meets Multimedia Fusion: Proceedings of ACM Multimedia 2016 Workshop.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

High precision grasp pose detection in dense clutter.
Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2016

Fine-to-coarse knowledge transfer for low-res image classification.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering.
Proceedings of the Computer Vision - ECCV 2016, 2016

Deep CORAL: Correlation Alignment for Deep Domain Adaptation.
Proceedings of the Computer Vision - ECCV 2016 Workshops, 2016

Natural Language Object Retrieval.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

MUTT: Metric Unit TesTing for Language Generation Tasks.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Combining Texture and Shape Cues for Object Recognition with Minimal Supervision.
Proceedings of the Computer Vision - ACCV 2016, 2016

Return of Frustratingly Easy Domain Adaptation.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
A Multi-scale Multiple Instance Video Description Network.
CoRR, 2015

Towards Adapting Deep Visuomotor Representations from Simulated to Real Environments.
CoRR, 2015

What Do Deep CNNs Learn About Objects?
Proceedings of the 3rd International Conference on Learning Representations, 2015

Translating Videos to Natural Language Using Deep Recurrent Neural Networks.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Sequence to Sequence - Video to Text.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Simultaneous Deep Transfer Across Domains and Tasks.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Learning Deep Object Detectors from 3D Models.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Spatial Semantic Regularisation for Large Scale Object Detection.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Detector discovery in the wild: Joint multiple instance and representation learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Subspace Distribution Alignment for Unsupervised Domain Adaptation.
Proceedings of the British Machine Vision Conference 2015, 2015

2014
Modeling Radiometric Uncertainty for Vision with Tone-Mapped Color Images.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Asymmetric and Category Invariant Feature Transformations for Domain Adaptation.
Int. J. Comput. Vis., 2014

Deep Domain Confusion: Maximizing for Domain Invariance.
CoRR, 2014

Exploring Invariances in Deep Convolutional Neural Networks Using Synthetic Images.
CoRR, 2014

One-Shot Adaptation of Supervised Deep Convolutional Models.
Proceedings of the 2nd International Conference on Learning Representations, 2014

LSDA: Large Scale Detection Through Adaptation.
CoRR, 2014

Open-vocabulary Object Retrieval.
Proceedings of the Robotics: Science and Systems X, 2014

LSDA: Large Scale Detection through Adaptation.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Interactive adaptation of real-time object detectors.
Proceedings of the 2014 IEEE International Conference on Robotics and Automation, 2014

Continuous Manifold Based Adaptation for Evolving Visual Domains.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Confidence-Rated Multiple Instance Boosting for Object Detection.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Integrating Language and Vision to Generate Natural Language Descriptions of Videos in the Wild.
Proceedings of the COLING 2014, 2014

From Virtual to Reality: Fast Adaptation of Virtual Object Detectors to Real Domains.
Proceedings of the British Machine Vision Conference, 2014

2013
A Category-Level 3D Object Dataset: Putting the Kinect to Work.
Proceedings of the Consumer Depth Cameras for Computer Vision, 2013

Efficient Learning of Domain-invariant Image Representations
Proceedings of the 1st International Conference on Learning Representations, 2013

Towards Adapting ImageNet to Reality: Scalable Domain Adaptation with Implicit Low-rank Transformations.
CoRR, 2013

"Off the grid": Self-contained landmarks for improved indoor probabilistic localization.
Proceedings of the 2013 IEEE Conference on Technologies for Practical Robot Applications, 2013

YouTube2Text: Recognizing and Describing Arbitrary Activities Using Semantic Hierarchies and Zero-Shot Recognition.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Semi-supervised Domain Adaptation with Instance Constraints.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Generating Natural-Language Video Descriptions Using Text-Mined Knowledge.
Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013

2012
Discovering Latent Domains for Multisource Domain Adaptation.
Proceedings of the Computer Vision - ECCV 2012, 2012

From pixels to physics: Probabilistic color de-rendering.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

A combined pose, object, and feature model for action understanding.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011
Practical 3-D object detection using category and instance-level appearance models.
Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2011

A category-level 3-D object dataset: Putting the Kinect to work.
Proceedings of the IEEE International Conference on Computer Vision Workshops, 2011

The NBNN kernel.
Proceedings of the IEEE International Conference on Computer Vision, 2011

Learning object color models from multi-view constraints.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

What you saw is not what you get: Domain adaptation using asymmetric kernel transforms.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

2010
Size Matters: Metric Visual Search Constraints from Monocular Metadata.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Adapting Visual Category Models to New Domains.
Proceedings of the Computer Vision, 2010

2009
Multistream Articulatory Feature-Based Models for Visual Speech Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2009

Filtering Abstract Senses From Image Search Results.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

2008
Unsupervised Learning of Visual Sense Models for Polysemous Words.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

2007
Object Category Recognition Using Probabilistic Fusion of Speech and Image Classifiers.
Proceedings of the Machine Learning for Multimodal Interaction , 2007

Articulatory Feature-Based Methods for Acoustic and Audio-Visual Speech Recognition: Summary from the 2006 JHU Summer workshop.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
An Asynchronous DBN for Audio-Visual speech Recognition.
Proceedings of the 2006 IEEE ACL Spoken Language Technology Workshop, 2006

Co-Adaptation of audio-visual speech and gesture classifiers.
Proceedings of the 8th International Conference on Multimodal Interfaces, 2006

2005
Visual Speech Recognition with Loosely Synchronized Feature Streams.
Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV 2005), 2005

Production domain modeling of pronunciation for visual speech recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Articulatory features for robust visual speech recognition.
Proceedings of the 6th International Conference on Multimodal Interfaces, 2004

A segment-based audio-visual speech recognizer: data collection, development, and initial experiments.
Proceedings of the 6th International Conference on Multimodal Interfaces, 2004


  Loading...