Cordelia Schmid

Affiliations:
  • INRIA, France


According to our database1, Cordelia Schmid authored at least 371 papers between 1993 and 2024.

Collaborative distances:

Awards

IEEE Fellow

IEEE Fellow 2012, "For contributions to large-scale image retrieval, classification and object detection".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
The Right Spin: Learning Object Motion from Rotation-Compensated Flow Fields.
Int. J. Comput. Vis., January, 2024

Contact Models in Robotics: A Comparative Analysis.
IEEE Trans. Robotics, 2024

Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy.
CoRR, 2024

Towards Zero-Shot Multimodal Machine Translation.
CoRR, 2024

mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus.
CoRR, 2024

Smoke and Mirrors in Causal Downstream Tasks.
CoRR, 2024

Learning text-to-video retrieval from image captioning.
CoRR, 2024

ViViDex: Learning Vision-based Dexterous Manipulation from Human Videos.
CoRR, 2024

SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code.
CoRR, 2024

RAVEN: Rethinking Adversarial Video Generation with Efficient Tri-plane Networks.
CoRR, 2024

Location-Aware Self-Supervised Transformers for Semantic Segmentation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

SceneCraft: An LLM Agent for Synthesizing 3D Scenes as Blender Code.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Retrieval-Enhanced Contrastive Vision-Text Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

DataDream: Few-Shot Guided Dataset Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

Streaming Dense Video Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Pixel Aligned Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Dense Optical Tracking: Connecting the Dots.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

MoReVQA: Exploring Modular Reasoning Models for Video Question Answering.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Time-, Memory- and Parameter-Efficient Visual Adaptation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Learning Correlation Structures for Vision Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

End-to-End Spatio-Temporal Action Localisation with Video Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

SUGAR : Pre-training 3D Visual Representations for Robotics.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

A Generative Approach for Wikipedia-Scale Visual Entity Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

CoVR: Learning Composed Video Retrieval from Web Video Captions.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

POCO: 3D Pose and Shape Estimation with Confidence.
Proceedings of the International Conference on 3D Vision, 2024

2023
Dense Video Object Captioning from Disjoint Supervision.
CoRR, 2023

AVIS: Autonomous Visual Information Seeking with Large Language Models.
CoRR, 2023

Posterior Annealing: Fast Calibrated Uncertainty for Regression.
CoRR, 2023

VidChapters-7M: Video Chapters at Scale.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

AVIS: Autonomous Visual Information Seeking with Large Language Model Agent.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Does Visual Pretraining Help End-to-End Reasoning?
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Robust Visual Sim-to-Real Transfer for Robotic Manipulation.
IROS, 2023

Object Goal Navigation with Recursive Implicit Maps.
IROS, 2023

Enforcing the consensus between Trajectory Optimization and Policy Learning for precise robot control.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Learning Video-Conditioned Policies for Unseen Manipulation Tasks.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Learning Reward Functions for Robotic Manipulation by Observing Humans.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

UnLoc: A Unified Framework for Video Localization Tasks.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Waffling around for Performance: Visual Classification with Random Words and Broad Concepts.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Verbs in Action: Improving verb understanding in video-language models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

WALDO: Future Video Synthesis using Object Layer Decomposition and Parametric Flow Prediction.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Audiovisual Masked Autoencoders.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

How can objects help action recognition?
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Bridging the Gap Between Model Explanations in Partially Annotated Multi-Label Classification.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Improving Image Recognition by Retrieving from Web-Scale Image-Text Data.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Reveal: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

PolarNet: 3D Point Clouds for Language-Guided Robotic Manipulation.
Proceedings of the Conference on Robot Learning, 2023

Modular Visual Question Answering via Code Generation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Location-Aware Self-Supervised Transformers.
CoRR, 2022

AVATAR submission to the Ego4D AV Transcription Challenge.
CoRR, 2022

TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency.
CoRR, 2022

Beyond Transfer Learning: Co-finetuning for Action Localisation.
CoRR, 2022

Augmenting differentiable physics with randomized smoothing.
CoRR, 2022

M&M Mix: A Multimodal Multiview Transformer Ensemble.
CoRR, 2022

Learning to Answer Visual Questions from Web Videos.
CoRR, 2022

Weakly-supervised segmentation of referring expressions.
CoRR, 2022

Leveraging Randomized Smoothing for Optimal Control of Nonsmooth Dynamical Systems.
CoRR, 2022

Masking Modalities for Cross-modal Video Retrieval.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

Zero-Shot Video Question Answering via Frozen Bidirectional Language Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Language Conditioned Spatial Relation Reasoning for 3D Object Grounding.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Assembly Planning from Observations under Physical Constraints.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

AVATAR: Unconstrained Audiovisual Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

TL;DW? Summarizing Instructional Videos with Task Relevance and Cross-Modal Saliency.
Proceedings of the Computer Vision - ECCV 2022, 2022

Learning Audio-Video Modalities from Image Captions.
Proceedings of the Computer Vision - ECCV 2022, 2022

AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction.
Proceedings of the Computer Vision - ECCV 2022, 2022

Learning from Unlabeled 3D Environments for Vision-and-Language Navigation.
Proceedings of the Computer Vision - ECCV 2022, 2022

TubeDETR: Spatio-Temporal Video Grounding with Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Multiview Transformers for Video Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

End-to-end Generative Pretraining for Multimodal Video Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Learning with Neighbor Consistency for Noisy Labels.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Instruction-driven history-aware policies for robotic manipulations.
Proceedings of the Conference on Robot Learning, 2022

A Memory Transformer Network for Incremental Learning.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022

2021
Differentiable Simulation for Physical System Identification.
IEEE Robotics Autom. Lett., 2021

On the Importance of Visual Context for Data Augmentation in Scene Understanding.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Synthetic Humans for Action Recognition from Unseen Viewpoints.
Int. J. Comput. Vis., 2021

Residual Reinforcement Learning from Demonstrations.
CoRR, 2021

Local Metrics for Multi-Object Tracking.
CoRR, 2021

Large-Scale Unsupervised Object Discovery.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Attention Bottlenecks for Multimodal Fusion.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

CCVS: Context-aware Controllable Video Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Differentiable rendering with perturbed optimizers.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

History Aware Multimodal Transformer for Vision-and-Language Navigation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Do you see what I see?: Large-scale Learning from Multimodal Videos.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Goal-Conditioned Reinforcement Learning with Imagined Subgoals.
Proceedings of the 38th International Conference on Machine Learning, 2021

Just Ask: Learning to Answer Questions from Millions of Narrated Videos.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Segmenter: Transformer for Semantic Segmentation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Improving robustness against common corruptions with frequency biased models.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Episodic Transformer for Vision-and-Language Navigation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Airbert: In-domain Pretraining for Vision-and-Language Navigation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Learning Temporal Dynamics from Cycles in Narrated Video.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Unified Graph Structured Models for Video Understanding.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

ViViT: A Video Vision Transformer.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Composable Augmentation Encoding for Video Representation Learning.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Look Before You Speak: Visually Contextualized Utterances.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Class-Balanced Distillation for Long-Tailed Visual Recognition.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

Towards Unconstrained Joint Hand-Object Reconstruction From RGB Videos.
Proceedings of the International Conference on 3D Vision, 2021

2020
LCR-Net++: Multi-Person 2D and 3D Pose Detection in Natural Images.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020).
CoRR, 2020

Learning Video Representations from Textual Web Supervision.
CoRR, 2020

Selecting Relevant Features from a Universal Representation for Few-shot Classification.
CoRR, 2020

Beyond the Camera: Neural Networks in World Coordinates.
CoRR, 2020

Optimized Generic Feature Learning for Few-shot Classification across Domains.
CoRR, 2020

What Makes for Good Views for Contrastive Learning?
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Learning visual policies for building 3D shape categories.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

Learning to combine primitive skills: A step towards versatile robotic manipulation §.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

Radioactive data: tracing through training.
Proceedings of the 37th International Conference on Machine Learning, 2020

Ava Active Speaker: An Audio-Visual Dataset for Active Speaker Detection.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Unsupervised Learning of Video Representations via Dense Trajectory Clustering.
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

Memory-Efficient Incremental Learning Through Feature Adaptation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Graph Convolutional Networks for Learning with Few Clean and Many Noisy Labels.
Proceedings of the Computer Vision - ECCV 2020, 2020

Multi-modal Transformer for Video Retrieval.
Proceedings of the Computer Vision - ECCV 2020, 2020

Selecting Relevant Features from a Multi-domain Representation for Few-Shot Classification.
Proceedings of the Computer Vision - ECCV 2020, 2020

TAO: A Large-Scale Benchmark for Tracking Any Object.
Proceedings of the Computer Vision - ECCV 2020, 2020

Consistency Guided Scene Flow Estimation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Uncertainty-Aware Weakly Supervised Action Detection from Untrimmed Videos.
Proceedings of the Computer Vision - ECCV 2020, 2020

Speech2Action: Cross-Modal Supervision for Action Recognition.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Leveraging Photometric Consistency Over Time for Sparsely Supervised Hand-Object Reconstruction.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

VectorNet: Encoding HD Maps and Agent Dynamics From Vectorized Representation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

TNT: Target-driven Trajectory Prediction.
Proceedings of the 4th Conference on Robot Learning, 2020

Learning Obstacle Representations for Neural Motion Planning.
Proceedings of the 4th Conference on Robot Learning, 2020

2019
Learning to Segment Moving Objects.
Int. J. Comput. Vis., 2019

Learning to Track Any Object.
CoRR, 2019

Combining learned skills and reinforcement learning for robotic manipulations.
CoRR, 2019

Contrastive Bidirectional Transformer for Temporal Representation Learning.
CoRR, 2019

A Study on Action Detection in the Wild.
CoRR, 2019

AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection.
CoRR, 2019

Coverage and Quality Driven Training of Generative Image Models.
CoRR, 2019

Automatic Understanding of the Visual World.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Adaptive Density Estimation for Generative Models.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Learning to Augment Synthetic Images for Sim2Real Policy Transfer.
Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

White-box vs Black-box: Bayes Optimal Strategies for Membership Inference.
Proceedings of the 36th International Conference on Machine Learning, 2019

Spreading vectors for similarity search.
Proceedings of the 7th International Conference on Learning Representations, 2019

Supplementary Material: AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

VideoBERT: A Joint Model for Video and Language Representation Learning.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Detecting Unseen Visual Relations Using Analogies.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Moulding Humans: Non-Parametric 3D Human Shape Estimation From Single Images.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Diversity With Cooperation: Ensemble Methods for Few-Shot Classification.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Self-Supervised Learning With Geometric Constraints in Monocular Video: Connecting Flow, Depth, and Camera.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

A Structured Model for Action Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Relational Action Forecasting.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Learning Joint Reconstruction of Hands and Manipulated Objects.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

MARS: Motion-Augmented RGB Stream for Action Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Focused Attention for Action Recognition.
Proceedings of the 30th British Machine Vision Conference 2019, 2019

2018
Learning From Web Videos for Event Classification.
IEEE Trans. Circuits Syst. Video Technol., 2018

Long-Term Temporal Convolutions for Action Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Proposal Flow: Semantic Correspondences from Object Proposals.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Image-Based Synthesis for Deep 3D Human Pose Estimation.
Int. J. Comput. Vis., 2018

Detecting rare visual relations using analogies.
CoRR, 2018

Modulated Policy Hierarchies.
CoRR, 2018

Déjà Vu: an empirical evaluation of the memorization properties of ConvNets.
CoRR, 2018

Modeling Spatio-Temporal Human Track Structure for Action Localization.
CoRR, 2018

A neural network catalyzer for multi-dimensional similarity search.
CoRR, 2018

Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos.
CoRR, 2018

Unsupervised Learning of Artistic Styles with Archetypal Style Analysis.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

A flexible model for training action localization with varying levels of supervision.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

BodyNet: Volumetric Inference of 3D Human Body Shapes.
Proceedings of the Computer Vision - ECCV 2018, 2018

Actor-Centric Relation Network.
Proceedings of the Computer Vision - ECCV 2018, 2018

How Good Is My GAN?
Proceedings of the Computer Vision - ECCV 2018, 2018

Modeling Visual Context Is Key to Augmenting Object Detection Datasets.
Proceedings of the Computer Vision - ECCV 2018, 2018

End-to-End Incremental Learning.
Proceedings of the Computer Vision - ECCV 2018, 2018

Actor and Observer: Joint Modeling of First and Third-Person Videos.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

AVA: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

PoTion: Pose MoTion Representation for Action Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Expanded Parts Model for Semantic Description of Humans in Still Images.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

Weakly Supervised Object Localization with Multi-Fold Multiple Instance Learning.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

Convolutional Patch Representations for Image Retrieval: An Unsupervised Approach.
Int. J. Comput. Vis., 2017

Leveraging the Path Signature for Skeleton-based Human Action Recognition.
CoRR, 2017

SfM-Net: Learning of Structure and Motion from Video.
CoRR, 2017

AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions.
CoRR, 2017

Inferring the Structure of Action Movies.
Proceedings of the 6th Workshop on Intelligent Cinematography and Editing, 2017

Learning Video Object Segmentation with Visual Memory.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Incremental Learning of Object Detectors without Catastrophic Forgetting.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Weakly-Supervised Learning of Visual Relations.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Areas of Attention for Image Captioning.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Action Tubelet Detector for Spatio-Temporal Action Localization.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Joint Learning of Object and Action Detectors.
Proceedings of the IEEE International Conference on Computer Vision, 2017

SCNet: Learning Semantic Correspondence.
Proceedings of the IEEE International Conference on Computer Vision, 2017

BlitzNet: A Real-Time Deep Network for Scene Understanding.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Learning from Synthetic Humans.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Learning Motion Patterns in Videos.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

LCR-Net: Localization-Classification-Regression for Human Pose.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Detecting Parts for Action Localization.
Proceedings of the British Machine Vision Conference 2017, 2017

2016
Analysing Domain Shift Factors between Videos and Images for Object Detection.
IEEE Trans. Pattern Anal. Mach. Intell., 2016

Approximate Fisher Kernels of Non-iid Image Models for Image Categorization.
IEEE Trans. Pattern Anal. Mach. Intell., 2016

Label-Embedding for Image Classification.
IEEE Trans. Pattern Anal. Mach. Intell., 2016

A Robust and Efficient Video Representation for Action Recognition.
Int. J. Comput. Vis., 2016

DeepMatching: Hierarchical Deformable Dense Matching.
Int. J. Comput. Vis., 2016

Circulant Temporal Encoding for Video Retrieval and Temporal Alignment.
Int. J. Comput. Vis., 2016

Towards Weakly-Supervised Action Localization.
CoRR, 2016

MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Weakly-Supervised Semantic Segmentation Using Motion Cues.
Proceedings of the Computer Vision - ECCV 2016, 2016

Multi-region Two-Stream R-CNN for Action Detection.
Proceedings of the Computer Vision - ECCV 2016, 2016

Proposal Flow.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
Deep Convolutional Matching.
CoRR, 2015

Beat-Event Detection in Action Movie Franchises.
CoRR, 2015

Unsupervised object discovery and localization in images and videos.
Proceedings of the 12th International Conference on Ubiquitous Robots and Ambient Intelligence, 2015

Learning to Track for Spatio-Temporal Action Localization.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Local Convolutional Features with Unsupervised Training for Image Retrieval.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Unsupervised Object Discovery and Tracking in Video Collections.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Online Object Tracking with Proposal Selection.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

P-CNN: Pose-Based CNN Features for Action Recognition.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Weakly-Supervised Alignment of Video with Text.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Learning to detect Motion Boundaries.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

EpicFlow: Edge-preserving interpolation of correspondences for optical flow.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Unsupervised object discovery and localization in the wild: Part-based matching with bottom-up region proposals.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014
Good Practice in Large-Scale Learning for Image Classification.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Activity representation with motion hierarchies.
Int. J. Comput. Vis., 2014

The INRIA-LIM-VocR and AXES submissions to TrecVid 2014 Multimedia Event Detection.
Proceedings of the 2014 TREC Video Retrieval Evaluation, 2014

Convolutional Kernel Networks.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Category-Specific Video Summarization.
Proceedings of the Computer Vision - ECCV 2014, 2014

Spatio-temporal Object Detection Proposals.
Proceedings of the Computer Vision - ECCV 2014, 2014

Occlusion and Motion Reasoning for Long-Term Tracking.
Proceedings of the Computer Vision - ECCV 2014, 2014

Weakly Supervised Action Labeling in Videos under Ordering Constraints.
Proceedings of the Computer Vision - ECCV 2014, 2014

Transformation Pursuit for Image Classification.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Efficient Action Localization with Approximately Normalized Fisher Vectors.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Multi-fold MIL Training for Weakly Supervised Object Localization.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Mixing Body-Part Sequences for Human Pose Estimation.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013
Explicit Modeling of Human-Object Interactions in Realistic Videos.
IEEE Trans. Pattern Anal. Mach. Intell., 2013

Temporal Localization of Actions with Actoms.
IEEE Trans. Pattern Anal. Mach. Intell., 2013

Dense Trajectories and Motion Boundary Descriptors for Action Recognition.
Int. J. Comput. Vis., 2013

Automatic Recognition of Human Activities in Realistic Videos.
ERCIM News, 2013


The AXES PRO video search system.
Proceedings of the International Conference on Multimedia Retrieval, 2013

Estimating Human Pose with Flowing Puppets.
Proceedings of the IEEE International Conference on Computer Vision, 2013

DeepFlow: Large Displacement Optical Flow with Deep Matching.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Action Recognition with Improved Trajectories.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Action and Event Recognition with Fisher Vectors on a Compact Feature Set.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Towards Understanding Action Recognition.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Stable Hyper-pooling and Query Expansion for Event Detection.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Segmentation Driven Object Detection with Fisher Vectors.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Finding Actors and Actions in Movies.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Expanded Parts Model for Human Attribute and Action Recognition in Still Images.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Event Retrieval in Large Video Collections with Circulant Temporal Encoding.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Label-Embedding for Attribute-Based Classification.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

2012
Weakly Supervised Learning of Interactions between Humans and Objects.
IEEE Trans. Pattern Anal. Mach. Intell., 2012

Aggregating Local Image Descriptors into Compact Codes.
IEEE Trans. Pattern Anal. Mach. Intell., 2012

Accurate Object Recognition with Shape Masks.
Int. J. Comput. Vis., 2012

Face Recognition from Caption-Based Supervision.
Int. J. Comput. Vis., 2012


Correlation-based burstiness for logo retrieval.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Discriminative spatial saliency for image classification.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Learning object class detectors from weakly annotated video.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Towards good practice in large-scale learning for image classification.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Image categorization using Fisher kernels of non-iid image models.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Recognizing activities with cluster-trees of tracklets.
Proceedings of the British Machine Vision Conference, 2012

2011
Product Quantization for Nearest Neighbor Search.
IEEE Trans. Pattern Anal. Mach. Intell., 2011

INRIA @TRECVID 2011: Copy Detection & Multimedia Event Detection.
Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

Unsupervised metric learning for face identification in TV video.
Proceedings of the IEEE International Conference on Computer Vision, 2011

Action recognition by dense trajectories.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Actom sequence models for efficient action detection.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Combining attributes and Fisher vectors for efficient image retrieval.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

A time series kernel for action recognition.
Proceedings of the British Machine Vision Conference, 2011

2010
An Image-Based Approach to Video Copy Detection With Spatio-Temporal Post-Filtering.
IEEE Trans. Multim., 2010

Accurate Image Search Using the Contextual Dissimilarity Measure.
IEEE Trans. Pattern Anal. Mach. Intell., 2010

Improving Bag-of-Features for Large Scale Image Search.
Int. J. Comput. Vis., 2010

From Images to Shape Models for Object Detection.
Int. J. Comput. Vis., 2010

INRIA LEAR-TEXMEX: Video Copy Detection Task.
Proceedings of the TRECVID 2010 workshop participants notebook papers, 2010

Image annotation with tagprop on the MIRFLICKR set.
Proceedings of the 11th ACM SIGMM International Conference on Multimedia Information Retrieval, 2010

Human Focused Action Localization in Video.
Proceedings of the Trends and Topics in Computer Vision, 2010

Multiple Instance Metric Learning from Automatically Labeled Bags of Faces.
Proceedings of the Computer Vision, 2010

Compact Video Description for Copy Detection with Precise Temporal Alignment.
Proceedings of the Computer Vision, 2010

Multi-view object class detection with a 3D geometric model.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Aggregating local descriptors into a compact image representation.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Multimodal semi-supervised learning for image classification.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009
Learning Color Names for Real-World Applications.
IEEE Trans. Image Process., 2009

Description of interest regions with local binary patterns.
Pattern Recognit., 2009

Large Scale Image Search.
Proceedings of the IAPR Conference on Machine Vision Applications (IAPR MVA 2009), 2009

Packing bag-of-features.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

Combining efficient object localization and image classification.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

Is that you? Metric learning approaches for face identification.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

Actions in context.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

Learning shape prior models for object matching.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

On the burstiness of visual elements.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

INRIA-LEAR's Participation in ImageCLEF 2009.
Proceedings of the Working Notes for CLEF 2009 Workshop co-located with the 13th European Conference on Digital Libraries (ECDL 2009) , Corfù, Greece, September 30, 2009

Evaluation of GIST descriptors for web-scale image search.
Proceedings of the 8th ACM International Conference on Image and Video Retrieval, 2009

Evaluation of Local Spatio-temporal Features for Action Recognition.
Proceedings of the British Machine Vision Conference, 2009

Mining Visual Actions from Movies.
Proceedings of the British Machine Vision Conference, 2009

2008
Groups of Adjacent Contour Segments for Object Detection.
IEEE Trans. Pattern Anal. Mach. Intell., 2008

Editorial.
Int. J. Comput. Vis., 2008

Learning to Recognize Objects with Little Supervision.
Int. J. Comput. Vis., 2008

INRIA-LEAR'S Video Copy Detection System.
Proceedings of the TRECVID 2008 workshop participants notebook papers, 2008

Query adaptative locality sensitive hashing.
Proceedings of the IEEE International Conference on Acoustics, 2008

Recent Advances in Large Scale Image Search.
Proceedings of the Emerging Trends in Visual Computing, 2008

Object Recognition by Integrating Multiple Image Segmentations.
Proceedings of the Computer Vision, 2008

Constructing Category Hierarchies for Visual Recognition.
Proceedings of the Computer Vision, 2008

Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search.
Proceedings of the Computer Vision, 2008

Viewpoint-independent object class detection using 3D Feature Maps.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

Learning realistic human actions from movies.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

Automatic face naming with caption-based supervision.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

A Spatio-Temporal Descriptor Based on 3D-Gradients.
Proceedings of the British Machine Vision Conference 2008, Leeds, UK, September 2008, 2008

2007
Segmenting, Modeling, and Matching Video Clips Containing Multiple Moving Objects.
IEEE Trans. Pattern Anal. Mach. Intell., 2007

Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study.
Int. J. Comput. Vis., 2007

High-dimensional data clustering.
Comput. Stat. Data Anal., 2007

Applying Color Names to Image Description.
Proceedings of the International Conference on Image Processing, 2007

Using High-Level Visual Information for Color Constancy.
Proceedings of the IEEE 11th International Conference on Computer Vision, 2007

Vector Quantizing Feature Space with a Regular Lattice.
Proceedings of the IEEE 11th International Conference on Computer Vision, 2007

Learning Color Names from Real-World Images.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

Semantic Hierarchies for Visual Object Recognition.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

Accurate Object Localization with Shape Masks.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

Flexible Object Models for Category-Level 3D Object Recognition.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

A contextual dissimilarity measure for accurate and efficient image search.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

Accurate Object Detection with Deformable Shape Models Learnt from Images.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

2006
3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints.
Int. J. Comput. Vis., 2006

Description of Interest Regions with Center-Symmetric Local Binary Patterns.
Proceedings of the Computer Vision, Graphics and Image Processing, 5th Indian Conference, 2006

Object Localization by Subspace Clustering of Local Descriptors.
Proceedings of the Computer Vision, Graphics and Image Processing, 5th Indian Conference, 2006

Blur Robust and Color Constant Image Description.
Proceedings of the International Conference on Image Processing, 2006

Coloring Local Feature Extraction.
Proceedings of the Computer Vision, 2006

Maximally Stable Local Description for Scale Selection.
Proceedings of the Computer Vision, 2006

Human Detection Using Oriented Histograms of Flow and Appearance.
Proceedings of the Computer Vision, 2006

Combining Regions and Patches for Object Class Localization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2006

Spatial Weighting for Bag-of-Features.
Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), 2006

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories.
Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), 2006

3D Object Modeling and Recognition from Photographs and Image Sequences.
Proceedings of the Toward Category-Level Object Recognition, 2006


A Discriminative Framework for Texture and Object Recognition Using Local Image Features.
Proceedings of the Toward Category-Level Object Recognition, 2006

A Semi-supervised Learning Approach to Object Recognition with Spatial Integration of Local Features and Segmentation Cues.
Proceedings of the Toward Category-Level Object Recognition, 2006

2005
A Performance Evaluation of Local Descriptors.
IEEE Trans. Pattern Anal. Mach. Intell., 2005

A Sparse Texture Representation Using Local Affine Regions.
IEEE Trans. Pattern Anal. Mach. Intell., 2005

A Comparison of Affine Region Detectors.
Int. J. Comput. Vis., 2005

Class-Specific Subspace Discriminant Analysis for High-Dimensional Data.
Proceedings of the Subspace, 2005


A Maximum Entropy Framework for Part-Based Texture and Object Recognition.
Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV 2005), 2005

Modèles markoviens pour l'organisation spatiale de descripteurs d'images.
Proceedings of the Actes de CAP 05, Conférence francophone sur l'apprentissage automatique, 2005

Markov Random Fields for Textures Recognition with Local Invariant Regions and their Geometric Relationships.
Proceedings of the British Machine Vision Conference 2005, Oxford, UK, September 2005, 2005

2004
Weakly Supervised Learning of Visual Models and Its Application to Content-Based Retrieval.
Int. J. Comput. Vis., 2004

Scale & Affine Invariant Interest Point Detectors.
Int. J. Comput. Vis., 2004

Image matching with scale adjustment.
Comput. Vis. Image Underst., 2004

Comparison of affine-invariant local detectors and descriptors.
Proceedings of the 2004 12th European Signal Processing Conference, 2004

Human Detection Based on a Probabilistic Assembly of Robust Part Detectors.
Proceedings of the Computer Vision, 2004

Scale-Invariant Shape Features for Recognition of Object Categories.
Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), with CD-ROM, 27 June, 2004

Semi-Local Affine Parts for Object Recognition.
Proceedings of the British Machine Vision Conference, 2004

2003
Face Detection and Tracking in a Video by Propagating Detection Probabilities.
IEEE Trans. Pattern Anal. Mach. Intell., 2003

Affine-Invariant Local Descriptors and Neighborhood Statistics for Texture Recognition.
Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV 2003), 2003

Selection of Scale-Invariant Parts for Object Class Recognition.
Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV 2003), 2003

3D Object Modeling and Recognition Using Affine-Invariant Patches and Multi-View Spatial Constraints.
Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), 2003

A Sparse Texture Representation Using Affine-Invariant Regions.
Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), 2003

Shape recognition with edge-based features.
Proceedings of the British Machine Vision Conference, 2003

2002
Learning to Parse Pictures of People.
Proceedings of the Computer Vision, 2002

An Affine Invariant Interest Point Detector.
Proceedings of the Computer Vision, 2002

On Pencils of Tangent Planes and the Recognition of Smooth 3D Shapes from Silhouettes.
Proceedings of the Computer Vision, 2002

2001
Indexing Based on Scale Invariant Interest Points.
Proceedings of the Eighth International Conference On Computer Vision (ICCV-01), Vancouver, British Columbia, Canada, July 7-14, 2001, 2001

Constructing models for content-based image retrieval.
Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), 2001

Face detection in a video sequence - a temporal approach.
Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), 2001

2000
The Geometry and Matching of Lines and Curves Over Multiple Views.
Int. J. Comput. Vis., 2000

Evaluation of Interest Point Detectors.
Int. J. Comput. Vis., 2000

Face Detection Based on Generic Local Descriptors and Spatial Constraints.
Proceedings of the 15th International Conference on Pattern Recognition, 2000

Matching Images with Different Resolutions.
Proceedings of the 2000 Conference on Computer Vision and Pattern Recognition (CVPR 2000), 2000

1999
Integrating Geometric and Photometric Information for Image Retrieval.
Proceedings of the Shape, Contour and Grouping in Computer Vision, 1999

A Structured Probabilistic Model for Recognition.
Proceedings of the 1999 Conference on Computer Vision and Pattern Recognition (CVPR '99), 1999

1998
Building and using hypervideos.
Proceedings of the Proceedings Fourth IEEE Workshop on Applications of Computer Vision, 1998

Effient Matching with Invariant Local Descriptors.
Proceedings of the Advances in Pattern Recognition, 1998

Comparing and Evaluating Interest Points.
Proceedings of the Sixth International Conference on Computer Vision (ICCV-98), 1998

The Geometry and Matching of Curves in Multiple Views.
Proceedings of the Computer Vision, 1998

1997
Local Grayvalue Invariants for Image Retrieval.
IEEE Trans. Pattern Anal. Mach. Intell., 1997

Automatic line matching across views.
Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97), 1997

Bayesian Decision Versus Voting for Image Retrieval.
Proceedings of the Computer Analysis of Images and Patterns, 7th International Conference, 1997

1996
Appariement d'images par invariants locaux de niveaux de gris. Application à l'indexation d'une base d'objets. (Image matching by local greyvalue invariants. Applied to indexing an object database).
PhD thesis, 1996

Image retrieval using local characterization.
Proceedings of the Proceedings 1996 International Conference on Image Processing, 1996

An Image Oriented CAD Approach.
Proceedings of the Object Representation in Computer Vision II, 1996

Combining greyvalue invariants with local constraints for object recognition.
Proceedings of the 1996 Conference on Computer Vision and Pattern Recognition (CVPR '96), 1996

1994
Obstacle detection analysis.
Proceedings of the Conference on Computer Vision and Pattern Recognition, 1994

1993
Auto-calibration by direct observation of objects.
Image Vis. Comput., 1993

Dynamic calibration of an active stereo head.
Proceedings of the Fourth International Conference on Computer Vision, 1993

Maintaining stereo calibration by tracking image points.
Proceedings of the Conference on Computer Vision and Pattern Recognition, 1993


  Loading...