Lorenzo Torresani

Affiliations:
  • Facebook AI Research, Meta, USA


According to our database1, Lorenzo Torresani authored at least 129 papers between 1997 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Semantic Compositions Enhance Vision-Language Contrastive Learning.
CoRR, 2024

UNICORN: A Unified Causal Video-Oriented Language-Modeling Framework for Temporal Video-Language Tasks.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

4DIFF: 3D-Aware Diffusion Model for Third-to-First Viewpoint Translation.
Proceedings of the Computer Vision - ECCV 2024, 2024

Learning to Segment Referred Objects from Narrated Egocentric Videos.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Step Differences in Instructional Video.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Open-world Instance Segmentation: Top-down Learning with Bottom-up Supervision.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Video ReCap: Recursive Captioning of Hour-Long Videos.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives.
CoRR, 2023

Multiscale Video Pretraining for Long-Term Activity Forecasting.
CoRR, 2023

MINOTAUR: Multi-task Video Grounding From Multimodal Queries.
CoRR, 2023

Egocentric Video Task Translation @ Ego4D Challenge 2022.
CoRR, 2023

What You Say Is What You Show: Visual Narration Detection in Instructional Videos.
CoRR, 2023

Ego-Only: Egocentric Action Detection without Exocentric Pretraining.
CoRR, 2023

Ego4D Goal-Step: Toward Hierarchical Understanding of Procedural Activities.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

HT-Step: Aligning Instructional Articles with How-To Videos.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Ego-Only: Egocentric Action Detection without Exocentric Transferring.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning to Ground Instructional Articles in Videos through Narrations.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Relational Space-Time Query in Long-Form Videos.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Egocentric Video Task Translation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

HierVL: Learning Hierarchical Video-Language Embeddings.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Generalized Few-Shot Video Classification With Video Retrieval and Feature Generation.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

HistoPerm: A Permutation-Based View Generation Approach for Learning Histopathologic Feature Representations.
CoRR, 2022

Deformable Video Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Long-Short Temporal Contrastive Learning of Video Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022


Learning To Recognize Procedural Activities with Distant Supervision.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Calibrating Histopathology Image Classifiers Using Label Smoothing.
Proceedings of the Artificial Intelligence in Medicine, 2022

Label Hallucination for Few-Shot Classification.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Ego4D: Around the World in 3, 000 Hours of Egocentric Video.
CoRR, 2021

Resolution-based distillation for efficient histology image classification.
Artif. Intell. Medicine, 2021

Learn like a Pathologist: Curriculum Learning by Annotator Agreement for Histopathology Image Classification.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Supervoxel Attention Graphs for Long-Range Video Modeling.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Only Time Can Tell: Discovering Temporal Data for Temporal Modeling.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Is Space-Time Attention All You Need for Video Understanding?
Proceedings of the 38th International Conference on Machine Learning, 2021

Slot Machines: Discovering Winning Combinations of Random Weights in Neural Networks.
Proceedings of the 38th International Conference on Machine Learning, 2021

A Multi-View Approach to Audio-Visual Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2021

Beyond Short Clips: End-to-End Video-Level Learning With Collaborative Memories.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

A Petri Dish for Histopathology Image Analysis.
Proceedings of the Artificial Intelligence in Medicine, 2021

2020
COBE: Contextualized Object Embeddings from Narrated Instructional Video.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Self-Supervised Learning by Cross-Modal Audio-Video Clustering.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Generalized Many-Way Few-Shot Video Classification.
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

Video Modeling With Correlation Networks.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Listen to Look: Action Recognition by Previewing Audio.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Attentive Action and Context Factorization.
Proceedings of the 31st British Machine Vision Conference 2020, 2020

Stein Variational Inference for Discrete Distributions.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019
Self-Supervised Learning by Cross-Modal Audio-Video Clustering.
CoRR, 2019

UniDual: A Unified Model for Image and Video Understanding.
CoRR, 2019

Learning Temporal Pose Estimation from Sparsely-Labeled Videos.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

STAR-Caps: Capsule Networks with Straight-Through Attentive Routing.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

HACS: Human Action Clips and Segments Dataset for Recognition and Temporal Localization.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Video Classification With Channel-Separated Convolutional Networks.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

SCSampler: Sampling Salient Clips From Video for Efficient Action Recognition.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

DistInit: Learning Video Representations Without a Single Labeled Video.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Leveraging the Present to Anticipate the Future in Videos.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Semantic Segmentation of the Growth Stages of Plasmodium Parasites using Convolutional Neural Networks.
Proceedings of the 2019 IEEE AFRICON, Accra, Ghana, September 25-27, 2019, 2019

2018
Learning Discriminative Motion Features Through Detection.
CoRR, 2018

Co-Training of Audio and Video Representations from Self-Supervised Temporal Synchronization.
CoRR, 2018

BranchConnect: Image Categorization with Learned Branch Connections.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Scenes-Objects-Actions: A Multi-task, Multi-label Video Dataset.
Proceedings of the Computer Vision - ECCV 2018, 2018

Object Detection in Video with Spatiotemporal Sampling Networks.
Proceedings of the Computer Vision - ECCV 2018, 2018

MaskConnect: Connectivity Learning by Gradient Descent.
Proceedings of the Computer Vision - ECCV 2018, 2018

Computer Vision in DH.
Proceedings of the 13th Annual International Conference of the Alliance of Digital Humanities Organizations, 2018

A Closer Look at Spatiotemporal Convolutions for Action Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Detect-and-Track: Efficient Pose Estimation in Videos.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Self-Supervised Feature Learning for Semantic Segmentation of Overhead Imagery.
Proceedings of the British Machine Vision Conference 2018, 2018

2017
Multiple hypothesis colorization and its application to image compression.
Comput. Vis. Image Underst., 2017

SLAC: A Sparsely Labeled Dataset for Action Classification and Localization.
CoRR, 2017

Connectivity Learning in Multi-Branch Networks.
CoRR, 2017

Deep-Learning for Classification of Colorectal Polyps on Whole-Slide Images.
CoRR, 2017

BranchConnect: Large-Scale Visual Recognition with Learned Branch Connections.
CoRR, 2017

Learning to Inpaint for Image Compression.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Deciphering Severely Degraded License Plates.
Proceedings of the Media Watermarking, Security, and Forensics 2017, Burlingame, CA, USA, 29 January 2017, 2017

Recurrent Mixture Density Network for Spatiotemporal Visual Attention.
Proceedings of the 5th International Conference on Learning Representations, 2017

Looking Under the Hood: Deep Neural Network Visualization to Interpret Whole-Slide Image Analysis Outcomes for Colorectal Polyps.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

Convolutional Random Walk Networks for Semantic Image Segmentation.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Local Perturb-and-MAP for Structured Prediction.
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

2016
EXMOVES: Mid-level Features for Efficient Action Recognition and Video Analysis.
Int. J. Comput. Vis., 2016

ViCom: Benchmark and Methods for Video Comprehension.
CoRR, 2016

Colorization for Image Compression.
CoRR, 2016

Self-taught object localization with deep networks.
Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision, 2016

Coupled depth learning.
Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision, 2016

Network of Experts for Large-Scale Image Categorization.
Proceedings of the Computer Vision - ECCV 2016, 2016

Deep End2End Voxel2Voxel Prediction.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016

Semantic Segmentation with Boundary Neural Fields.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
What Can Pictures Tell Us About Web Pages? Improving Document Search Using Images.
IEEE Trans. Pattern Anal. Mach. Intell., 2015

Coarse-to-fine Depth Estimation from a Single Image via Coupled Regression and Dictionary Learning.
CoRR, 2015

Learning Spatiotemporal Features with 3D Convolutional Networks.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

High-for-Low and Low-for-High: Efficient Boundary Detection from Deep Object Features and Its Applications to High-Level Vision.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

DeepEdge: A multi-scale bifurcated deep network for top-down contour detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014
Classemes: A Compact Image Descriptor for Efficient Novel-Class Recognition and Search.
Proceedings of the Registration and Recognition in Images and Videos, 2014

Weakly Supervised Learning.
Computer Vision, A Reference Guide, 2014

Learning discriminative localization from weakly labeled data.
Pattern Recognit., 2014

Classemes and Other Classifier-Based Features for Efficient Object Categorization.
IEEE Trans. Pattern Anal. Mach. Intell., 2014

Learning what is where from unlabeled images: joint localization and clustering of foreground objects.
Mach. Learn., 2014

EXMOVES: Classifier-based Features for Scalable Action Recognition.
Proceedings of the 2nd International Conference on Learning Representations, 2014

C3D: Generic Features for Video Analysis.
CoRR, 2014

AutoCaption: Automatic caption generation for personal photos.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2014

Improving tag transfer for image annotation using visual and semantic information.
Proceedings of the 12th International Workshop on Content-Based Multimedia Indexing, 2014

2013
A Dual Decomposition Approach to Feature Correspondence.
IEEE Trans. Pattern Anal. Mach. Intell., 2013

CarSafe app: alerting drowsy and distracted drivers using dual cameras on smartphones.
Proceedings of the 11th Annual International Conference on Mobile Systems, 2013

Leveraging Structure from Motion to Learn Discriminative Codebooks for Scalable Landmark Classification.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

2012
WalkSafe: a pedestrian safety app for mobile phone users who walk and talk while crossing roads.
Proceedings of the 2012 Workshop on Mobile Computing Systems and Applications, 2012

CarSafe: a driver safety app that detects dangerous driving behavior using dual-cameras on smartphones.
Proceedings of the 2012 ACM Conference on Ubiquitous Computing, 2012

CarSafe demo: supporting driver safety using dual-cameras on smartphones.
Proceedings of the 2012 ACM Conference on Ubiquitous Computing, 2012

Measuring Image Distances via Embedding in a Semantic Manifold.
Proceedings of the Computer Vision - ECCV 2012, 2012

Meta-class features for large-scale object categorization on a budget.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011
PiCoDes: Learning a Compact Code for Novel-Category Recognition.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Scalable object-class retrieval with approximate and top-k ranking.
Proceedings of the IEEE International Conference on Computer Vision, 2011

2010
Exploiting weakly-labeled Web images to improve object classification: a domain adaptation approach.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Efficient Object Category Recognition Using Classemes.
Proceedings of the Computer Vision, 2010

Simultaneous point matching and 3D deformable surface reconstruction.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009
Unsupervised hierarchical modeling of locomotion styles.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Weakly supervised discriminative localization and classification: a joint learning process.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

Learning query-dependent prefilters for scalable image retrieval.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

2008
Nonrigid Structure-from-Motion: Estimating Shape and Motion with Hierarchical Priors.
IEEE Trans. Pattern Anal. Mach. Intell., 2008

Feature Correspondence Via Graph Matching: Models and Global Optimization.
Proceedings of the Computer Vision, 2008

2006
Large Margin Component Analysis.
Proceedings of the Advances in Neural Information Processing Systems 19, 2006

Learning Motion Style Synthesis from Perceptual Observations.
Proceedings of the Advances in Neural Information Processing Systems 19, 2006

2005
Learning models of human movement from video.
PhD thesis, 2005

2004
Automatic Non-rigid 3D Modeling from Video.
Proceedings of the Computer Vision, 2004

2003
Learning Non-Rigid 3D Shape from 2D Motion.
Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

2002
Space-Time Tracking.
Proceedings of the Computer Vision, 2002

2001
Tracking and Modeling Non-Rigid Objects with Rank Constraints.
Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), 2001

1997
Analysis and Encoding of Lip Movements.
Proceedings of the Audio- and Video-Based Biometric Person Authentication, 1997


  Loading...