Triantafyllos Afouras

David Kant

Wei-Ning Hsu

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Voicevector: Multimodal Enrolment Vectors for Speaker Separation.

[BibT_eX]

[DOI]

Akam Rahimi

Proceedings of the IEEE International Conference on Acoustics, 2024

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives.

[BibT_eX]

[DOI]

Santhosh Kumar Ramakrishnan

Oluwatumininu Oguntola

Giovanni Maria Farinella

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Eventfulness for Interactive Video Alignment.

[BibT_eX]

[DOI]

Jiatian Sun

Longxiulin Deng

Santhosh Kumar Ramakrishnan

Andrew Owens

Abe Davis

ACM Trans. Graph., August, 2023

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives.

[BibT_eX]

[DOI]

CoRR, 2023

Video-Mined Task Graphs for Keystep Recognition in Instructional Videos.

[BibT_eX]

[DOI]

Kumar Ashutosh

Kristen Grauman

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

HT-Step: Aligning Instructional Articles with How-To Videos.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Learning to Ground Instructional Articles in Videos through Narrations.

[BibT_eX]

[DOI]

Effrosyni Mavroudi

Lorenzo Torresani

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022

Deep Audio-Visual Speech Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Scaling Up Sign Spotting Through Sign Language Dictionaries.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., 2022

Reading to Listen at the Cocktail Party: Multi-Modal Speech Separation.

[BibT_eX]

[DOI]

Akam Rahimi

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Sub-word Level Lip Reading With Visual Attention.

[BibT_eX]

[DOI]

K. R. Prajwal

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Self-supervised object detection from audio-visual correspondence.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

BBC-Oxford British Sign Language Dataset.

[BibT_eX]

[DOI]

CoRR, 2021

Aligning Subtitles in Sign Language Videos.

[BibT_eX]

[DOI]

Hannah Bull

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

SeeHear: Signer Diarisation and a New Dataset.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Read and Attend: Temporal Localisation in Sign Language Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Localizing Visual Sounds the Hard Way.

[BibT_eX]

[DOI]

Honglie Chen

Weidi Xie

Arsha Nagrani

Andrea Vedaldi

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Visual Keyword Spotting with Attention.

[BibT_eX]

[DOI]

K. R. Prajwal

Proceedings of the 32nd British Machine Vision Conference 2021, 2021

Audio-Visual Synchronisation in the wild.

[BibT_eX]

[DOI]

Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020

Spot the Conversation: Speaker Diarisation in the Wild.

[BibT_eX]

[DOI]

Jaesung Huh

Arsha Nagrani

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Now You're Speaking My Language: Visual Language Identification.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

ASR is All You Need: Cross-Modal Distillation for Lip Reading.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

BSL-1K: Scaling Up Co-articulated Sign Language Recognition Using Mouthing Cues.

[BibT_eX]

[DOI]

Neil Fox

Proceedings of the Computer Vision - ECCV 2020, 2020

Self-supervised Learning of Audio-Visual Objects from Video.

[BibT_eX]

[DOI]

Andrew Owens

Proceedings of the Computer Vision - ECCV 2020, 2020

Seeing wake words: Audio-visual Keyword Spotting.

[BibT_eX]

[DOI]

Themos Stafylakis

Proceedings of the 31st British Machine Vision Conference 2020, 2020

Watch, Read and Lookup: Learning to Spot Signs from Multiple Supervisors.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30, 2020

2019

My Lips Are Concealed: Audio-Visual Speech Enhancement Through Obstructions.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2018

LRS3-TED: a large-scale dataset for visual speech recognition.

[BibT_eX]

[DOI]

CoRR, 2018

Deep Lip Reading: A Comparison of Models and an Online Application.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

The Conversation: Deep Audio-Visual Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Counterfactual Multi-Agent Policy Gradients.

[BibT_eX]

[DOI]

Jakob N. Foerster

Gregory Farquhar

Nantas Nardelli

Shimon Whiteson

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

Jakob N. Foerster

Nantas Nardelli

Gregory Farquhar

Philip H. S. Torr

Pushmeet Kohli

Shimon Whiteson

Proceedings of the 34th International Conference on Machine Learning, 2017

2015

An Application-Layer Restful Sleepy Nodes Implementation for Internet of Things Systems.

[BibT_eX]

[DOI]

Matthias Thoma