Hervé Bredin

Orcid: 0000-0002-3739-925X

According to our database1, Hervé Bredin authored at least 87 papers between 2005 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Diart: A Python Library for Real-Time Speaker Diarization.
J. Open Source Softw., 2024

On the calibration of powerset speaker diarization models.
CoRR, 2024

Premier système IRIT-MyFamillyUp pour la compétition sur la reconnaissance des émotions Odyssey 2024.
Proceedings of the Actes des 35èmes Journées d'Études sur la Parole, 2024

IRIT-MFU Multi-modal systems for emotion classification for Odyssey 2024 challenge.
Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024

PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings.
Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024

2023
Powerset multi-class cross entropy loss for neural speaker diarization.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

pyannote.audio 2.1 speaker diarization pipeline: principle, benchmark, and recipe.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Brouhaha: Multi-Task Training for Voice Activity Detection, Speech-to-Noise Ratio, and C50 Room Acoustics Estimation.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Continual Self-Supervised Domain Adaptation for End-to-End Speaker Diarization.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Bazinga! A Dataset for Multi-Party Dialogues Structuring.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

2021
End-To-End Speaker Segmentation for Overlap-Aware Resegmentation.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Overlap-Aware Low-Latency Online Speaker Diarization Based on End-to-End Local Segmentation.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
A Comparison of Metric Learning Loss Functions for End-To-End Speaker Verification.
Proceedings of the Statistical Language and Speech Processing, 2020

A Metric Learning Approach to Misogyny Categorization.
Proceedings of the 5th Workshop on Representation Learning for NLP, 2020


End-to-End Domain-Adversarial Voice Activity Detection.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

An Open-Source Voice Type Classifier for Child-Centered Daylong Recordings.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Overlap-Aware Diarization: Resegmentation Using Neural End-to-End Overlapped Speech Detection.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Pyannote.Audio: Neural Building Blocks for Speaker Diarization.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
The Speed Submission to DIHARD II: Contributions & Lessons Learned.
CoRR, 2019

LSTM Based Similarity Measurement with Spectral Clustering for Speaker Diarization.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

« Hé Manu, tu descends ? » : identification nommée du locuteur dans les dialogues.
Proceedings of the COnférence en Recherche d'Informations et Applications, 2019

2018
IRIM at TRECVID 2018: Instance Search.
Proceedings of the 2018 TREC Video Retrieval Evaluation, 2018

PLUMCOT at TRECVid Instance Search 2018.
Proceedings of the 2018 TREC Video Retrieval Evaluation, 2018

Low-latency speaker spotting with online diarization and detection.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Neural Speech Turn Segmentation and Affinity Propagation for Speaker Diarization.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

ODESSA at Albayzin Speaker Diarization Challenge 2018.
Proceedings of the Fourth International Conference, 2018

ODESSA/PLUMCOT at Albayzin Multimodal Diarization Challenge 2018.
Proceedings of the Fourth International Conference, 2018

2017
Multimodal person discovery in broadcast TV: lessons learned from MediaEval 2015.
Multim. Tools Appl., 2017

IRIM at TRECVID 2017: Instance Search.
Proceedings of the 2017 TREC Video Retrieval Evaluation, 2017

Speaker Change Detection in Broadcast TV Using Bidirectional Long Short-Term Memory Networks.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Combining Speaker Turn Embedding and Incremental Structure Prediction for Low-Latency Speaker Diarization.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

pyannote.metrics: A Toolkit for Reproducible Evaluation, Diagnostic, and Error Analysis of Speaker Diarization Systems.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

TristouNet: Triplet loss for speaker turn embedding.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017


2016
IRIM at TRECVID 2016: Instance Search.
Proceedings of the 2016 TREC Video Retrieval Evaluation, 2016

Improving Speaker Diarization of TV Series using Talking-Face Detection and Clustering.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Multimodal Person Discovery in Broadcast TV at MediaEval 2016.
Proceedings of the Working Notes Proceedings of the MediaEval 2016 Workshop, 2016

Benchmarking multimedia technologies with the CAMOMILE platform: the case of Multimodal Person Discovery at MediaEval 2015.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

The CAMOMILE Collaborative Annotation Platform for Multi-modal, Multi-lingual and Multi-media Documents.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Post-Hoc Interactive Analytics of Errors in the Context of a Person Discovery Task.
Proceedings of the IEEE International Symposium on Multimedia, 2016

2015
Lexical speaker identification in TV shows.
Multim. Tools Appl., 2015

LIMSI at MediaEval 2015: Person Discovery in Broadcast TV Task.
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

Multimodal Person Discovery in Broadcast TV at MediaEval 2015.
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

Structured prediction for speaker identification in TV series.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Collaborative annotation for person identification in TV shows.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

A Visual Analytics Approach to Finding Factors Improving Automatic Speaker Identifications.
Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, Seattle, WA, USA, November 09, 2015

2014
Hierarchical Late Fusion for Concept Detection in Videos.
Proceedings of the Fusion in Computer Vision - Understanding Complex Visual Content, 2014

Person instance graphs for mono-, cross- and multi-modal person recognition in multimedia data: application to speaker identification in TV broadcast.
Int. J. Multim. Inf. Retr., 2014

Person Instance Graphs for Named Speaker Identification in TV Broadcast.
Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

"Sheldon speaking, Bonjour!": Leveraging Multilingual Tracks for (Weakly) Supervised Speaker Identification.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

LIMSI @ MediaEval SED 2014.
Proceedings of the Working Notes Proceedings of the MediaEval 2014 Workshop, 2014

TVD: A Reproducible and Multiply Aligned TV Series Dataset.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

A Web-Based Tool for the Visual Analysis of Media Annotations.
Proceedings of the 18th International Conference on Information Visualisation, 2014

Collaborative Annotation of Multimedia Resources.
Proceedings of the Cooperative Design, Visualization, and Engineering, 2014

2013
Towards a Better Integration of Written Names for Unsupervised Speakers Identification in Videos.
Proceedings of the First Workshop on Speech, 2013

Integer linear programming for speaker diarization and cross-modal identification in TV broadcast.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013


2012
Vers un résumé automatique de séries télévisées basé sur une recherche multimodale d'histoires.
Document Numérique, 2012

A Public Audio Identification Evaluation Framework for Broadcast Monitoring.
Appl. Artif. Intell., 2012


StoViz: story visualization of TV series.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Unsupervised Speaker Identification using Overlaid Texts in TV Broadcast.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Segmentation of TV shows into scenes using speaker diarization and speech recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Community-driven hierarchical fusion of numerous classifiers: Application to video semantic indexing.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Hierarchical Late Fusion for Concept Detection in Videos.
Proceedings of the Computer Vision - ECCV 2012. Workshops and Demonstrations, 2012

Fusion of Speech, Faces and Text for Person Identification in TV Broadcast.
Proceedings of the Computer Vision - ECCV 2012. Workshops and Demonstrations, 2012

Toward plot de-interlacing in TV series using scenes clustering.
Proceedings of the 10th International Workshop on Content-Based Multimedia Indexing, 2012

2011

2010

IRIT @ TRECVid 2010 : Hidden Markov Models for Context-aware Late Fusion of Multiple Audio Classifiers.
Proceedings of the TRECVID 2010 workshop participants notebook papers, 2010

2009
Audio-visual speech asynchrony detection using co-inertia analysis and coupled hidden markov models.
Pattern Anal. Appl., 2009

Talking-Face Identity Verification, Audiovisual Forgery, and Robustness Issues.
EURASIP J. Adv. Signal Process., 2009


IRIT @ TRECVid HLF 2009 - Audio to the Rescue.
Proceedings of the TRECVID 2009 workshop participants notebook papers, 2009

An interactive and multi-level framework for summarising user generated videos.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

2008
Rushes video summarization using a collaborative approach.
Proceedings of the 2nd ACM Workshop on Video Summarization, 2008

Dublin City University at the TRECVid 2008 BBC rushes summarisation task.
Proceedings of the 2nd ACM Workshop on Video Summarization, 2008

Some results from the biosecure talking face evaluation campaign.
Proceedings of the IEEE International Conference on Acoustics, 2008

Making talking-face authentication robust to deliberate imposture.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Audiovisual Speech Synchrony Measure: Application to Biometrics.
EURASIP J. Adv. Signal Process., 2007

Some Experiments in Audio-Visual Speech Processing.
Proceedings of the Advances in Nonlinear Speech Processing, 2007

Audio-Visual Speech Synchrony Measure for Talking-Face Identity Verification.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
GMM-based SVM for face recognition.
Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), 2006

Detecting Replay Attacks in Audiovisual Identity Verification.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Audio-visual Identity Verification: An Introductory Overview.
Proceedings of the Progress in Nonlinear Speech Processing, 2005


  Loading...