Ricard Marxer

CoRR, 2024

PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024

Speech Foundation Models on Intelligibility Prediction for Hearing-Impaired Listeners.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Scaling Properties of Speech Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

SUCRe: Leveraging Scene Structure for Underwater Color Restoration.

[BibT_eX]

[DOI]

Proceedings of the International Conference on 3D Vision, 2024

2023

Eiffel Tower: A deep-sea underwater dataset for long-term visual localization.

[BibT_eX]

[DOI]

Int. J. Robotics Res., August, 2023

Progress and Prospects for Spoken Language Technology: Results from Five Sexennial Surveys.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

On the Benefits of Self-supervised Learned Speech Representations for Predicting Human Phonetic Misperceptions.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022

Homography-Based Loss Function for Camera Pose Regression.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Lett., 2022

Variable-rate hierarchical CPC leads to acoustic unit discovery in speech.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Blind Speech Separation Through Direction of Arrival Estimation Using Deep Neural Networks with a Flexibility on the Number of Speakers.

[BibT_eX]

[DOI]

Mohammed Hafsati

Kamil Bentounes

José Andrés González López

Proceedings of the 24th IEEE International Workshop on Multimedia Signal Processing, 2022

Contrastive Prediction Strategies for Unsupervised Segmentation and Categorization of Phonemes and Words.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Aligned Contrastive Predictive Coding.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Information Retrieval for ZeroSpeech 2021: The Submission by University of Wroclaw.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Voice Restoration with Silent Speech Interfaces (ReSSInt).

[BibT_eX]

[DOI]

Inma Hernáez

Eva Navas

José Luis Pérez-Córdoba

Ibon Saratxaga

Gonzalo Olivares

Jon Sánchez de la Fuente

Alberto Galdón

Víctor García Romillo

Míriam González-Atienza

Proceedings of the Fifth International Conference, 2021

2020

A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Robust Training of Vector Quantized Bottleneck Models.

[BibT_eX]

[DOI]

Hans J. G. A. Dolfing

Sameer Khurana

Tanel Alumäe

Antoine Laurent

Proceedings of the 2020 International Joint Conference on Neural Networks, 2020

DOCC10: Open access dataset of marine mammal transient studies and end-to-end CNN classification.

[BibT_eX]

[DOI]

Proceedings of the 2020 International Joint Conference on Neural Networks, 2020

Deep Learning and Domain Transfer for Orca Vocalization Detection.

[BibT_eX]

[DOI]

Proceedings of the 2020 International Joint Conference on Neural Networks, 2020

Deep Learning Classification with Noisy Labels.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Multimedia & Expo Workshops, 2020

The "ScribbleLens" Dutch Historical Handwriting Corpus.

[BibT_eX]

[DOI]

Hans J. G. A. Dolfing

Proceedings of the 17th International Conference on Frontiers in Handwriting Recognition, 2020

2019

Real-time Passive Acoustic 3D Tracking of Deep Diving Cetacean by Small Non-uniform Mobile Surface Antenna.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

The impact of the Lombard effect on audio and visual speech recognition systems.

[BibT_eX]

[DOI]

Speech Commun., 2018

DNN Driven Speaker Independent Audio-Visual Mask Estimation for Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017

An analysis of environment, microphone and data simulation mismatches in robust speech recognition.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2017

The third 'CHiME' speech separation and recognition challenge: Analysis and outcomes.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2017

Multi-microphone speech recognition in everyday environments.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2017

Binary Mask Estimation Strategies for Constrained Imputation-Based Speech Enhancement.

[BibT_eX]

[DOI]

Jon Barker

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

An Innovative Speech-Based User Interface for Smarthomes and IoT Solutions to Help People with Speech and Motor Disabilities.

[BibT_eX]

[DOI]

Proceedings of the Harnessing the Power of Technology to Improve Lives, 2017

The CHiME Challenges: Robust Speech Recognition in Everyday Environments.

[BibT_eX]

[DOI]

Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016

Unsupervised Incremental Online Learning and Prediction of Musical Audio Signals.

[BibT_eX]

[DOI]

Hendrik Purwins

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Vocal Interactivity in-and-between Humans, Animals, and Robots.

[BibT_eX]

[DOI]

Serge Thill

Frontiers Robotics AI, 2016

Vocal Interactivity in-and-between Humans, Animals and Robots (VIHAR) (Dagstuhl Seminar 16442).

[BibT_eX]

[DOI]

Serge Thill

Dagstuhl Reports, 2016

Progress and Prospects for Spoken Language Technology: Results from Four Sexennial Surveys.

[BibT_eX]

[DOI]

María Luisa García Lecumberri

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Language Effects in Noise-Induced Word Misperceptions.

[BibT_eX]

[DOI]

Jon Barker

Martin Cooke

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

CloudCAST - Remote Speech Technology for Speech Professionals.

[BibT_eX]

[DOI]

Lorenzo Desideri

Fabio Tamburini

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

An Innovative Speech-Based Interface to Control AAL and IoT Solutions to Help People with Speech and Motor Disability.

[BibT_eX]

[DOI]

Enrico Turri

Maria Rosaria Motolese

Proceedings of the Ambient Assisted Living, 2016

A Data Driven Approach to Audiovisual Speech Mapping.

[BibT_eX]

[DOI]

Proceedings of the Advances in Brain Inspired Cognitive Systems, 2016

2015

Unsupervised Incremental Learning and Prediction of Audio Signals.

[BibT_eX]

[DOI]

Hendrik Purwins

CoRR, 2015

Automatic dysfluency detection in dysarthric speech using deep belief networks.

[BibT_eX]

[DOI]

Stacey Oue

Frank Rudzicz

Proceedings of the 6th Workshop on Speech and Language Processing for Assistive Technologies, 2015

Remote Speech Technology for Speech Professionals - the CloudCAST initiative.

[BibT_eX]

[DOI]

Lorenzo Desideri

Proceedings of the 6th Workshop on Speech and Language Processing for Assistive Technologies, 2015

Knowledge transfer between speakers for personalised dialogue management.

[BibT_eX]

[DOI]

Proceedings of the SIGDIAL 2015 Conference, 2015

A framework for the evaluation of microscopic intelligibility models.

[BibT_eX]

[DOI]

Martin Cooke

Jon Barker

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Exploiting synchrony spectra and deep neural networks for noise-robust automatic speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

The third 'CHiME' speech separation and recognition challenge: Dataset, task and baselines.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2012

A Tikhonov regularization method for spectrum decomposition in low latency audio source separation.

[BibT_eX]

[DOI]

Jordi Janer

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Combining a harmonic-based NMF decomposition with transient analysis for instantaneous percussion separation.

[BibT_eX]

[DOI]

Jordi Janer

Keita Arimoto

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Low-Latency Instrument Separation in Polyphonic Audio Using Timbre Models.

[BibT_eX]

[DOI]