Laurent Girin

Orcid: 0000-0002-9214-8760

According to our database1, Laurent Girin authored at least 137 papers between 1995 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
A multimodal dynamical variational autoencoder for audiovisual speech representation learning.
Neural Networks, 2024

Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting.
CoRR, 2024

Exploration de la représentation multidimensionnelle de paramètres acoustiques unidimensionnels de la parole extraits par des modèles profonds non supervisés.
Proceedings of the Actes des 35èmes Journées d'Études sur la Parole, 2024

Exploring the Multidimensional Representation of Unidimensional Speech Acoustic Parameters Extracted by Deep Unsupervised Models.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Learning and controlling the source-filter representation of speech with a variational autoencoder.
Speech Commun., March, 2023

Mixture of Dynamical Variational Autoencoders for Multi-Source Trajectory Modeling and Separation.
Trans. Mach. Learn. Res., 2023

Exploring the multidimensional representation of individual speech acoustic parameters extracted by deep unsupervised models.
Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Unsupervised speech enhancement with deep dynamical generative speech and noise models.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Speech Modeling with a Hierarchical Transformer Dynamical VAE.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Unsupervised Speech Enhancement Using Dynamical Variational Autoencoders.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

HiT-DVAE: Human Motion Generation via Hierarchical Transformer Dynamical VAE.
CoRR, 2022

Unsupervised Multiple-Object Tracking with a Dynamical Variational Autoencoder.
CoRR, 2022

BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Repeat after Me: Self-Supervised Learning of Acoustic-to-Articulatory Mapping by Vocal Imitation.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Make That Sound More Metallic: Towards a Perceptually Relevant Control of the Timbre of Synthesizer Sounds Using a Variational Autoencoder.
Trans. Int. Soc. Music. Inf. Retr., 2021

Variational Bayesian Inference for Audio-Visual Tracking of Multiple Speakers.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Dynamical Variational Autoencoders: A Comprehensive Review.
Found. Trends Mach. Learn., 2021

A Survey of Sound Source Localization with Deep Learning Methods.
CoRR, 2021

Unsupervised Speech Enhancement using Dynamical Variational Auto-Encoders.
CoRR, 2021

Multichannel CRNN for Speaker Counting: an Analysis of Performance.
CoRR, 2021

Saladnet: Self-Attentive Multisource Localization in the Ambisonics Domain.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Learning Robust Speech Representation with an Articulatory-Regularized Variational Autoencoder.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

A Benchmark of Dynamical Variational Autoencoders Applied to Speech Spectrogram Modeling.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Improved feature extraction for CRNN-based multiple sound source localization.
Proceedings of the 29th European Signal Processing Conference, 2021

2020
Audio-Visual Speech Enhancement Using Conditional Variational Auto-Encoders.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Evaluating the Potential Gain of Auditory and Audiovisual Speech-Predictive Coding Using Deep Learning.
Neural Comput., 2020

What the Future Brings: Investigating the Impact of Lookahead for Incremental Neural TTS.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

A Recurrent Variational Autoencoder for Speech Enhancement.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

High-Resolution Speaker Counting in Reverberant Rooms Using CRNN with Ambisonics Features.
Proceedings of the 28th European Signal Processing Conference, 2020

2019
Multichannel Online Dereverberation Based on Spectral Magnitude Inverse Filtering.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Multichannel Speech Separation and Enhancement Using the Convolutive Transfer Function.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Audio-Noise Power Spectral Density Estimation Using Long Short-Term Memory.
IEEE Signal Process. Lett., 2019

Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments.
IEEE J. Sel. Top. Signal Process., 2019

Assessing the performances of different neural network architectures for the detection of screams and shouts in public transportation.
Expert Syst. Appl., 2019

Audio-visual Speech Enhancement Using Conditional Variational Auto-Encoder.
CoRR, 2019

Expectation-Maximization for Speech Source Separation Using Convolutive Transfer Function.
CoRR, 2019

Expectation-maximisation for speech source separation using convolutive transfer function.
CAAI Trans. Intell. Technol., 2019

Audio-Visual Variational Fusion for Multi-Person Tracking with Robots.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Speech Enhancement with Variational Autoencoders and Alpha-stable Distributions.
Proceedings of the IEEE International Conference on Acoustics, 2019

Semi-supervised Multichannel Speech Enhancement with Variational Autoencoders and Non-negative Matrix Factorization.
Proceedings of the IEEE International Conference on Acoustics, 2019

Bayesian time-domain multiple sound source localization for a stochastic machine.
Proceedings of the 27th European Signal Processing Conference, 2019

2018
Multichannel Identification and Nonnegative Equalization for Dereverberation and Noise Reduction Based on Convolutive Transfer Function.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

A cascaded multiple-speaker localization and tracking system.
CoRR, 2018

Autoencoders for music sound synthesis: a comparison of linear, shallow, deep and variational models.
CoRR, 2018

A variance Modeling Framework based on variational Autoencoders for speech enhancement.
Proceedings of the 28th IEEE International Workshop on Machine Learning for Signal Processing, 2018

Online Localization of Multiple Moving Speakers in Reverberant Environments.
Proceedings of the 10th IEEE Sensor Array and Multichannel Signal Processing Workshop, 2018

Multisource Mint Using Convolutive Transfer Function.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Accounting for Room Acoustics in Audio-Visual Multi-Speaker Tracking.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Multiple-Speaker Localization Based on Direct-Path Features and Likelihood Maximization With Spatial Sparsity Regularization.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Extending the Cascaded Gaussian Mixture Regression Framework for Cross-Speaker Acoustic-Articulatory Mapping.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Automatic animation of an articulatory tongue model from ultrasound images of the vocal tract.
Speech Commun., 2017

Multichannel Source Separation and Speech Enhancement Using the Convolutive Transfer Function.
CoRR, 2017

An em algorithm for audio source separation based on the convolutive transfer function.
Proceedings of the 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2017

Exploiting the intermittency of speech for joint separation and diarization.
Proceedings of the 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2017

Explaining the parameterized wiener filter with alpha-stable processes.
Proceedings of the 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2017

A Bayesian Stochastic Machine for Sound Source Localization.
Proceedings of the IEEE International Conference on Rebooting Computing, 2017

Exploiting the Complementarity of Audio and Visual Data in Multi-speaker Tracking.
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

Audio source separation based on convolutive transfer function and frequency-domain lasso optimization.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

An EM algorithm for joint source separation and diarisation of multichannel convolutive speech mixtures.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Adaptation of a Gaussian Mixture Regressor to a New Input Distribution: Extending the C-GMR Framework.
Proceedings of the Latent Variable Analysis and Signal Separation, 2017

On the Use of Latent Mixing Filters in Audio Source Separation.
Proceedings of the Latent Variable Analysis and Signal Separation, 2017

2016
Estimation of the Direct-Path Relative Transfer Function for Supervised Sound-Source Localization.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

A Variational EM Algorithm for the Separation of Time-Varying Convolutive Audio Mixtures.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces.
PLoS Comput. Biol., 2016

Voice activity detection based on statistical likelihood ratio with adaptive thresholding.
Proceedings of the IEEE International Workshop on Acoustic Signal Enhancement, 2016

Reverberant sound localization with a robot head based on direct-path relative transfer function.
Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2016

Non-stationary noise power spectral density estimation based on regional statistics.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Deep neural networks for automatic detection of screams and shouted speech in subway trains.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

An inverse-gamma source variance prior with factorized parameterization for audio source separation.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Speaker-Adaptive Acoustic-Articulatory Inversion Using Cascaded Gaussian Mixture Regression.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Co-Localization of Audio Sources in Images Using Binaural Features and Locally-Linear Regression.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Binaural Sound Source Localization based on Direct-Path Relative Transfer Function.
CoRR, 2015

A variational EM algorithm for the separation of moving sound sources.
Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2015

Real-time control of a DNN-based articulatory synthesizer for silent speech conversion: a pilot study.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Estimation of relative transfer function in the presence of stationary noise based on segmental power spectral density matrix subtraction.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Local relative transfer function for sound source localization.
Proceedings of the 23rd European Signal Processing Conference, 2015

2014
Co-Localization of Audio Sources Using Binaural Features and Locally-Linear Regression.
CoRR, 2014

Robust articulatory speech synthesis using deep neural networks for BCI applications.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Sound representation and classification benchmark for domestic robots.
Proceedings of the 2014 IEEE International Conference on Robotics and Automation, 2014

Perceptual coding-based Informed Source Separation.
Proceedings of the 22nd European Signal Processing Conference, 2014

Mapping sounds onto images using binaural spectrograms.
Proceedings of the 22nd European Signal Processing Conference, 2014

2013
Fast and Accurate Direct MDCT to DFT Conversion With Arbitrary Window Functions.
IEEE Trans. Speech Audio Process., 2013

Informed Source Separation from compressed mixtures using spatial wiener filter and quantization noise estimation.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Informed source separation through spectrogram coding and data embedding.
Signal Process., 2012

Professionally-produced Music Separation Guided by Covers.
Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

A Simple Hybrid Acoustic / Morphologically-Constrained Technique for the Synthesis of Stop Consonants in Various Vocalic Contexts.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Sound-event recognition with a companion humanoid.
Proceedings of the 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), Osaka, Japan, November 29, 2012

Informed audio source separation: A comparative study.
Proceedings of the 20th European Signal Processing Conference, 2012

2011
Informed Source Separation of Linear Instantaneous Under-Determined Audio Mixtures by Source Index Embedding.
IEEE Trans. Speech Audio Process., 2011

"Sparsification" of Audio Signals Using the MDCT/IntMDCT and a Psychoacoustic Model - Application to Informed Audio Source Separation.
Proceedings of the AES International Conference Semantic Audio 2011, 2011

Informed Audio Source Separation from Compressed Linear Stereo Mixtures.
Proceedings of the AES International Conference Semantic Audio 2011, 2011

An Informed Source Separation System for Speech Signals.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A Long-Term Harmonic Plus Noise Model for Speech Signals.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

2010
A Watermarking-Based Method for Informed Source Separation of Audio Signals With a Single Sensor.
IEEE Trans. Speech Audio Process., 2010

Adaptive Long-Term Coding of LSF Parameters Trajectories for Large-Delay/Very- to Ultra-Low Bit-Rate Speech Coding.
EURASIP J. Audio Speech Music. Process., 2010

Informed source separation of underdetermined instantaneous stereo mixtures using source index embedding.
Proceedings of the IEEE International Conference on Acoustics, 2010

Interactive Music with Active Audio CDs.
Proceedings of the Exploring Music Contents - 7th International Symposium, 2010

2009
A watermarking-based method for single-channel audio source separation.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
Estimation of the voicing cut-off frequency contour of natural speech based on harmonic and aperiodic energies.
Proceedings of the IEEE International Conference on Acoustics, 2008

Long-term flexible 2D cepstral modeling of speech spectral amplitudes.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Log-Rayleigh Distribution: A Simple and Efficient Statistical Representation of Log-Spectral Coefficients.
IEEE Trans. Speech Audio Process., 2007

Mixing Audiovisual Speech Processing and Blind Source Separation for the Extraction of Speech Signals From Convolutive Mixtures.
IEEE Trans. Speech Audio Process., 2007

Perceptual Long-Term Variable-Rate Sinusoidal Modeling of Speech.
IEEE Trans. Speech Audio Process., 2007

Visual voice activity detection as a help for speech source separation from convolutive mixtures.
Speech Commun., 2007

Using a Visual Voice Activity Detector to Regularize the Permutations in Blind Separation of Convolutive Speech Mixtures.
Proceedings of the 15th International Conference on Digital Signal Processing, 2007

Long-Term Quantization of Speech LSF Parameters.
Proceedings of the IEEE International Conference on Acoustics, 2007

Two novel visual voice activity detectors based on appearance models and retinal filtering.
Proceedings of the 15th European Signal Processing Conference, 2007

Audiovisual speech source separation: a regularization method based on visual voice activity detection.
Proceedings of the Auditory-Visual Speech Processing 2007, 2007

Development and comparison of two approaches for visual speech analysis with application to voice activity detection.
Proceedings of the Auditory-Visual Speech Processing 2007, 2007

2006
An Analysis of Visual Speech Information Applied to Voice Activity Detection.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Theoretical and experimental bases of a new method for accurate separation of harmonic and noise components of speech signals.
Proceedings of the 14th European Signal Processing Conference, 2006

2005
Comparing several models for perceptual long-term modeling of amplitude and phase trajectories of sinusoidal speech.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Solving the indeterminations of blind source separation of convolutive speech mixtures.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Perceptually Weighted Long Term Modeling of Sinusoidal Speech Amplitude Trajectories.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Joint matrix quantization of face parameters and LPC coefficients for low bit rate audiovisual speech coding.
IEEE Trans. Speech Audio Process., 2004

Developing an audio-visual speech source separation algorithm.
Speech Commun., 2004

Using audiovisual speech processing to improve the robustness of the separation of convolutive speech mixtures.
Proceedings of the IEEE 6th Workshop on Multimedia Signal Processing, 2004

Long term modeling of phase trajectories within the speech sinusoidal model framework.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Characterizing and classifying cued speech vowels from labial parameters.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Watermarking of speech signals using the sinusoidal model and frequency modulation of the partials.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Speech extraction based on ICA and audio-visual coherence.
Proceedings of the Seventh International Symposium on Signal Processing and Its Applications, 2003

Extracting an AV speech source from a mixture of signals.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Further experiments on audio-visual speech source separation.
Proceedings of the AVSP 2003, 2003

Pure audio McGurk effect.
Proceedings of the AVSP 2003, 2003

2002
Separation of Audio-Visual Speech Sources: A New Approach Exploiting the Audio-Visual Coherence of Speech Stimuli.
EURASIP J. Adv. Signal Process., 2002

Audio-visual speech sources separation: a new approach exploiting the audio-visual coherence of speech stimuli.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

2001
Speech signals separation: a new approach exploiting the coherence of audio and visual speech.
Proceedings of the Fourth IEEE Workshop on Multimedia Signal Processing, 2001

1998
Audiovisual speech enhancement: new advances using multi-layer perceptrons.
Proceedings of the Second IEEE Workshop on Multimedia Signal Processing, 1998

An audio-visual distance for audio-visual speech vector quantization.
Proceedings of the Second IEEE Workshop on Multimedia Signal Processing, 1998

A signal processing system for having the sound "pop-out" in noise thanks to the image of the speaker's lips: new advances using multi-layer perceptrons.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Fusion of auditory and visual information for noisy speech enhancement: a preliminary study of vowel transitions.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

A preliminary study of an audio-visual speech coder: Using video parameters to reduce an LPC vocoder bit rate.
Proceedings of the 9th European Signal Processing Conference, 1998

Audiovisual Speech Coder : Using Vector Quantization To Exploit The Audio/Video Correlation.
Proceedings of the Auditory-Visual Speech Processing, 1998

1997
Noisy speech enhancement by fusion of auditory and visual information: a study of vowel transitions.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Can the visual input make the audio signal "pop out" in noise ? a first study of the enhancement of noisy VCV acoustic sequences by audio-visual fusion.
Proceedings of the ESCA Workshop on Audio-Visual Speech Processing, 1997

1995
Noisy speech enhancement with filters estimated from the speaker's lips.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995


  Loading...