Julien Pinquier

Orcid: 0000-0003-1556-1284

According to our database1, Julien Pinquier authored at least 87 papers between 2002 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


CoNeTTE: An Efficient Audio Captioning System Leveraging Multiple Datasets With Task Embedding.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

SAMI: an M-Health application to telemonitor intelligibility and speech disorder severity in head and neck cancers.
Frontiers Artif. Intell., 2024

EMVD dataset: a dataset of extreme vocal distortion techniques used in heavy metal.
CoRR, 2024

Les représentations de locuteurs pour prédire l'intelligibilité de la parole lors de conversations médicales.
Proceedings of the Actes des 35èmes Journées d'Études sur la Parole, 2024

Peut-on évaluer la compréhensibilité de la parole sans référence quant aux intentions de communication du locuteur ? Une étude auprès d'apprenants germanophones de FLE.
Proceedings of the Actes des 35èmes Journées d'Études sur la Parole, 2024

Erreurs de prononciation en L2 : comparaison de méthodes pour la détection et le diagnostic guidés par la didactique.
Proceedings of the Actes des 35èmes Journées d'Études sur la Parole, 2024

Étude des liens acoustico-moteurs après cancer oral ou oropharyngé, via la réalisation d'un inventaire phonémique automatique des consonnes.
Proceedings of the Actes des 35èmes Journées d'Études sur la Parole, 2024

Detection of Pharyngolaryngeal Activities in Real-World Settings Using Wearable Sensors.
Proceedings of the 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2024

Audio-video fusion strategies for active speaker detection in meetings.
Multim. Tools Appl., April, 2023

Killing two birds with one stone: Can an audio captioning system also be used for audio-text retrieval?
CoRR, 2023

Comparing phoneme recognition systems on the detection and diagnosis of reading mistakes for young children's oral reading evaluation.
Proceedings of the 9th Workshop on Speech and Language Technology in Education, 2023

Towards Reducing Patient Effort for the Automatic Prediction of Speech Intelligibility in Head and Neck Cancers.
Proceedings of the IEEE International Conference on Acoustics, 2023

Multitask Learning in Audio Captioning: A Sentence Embedding Regression Loss Acts as a Regularizer.
Proceedings of the 31st European Signal Processing Conference, 2023

Can We Use Speaker Embeddings On Spontaneous Speech Obtained From Medical Conversations To Predict Intelligibility?
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Deep neural networks for automatic speech processing: a survey from large corpora to limited data.
EURASIP J. Audio Speech Music. Process., 2022

Automatic Assessment of Speech Intelligibility using Consonant Similarity for Head and Neck Cancer.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Prediction of L2 speech proficiency based on multi-level linguistic features.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Is my Automatic Audio Captioning System so Bad? SPIDEr-max: A Metric to Consider Several Caption Candidates.
Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

End-to-end acoustic modelling for phone recognition of young readers.
Speech Commun., 2021

C2SI corpus: a database of speech disorder productions to assess intelligibility and quality of life in head and neck cancers.
Lang. Resour. Evaluation, 2021

Improving vehicle re-identification using CNN latent spaces: Metrics comparison and track-to-track extension.
IET Comput. Vis., 2021

Multimodal Neural Network for Sentiment Analysis in Embedded Systems.
Proceedings of the 16th International Joint Conference on Computer Vision, 2021

Multimodal human interaction analysis in vehicle cockpit.
Proceedings of the 24th IEEE International Intelligent Transportation Systems Conference, 2021

Simulating Reading Mistakes for Child Speech Transformer-Based Phone Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Automatic macro segmentation into interaction sequence: a silence-based approach for meeting structuring.
Proceedings of the 18th International Conference on Content-Based Multimedia Indexing, 2021

Towards a content-based prediction of personalized musical preferences using transfer learning.
Proceedings of the 18th International Conference on Content-Based Multimedia Indexing, 2021

Étude des facteurs affectant la compréhensibilité de documents multimodaux : une étude expérimentale (Factors affecting the comprehensibility of multimodal documents : an experimental study ).
Proceedings of the Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 2020

Reconnaissance de phones fondée sur du Transfer Learning pour des enfants apprenants lecteurs en environnement de classe (Transfer Learning based phone recognition on children learning to read, with speech recorded in a classroom environment).
Proceedings of the Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 2020

Analyse de l'effet de la réverbération sur la reconnaissance automatique de la parole (Analyzing how reverberation affects Automatic Speech Recognition).
Proceedings of the Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 2020

Une nouvelle mesure de la réverbération pour prédire les performances a priori de la transcription de la parole (A new reverberation measure to predict a priori ASR performance).
Proceedings of the Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 2020

Subjective Evaluation of Comprehensibility in Movie Interactions.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Automatic Prediction of Speech Intelligibility Based on X-Vectors in the Context of Head and Neck Cancer.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Audiovisual Annotation Procedure for Multi-view Field Recordings.
Proceedings of the MultiMedia Modeling - 25th International Conference, 2019

Toulouse campus surveillance dataset: scenarios, soundtracks, synchronized videos with overlapping and disjoint views.
Proceedings of the 9th ACM Multimedia Systems Conference, 2018

Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Perceptual and Automatic Evaluations of the Intelligibility of Speech Degraded by Noise Induced Hearing Loss Simulation.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Catégorisation libre d'extraits musicaux et analyse automatique.
Proceedings of the COnférence en Recherche d'Informations et Applications, 2018

Unsupervised Speech Unit Discovery Using K-means and Neural Networks.
Proceedings of the Statistical Language and Speech Processing, 2017

Music Feature Maps with Convolutional Neural Networks for Music Genre Classification.
Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing, 2017

A multi-modal perception based assistive robotic system for the elderly.
Comput. Vis. Image Underst., 2016

Influence de la quantité de données sur une tâche de segmentation de phones fondée sur les réseaux de neurones (Phone-level speech segmentation with neural networks : influence of the amount of data ).
Proceedings of the Actes de la conférence conjointe JEP-TALN-RECITAL 2016. Volume 1 : JEP, 2016

CNN-Based Phone Segmentation Experiments in a Less-Represented Language.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Using Phonologically Weighted Levenshtein Distances for the Prediction of Microscopic Intelligibility.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Online Audiovisual Signature Training for Person Re-identification.
Proceedings of the 10th International Conference on Distributed Smart Camera, 2016

A Multi-modal Perception based Architecture for a Non-intrusive Domestic Assistant Robot.
Proceedings of the Eleventh ACM/IEEE International Conference on Human Robot Interation, 2016

Filterbank coefficients selection for segmentation in singer turns.
Proceedings of the 14th International Workshop on Content-Based Multimedia Indexing, 2016

Automatic intelligibility measures applied to speech signals simulating age-related hearing loss.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Perceiving user's intention-for-interaction: A probabilistic multimodal data fusion scheme.
Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

Pyc2Sound: a Python tool to convert images into sound.
Proceedings of the Audio Mostly 2015 on Interaction With Sound, 2015

Comparaison de mesures perceptives et automatiques de l'intelligibilité. Application à de la parole simulant la presbyacousie.
Trait. Autom. des Langues, 2014

Hierarchical Hidden Markov Model in detecting activities of daily living in wearable videos for studies of dementia.
Multim. Tools Appl., 2014

Segmentation in singer turns with the Bayesian information criterion.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

A particle swarm optimization inspired tracker applied to visual tracking.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Segmentations sonore et audiovisuelle ?
, 2014

Superposed speech localisation using frequency tracking.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Two-step detection of water sound events for the diagnostic and monitoring of dementia.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

Water sound recognition based on physical models.
Proceedings of the IEEE International Conference on Acoustics, 2013

Audio indexing including frequency tracking of simultaneous multiple sources in speech and music.
Proceedings of the 11th International Workshop on Content-Based Multimedia Indexing, 2013

Detecting individual role using features extracted from speaker diarization results.
Multim. Tools Appl., 2012

Strategies for multiple feature fusion with Hierarchical HMM: Application to activity recognition from wearable audiovisual sensors.
Proceedings of the 21st International Conference on Pattern Recognition, 2012

Water flow detection from a wearable device with a new feature, the spectral cover.
Proceedings of the 10th International Workshop on Content-Based Multimedia Indexing, 2012

Feasibility of the detection of choirs for ethnomusicologic music indexing.
Proceedings of the 10th International Workshop on Content-Based Multimedia Indexing, 2012

Distinguishing Monophonies From Polyphonies Using Weibull Bivariate Distributions.
IEEE Trans. Speech Audio Process., 2011

Activities of daily living indexing by hierarchical HMM for dementia diagnostics.
Proceedings of the 9th International Workshop on Content-Based Multimedia Indexing, 2011

The IMMED project: wearable video monitoring of people with age dementia.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Speaker role recognition to help spontaneous conversational speech detection.
Proceedings of the 2010 International Workshop on Searching Spontaneous Conversational Speech, 2010

Looking for relevant features for speaker role recognition.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Exploiting speaker segmentations for automatic role detection. An application to broadcast news documents.
Proceedings of the 2010 International Workshop on Content-Based Multimedia Indexing, 2010

Improved speaker diarization system for meetings.
Proceedings of the IEEE International Conference on Acoustics, 2009

Singing voice detection in monophonic and polyphonic contexts.
Proceedings of the 17th European Signal Processing Conference, 2009

Monophony vs Polyphony: A New Method Based on Weibull Bivariate Models.
Proceedings of the Seventh International Workshop on Content-Based Multimedia Indexing, 2009

Dynamic organization of audiovisual database using a user-defined similarity measure based on low-level features.
Proceedings of the International Conference on Image Processing, 2008

Wearable video monitoring of people with age Dementia : Video indexing at the service of helthcare.
Proceedings of the International Workshop on Content-Based Multimedia Indexing, 2008

Singing voice characterization for audio indexing.
Proceedings of the 15th European Signal Processing Conference, 2007

ACADI showcase - automatic character indexing in audiovisual document.
Proceedings of the 6th ACM International Conference on Image and Video Retrieval, 2007

Fast Hierarchical Multimodal Structuring of Time Slots.
Proceedings of the International Workshop on Content-Based Multimedia Indexing, 2007

Association of Audio and Video Segmentations for Automatic Person Indexing.
Proceedings of the International Workshop on Content-Based Multimedia Indexing, 2007

Audio indexing: primary components retrieval.
Multim. Tools Appl., 2006

Intervenant Classification in an Audiovisual Document.
Proceedings of the SIGMAP 2006, 2006

Evaluation of classification techniques for audio indexing.
Proceedings of the 13th European Signal Processing Conference, 2005

Indexation sonore : recherche de composantes primaires pour une structuration audiovisuelle. (Audio classification: search of primary components for audiovisual structuring).
PhD thesis, 2004

Jingle detection and identification in audio documents.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Fusion of descriptors for speech / music classification.
Proceedings of the 2004 12th European Signal Processing Conference, 2004

Fusion de paramètres pour une classification automatique parole/musique robuste. Séparation parole/musique dans les fichiers a.
Tech. Sci. Informatiques, 2003

A fusion study in speech/music classification.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Robust speech / music classification in audio documents.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Speech and music classification in audio documents.
Proceedings of the IEEE International Conference on Acoustics, 2002
