Thomas Pellegrini

Orcid: 0000-0001-8984-1399

According to our database1, Thomas Pellegrini authored at least 85 papers between 2006 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
CoNeTTE: An Efficient Audio Captioning System Leveraging Multiple Datasets With Task Embedding.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Enhancing Synthetic Training Data for Speech Commands: From ASR-Based Filtering to Domain Adaptation in SSL Latent Space.
CoRR, 2024

CNN-based Compressor Mass Flow Estimator in Industrial Aircraft Vapor Cycle System.
CoRR, 2024

Adaptation de modèles auto-supervisés pour la reconnaissance de phonèmes dans la parole d'enfant.
Proceedings of the Actes des 35èmes Journées d'Études sur la Parole, 2024

Premier système IRIT-MyFamillyUp pour la compétition sur la reconnaissance des émotions Odyssey 2024.
Proceedings of the Actes des 35èmes Journées d'Études sur la Parole, 2024

IRIT-MFU Multi-modal systems for emotion classification for Odyssey 2024 challenge.
Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024

2023
Audio-video fusion strategies for active speaker detection in meetings.
Multim. Tools Appl., April, 2023

Audio classification with Dilated Convolution with Learnable Spacings.
CoRR, 2023

Multilingual Audio Captioning using machine translated data.
CoRR, 2023

Killing two birds with one stone: Can an audio captioning system also be used for audio-text retrieval?
CoRR, 2023

Dilated Convolution with Learnable Spacings: beyond bilinear interpolation.
CoRR, 2023

Comparing phoneme recognition systems on the detection and diagnosis of reading mistakes for young children's oral reading evaluation.
Proceedings of the 9th Workshop on Speech and Language Technology in Education, 2023

Adapting a ConvNeXt Model to Audio Classification on AudioSet.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Dilated convolution with learnable spacings.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Multitask Learning in Audio Captioning: A Sentence Embedding Regression Loss Acts as a Regularizer.
Proceedings of the 31st European Signal Processing Conference, 2023

2022
PCEDNet: A Lightweight Neural Network for Fast and Interactive Edge Detection in 3D Point Clouds.
ACM Trans. Graph., 2022

Comparison of semi-supervised deep learning algorithms for audio classification.
EURASIP J. Audio Speech Music. Process., 2022

Language-Based Audio Retrieval with Textual Embeddings of Tag Names.
Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

Is my Automatic Audio Captioning System so Bad? SPIDEr-max: A Metric to Consider Several Caption Candidates.
Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

2021
End-to-end acoustic modelling for phone recognition of young readers.
Speech Commun., 2021

Improving Deep-learning-based Semi-supervised Audio Tagging with Mixup.
CoRR, 2021

Low-Activity Supervised Convolutional Spiking Neural Networks Applied to Speech Commands Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Deep-Learning-Based Central African Primate Species Classification with MixUp and SpecAugment.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Simulating Reading Mistakes for Child Speech Transformer-Based Phone Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Fast Threshold Optimization for Multi-Label Audio Tagging Using Surrogate Gradient Learning.
Proceedings of the IEEE International Conference on Acoustics, 2021

Comparison of Deep Co-Training and Mean-Teacher Approaches for Semi-Supervised Audio Tagging.
Proceedings of the IEEE International Conference on Acoustics, 2021

Weakly supervised discourse segmentation for multiparty oral conversations.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Automatic macro segmentation into interaction sequence: a silence-based approach for meeting structuring.
Proceedings of the 18th International Conference on Content-Based Multimedia Indexing, 2021

2020
PCEDNet : A Neural Network for Fast and Efficient Edge Detection in 3D Point Clouds.
CoRR, 2020

Deep learning with weakly-annotated data: a sound event detection use case (and hate speech detection here and there) (abstract).
Proceedings of the Workshop on Machine Learning for Trend and Weak Signal Detection in Social Networks and Social Media, 2020

Informations segmentales pour la caractérisation phonétique du locuteur : variabilité inter- et intra-locuteurs (An automatic classification task involving 44 speakers was performed using convolutional neural networks (CNN) on broadband spectrograms extracted from 2-second sequences of a spontaneous speech corpus (NCCFr)).
Proceedings of the Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 2020

Reconnaissance de phones fondée sur du Transfer Learning pour des enfants apprenants lecteurs en environnement de classe (Transfer Learning based phone recognition on children learning to read, with speech recorded in a classroom environment).
Proceedings of the Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 2020

2019
Technical report: supervised training of convolutional spiking neural networks with PyTorch.
CoRR, 2019

Evaluation of Post-Processing Algorithms for Polyphonic Sound Event Detection.
Proceedings of the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019

The Airbus Air Traffic Control Speech Recognition 2018 Challenge: Towards ATC Automatic Transcription and Call Sign Detection.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Char+CV-CTC: Combining Graphemes and Consonant/Vowel Units for CTC-Based ASR Using Multitask Learning.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Cosine-similarity penalty to discriminate sound classes in weakly-supervised sound event detection.
Proceedings of the International Joint Conference on Neural Networks, 2019

A Convolutional Neural Network for 250-MHz Quantitative Acoustic-microscopy Resolution Enhancement.
Proceedings of the 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2019

2018
Group emotion recognition strategies for entertainment robots.
Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018

Sound event detection from weak annotations: weighted-GRU versus multi-instance-learning.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2018

2017
Unsupervised Speech Unit Discovery Using K-means and Neural Networks.
Proceedings of the Statistical Language and Speech Processing, 2017

Lexical Emphasis Detection in Spoken French Using F-BANKs and Neural Networks.
Proceedings of the Statistical Language and Speech Processing, 2017

Densely connected CNNs for bird audio detection.
Proceedings of the 25th European Signal Processing Conference, 2017

Music Feature Maps with Convolutional Neural Networks for Music Genre Classification.
Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing, 2017

2016
Réseau de neurones convolutif pour l'évaluation automatique de la prononciation (CNN-based automatic pronunciation assessment of Japanese speakers learning French ).
Proceedings of the Actes de la conférence conjointe JEP-TALN-RECITAL 2016. Volume 1 : JEP, 2016

Influence de la quantité de données sur une tâche de segmentation de phones fondée sur les réseaux de neurones (Phone-level speech segmentation with neural networks : influence of the amount of data ).
Proceedings of the Actes de la conférence conjointe JEP-TALN-RECITAL 2016. Volume 1 : JEP, 2016

Inferring Phonemic Classes from CNN Activation Maps Using Clustering Techniques.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

CNN-Based Phone Segmentation Experiments in a Less-Represented Language.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Pronunciation Assessment of Japanese Learners of French with GOP Scores and Phonetic Information.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Sinusoidal Modelling for Ecoacoustics.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Filterbank coefficients selection for segmentation in singer turns.
Proceedings of the 14th International Workshop on Content-Based Multimedia Indexing, 2016

2015
Automatic Assessment of Speech Capability Loss in Disordered Speech.
ACM Trans. Access. Comput., 2015

Predicting disordered speech comprehensibility from Goodness of Pronunciation scores.
Proceedings of the 6th Workshop on Speech and Language Processing for Assistive Technologies, 2015

Time-continuous Estimation of Emotion in Music with Recurrent Neural Networks.
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

Comparing SVM, softmax, and shallow neural networks for eating condition classification.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Pyc2Sound: a Python tool to convert images into sound.
Proceedings of the Audio Mostly 2015 on Interaction With Sound, 2015

2014
Correlating ASR errors with developmental changes in speech production: a study of 3-10-year-old European Portuguese children's speech.
Proceedings of the 4st Workshop on Child, Computer and Interaction, 2014

Improving Speech Recognition through Automatic Selection of Age Group - Specific Acoustic Models.
Proceedings of the Computational Processing of the Portuguese Language, 2014

Automatically Recognising European Portuguese Children's Speech - Pronunciation Patterns Revealed by an Analysis of ASR Errors.
Proceedings of the Computational Processing of the Portuguese Language, 2014

El-WOZ: a client-server wizard-of-oz interface.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Segmentation in singer turns with the Bayesian information criterion.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Speaker age estimation for elderly speech recognition in European Portuguese.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

The goodness of pronunciation algorithm applied to disordered speech.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Towards soundpainting gesture recognition.
Proceedings of the Audio Mostly 2014, AM '14, 2014

2013
ASR-based exercises for listening comprehension practice in European Portuguese.
Comput. Speech Lang., 2013

A corpus-based study of elderly and young speakers of European Portuguese: acoustic correlates and their impact on speech recognition performance.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012
Less errors with TTS? A dictation experiment with foreign language learners.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Impact of Age in ASR for the Elderly: Preliminary Experiments in European Portuguese.
Proceedings of the Advances in Speech and Language Technologies for Iberian Languages, 2012

Overview of Computer-assisted Language Learning for European Portuguese at L<sup>2</sup>f.
Proceedings of the CSEDU 2012, 2012

2011
Listening comprehension games for portuguese: exploring the best features.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2011

Automatic Generation of Listening Comprehension Learning Material in European Portuguese.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Browsing videos by automatically detected audio events.
Proceedings of EUROCON 2011, 2011

2010
Multimedia learning materials.
Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

Improving ASR error detection with non-decoder based features.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Context dependent modelling approaches for hybrid speech recognizers.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

2009
Automatic Word Decompounding for ASR in a Morphologically Rich Language: Application to Amharic.
IEEE Trans. Speech Audio Process., 2009

Error Detection in Broadcast News ASR Using Markov Chains.
Proceedings of the Human Language Technology. Challenges for Computer Science and Linguistics, 2009

Detecting audio events for semantic video search.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Audio contributions to semantic video search.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

2008
Transcription automatique de langues peu dotées. (Automatic speech recognition for less-represented languages).
PhD thesis, 2008

Are audio or textual training data more important for ASR in less-represented languages?
Proceedings of the First International Workshop on Spoken Languages Technologies for Under-Resourced Languages, 2008

Developments of "Lëtzebuergesch" Resources for Automatic Speech Processing and Linguistic Studies.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

2007
Using phonetic features in unsupervised word decompounding for ASR with application to a less-represented language.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

2006
Experimental detection of vowel pronunciation variants in Amharic.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Investigating automatic decomposition for ASR in less represented languages.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006


  Loading...