Mirco Ravanelli

Orcid: 0000-0002-3929-5526

According to our database1, Mirco Ravanelli authored at least 92 papers between 2012 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Generalization limits of Graph Neural Networks in identity effects learning.
Neural Networks, 2025

A protocol for trustworthy EEG decoding with neural networks.
Neural Networks, 2025

Speech self-supervised representations benchmarking: A case for larger probing heads.
Comput. Speech Lang., 2025

2024
CL-MASR: A Continual Learning Benchmark for Multilingual ASR.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

What Are They Doing? Joint Audio-Speech Co-Reasoning.
CoRR, 2024

LMAC-TD: Producing Time Domain Explanations for Audio Classifiers.
CoRR, 2024

ProGRes: Prompted Generative Rescoring on ASR n-Best.
CoRR, 2024

Open-Source Conversational AI with SpeechBrain 1.0.
CoRR, 2024

DASB - Discrete Audio and Speech Benchmark.
CoRR, 2024

How Should We Extract Discrete Audio Tokens from Self-Supervised Models?
CoRR, 2024

Phoneme Discretized Saliency Maps for Explainable Detection of AI-Generated Voice.
CoRR, 2024

Listenable Maps for Zero-Shot Audio Classifiers.
CoRR, 2024

Bayesian Deep Learning for Remaining Useful Life Estimation via Stein Variational Gradient Descent.
CoRR, 2024

Are LLMs Robust for Spoken Dialogues?
CoRR, 2024

SpeechBrain-MOABB: An open-source Python library for benchmarking deep neural networks applied to EEG signals.
Comput. Biol. Medicine, 2024

Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers.
Proceedings of the 34th IEEE International Workshop on Machine Learning for Signal Processing, 2024

Listenable Maps for Audio Classifiers.
Proceedings of the Forty-first International Conference on Machine Learning, 2024


Skill: Similarity-Aware Knowledge Distillation for Speech Self-Supervised Learning.
Proceedings of the IEEE International Conference on Acoustics, 2024

Resource-Efficient Separation Transformer.
Proceedings of the IEEE International Conference on Acoustics, 2024

Focal Modulation Networks for Interpretable Sound Classification.
Proceedings of the IEEE International Conference on Acoustics, 2024

Adaptation Odyssey in LLMs: Why Does Additional Pretraining Sometimes Fail to Improve?
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

TARIC-SLU: A Tunisian Benchmark Dataset for Spoken Language Understanding.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Dynamic HumTrans: Humming Transcription Using CNNs and Dynamic Programming.
Proceedings of the Artificial Neural Networks in Pattern Recognition, 2024

Explaining Network Decision Provides Insights on the Causal Interaction Between Brain Regions in a Motor Imagery Task.
Proceedings of the Artificial Neural Networks in Pattern Recognition, 2024

Multi-modal Decoding of Reach-to-Grasping from EEG and EMG via Neural Networks.
Proceedings of the Artificial Neural Networks in Pattern Recognition, 2024

2023
Exploring Self-Attention Mechanisms for Speech Separation.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch.
CoRR, 2023

Audio Editing with Non-Rigid Text Prompts.
CoRR, 2023

Towards Foundational Models for Molecular Learning on Large-Scale Multi-Task Datasets.
CoRR, 2023

Speech Emotion Diarization: Which Emotion Appears When?
CoRR, 2023

Posthoc Interpretation via Quantization.
CoRR, 2023

Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Fine-Tuning Strategies for Faster Inference Using Speech Self-Supervised Models: A Comparative Study.
Proceedings of the IEEE International Conference on Acoustics, 2023

Simulated Annealing in Early Layers Leads to Better Generalization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Speech Emotion Diarization: Which Emotion Appears When?
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Rescuespeech: A German Corpus for Speech Recognition in Search and Rescue Domain.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

TorchAudio 2.1: Advancing Speech Recognition, Self-Supervised Learning, and Audio Processing Components for Pytorch.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Learning Representations for New Sound Classes With Continual Self-Supervised Learning.
IEEE Signal Process. Lett., 2022

Resource-Efficient Separation Transformer.
CoRR, 2022

On Using Transformers for Speech-Separation.
CoRR, 2022

OSSEM: one-shot speaker adaptive speech enhancement using meta learning.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

SoundChoice: Grapheme-to-Phoneme Models with Semantic Disambiguation.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Real-M: Towards Speech Separation on Real Mixtures.
Proceedings of the IEEE International Conference on Acoustics, 2022

MetricGAN-U: Unsupervised Speech Enhancement/ Dereverberation Based Only on Noisy/ Reverberated Speech.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
SpeechBrain: A General-Purpose Speech Toolkit.
CoRR, 2021

Transformers with Competitive Ensembles of Independent Mechanisms.
CoRR, 2021

Timers and Such: A Practical Benchmark for Spoken Language Understanding with Numbers.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

The Energy and Carbon Footprint of Training End-to-End Speech Recognizers.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

ECAPA-TDNN Embeddings for Speaker Diarization.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Attention Is All You Need In Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2021

Interpretable SincNet-based Deep Learning for Emotion Recognition from EEG brain activity.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021

2020
BIRD: Big Impulse Response Dataset.
CoRR, 2020

Towards Unsupervised Learning of Speech Representations.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Quaternion Neural Networks for Multi-Channel Distant Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Multi-Task Self-Supervised Learning for Robust Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Using Speech Synthesis to Train End-To-End Spoken Language Understanding Models.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Learning Speaker Representations with Mutual Information.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Learning Problem-Agnostic Speech Representations from Multiple Self-Supervised Tasks.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Speech Model Pre-Training for End-to-End Spoken Language Understanding.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Quaternion Recurrent Neural Networks.
Proceedings of the 7th International Conference on Learning Representations, 2019

The Pytorch-kaldi Speech Recognition Toolkit.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Light Gated Recurrent Units for Speech Recognition.
IEEE Trans. Emerg. Top. Comput. Intell., 2018

Automatic context window composition for distant speech recognition.
Speech Commun., 2018

Speech and Speaker Recognition from Raw Waveform with SincNet.
CoRR, 2018

Interpretable Convolutional Filters with SincNet.
CoRR, 2018

Speech recognition with quaternion neural networks.
CoRR, 2018

Speaker Recognition from Raw Waveform with SincNet.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Twin Regularization for Online Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017
Deep Learning for Distant Speech Recognition.
PhD thesis, 2017

Deep Learning for Distant Speech Recognition.
CoRR, 2017

The DIRHA-English corpus and related tasks for distant-speech recognition in domestic environments.
CoRR, 2017

Improving Speech Recognition by Revising Gated Recurrent Units.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

A network of deep neural networks for Distant Speech Recognition.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Batch-normalized joint training for DNN-based distant speech recognition.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Discussion.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Realistic Multi-Microphone Data Simulation for Distant Speech Recognition.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2015
Insights into Audio-Based Multimedia Event Classification with Neural Networks.
Proceedings of the 2015 Workshop on Community-Organized Multimodal Mining: Opportunities for Novel Solutions, 2015

Contaminated speech training methods for robust DNN-HMM distant speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

A multi-channel corpus for distant-speech interaction in presence of known interferences.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

The DIRHA-ENGLISH corpus and related tasks for distant-speech recognition in domestic environments.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
The DIRHA simulated corpus.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

TANDEM-bottleneck feature combination using hierarchical Deep Neural Networks.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

On the selection of the impulse responses for distant-speech recognition based on contaminated speech training.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

The DIRHA-GRID corpus: baseline and tools for multi-room distant speech recognition using distributed microphones.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Audio-concept features and hidden Markov models for multimedia event detection.
Proceedings of the 2nd International Workshop on Speech, Language and Audio in Multimedia, 2014

A speech event detection and localization task for multiroom environments.
Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2014

Audio concept classification with Hierarchical Deep Neural Networks.
Proceedings of the 22nd European Signal Processing Conference, 2014

2013
Embedding speech recognition to control lights.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Audio Concept Ranking for Video Event Detection on User-Generated Content.
Proceedings of the First Workshop on Speech, 2013

2012
Impulse response estimation for robust speech recognition in a reverberant environment.
Proceedings of the 20th European Signal Processing Conference, 2012


  Loading...