Alessio Brutti

Orcid: 0000-0003-4146-3071

According to our database1, Alessio Brutti authored at least 80 papers between 2005 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
End-to-end integration of speech separation and voice activity detection for low-latency diarization of telephone conversations.
Speech Commun., 2024

MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages.
CoRR, 2024

Large Language Models Are Strong Audio-Visual Speech Recognition Learners.
CoRR, 2024

Federating Dynamic Models using Early-Exit Architectures for Automatic Speech Recognition on Heterogeneous Clients.
CoRR, 2024

Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters.
CoRR, 2024

Detection and Classification of Cardiovascular Diseases Using Neural Networks.
Proceedings of the Signal Processing: Algorithms, 2024

Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers.
Proceedings of the 34th IEEE International Workshop on Machine Learning for Signal Processing, 2024

Training Early-Exit Architectures for Automatic Speech Recognition: Fine-Tuning Pre-Trained Models or Training from Scratch.
Proceedings of the IEEE International Conference on Acoustics, 2024

LDASR: An Experimental Study on Layer Drop Using Conformer-Based Architecture.
Proceedings of the 32nd European Signal Processing Conference, 2024

MOSEL: 950, 000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Continual Contrastive Spoken Language Understanding.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
An experimental review of speaker diarization methods with application to two-speaker conversational telephone speech recordings.
Comput. Speech Lang., July, 2023

Direct enhancement of pre-trained speech embeddings for speech processing in noisy conditions.
Comput. Speech Lang., June, 2023

Training dynamic models using early exits for automatic speech recognition on resource-constrained devices.
CoRR, 2023

Improving the Intent Classification accuracy in Noisy Environment.
CoRR, 2023

Scaling strategies for on-device low-complexity source separation with Conv-Tasnet.
CoRR, 2023

Towards Speaker-Independent Voice Conversion for Improving Dysarthric Speech Intelligibility.
Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Sequence-Level Knowledge Distillation for Class-Incremental End-to-End Spoken Language Understanding.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

An Investigation of the Combination of Rehearsal and Knowledge Distillation in Continual Learning for Spoken Language Understanding.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022
Audio-Visual Tracking of Concurrent Speakers.
IEEE Trans. Multim., 2022

Time-Domain Joint Training Strategies of Speech Enhancement and Intent Classification Neural Models.
Sensors, 2022

Exploring the Joint Use of Rehearsal and Knowledge Distillation in Continual Learning for Spoken Language Understanding.
CoRR, 2022

Low-Latency Speech Separation Guided Diarization for Telephone Conversations.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Using Seq2seq voice conversion with pre-trained representations for audio anonymization: experimental insights.
Proceedings of the IEEE International Smart Cities Conference, 2022

Enhancing Embeddings for Speech Classification in Noisy Conditions.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Is Cross-Attention Preferable to Self-Attention for Multi-Modal Emotion Recognition?
Proceedings of the IEEE International Conference on Acoustics, 2022

Scalable Neural Architectures for End-to-End Environmental Sound Classification.
Proceedings of the IEEE International Conference on Acoustics, 2022

End-to-End Low Resource Keyword Spotting Through Character Recognition and Beam-Search Re-Scoring.
Proceedings of the IEEE International Conference on Acoustics, 2022

Optimizing PhiNet architectures for the detection of urban sounds on low-end devices.
Proceedings of the 30th European Signal Processing Conference, 2022

Low-Complexity Acoustic Scene Classification in DCASE 2022 Challenge.
Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

2021
Learning to Rank Microphones for Distant Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Robust Latent Representations Via Cross-Modal Translation and Alignment.
Proceedings of the IEEE International Conference on Acoustics, 2021

A Speech Enhancement Front-End for Intent Classification in Noisy Environments.
Proceedings of the 29th European Signal Processing Conference, 2021

Automatic Assessment of English CEFR Levels Using BERT Embeddings.
Proceedings of the Eighth Italian Conference on Computational Linguistics, 2021

2020
Compact Recurrent Neural Networks for Acoustic Event Detection on Low-Energy Low-Complexity Platforms.
IEEE J. Sel. Top. Signal Process., 2020

Supervised Online Diarization with Sample Mean Loss for Multi-Domain Data.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Speech Enhancement Using Dilated Wave-U-Net: an Experimental Analysis.
Proceedings of the 27th Conference of Open Innovations Association, 2020

2019
Multi-Speaker Tracking From an Audio-Visual Sensing Device.
IEEE Trans. Multim., 2019

ConflictNET: End-to-End Learning for Speech-Based Conflict Intensity Estimation.
IEEE Signal Process. Lett., 2019

The Speed Submission to DIHARD II: Contributions & Lessons Learned.
CoRR, 2019

LOCATA challenge: speaker localization with a planar array.
CoRR, 2019

Neural Network Distillation on IoT Platforms for Sound Event Detection.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Accurate Target Annotation in 3D from Multimodal Streams.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
3D Mouth Tracking from a Compact Microphone Array Co-Located with a camera.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Online Cross-Modal Adaptation for Audio-Visual Person Identification With Wearable Cameras.
IEEE Trans. Hum. Mach. Syst., 2017

Optimizing DNN Adaptation for Recognition of Enhanced Speech.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Unsupervised Cross-Modal Deep-Model Adaptation for Audio-Visual Re-identification with Wearable Cameras.
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

3D audio-visual speaker tracking with an adaptive particle filter.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
On the relationship between Early-to-Late Ratio of Room Impulse Responses and ASR performance in reverberant environments.
Speech Commun., 2016

Multi-channel i-vector combination for robust speaker verification in multi-room domestic environments.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Increasing the environment-awareness of rake beamforming for directive acoustic sources.
Proceedings of the IEEE International Workshop on Acoustic Signal Enhancement, 2016

A Phase-Based Time-Frequency Masking for Multi-Channel Speech Enhancement in Domestic Environments.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2015
Multi-channel speaker verification based on total variability modelling.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Multi-room speech activity detection using a distributed microphone network in domestic environments.
Proceedings of the 23rd European Signal Processing Conference, 2015

2014
Acoustic modeling based on early-to-late reverberation ratio for robust ASR.
Proceedings of the 14th International Workshop on Acoustic Signal Enhancement, 2014

On the use of Early-To-Late Reverberation ratio for ASR in reverberant environments.
Proceedings of the IEEE International Conference on Acoustics, 2014

A speech event detection and localization task for multiroom environments.
Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2014

2013
An environment aware ML estimation of acoustic radiation pattern with distributed microphone pairs.
Signal Process., 2013

Tracking of multidimensional TDOA for multiple sources with distributed microphone pairs.
Comput. Speech Lang., 2013

Geometric contamination for GMM/UBM speaker verification in reverberant environments.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012
Maximum a Posteriori Trajectory Estimation for Acoustic Source Tracking.
Proceedings of the IWAENC 2012 - International Workshop on Acoustic Signal Enhancement, Proceedings, RWTH Aachen University, Germany, September 4th, 2012

Environment aware estimation of the orientation of acoustic sources using a line array.
Proceedings of the 20th European Signal Processing Conference, 2012

2011

Sub-band spectral variance feature for noise robust ASR.
Proceedings of the 19th European Signal Processing Conference, 2011

Inference of acoustic source directivity using environment awareness.
Proceedings of the 19th European Signal Processing Conference, 2011

Multiple source tracking by sequential posterior kernel density estimation through GSCT.
Proceedings of the 19th European Signal Processing Conference, 2011

2010
WOZ acoustic data collection for interactive TV.
Lang. Resour. Evaluation, 2010

Multiple Source Localization Based on Acoustic Map De-Emphasis.
EURASIP J. Audio Speech Music. Process., 2010

A joint particle filter to track the position and head orientation of people using audio visual cues.
Proceedings of the 18th European Signal Processing Conference, 2010

2009
Person Tracking.
Proceedings of the Computers in the Human Interaction Loop, 2009

A sequential Monte Carlo approach for tracking of overlapping acoustic sources.
Proceedings of the 17th European Signal Processing Conference, 2009

Acoustic Based Surveillance System for Intrusion Detection.
Proceedings of the Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, 2009

2008
Localization of multiple speakers based on a two step acoustic map analysis.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Classification of Acoustic Maps to Determine Speaker Position and Orientation from a Distributed Microphone Network.
Proceedings of the IEEE International Conference on Acoustics, 2007

A Person Tracking System for CHIL Meetings.
Proceedings of the Multimodal Technologies for Perception of Humans, 2007

2006
Speaker localization based on oriented global coherence field.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

A Generative Approach to Audio-Visual Person Tracking.
Proceedings of the Multimodal Technologies for Perception of Humans, 2006

2005
Speaker Localization in CHIL Lectures: Evaluation Criteria and Results.
Proceedings of the Machine Learning for Multimodal Interaction, 2005

Oriented global coherence field for the estimation of the head orientation in smart rooms equipped with distributed microphone arrays.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Automatic Speech Activity Detection, Source Localization, and Speech Recognition on the Chil Seminar Corpus.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005


  Loading...