Sabato Marco Siniscalchi
Orcid: 0000-0002-0770-0507
According to our database1,
Sabato Marco Siniscalchi
authored at least 139 papers
between 2002 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
How word semantics and phonology affect handwriting of Alzheimer's patients: A machine learning based analysis.
Comput. Biol. Medicine, February, 2024
An Explicit Consistency-Preserving Loss Function for Phase Reconstruction and Speech Enhancement.
CoRR, 2024
CoRR, 2024
Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition.
CoRR, 2024
Language-Universal Speech Attributes Modeling for Zero-Shot Multilingual Spoken Keyword Recognition.
CoRR, 2024
Bayesian adaptive learning to latent variables via Variational Bayes and Maximum a Posteriori.
CoRR, 2024
Exploiting Consistency-Preserving Loss and Perceptual Contrast Stretching to Boost SSL-Based Speech Enhancement.
Proceedings of the 26th IEEE International Workshop on Multimedia Signal Processing, 2024
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Boosting End-to-End Multilingual Phoneme Recognition Through Exploiting Universal Speech Attributes Constraints.
Proceedings of the IEEE International Conference on Acoustics, 2024
The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
2023
A step-by-step training method for multi generator GANs with application to anomaly detection and cybersecurity.
Neurocomputing, June, 2023
Generative error correction for code-switching speech recognition using large language models.
CoRR, 2023
The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction.
CoRR, 2023
S-HR-VQVAE: Sequential Hierarchical Residual Learning Vector Quantized Variational Autoencoder for Video Prediction.
CoRR, 2023
HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the 33rd IEEE International Workshop on Machine Learning for Signal Processing, 2023
Neural Model Reprogramming with Similarity Based Mapping for Low-Resource Spoken Command Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Description and analysis of the KPT system for NIST Language Recognition Evaluation 2022.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
The Multimodal Information Based Speech Processing (Misp) 2022 Challenge: Audio-Visual Diarization And Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
2022
Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2022
An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
A Study on Joint Modeling and Data Augmentation of Multi-Modalities for Audio-Visual Scene Classification.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
Audio-Visual Wake Word Spotting in MISP2021 Challenge: Dataset Release and Deep Analysis.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
A Variational Bayesian Approach to Learning Latent Variables for Acoustic Knowledge Transfer.
Proceedings of the IEEE International Conference on Acoustics, 2022
The First Multimodal Information Based Speech Processing (Misp) Challenge: Data, Tasks, Baselines And Results.
Proceedings of the IEEE International Conference on Acoustics, 2022
Hierarchical Residual Learning Based Vector Quantized Variational Autoencoder for Image Reconstruction and Generation.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022
2021
IEEE Signal Process. Lett., 2021
A multimodal retina-iris biometric system using the Levenshtein distance for spatial feature comparison.
IET Biom., 2021
A Study of Low-Resource Speech Commands Recognition based on Adversarial Reprogramming.
CoRR, 2021
A Lottery Ticket Hypothesis Framework for Low-Complexity Device-Robust Neural Acoustic Scene Classification.
CoRR, 2021
A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021
PATE-AAE: Incorporating Adversarial Autoencoder into Private Aggregation of Teacher Ensembles for Spoken Command Classification.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
2020
Analyzing Upper Bounds on Mean Absolute Errors for Deep Neural Network-Based Vector-to-Vector Regression.
IEEE Trans. Signal Process., 2020
Maximal Figure-of-Merit Framework to Detect Multi-Label Phonetic Features for Spoken Language Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
IEEE Trans. Cogn. Dev. Syst., 2020
IEEE Signal Process. Lett., 2020
Device-Robust Acoustic Scene Classification Based on Two-Stage Categorization and Data Augmentation.
CoRR, 2020
Sequence-to-Sequence Articulatory Inversion Through Time Convolution of Sub-Band Frequency Signals.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Relational Teacher Student Learning with Neural Label Embedding for Device Adaptation in Acoustic Scene Classification.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
An Acoustic Segment Model Based Segment Unit Selection Approach to Acoustic Scene Classification with Partial Utterances.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
A Cross-Task Transfer Learning Approach to Adapting Deep Speech Enhancement Models to Unseen Background Noise Using Paired Senone Classifiers.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Tensor-To-Vector Regression for Multi-Channel Speech Enhancement Based on Tensor-Train Network.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Performance Analysis for Tensor-Train Decomposition to Deep Neural Network Based Vector-to-Vector Regression.
Proceedings of the 54th Annual Conference on Information Sciences and Systems, 2020
2019
A Theory on Deep Neural Network Based Vector-to-Vector Regression With an Illustration of Its Expressive Power in Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2019
Improving Mispronunciation Detection of Mandarin Tones for Non-Native Learners With Soft-Target Tone Labels and BLSTM-Based Deep Tone Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2019
Proceedings of the Increasing Naturalness and Flexibility in Spoken Dialogue Interaction, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Improving Audio-visual Speech Recognition Performance with Cross-modal Student-teacher Training.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the 27th European Signal Processing Conference, 2019
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
2018
Improving Mandarin Tone Recognition Based on DNN by Combining Acoustic and Articulatory Features Using Extended Recognition Networks.
J. Signal Process. Syst., 2018
Improving Mandarin Tone Mispronunciation Detection for Non-Native Learners with Soft-Target Tone Labels and BLSTM-Based Deep Models.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2017
Adaptation to New Microphones Using Artificial Neural Networks With Trainable Activation Functions.
IEEE Trans. Neural Networks Learn. Syst., 2017
Bayesian Unsupervised Batch and Online Speaker Adaptation of Activation Function Parameters in Deep Models for Automatic Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2017
Hierarchical Bayesian combination of plug-in maximum a posteriori decoders in deep neural networks-based speech recognition and speaker adaptation.
Pattern Recognit. Lett., 2017
An End-to-End Deep Learning Approach to Simultaneous Speech Dereverberation and Acoustic Modeling for Robust Speech Recognition.
IEEE J. Sel. Top. Signal Process., 2017
A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation.
EURASIP J. Adv. Signal Process., 2017
IEEE Access, 2017
Improving Mispronunciation Detection for Non-Native Learners with Multisource Information and LSTM-Based Deep Models.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Joint Training of Multi-Channel-Condition Dereverberation and Acoustic Modeling of Microphone Array Speech for Robust Distant Speech Recognition.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
A transfer learning and progressive stacking approach to reducing deep model sizes with an application to speech enhancement.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
A unified deep modeling approach to simultaneous speech dereverberation and recognition for the reverb challenge.
Proceedings of the Hands-free Speech Communications and Microphone Arrays, 2017
2016
IEEE ACM Trans. Audio Speech Lang. Process., 2016
A unified approach to transfer learning of deep neural networks with applications to speaker adaptation in automatic speech recognition.
Neurocomputing, 2016
Deep learning with maximal figure-of-merit cost to advance multi-label speech attribute detection.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016
Detecting Mispronunciations of L2 Learners and Providing Corrective Feedback Using Knowledge-Guided and Data-Driven Decision Trees.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Improving non-native mispronunciation detection and enriching diagnostic feedback with DNN-based speech attribute modeling.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Using tone-based extended recognition network to detect non-native Mandarin tone mispronunciations.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016
2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Boosting universal speech attributes classification with deep neural network for foreign accent characterization.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
2014
Neurocomputing, 2014
Feature space maximum a posteriori linear regression for adaptation of deep neural networks.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
2013
A Bottom-Up Modular Search Approach to Large Vocabulary Continuous Speech Recognition.
IEEE Trans. Speech Audio Process., 2013
Hermitian Polynomial for Speaker Adaptation of Connectionist Speech Recognition Systems.
IEEE Trans. Speech Audio Process., 2013
IEEE Signal Process. Lett., 2013
An Information-Extraction Approach to Speech Processing: Analysis, Detection, Verification, and Recognition.
Proc. IEEE, 2013
Neurocomputing, 2013
IET Signal Process., 2013
Universal attribute characterization of spoken languages for automatic spoken language recognition.
Comput. Speech Lang., 2013
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
An experimental study on structural-MAP approaches to implementing very large vocabulary speech recognition systems for real-world tasks.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013
2012
Experiments on Cross-Language Attribute Detection and Phone Recognition With Minimal Target-Specific Training Data.
IEEE Trans. Speech Audio Process., 2012
Combining speech attribute detection and penalized logistic regression for phoneme recognition.
Neurocomputing, 2012
A new confidence measure combining Hidden Markov Models and Artificial Neural Networks of phonemes for effective keyword spotting.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Consumer-level multimedia event detection through unsupervised audio signal modeling.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
2011
A Bottom-Up Stepwise Knowledge-Integration Approach to Large Vocabulary Continuous Speech Recognition Using Weighted Finite State Machines.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Bootstrapping a spoken language identification system using unsupervised integrated sensing and processing decision trees.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011
2010
Penalized Logistic Regression With HMM Log-Likelihood Regressors for Speech Recognition.
IEEE Trans. Speech Audio Process., 2010
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Exploiting context-dependency and acoustic resolution of universal speech attribute models in spoken language recognition.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Experimental studies on continuous speech recognition using neural architectures with "adaptive" hidden activation functions.
Proceedings of the IEEE International Conference on Acoustics, 2010
2009
A study on integrating acoustic-phonetic information into lattice rescoring for automatic speech recognition.
Speech Commun., 2009
Proceedings of the 10th International Society for Music Information Retrieval Conference, 2009
Exploring universal attribute characterization of spoken languages for spoken language recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the IEEE International Conference on Acoustics, 2009
2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the IEEE International Conference on Acoustics, 2008
2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
High-Accuracy Phone Recognition By Combining High-Performance Lattice Generation and Knowledge Based Rescoring.
Proceedings of the IEEE International Conference on Acoustics, 2007
Proceedings of the IEEE International Conference on Acoustics, 2007
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007
2006
Riconoscimento del parlato basato su tecniche di soppressione del rumore o di integrazione della conoscenza articolatoria.
PhD thesis, 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Proceedings of the 2006 International Conference on Parallel Processing Workshops (ICPP Workshops 2006), 2006
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
Noise Robust Aurora-2 Speech Recognition Employing a Codebook-Constrained Kalman Filter Preprocessor.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
2005
Application of E<i>alpha</i>Nets to Feature Recognition of Articulation Manner in Knowledge-Based Automatic Speech Recognition.
Proceedings of the Neural Nets, 16th Italian Workshop on Neural Nets, 2005
Proceedings of the Embedded Software and Systems, Second International Conference, 2005
2004
Proceedings of the Biological and Artificial Intelligence Environments, 2004
Proceedings of the 2004 Euromicro Symposium on Digital Systems Design (DSD 2004), Architectures, Methods and Tools, 31 August, 2004
2002
MIP: A New Hybrid Multi-Agent Architecture for the Coordination of a Robot Colony Activities.
Proceedings of the 15th European Conference on Artificial Intelligence, 2002