Siddharth Sigtia

Erik Marchi

CoRR, January, 2025

2024

A Multimodal Approach to Device-Directed Speech Detection with Large Language Models.

[BibT_eX]

[DOI]

Dominik Wagner

Panayiotis G. Georgiou

Matt Mirsamadi

Aarshee Mishra

Erik Marchi

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Multimodal Data and Resource Efficient Device-Directed Speech Detection with Large Foundation Models.

[BibT_eX]

[DOI]

Dominik Wagner

Panayiotis G. Georgiou

Matt Mirsamadi

Aarshee Mishra

Erik Marchi

CoRR, 2023

2022

Improving Voice Trigger Detection with Metric Learning.

[BibT_eX]

[DOI]

Varun Lakshminarasimhan

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Streaming Transformer for Hardware Efficient Voice Trigger Detection and False Trigger Mitigation.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Progressive Voice Trigger Detection: Accuracy vs Latency.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Hybrid Transformer/CTC Networks for Hardware Efficient Voice Triggering.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Multi-Task Learning for Speaker Verification and Voice Trigger Detection.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Multi-Task Learning for Voice Trigger Detection.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2018

Efficient Voice Trigger Detection for Low Resource Hardware.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Generalised Discriminative Transform via Curriculum Learning for Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Neural networks for analysing music and environmental audio.

[BibT_eX]

[DOI]

PhD thesis, 2017

Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2017

2016

Automatic Environmental Sound Recognition: Performance Versus Computational Cost.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

An End-to-End Neural Network for Polyphonic Piano Music Transcription.

[BibT_eX]

[DOI]

Emmanouil Benetos

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Fully Deep Neural Networks Incorporating Unsupervised Feature Learning for Audio Tagging.

[BibT_eX]

[DOI]

CoRR, 2016

Learning to Generate Genotypes with Neural Networks.

[BibT_eX]

[DOI]

Chrisantha Fernando

CoRR, 2016

2015

An End-to-End Neural Network for Polyphonic Music Transcription.

[BibT_eX]

[DOI]

Emmanouil Benetos

CoRR, 2015

Chime-home: A dataset for sound source recognition in a domestic environment.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2015

Audio Chord Recognition with a Hybrid Recurrent Neural Network.

[BibT_eX]

[DOI]

Nicolas Boulanger-Lewandowski

Proceedings of the 16th International Society for Music Information Retrieval Conference, 2015

A hybrid recurrent neural network for music transcription.

[BibT_eX]

[DOI]

Nicolas Boulanger-Lewandowski

Emmanouil Benetos

Tillman Weyde

Artur S. d'Avila Garcez

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

A Denoising Autoencoder that Guides Stochastic Search.

[BibT_eX]

[DOI]

Chrisantha Fernando

CoRR, 2014

An RNN-based Music Language Model for Improving Automatic Music Transcription.

[BibT_eX]

[DOI]

Artur S. d'Avila Garcez

Proceedings of the 15th International Society for Music Information Retrieval Conference, 2014

Improved music feature learning with deep neural networks.

[BibT_eX]

[DOI]