Wei-Ning Hsu

Proceedings of the Interspeech 2020, 2020

Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech.

[BibT_eX]

[DOI]

David Harwath

Proceedings of the 8th International Conference on Learning Representations, 2020

2019

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling.

[BibT_eX]

[DOI]

CoRR, 2019

Transfer Learning from Audio-Visual Grounding to Speech Recognition.

[BibT_eX]

[DOI]

David Harwath

Proceedings of the Interspeech 2019, 2019

An Unsupervised Autoregressive Model for Speech Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2019, 2019

Hierarchical Generative Modeling for Controllable Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Semi-supervised Training for Improving Data Efficiency in End-to-end Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data.

[BibT_eX]

[DOI]

CoRR, 2018

Unsupervised Representation Learning of Speech for Dialect Identification.

[BibT_eX]

[DOI]

Suwon Shon

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

A Study of Enhancement, Augmentation and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2018, 2018

Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition.

[BibT_eX]

[DOI]

Hao Tang

Proceedings of the Interspeech 2018, 2018

Scalable Factorized Hierarchical Variational Autoencoder Training.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2018, 2018

A Noise-Robust Self-Adaptive Multitarget Speaker Detection System.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Pattern Recognition, 2018

Extracting Domain Invariant Features by Unsupervised Learning for Robust Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Learning Latent Representations for Speech Generation and Transformation.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2017, 2017

Automatic speech recognition of Arabic multi-genre broadcast media.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016

Recurrent Neural Network Encoder with Attention for Community Question Answering.

[BibT_eX]

[DOI]

CoRR, 2016

A prioritized grid long short-term memory RNN for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Development of the MIT ASR system for the 2016 Arabic Multi-genre Broadcast Challenge.

[BibT_eX]

[DOI]

Tuka Al Hanai

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

SLS at SemEval-2016 Task 3: Neural-based Approaches for Ranking in Community Question Answering.

[BibT_eX]

[DOI]

Proceedings of the 10th International Workshop on Semantic Evaluation, 2016

Exploiting Depth and Highway Connections in Convolutional Recurrent Deep Neural Networks for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2016, 2016

Neural Attention for Learning to Rank Questions in Community Question Answering.

[BibT_eX]

[DOI]

Salvatore Romeo

Giovanni Da San Martino

Alberto Barrón-Cedeño

Proceedings of the COLING 2016, 2016

2015

Enhancing automatically discovered multi-level acoustic patterns considering context consistency with applications in spoken term detection.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Active Learning by Learning.

[BibT_eX]

[DOI]