Ron Hoory

Orcid: 0009-0006-1327-5160

According to our database1, Ron Hoory authored at least 48 papers between 1994 and 2024.

Collaborative distances:



In proceedings 
PhD thesis 




Creating an African American-Sounding TTS: Guidelines, Technical Challenges, and Surprising Evaluations.
Proceedings of the 29th International Conference on Intelligent User Interfaces, 2024

Speak While You Think: Streaming Speech Synthesis During Text Generation.
Proceedings of the IEEE International Conference on Acoustics, 2024

Modeling Turn-Taking in Human-To-Human Spoken Dialogue Datasets Using Self-Supervised Features.
Proceedings of the IEEE International Conference on Acoustics, 2023

Speech Emotion Recognition Using Self-Supervised Features.
Proceedings of the IEEE International Conference on Acoustics, 2022

A New Data Augmentation Method for Intent Classification Enhancement and its Application on Spoken Conversation Datasets.
Proceedings of the IEEE International Conference on Acoustics, 2022

Speaker Normalization for Self-Supervised Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

Towards A Common Speech Analysis Engine.
Proceedings of the IEEE International Conference on Acoustics, 2022

An autonomous debating system.
Nat., 2021

RNN Transducer Models for Spoken Language Understanding.
Proceedings of the IEEE International Conference on Acoustics, 2021

Principal Style Components: Expressive Style Control and Cross-Speaker Transfer in Neural TTS.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Siamese X-Vector Reconstruction for Domain Adapted Speaker Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

End-to-End Spoken Language Understanding Without Full Transcripts.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

New Advances in Speaker Diarization.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Leveraging Unpaired Text Data for Training End-To-End Speech-to-Intent Systems.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

High Quality, Lightweight and Adaptable TTS Using LPCNet.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Neural TTS Voice Conversion.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

The IBM Virtual Voice Creator.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Word Emphasis Prediction for Expressive Text to Speech.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Efficient Emotion Recognition from Speech Using Deep Learning on Spectrograms.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Weakly-Supervised Phrase Assignment from Text in a Speech-Synthesis System Using Noisy Labels.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Voice-transformation-based data augmentation for prosodic classification.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Using continuous lexical embeddings to improve symbolic-prosody prediction in a text-to-speech front-end.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Using deep bidirectional recurrent neural networks for prosodic-target prediction in a unit-selection text-to-speech system.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Fusion of voice signal information for detection of mild laryngeal pathology.
Appl. Soft Comput., 2014

Speech-based automatic and robust detection of very early dementia.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Prosody contour prediction with long short-term memory, bi-directional, deep recurrent neural networks.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Exploring modulation spectrum features for speech-based depression level classification.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Multi-modal biometrics for mobile authentication.
Proceedings of the IEEE International Joint Conference on Biometrics, Clearwater, 2014

F0 contour prediction with a deep belief network-Gaussian process hybrid model.
Proceedings of the IEEE International Conference on Acoustics, 2013

Towards automatic phonetic segmentation for TTS.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Towards Goat Detection in Text-Dependent Speaker Verification.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Improved Spoken Query Transcription Using Co-Occurrence Information.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

New Developments in Voice Biometrics for User Authentication.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Speech processing and retrieval in a personal memory aid system for the elderly.
Proceedings of the IEEE International Conference on Acoustics, 2011

The IBM Submission to the 2008 Text-to-Speech Blizzard Challenge.
Proceedings of the Blizzard Challenge 2008, 2008

Spoken document retrieval from call-center conversations.
Proceedings of the SIGIR 2006: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2006

High Quality Sinusoidal Modeling of Wideband Speech for the Purposes of Speech Synthesis and Modification.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

The IBM Submission to the 2006 Blizzard Text-to-Speech Challenge.
Proceedings of the Blizzard Challenge 2006, Pittsburgh, PA, USA, September 16, 2006, 2006

Small footprint concatenative text-to-speech synthesis system using complex spectral envelope modeling.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Automatic analysis of call-center conversations.
Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management, Bremen, Germany, October 31, 2005

The ETSI extended distributed speech recognition (DSR) standards: client side processing and tonal language recognition evaluation.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

The ETSI extended distributed speech recognition (DSR) standards: server-side speech reconstruction.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Reducing the footprint of the IBM trainable speech synthesis system.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Efficient periodicity extraction based on sine-wave representation and its application to pitch determination of speech signals.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Conversational networking: conversational protocols for transport, coding, and control.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Speech reconstruction from mel frequency cepstral coefficients and pitch frequency.
Proceedings of the IEEE International Conference on Acoustics, 2000

Low bit rate speech compression for playback in speech recognition systems.
Proceedings of the 10th European Signal Processing Conference, 2000

Speech synthesis for a specific speaker based on a labeled speech database.
Proceedings of the 12th IAPR International Conference on Pattern Recognition, 1994
