Dhananjaya Gowda

Bajibabu Bollepalli

Sudarsana Reddy Kadiri

IEEE Access, 2021

Streaming End-to-End Speech Recognition with Jointly Trained Neural Feature Enhancement.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Neural Utterance Confidence Measure for RNN-Transducers and Two Pass Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Comparative Study of Different Tokenization Strategies for Streaming End-to-End ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

A Comparison of Streaming Models and Data Augmentation Methods for Robust Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Semi-Supervised Transfer Learning for Language Expansion of End-to-End Speech Recognition Models to Low-Resource Languages.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Voice to Action: Spoken Language Understanding for Memory-Constrained Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

HiTNet: Byte-to-BPE Hierarchical Transcription Network for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Two-Pass End-to-End ASR Model Compression.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Time-Varying Quasi-Closed-Phase Analysis for Accurate Formant Tracking in Speech Signals.

[BibT_eX]

[DOI]

Sudarsana Reddy Kadiri

Brad H. Story

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Utterance Confidence Measure for End-to-End Speech Recognition with Applications to Distributed Speech Recognition Scenarios.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Utterance Invariant Training for Hybrid Two-Pass End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Streaming On-Device End-to-End ASR System for Privacy-Sensitive Voice-Typing.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Hierarchical Multi-Stage Word-to-Grapheme Named Entity Corrector for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

A Review of On-Device Fully Neural End-to-End Automatic Speech Recognition Algorithms.

[BibT_eX]

[DOI]

Proceedings of the 54th Asilomar Conference on Signals, Systems, and Computers, 2020

2019

Improved Vocal Tract Length Perturbation for a State-of-the-Art End-to-End Speech Recognition System.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Multi-Task Multi-Resolution Char-to-BPE Cross-Attention Decoder for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

End-to-End Training of a Large Vocabulary End-to-End Speech Recognition System.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Power-Law Nonlinearity with Maximally Uniform Distribution Criterion for Improved Neural Network Training in Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Attention Based On-Device Streaming Speech Recognition with Large Speech Corpus.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Improved Multi-Stage Training of Online Attention-Based Encoder-Decoder Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Speaker recognition from whispered speech: A tutorial survey and an application of time-varying linear prediction.

[BibT_eX]

[DOI]

Speech Commun., 2018

2017

Time-Varying Autoregressions for Speaker Verification in Reverberant Conditions.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016

Whispered Speech Detection Using Fusion of Group-Delay-Based Subband Modulation Spectrum and Correntropy Features.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2016

Time-Varying Quasi-Closed-Phase Weighted Linear Prediction Analysis of Speech for Accurate Formant Detection and Tracking.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Quasi closed phase analysis of speech signals using time varying weighted linear prediction for accurate formant tracking.

[BibT_eX]

[DOI]

Manu Airaksinen

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Vowel Enhancement in Early Stage Spanish Esophageal Speech Using Natural Glottal Flow Pulse and Vocal Tract Frequency Warping.

[BibT_eX]

[DOI]

Rizwan Ishaq

Begonya Garcia-Zapirain

Proceedings of the 6th Workshop on Speech and Language Processing for Assistive Technologies, 2015

AM-FM based filter bank analysis for estimation of spectro-temporal envelopes and its application for speaker recognition in noisy reverberant environments.

[BibT_eX]

[DOI]

Rahim Saeidi

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

2014

On the role of missing data imputation and NMF feature enhancement in building synthetic voices using reverberant speech.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

The Simple4All entry to the Blizzard Challenge 2014.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2014, Singapore, Singapore, September 19, 2014, 2014

2013

Spectro-temporal analysis of speech signals using zero-time windowing and group delay function.

[BibT_eX]

[DOI]

Bayya Yegnanarayana

Speech Commun., 2013

Analysis of Acoustic Events in Speech Signals Using Bessel Series Expansion.

[BibT_eX]

[DOI]

Chetana Prakash

Circuits Syst. Signal Process., 2013

Robust spectral representation using group delay function and stabilized weighted linear prediction for additive noise degradations.

[BibT_eX]

[DOI]

Proceedings of the 7th Conference on Speech Technology and Human-Computer Dialogue, 2013

Robust formant detection using group delay function and stabilized weighted linear prediction.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Analysis of breathy, modal and pressed phonation based on low frequency spectral density.

[BibT_eX]

[DOI]

Mikko Kurimo

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012

Effect of Tongue Tip Trilling on the Glottal Excitation Source.

[BibT_eX]

[DOI]

Vinay Kumar Mittal

Bayya Yegnanarayana

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011

Exploring Bessel Features for Detection of Glottal Closure Instants.

[BibT_eX]

[DOI]

Chetana Prakash

Anand Joseph Xavier Medabalimi

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Decomposition of speech signals for analysis of aperiodic components of excitation.

[BibT_eX]

[DOI]

Bayya Yegnanarayana

Proceedings of the IEEE International Conference on Acoustics, 2011

Acoustic-phonetic information from excitation source for refining manner hypotheses of a phone recognizer.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Voiced/Nonvoiced Detection Based on Robustness of Voiced Epochs.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2010

2008

Speaker change detection in casual conversations using excitation source features.

[BibT_eX]

[DOI]

Speech Commun., 2008

Analysis of glottal stops in speech signals.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Features for automatic detection of voice bars in continuous speech.

[BibT_eX]

[DOI]

S. Rajendran

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Video Shot Segmentation Using Late Fusion Technique.

[BibT_eX]

[DOI]

C. Krishna Mohan

Proceedings of the Seventh International Conference on Machine Learning and Applications, 2008

2006

Correlation-Based Similarity Between Signals for Speaker Verification with Limited Amount of Speech Data.

[BibT_eX]

[DOI]

Proceedings of the Multimedia Content Representation, 2006

2004

Speaker Segmentation Based on Subsegmental Features and Neural Network Models.

[BibT_eX]

[DOI]

Sunitha Guruprasad